In this part, I analyze the data quality in the aspects of completeness, coherence and correctness.
Indicators and Dimensions
This is a panel dataset which include 258 countries' 38 populational indicators accross recent 10 years (from 2006 to 2015). Indicators include literacy rate, motality rate, life expectency, etc. and can roughly represent a country's development stage so as to make comparisons and provide insights on making public policy for both international organizations and national governments.
You can go through the raw data from here.
Here I will mainly use the example of indicator "Female First Marriage Age" to show how the analysis goes through and how to potentially clean the data.
Is the data complete?
An interesting result: women literacy rate has a positive correlation with male and female's first marriage age.