******************************************************************************** /* Citation: Oxford Poverty and Human Development Initiative (OPHI), University of Oxford. 2018 Global Multidimensional Poverty Index - South Africa NIDS 2014-2015 [STATA do-file]. Available from OPHI website: http://ophi.org.uk/ For further queries, contact: ophi@qeh.ox.ac.uk */ ******************************************************************************** clear all set more off set maxvar 10000 set mem 800m cap log close *** Working Folder Path *** global path_in "T:/GMPI 2.0/rdta/South Africa NIDS 2014-15" global path_out "G:/pov" global path_logs "G:/logs" global path_ado "G:/ado" *** Log file *** log using "$path_logs/zaf_nids14-15_dataprep.log", replace ******************************************************************************** *** South Africa NIDS 2014-2015 *** *** We use Wave 4, as a cross-section data set ******************************************************************************** ******************************************************************************** *** Step 1: Data preparation ******************************************************************************** ******************************************************************************** *** Step 1.1 Children's nutrition (under 5 years) ******************************************************************************** use "$path_in/Child_W4_Anon_V1.1.dta", clear sort w4_hhid merge m:1 w4_hhid using "$path_in/hhderived_W4_Anon_V1.1.dta" tab _merge drop _merge *** Generate individual unique key variable required for data merging *** no cluster identifier *** w4_hhid=household number *** pid=child's line number in household gen double ind_id = w4_hhid*10000000 + pid format ind_id %20.0g label var ind_id "Individual ID" tab w4_c_outcome //We will only retain children who were successfully interviwed in Wave 4 drop if w4_c_outcome!=1 //871 Observations deleted duplicates report ind_id //No duplicates /* For this part of the do-file we use the WHO Anthro and macros. This is to calculate the z-scores of the children's nutritional variables. */ *** Indicate to STATA where the igrowup_restricted.ado file is stored: ***Source of ado file: http://www.who.int/childgrowth/software/en/ adopath + "$path_ado/igrowup_stata" gen str100 reflib = "$path_ado/igrowup_stata" lab var reflib "Directory of reference tables" gen str100 datalib = "$path_out" lab var datalib "Directory for datafiles" gen str30 datalab = "children_nutri_zaf" lab var datalab "Working file" *** Next check the variables that WHO ado needs to calculate the z-scores: *** sex, age, weight, height, measurement, oedema & child sampling weight *** Variable: SEX *** tab w4_c_gen, miss //no missing tab w4_c_gen, nol //1=Male; 2=Female clonevar gender = w4_c_gen desc gender tab gender, miss *** Variable: AGE *** //South Africa NIDS 2014-15: only have year and month of birth codebook w4_c_intrv_y, tab(99) //Date of interview (year); no missing codebook w4_c_intrv_m, tab(99) //Date of interview (month); no missing codebook w4_c_dob_y, tab(99) //Date of birth (year); 3333=Missing, 9999=Don't know codebook w4_c_dob_m, tab(99) //Date of birth (month); 33=Missing, 99=Don't know tab w4_c_dob_y w4_c_dob_m, miss /*In some cases the year is indicated but not the month. We will assign month of birth = 6 if month is missing but year is available*/ clonevar temp=w4_c_dob_m replace temp=6 if (w4_c_dob_m==. & w4_c_dob_y<=9999) | /// (w4_c_dob_m==33 & w4_c_dob_y<=9999)| /// (w4_c_dob_m==99 & w4_c_dob_y<=9999) //Date of the interview and birth gen w4_c_intrv = ym(w4_c_intrv_y, w4_c_intrv_m) gen w4_c_dob = ym(w4_c_dob_y , temp) format w4_c_intrv %tm format w4_c_dob %tm drop temp gen w4_c_agemonth = w4_c_intrv - w4_c_dob sum w4_c_agemonth, det //Some values are incoherent replace w4_c_agemonth = . if w4_c_agemonth<0 replace w4_c_agemonth = . if w4_c_agemonth>180 //Module goes up to children 15 years of age clonevar age_month = w4_c_agemonth desc age_month summ age_month gen str6 ageunit = "months" lab var ageunit "months" gen age_years=age_month/12 replace age_year=round(age_year) //Consistent with the questionnaire (0-15) *** Variable: BODY WEIGHT (KILOGRAMS) *** /*NOTE: According to the official document "NIDS Wave 4: Overview 2016.pdf" (page 4): "Interviewers weigh and measure every resident sample member who is six months or older". Indeed, variables on weight and height have missing values for babies 0-6 months. Thus, no member younger than 6 months should have data on nutrition. Weight is measured as follows: 1. They take two measures 2. If there is a difference greater than +-1 kg between the two, they take a third one. As such, we will take the average of the first two measures if the third one is not needed OR the third measure if it was needed */ codebook w4_c_weight_*, tab(999) clonevar temp1=w4_c_weight_1 clonevar temp2=w4_c_weight_2 clonevar temp3=w4_c_weight_3 replace temp1 = . if w4_c_weight_1<0 replace temp2 = . if w4_c_weight_2<0 replace temp3 = . if w4_c_weight_3<0 gen diff_weight = temp1-temp2 egen weight1 = rowmean(temp1 temp2) if diff_w>=-1 & diff_w<=1 replace weight1 = temp3 if (diff_w<-1 | diff_w>1) & temp3!=. /*Replace as missing if child younger than 6 months, as they were not measured */ replace weight1 = . if age_month<6 //5 changes desc weight1 summ weight1 drop temp* diff* *** Variable: HEIGHT (CENTIMETERS) /*Note: Height is measured as follows: 1. They take two measures 2. If there is a difference greater than +-1 cm between the two, they take a third one We will take: The average of the first two measures if the third one is not needed OR the third measure if it was needed */ codebook w4_c_height_*, tab(999) clonevar temp1=w4_c_height_1 clonevar temp2=w4_c_height_2 clonevar temp3=w4_c_height_3 replace temp1 = . if w4_c_height_1<0 replace temp2 = . if w4_c_height_2<0 replace temp3 = . if w4_c_height_3<0 gen diff_height = temp1-temp2 egen height = rowmean(temp1 temp2) if diff_h>=-1 & diff_h<=1 replace height = temp3 if (diff_h<-1 | diff_h>1) & temp3!=. //AC: replace as missing if child younger than 6 months, as they were not measured replace height = . if age_month<6 //4 changes desc height summ height drop temp* diff* *** Variable: MEASURED STANDING/LYING DOWN *** /* NOTE: Official document "National Income Dynamics Study Panel User Manual.pdf", Edited by Michelle Chinhema, Timothy Brophy, Michael Brown, Murray Leibbrandt, Cecil Mlatsheni and Ingrid Woolard (page 58) indicates that: "In calculating the weight for height z-scores, we assume that the child was measured in the recumbent position if the child's age is below 24 months (731 days). If the child is aged 24 months or above, we assume that the measured height is standing height". We have adjusted the code to reflect this. */ gen measure = "l" if age_month<24 *Child measured lying down replace measure = "h" if age_month>=24 & age_month<. *Child measured standing up *** Variable: OEDEMA *** lookfor oedema gen oedema = "n" //It assumes no-one has oedema desc oedema tab oedema *** Variable: INDIVIDUAL CHILD SAMPLING WEIGHT *** /* Note: There is no sampling weight for children. As such we use the household sampling weight in this section */ sort w4_hhid gen sw = w4_wgt desc sw summ sw //We will only keep children under 5 years drop if age_month>=60 //Generate identification variable for observations in child questionnaire gen child_data=1 /*We now run the command to calculate the z-scores with the adofile */ igrowup_restricted reflib datalib datalab gender age_month ageunit weight1 /// height measure oedema sw /*We now turn to using the dta file that was created and that contains the calculated z-scores to create the child nutrition variables following WHO standards */ use "$path_out/children_nutri_zaf_z_rc.dta", clear *** Standard MPI indicator *** //Takes value 1 if the child is under 2 stdev below the median & 0 otherwise gen underweight = (_zwei < -2.0) replace underweight = . if _zwei == . | _fwei==1 lab var underweight "Child is undernourished (weight-for-age) 2sd - WHO" tab underweight, miss gen stunting = (_zlen < -2.0) replace stunting = . if _zlen == . | _flen==1 lab var stunting "Child is stunted (length/height-for-age) 2sd - WHO" tab stunting, miss gen wasting = (_zwfl < - 2.0) replace wasting = . if _zwfl == . | _fwfl == 1 lab var wasting "Child is wasted (weight-for-length/height) 2sd - WHO" tab wasting, miss count if _zlen==1 | _zwfl==1 //Retain relevant variables: keep ind_id w4_hhid pid child_data age_month under* stunting* wasting* order ind_id w4_hhid pid child_data age_month under* stunting* wasting* sort ind_id duplicates report ind_id //Erase files from folder: erase "$path_out/children_nutri_zaf_z_rc.xls" erase "$path_out/children_nutri_zaf_prev_rc.xls" erase "$path_out/children_nutri_zaf_z_rc.dta" //Save a temp file for merging later: save "$path_out/ZAF14-15_child.dta", replace ******************************************************************************** *** Step 1.2 Children's BMI-for-age (5-15 years) ******************************************************************************** use "$path_in/Child_W4_Anon_V1.1.dta", clear sort w4_hhid merge m:1 w4_hhid using "$path_in/hhderived_W4_Anon_V1.1.dta" drop if _m==2 drop _merge //This merging is necesary because of bmi variables for 5-15 sort pid merge 1:1 pid using "$path_in/indderived_W4_Anon_V1.1.dta" drop _merge *** Generate individual unique key variable required for data merging *** no cluster identifier *** w4_hhid=household number *** pid=child's line number in household gen double ind_id = w4_hhid*10000000 + pid format ind_id %20.0g label var ind_id "Individual ID" tab w4_c_outcome //We will only retain children who were successfully interviewed in Wave 4 drop if w4_c_outcome!=1 //871 Observations deleted duplicates report ind_id //no duplicates /* For this part of the do-file we use the WHO AnthroPlus software. This is to calculate the z-scores for 5-15 years. */ *** Indicate to STATA where the igrowup_restricted.ado file is stored: ***Source of ado file: https://www.who.int/growthref/tools/en/ adopath + "$path_ado/who2007_stata" gen str100 reflib = "$path_ado/who2007_stata" lab var reflib "Directory of reference tables" gen str100 datalib = "$path_out" lab var datalib "Directory for datafiles" gen str30 datalab = "teen_nutri_zaf" lab var datalab "Working file" *** Next check the variables that WHO ado needs to calculate the z-scores: *** sex, age, weight, height, measurement, oedema & sampling weight *** Variable: SEX *** tab w4_c_gen, miss //no missing tab w4_c_gen, nol //1=Male; 2=Female clonevar gender = w4_c_gen desc gender tab gender, miss *** Variable: AGE *** //South Africa NIDS 2014-15: only have year and month of birth codebook w4_c_intrv_y, tab(99) //Date of interview (year); no missing codebook w4_c_intrv_m, tab(99) //Date of interview (month); no missing codebook w4_c_dob_y, tab(99) //Date of birth (year); 3333=Missing, 9999=Don't know codebook w4_c_dob_m, tab(99) //Date of birth (month); 33=Missing, 99=Don't know tab w4_c_dob_y w4_c_dob_m, miss /*In some cases the year is indicated but not the month. We will assign month of birth = 6 if month is missing but year is available */ clonevar temp=w4_c_dob_m replace temp=6 if (w4_c_dob_m==. & w4_c_dob_y<=9999) | /// (w4_c_dob_m==33 & w4_c_dob_y<=9999)| /// (w4_c_dob_m==99 & w4_c_dob_y<=9999) //Date of the interview and birth gen w4_c_intrv = ym(w4_c_intrv_y, w4_c_intrv_m) gen w4_c_dob = ym(w4_c_dob_y , temp) format w4_c_intrv %tm format w4_c_dob %tm drop temp gen w4_c_agemonth = w4_c_intrv - w4_c_dob sum w4_c_agemonth, det //Some values are incoherent replace w4_c_agemonth = . if w4_c_agemonth<0 replace w4_c_agemonth = . if w4_c_agemonth>180 //Module goes up to children w/ 15 years of age clonevar age_month = w4_c_agemonth desc age_month summ age_month gen str6 ageunit = "months" lab var ageunit "months" gen age_year=age_month/12 replace age_year=round(age_year) //Consistent with the questionnaire (0-15) *** Variable: BODY WEIGHT (KILOGRAMS) *** /*NOTE: Weight is measured as follows: 1. They take two measures 2. If there is a difference greater than +-1 kg between the two, they take a third one We will take: The average of the first two measures if the third one is not needed OR the third measure if it was needed */ codebook w4_c_weight_*, tab(999) clonevar temp1=w4_c_weight_1 clonevar temp2=w4_c_weight_2 clonevar temp3=w4_c_weight_3 //Clean data replace temp1 = . if w4_c_weight_1<0 replace temp2 = . if w4_c_weight_2<0 replace temp3 = . if w4_c_weight_3<0 gen diff_weight = temp1-temp2 egen weight1 = rowmean(temp1 temp2) if diff_w>=-1 & diff_w<=1 replace weight1 = temp3 if (diff_w<-1 | diff_w>1) & temp3!=. desc weight1 summ weight1 drop temp* diff* *** Variable: HEIGHT (CENTIMETERS) *** /*Height is measured as follows: 1. They take two measures 2. If there is a difference greater than +-1 cm between the two, they take a third one We will take: The average of the first two measures if the third one is not needed OR the third measure if it was needed */ codebook w4_c_height_*, tab(999) clonevar temp1=w4_c_height_1 clonevar temp2=w4_c_height_2 clonevar temp3=w4_c_height_3 //Clean data replace temp1 = . if w4_c_height_1<0 replace temp2 = . if w4_c_height_2<0 replace temp3 = . if w4_c_height_3<0 gen diff_height = temp1-temp2 egen height = rowmean(temp1 temp2) if diff_h>=-1 & diff_h<=1 replace height = temp3 if (diff_h<-1 | diff_h>1) & temp3!=. desc height summ height drop temp* diff* *** Variable: OEDEMA *** lookfor oedema gen oedema = "n" //It assumes no-one has oedema desc oedema tab oedema *** Variable: SAMPLING WEIGHT *** /* Note: There is no sampling weight for individuals 5-15 years. As such we use the household sampling weight in this section */ sort w4_hhid gen sw = w4_wgt desc sw summ sw drop if age_month<=59 //We will only keep the individuals between 5-15 gen teen_data=1 //Generate id to identify individuals aged 5-15 years /*We now run the command to calculate the z-scores with the adofile */ who2007 reflib datalib datalab gender age_month ageunit weight1 height oedema sw /*We now turn to using the dta file that was created and that contains the calculated z-scores to compute BMI-for-age*/ use "$path_out/teen_nutri_zaf_z.dta", clear gen z_bmi = _zbfa replace z_bmi = . if _fbfa==1 lab var z_bmi "z-score bmi-for-age WHO" *** Standard MPI indicator *** gen low_bmiage_teen = (z_bmi < -2.0) *Takes value 1 if the child is under 2 stdev below the median & 0 otherwise replace low_bmiage_teen = . if z_bmi==. lab var low_bmiage_teen "Teenage low bmi 2sd - WHO" //Save a temp file for merging with PR: keep w4_hhid pid w4_c_outcome ind_id teen_data age_month low_bmiage* order w4_hhid pid w4_c_outcome ind_id teen_data age_month low_bmiage* //Erase files from folder: erase "$path_out/teen_nutri_zaf_z.xls" erase "$path_out/teen_nutri_zaf_prev.xls" erase "$path_out/teen_nutri_zaf_z.dta" sort ind_id save "$path_out/ZAF14-15_teen.dta", replace ******************************************************************************** *** Step 1.3 Adult's questionnaire /*We will use the Adult Questionnaire to calculate bmi-for-age for young adults (15 - 19 years)*/ ******************************************************************************** use "$path_in/Adult_W4_Anon_V1.1.dta", clear sort w4_hhid merge m:1 w4_hhid using "$path_in/hhderived_W4_Anon_V1.1.dta" tab _merge drop _merge *** Generate individual unique key variable required for data merging *** w4_hhid=household number; *** pid=child's line number in household gen double ind_id = w4_hhid*10000000 + pid format ind_id %20.0g label var ind_id "Individual ID" tab w4_a_outcome //We will only retain adults who were successfully interviwed in Wave 4 drop if w4_a_outcome!=1 //4066 Observations deleted duplicates report ind_id // no duplicates ***Variables required to calculate the z-scores to produce BMI-for-age: *** Variable: AGE IN MONTHS *** codebook w4_a_dob_y w4_a_dob_m, tab(99) gen w4_a_dob = ym(w4_a_dob_y , w4_a_dob_m ) if w4_a_dob_y <8888 & w4_a_dob_m <20 gen w4_a_intrv = ym(w4_a_intrv_y, w4_a_intrv_m ) format w4_a_intrv %tm format w4_a_dob %tm gen age_month = (w4_a_intrv - w4_a_dob) gen temp=age_month/12 gen age_year=round(temp) drop temp sum age_year, det //Some people are over 100 years old *** Variable: SEX *** clonevar gender=w4_a_gen *** Variable: HEIGHT (CENTIMETERS) codebook w4_a_height_*, tab(999) clonevar temp1=w4_a_height_1 clonevar temp2=w4_a_height_2 clonevar temp3=w4_a_height_3 replace temp1 = . if w4_a_height_1<0 replace temp2 = . if w4_a_height_2<0 replace temp3 = . if w4_a_height_3<0 gen diff_height = temp1-temp2 egen height = rowmean(temp1 temp2) if diff_h>=-1 & diff_h<=1 replace height = temp3 if (diff_h<-1 | diff_h>1) & temp3!=. desc height summ height drop temp* diff* *** Variable: BODY WEIGHT (KILOGRAMS) *** codebook w4_a_weight_*, tab(999) clonevar temp1=w4_a_weight_1 clonevar temp2=w4_a_weight_2 clonevar temp3=w4_a_weight_3 replace temp1 = . if w4_a_weight_1<0 replace temp2 = . if w4_a_weight_2<0 replace temp3 = . if w4_a_weight_3<0 gen diff_weight = temp1-temp2 egen weight1 = rowmean(temp1 temp2) if diff_w>=-1 & diff_w<=1 replace weight1 = temp3 if (diff_w<-1 | diff_w>1) & temp3!=. desc weight1 summ weight1 drop temp* diff* *** Variable: OEDEMA gen oedema = "n" *** Variable: AGE UNIT *** gen str6 ageunit = "months" lab var ageunit "months" *** Variable: SAMPLING WEIGHT *** sort w4_hhid gen sw = w4_wgt desc sw summ sw /* For this part of the do-file we use the WHO AnthroPlus software. This is to calculate the z-scores for individuals 15-19 years. */ *** Indicate to STATA where the igrowup_restricted.ado file is stored: ***Source of ado file: https://www.who.int/growthref/tools/en/ adopath + "$path_ado/who2007_stata" gen str100 reflib = "$path_ado/who2007_stata" lab var reflib "Directory of reference tables" gen str100 datalib = "$path_out" lab var datalib "Directory for datafiles" gen str30 datalab = "yadult_nutri_zaf" lab var datalab "Working file" //We will only keep individuals between 15-19 to compute BMI-for-age gen man_1519=(gender==1 & age_year<=19 & age_year>=15) gen woman_1519=(gender==2 & age_year<=19 & age_year>=15) gen yadult_data=(man_1519 | woman_1519) tab yadult_data, miss keep if yadult_data==1 //We now run the command to calculate the z-scores with the adofile who2007 reflib datalib datalab gender age_month ageunit weight1 height oedema sw /*We now turn to using the dta file that was created and that contains the calculated z-scores to compute BMI-for-age*/ use "$path_out/yadult_nutri_zaf_z.dta", clear gen z_bmi = _zbfa replace z_bmi = . if _fbfa==1 lab var z_bmi "z-score bmi-for-age WHO" *** Standard MPI indicator *** gen low_bmiage = (z_bmi < -2.0) /*Takes value 1 if BMI-for-age is under 2 stdev below the median & 0 otherwise */ replace low_bmiage = . if z_bmi==. lab var low_bmiage "Teenage low bmi 2sd - WHO" //Retain relevant variables: keep ind_id yadult w4_hhid pid age* gender low_bmiage* order ind_id yadult w4_hhid pid age* gender low_bmiage* sort ind_id //Save a temp file for merging with PR: save "$path_out/ZAF14-15_yadult.dta", replace ******************************************************************************** *** Step 1.4 Adult's questionnaire /*We will use the Adult Questionnaire to identify death of children*/ ******************************************************************************** use "$path_in/Adult_W4_Anon_V1.1.dta", clear sort w4_hhid merge m:1 w4_hhid using "$path_in/hhderived_W4_Anon_V1.1.dta" tab _merge drop _merge *** Generate individual unique key variable required for data merging *** w4_hhid=household number; *** pid=child's line number in household gen double ind_id = w4_hhid*10000000 + pid format ind_id %20.0g label var ind_id "Individual ID" tab w4_a_outcome //We will only retain adults who were successfully interviwed in Wave 4 drop if w4_a_outcome!=1 //4066 Observations deleted duplicates report ind_id //no duplicates gen adult_data=1 //Generate id to identify adults with child mortality information //Gender clonevar gender=w4_a_gen //Date of interview (month and year) codebook w4_a_intrv_m w4_a_intrv_y, tab(99) gen date_inter = ym(w4_a_intrv_y, w4_a_intrv_m) //Date of death codebook w4_a_bhdod_m*, tab(99) codebook w4_a_bhdod_y*, tab(99) forvalues i=1/17 { clonevar temp`i'=w4_a_bhdod_m`i' replace temp`i'=6 if (w4_a_bhdod_m`i'==. & w4_a_bhdod_y`i'< 3333) | /// (w4_a_bhdod_m`i'==33 & w4_a_bhdod_y`i'< 3333) | /// (w4_a_bhdod_m`i'==88 & w4_a_bhdod_y`i'< 3333) | /// (w4_a_bhdod_m`i'==99 & w4_a_bhdod_y`i'< 3333) /*Note: If we know the year of death of the child but the month is missing, then we will assign month of death = 6. */ replace temp`i'=. if (w4_a_bhdod_y`i'>= 3333) /*Note: If the year of death is missing, then the variable takes a missing value */ /* replace temp`i'=6 if (w4_a_bhdod_m`i'==. & w4_a_bhdod_y`i'<=9999) | /// (w4_a_bhdod_m`i'==33 & w4_a_bhdod_y`i'<=9999) | /// (w4_a_bhdod_m`i'==99 & w4_a_bhdod_y`i'<=9999) */ gen date_death`i'=ym(w4_a_bhdod_y`i',temp`i') drop temp* gen mdead_survey`i'=date_inter-date_death`i' gen ydead_survey`i'=mdead_survey`i'/12 gen child_dead`i'=1 if ydead_survey`i'!=. // child died and we know when gen child_dead_5y`i'=1 if ydead_survey`i'!=. & ydead_survey`i'<=5 } * local aux1 "child_dead1 child_dead2 child_dead3 child_dead4 child_dead5 child_dead6 child_dead7 child_dead8 child_dead9 child_dead10 child_dead11 child_dead12 child_dead13 child_dead14 child_dead15 child_dead16 child_dead17" egen temp=rowmiss(`aux1') egen child_died_per_wom=rowtotal(`aux1') tab temp, miss //19,881 adults never reported any child death //20,242 adults never reported any child death tab temp gender, miss //Of the reported death, 9,461 are men. Men are never asked the question replace child_died_per_wom=. if temp==17 replace child_died_per_wom=0 if w4_a_bhdth==2 //no dead children lab var child_died_per_wom "Total child death for each women" tab child_died_per_wom w4_a_bhdth, miss drop temp local aux2 "child_dead_5y1 child_dead_5y2 child_dead_5y3 child_dead_5y4 child_dead_5y5 child_dead_5y6 child_dead_5y7 child_dead_5y8 child_dead_5y9 child_dead_5y10 child_dead_5y11 child_dead_5y12 child_dead_5y13 child_dead_5y14 child_dead_5y15 child_dead_5y16 child_dead_5y17" egen temp=rowmiss(`aux2') egen child_died_per_wom_5y=rowtotal(`aux2') tab temp, miss //21,409 adults never reported any child death replace child_died_per_wom_5y=. if temp==17 replace child_died_per_wom_5y=0 if w4_a_bhdth==2 | /// (child_died_per_wom!=0 & child_died_per_wom!=. & child_died_per_wom_5y==.) lab var child_died_per_wom_5y "Total child death for each women in the last 5 years" drop temp //Retain relevant variables keep ind_id w4_hhid pid adult_data child_died_per_wom child_died_per_wom_5y order ind_id w4_hhid pid adult_data child_died_per_wom child_died_per_wom_5y sort ind_id //Erase files from folder: erase "$path_out/yadult_nutri_zaf_z.xls" erase "$path_out/yadult_nutri_zaf_prev.xls" erase "$path_out/yadult_nutri_zaf_z.dta" save "$path_out/ZAF14-15_mortality.dta", replace ******************************************************************************** *** Step 1.5 Merging of all data ******************************************************************************** use "$path_in/HHQuestionnaire_W4_Anon_V1.1.dta", clear sort w4_hhid merge 1:1 w4_hhid using "$path_in/hhderived_W4_Anon_V1.1.dta" drop _merge sort w4_hhid //merge with HH merge 1:m w4_hhid using "$path_in/HouseholdRoster_W4_Anon_V1.1.dta" drop _merge sort w4_hhid pid //merge with HH merge 1:1 w4_hhid pid using "$path_in/Adult_W4_Anon_V1.1.dta" gen w4_adults = 1 if _m==3 drop _merge sort w4_hhid pid //merge with adults original merge 1:1 w4_hhid pid using "$path_in/Child_W4_Anon_V1.1.dta" gen w4_child=1 if _m==3 drop _merge sort w4_hhid pid //merge with children merge 1:1 w4_hhid pid using "$path_in/Proxy_W4_Anon_V1.1.dta" drop _merge sort w4_hhid pid //merge with proxy adults merge 1:1 w4_hhid pid using "$path_in/indderived_W4_Anon_V1.1.dta" drop _merge sort w4_hhid pid //merge with individual derived variables merge 1:1 w4_hhid pid using "$path_out/ZAF14-15_child.dta" drop _merge sort w4_hhid pid //merge with children under 5 merge 1:1 w4_hhid pid using "$path_out/ZAF14-15_teen.dta" drop _merge sort w4_hhid pid //merge with individuals 5-15 years old merge 1:1 w4_hhid pid using "$path_out/ZAF14-15_yadult.dta" drop _merge sort w4_hhid pid //merge with young adults 15-19 years old merge 1:1 w4_hhid pid using "$path_out/ZAF14-15_mortality.dta" drop _merge sort w4_hhid pid //merge with data for child mortality merge m:1 pid using "$path_in/Link_File_W4_Anon_V1.1.dta", keepusing(cluster) drop if _m==2 drop _merge sort w4_hhid pid save "$path_out/full_data_ZAF14-15.dta", replace ******************************************************************************** *** Step 1.6 KEEPING ONLY DE JURE HOUSEHOLD MEMBERS *** ******************************************************************************** use "$path_out/full_data_ZAF14-15.dta", clear //Permanent (de jure) household members clonevar resident = w4_r_pres codebook resident, tab (10) replace resident = . if w4_r_pres==2 | w4_r_pres==3 //Not permanent member OR deceased label var resident "Permanent (de jure) household member" tab resident, miss drop if resident!=1 //7203 observations are deleted ******************************************************************************** *** Step 1.7 RE-GENERATING DEMOGRAPHIC VARIABLES ******************************************************************************** drop ind_id age_year gender //delete and re-generate demographic variable drop if w4_h_outcome!=1 /*4358 observations deleted. As such we will finally be working with 37,979 observations of which 1584 are proxy adults only */ //Household ID clonevar hh_id=w4_hhid //Individual id gen double ind_id = w4_hhid*10000000 + pid format ind_id %20.0g label var ind_id "Individual ID" duplicates report ind_id duplicates tag ind_id, gen(duplicates) //Age of household member codebook w4_r_dob*, tab(1000) gen w4_r_dob = ym(w4_r_dob_y , w4_r_dob_m ) if w4_r_dob_y <8888 & w4_r_dob_m <88 gen w4_h_intrv = ym(w4_h_intrv_y, w4_h_intrv_m ) format w4_h_intrv %tm format w4_r_dob %tm gen temp_age = (w4_h_intrv - w4_r_dob)/12 gen age = round(temp_age) replace age=. if age<0 //Age group recode age (0/4 = 1 "0-4")(5/9 = 2 "5-9")(10/14 = 3 "10-14") /// (15/17 = 4 "15-17")(18/59 = 5 "18-59")(60/max=6 "60+"), gen(agec7) lab var agec7 "age groups (7 groups)" recode age (0/9 = 1 "0-9") (10/17 = 2 "10-17")(18/59 = 3 "18-59") /// (60/max=4 "60+"), gen(agec4) lab var agec4 "age groups (4 groups)" //Sex of household member clonevar sex=w4_r_gen ******************************************************************************** *** Step 1.8 CONTROL VARIABLES ******************************************************************************** *** No Eligible Women ***************************************** //Eligibility for child mortality indicator as provided by women 15-49 years gen fem_eligible = (sex==2 & age>=15 & age<=49) bysort hh_id: egen hh_n_fem_eligible = sum(fem_eligible) //Number of eligible women for interview in the hh gen no_fem_eligible = (hh_n_fem_eligible==0) //Takes value 1 if the household had no eligible females for child mortality lab var no_fem_eligible "Household has no eligible women for child mortality" tab no_fem_eligible, miss /*Eligibility for child nutrition indicator as provided by women 15 years and older */ gen fem_eligible_nutri = (sex==2 & age>=15 & age<.) bysort hh_id: egen hh_n_fem_eligible_nutri = sum(fem_eligible_nutri) //Number of eligible women for interview in the hh gen no_fem_eligible_nutri = (hh_n_fem_eligible_nutri==0) //Takes value 1 if the household had no eligible females for nutrition lab var no_fem_eligible_nutri "Household has no eligible women for nutrition" tab no_fem_eligible_nutri, miss *** No Eligible Men ***************************************** gen male_eligible = (sex==1 & age>=15 & age!=.) bysort hh_id: egen hh_n_male_eligible = sum(male_eligible) //Number of eligible men for interview in the hh gen no_male_eligible = (hh_n_male_eligible==0) //Takes value 1 if the household had no eligible males for an interview lab var no_male_eligible "Household has no eligible man" tab no_male_eligible, miss *** No Eligible Children 0-5 years ***************************************** gen child_eligible = (age>=0 & age<=5) bysort hh_id: egen hh_n_children_eligible = sum(child_eligible) //Number of eligible children gen no_child_eligible = (hh_n_children_eligible==0) //Takes value 1 if there were no eligible children for anthropometrics lab var no_child_eligible "Household has no children eligible" tab no_child_eligible, miss *** No Eligible Children 5-15 years ***************************************** gen teen_eligible = (age>=5 & age<=15) bysort hh_id: egen hh_n_teen_eligible = sum(teen_eligible) //Number of eligible children 5 -16 years gen no_teen_eligible = (hh_n_teen_eligible==0) //Takes value 1 if there were no eligible children 5-15 years for anthropometrics lab var no_teen_eligible "Household has no children 5-15 years eligible" tab no_teen_eligible, miss *** No Eligible Women and Men *********************************************** gen no_adults_eligible = (no_fem_eligible_nutri==1 & no_male_eligible==1) //Takes value 1 if the household had no eligible men & women for an interview lab var no_adults_eligible "Household has no eligible women or men" tab no_adults_eligible, miss *** No Eligible Children and Women *********************************************** gen no_child_fem_eligible = (no_child_eligible==1 & no_fem_eligible_nutri==1) lab var no_child_fem_eligible "Household has no children or women eligible" tab no_child_fem_eligible, miss *** No Eligible Women, Men or Children *********************************************** gen no_eligibles = (no_fem_eligible_nutri==1 & no_male_eligible==1 & /// no_teen_eligible==1 & no_child_eligible==1) lab var no_eligibles "Household has no eligible women, men, or children" tab no_eligibles, miss *** No Eligible Subsample ***************************************** gen no_hem_eligible = . lab var no_hem_eligible "Household has no eligible individuals for hemoglobin measurements" tab no_hem_eligible, miss drop fem_eligible hh_n_fem_eligible male_eligible hh_n_male_eligible /// child_eligible hh_n_children_eligible teen_eligible hh_n_teen_eligible sort hh_id ind_id ******************************************************************************** *** Step 1.9 SUBSAMPLE VARIABLE *** ******************************************************************************** /* In the context of South Africa NIDS 2014-15, height and weight measurements were collected from children and adults in all households. As such there is no presence of subsample */ gen subsample =. label var subsample "Households selected as part of nutrition subsample" tab subsample, miss ******************************************************************************** *** Step 1.10 RENAMING DEMOGRAPHIC VARIABLES *** ******************************************************************************** //Corresponding country and year gen cty = "South Africa" gen ccty = "ZAF" gen year = "2014-2015" gen survey = "NIDS" gen ccnum = 710 //Household sampling weight* gen weight = w4_wgt //Area: urban or rural gen area=1 if w4_geo2011==2 replace area=0 if w4_geo2011==1 | w4_geo2011==3 label define lab_area 1 "urban" 0 "rural" label values area lab_area label var area "Area: urban-rural" tab area, miss //Relationship to the head of household clonevar relationship = w4_r_relhead codebook relationship, tab (30) recode relationship (-9 -3=.) (1=1)(3=2)(4/7=3)(8/25=4)(26 30=5) label define lab_rel 1"head" 2"spouse" 3"child" 4"extended family" 5"not related" 6"maid" label values relationship lab_rel label var relationship "Relationship to the head of household" tab w4_r_relhead relationship, miss //Marital status of household member gen marital = . label var marital "Marital status of household member" //Total number of de jure hh members in the household gen member = 1 bysort hh_id: egen hhsize = sum(member) label var hhsize "Household size" tab hhsize, miss drop member //Subnational region /*NOTE: The NIDS survey is only nationally representative, not subnationally */ gen region= . lab var region "Region for subnational decomposition" ******************************************************************************** *** Step 2 Data preparation *** *** Standardization of the 10 Global MPI indicators *** Identification of non-deprived & deprived individuals ******************************************************************************** ******************************************************************************** *** Step 2.1 Years of Schooling *** ******************************************************************************** codebook w4_a_edschgrd, tab(99) //variable only for adults 15+ codebook w4_c_edcmpgrd, tab(99) //variable only for children younger than 15 tab w4_c_edcmpgrd if age==15 tab w4_a_edschgrd if age==15 & w4_c_edcmpgrd!=. /*There are 467 children aged exactly 15 with real data. All of them have missing values in the variable for 15+. Thus, we use the info provided in var w4_c_edcmpgrd */ codebook w4_p_edschgrd, tab(99) //variable only for proxy adults gen temp_edu_c=w4_c_edcmpgrd if w4_c_edcmpgrd>=0 & w4_c_edcmpgrd<24 //educ for <15 years old /*We have adjusted so that not yet completed a grade (=26) is ==0 (only 3 obs) */ replace temp_edu_c=0 if w4_c_edcmpgrd==24 | w4_c_edcmpgrd==26 //Other or Not yet completed a grade replace temp_edu_c=. if w4_c_edcmpgrd<0 //Set to missing gen temp_edu_a=w4_a_edschgrd if w4_a_edschgrd>=0 & w4_a_edschgrd<24 //Educ for adults replace temp_edu_a=0 if w4_a_edschgrd==24 | w4_a_edschgrd==25 //Other, No schooling replace temp_edu_a=. if w4_a_edschgrd<0 //Set to missing gen temp_edu_p=w4_p_edschgrd if w4_p_edschgrd>=0 & w4_p_edschgrd<24 //Educ for proxy adults replace temp_edu_p=0 if w4_p_edschgrd==24 | w4_p_edschgrd==25 //Other, No schooling replace temp_edu_p=. if w4_p_edschgrd<0 //Set to missing gen aux1=(temp_edu_a!=.) if age>=15 gen aux2=(temp_edu_p!=.) if age>=15 tab aux1 aux2, miss //We will take education data from 1420 proxy observations gen eduyears=. replace eduyears=temp_edu_c if age<=15 replace eduyears=temp_edu_a if age>=15 & temp_edu_a!=. replace eduyears=temp_edu_p if age>=15 & eduyears==. & temp_edu_p!=. replace eduyears = . if eduyears>30 //Recode any unreasonable years of highest education as missing value replace eduyears = . if eduyears>=age & age>0 /*The variable "eduyears" was replaced with a '.' if total years of education was more than individual's age */ replace eduyears = 0 if age < 10 /*The variable "eduyears" was replaced with a '0' given that the criteria for this indicator is household member aged 10 years or older */ /*A control variable is created on whether there is information on years of education for at least 2/3 of the household members aged 10 years and older */ gen temp = 1 if eduyears!=. & age>=10 & age!=. bysort hh_id: egen no_missing_edu = sum(temp) /*Total household members who are 10 years and older with no missing years of education */ gen temp2 = 1 if age>=10 & age!=. bysort hh_id: egen hhs = sum(temp2) //Total number of household members who are 10 years and older replace no_missing_edu = no_missing_edu/hhs replace no_missing_edu = (no_missing_edu>=2/3) /*Identify whether there is information on years of education for at least 2/3 of the household members aged 10 years and older */ tab no_missing_edu, miss //Values for 0 are less than 1% label var no_missing_edu "No missing edu for at least 2/3 of the HH members aged 10 years & older" drop temp temp2 hhs /*The entire household is considered deprived if no household member aged 10 years or older has completed SIX years of schooling. */ gen years_edu6 = (eduyears>=6) /* The years of schooling indicator takes a value of "1" if at least someone in the hh has reported 6 years of education or more */ replace years_edu6 = . if eduyears==. bysort hh_id: egen hh_years_edu6_1 = max(years_edu6) gen hh_years_edu6 = (hh_years_edu6_1==1) replace hh_years_edu6 = . if hh_years_edu6_1==. replace hh_years_edu6 = . if hh_years_edu6==0 & no_missing_edu==0 //Final variable missing if household has info for < 2/3 of members lab var hh_years_edu6 "Household has at least one member with 6 years of edu" tab hh_years_edu6, miss ******************************************************************************** *** Step 2.2 Child School Attendance *** ******************************************************************************** /*NOTE: Since the survey was conducted between Oct-Dec 2014 and Jan-Aug 2015, there are two questions that should be used: (i) for those interviewed in 2014: w4_c_ed14cur (ii) for those interviewed in 2015: w4_c_ed15cur In addition, school attendance for children aged 7 years and younger was captured in a separate variable: (iii) w4_c_edcurgrd Also, there are children aged 15 years (so within the school age interval) that were asked about school attendance in the adults' questionnaire and have no data on the children's questionnaire. Their info was also considered. (iv) for those interviewed in 2014: w4_a_ed14cur (v) for those interviewed in 2015: w4_a_ed15cur */ tab age w4_c_ed14cur if age>=7 & age<=15, miss //w4_c_ed14cur: Currently (2014) enrolled in school? tab age w4_c_ed15cur if age>=7 & age<=15 , miss //w4_c_ed15cur: Currently (2015) enrolled in school? tab w4_c_ed14cur w4_c_ed15cur if age>=7 & age<=15, miss /*Cross tab between those who were interviewed between Oct-Dec 2014 versus those interviewed between Jan-Aug 2015 */ gen attendance = 1 if w4_c_ed14cur==1 | w4_c_ed15cur==1 /*Attendance variable takes a value of '1' if children are currently attending school*/ replace attendance = 0 if w4_c_ed14cur==2 | w4_c_ed15cur==2 /*Attendance indicator replaced with a value of '0' if children are currently not attending school when interview was carried out*/ replace attendance = 1 if w4_c_edcurgrd == 1 & attendance==. /*Attendance variable takes a value of '1' if school-age children are currently attending primary school as captured by the 'w4_c_edcurgrd' variable.*/ replace attendance = 0 if (w4_c_edcurgrd>1 & w4_c_edcurgrd<=7) & attendance==. /*Attendance variable takes a value of '0' if children who responded to the 'w4_c_edcurgrd' variable are not attending formal school. Note, that the final school attendance indicator for the global MPI of South Africa is focused on children between the age group of 7-15 years. Hence, the interest is in formal schooling. However, we recognise that not all children who are 7 years should be in formal schooling. They are not in formal schooling because they may be still just under 7 years. This is caused by the age variable which we constructed and rounded from the differences between date of birth and date of interview date. This is addressed in the following section when we construct the eligible 'child_schoolage' variable. */ replace attendance = 1 if w4_a_ed14cur == 1 & attendance==. & age==15 replace attendance = 0 if w4_a_ed14cur == 2 & attendance==. & age==15 /*For 15 year olds whose interview was done between Oct-Dec 2014 and data collected from the adult questionnaire */ replace attendance = 1 if w4_a_ed15cur == 1 & attendance==. & age==15 replace attendance = 0 if w4_a_ed15cur == 2 & attendance==. & age==15 /*For 15 year olds whose interview was done between Jan-Aug 2015 and data collected from the adult questionnaire */ label define attendance 0 "no" 1 "yes" label values attendance attendance tab attendance if age>=7 & age<=15, miss tab attendance if temp_age>=7 & temp_age<=15, miss /*The entire household is considered deprived if any school-aged child is not attending school up to class 8. */ gen child_schoolage = (temp_age>=7 & temp_age<=15) /*Note: In South Africa, the official school entrance age is 7 years. So, age range is 7-15 (=7+8). We constructed this indicator using the age variable that was constructed from the differences between date of birth and date of interview date. If we use an age variable that was rounded we tend to overestimate the number of young children just under the age of 7 years as not attending formal schooling. These children are still in informal schooling because they are short of few months to the age of 7 years. To prevent this error, we have opted to use age that was not rounded. */ /*A control variable is created on whether there is no information on school attendance for at least 2/3 of the school age children */ count if child_schoolage==1 & attendance==. //Understand how many eligible school aged children are not attending school gen temp = 1 if child_schoolage==1 & attendance!=. /*Generate a variable that captures the number of eligible school aged children who are attending school */ bysort hh_id: egen no_missing_atten = sum(temp) /*Total school age children with no missing information on school attendance */ gen temp2 = 1 if child_schoolage==1 bysort hh_id: egen hhs = sum(temp2) //Total number of household members who are of school age replace no_missing_atten = no_missing_atten/hhs replace no_missing_atten = (no_missing_atten>=2/3) /*Identify whether there is missing information on school attendance for more than 2/3 of the school age children */ tab no_missing_atten, miss label var no_missing_atten "No missing school attendance for at least 2/3 of the school aged children" drop temp temp2 hhs bysort hh_id: egen hh_children_schoolage = sum(child_schoolage) replace hh_children_schoolage = (hh_children_schoolage>0) //Control variable: //It takes value 1 if the household has children in school age lab var hh_children_schoolage "Household has children in school age" gen child_not_atten = (attendance==0) if child_schoolage==1 replace child_not_atten = . if attendance==. & child_schoolage==1 bysort hh_id: egen any_child_not_atten = max(child_not_atten) gen hh_child_atten = (any_child_not_atten==0) replace hh_child_atten = . if any_child_not_atten==. replace hh_child_atten = 1 if hh_children_schoolage==0 replace hh_child_atten = . if hh_child_atten==1 & no_missing_atten==0 /*If the household has been intially identified as non-deprived, but has missing school attendance for at least 2/3 of the school aged children, then we replace this household with a value of '.' because there is insufficient information to conclusively conclude that the household is not deprived */ lab var hh_child_atten "Household has all school age children up to class 8 in school" tab hh_child_atten, miss /*Note: The indicator takes value 1 if ALL children in school age are attending school and 0 if there is at least one child not attending. Households with no children receive a value of 1 as non-deprived. The indicator has a missing value only when there are all missing values on children attendance in households that have children in school age. */ ******************************************************************************** *** Step 2.3 Nutrition *** ******************************************************************************** ******************************************************************************** *** Step 2.3a Adult Nutrition *** ******************************************************************************** /*In the context of South Africa NIDS 2014-15, BMI Indicator for individuals aged 20 years and older*/ foreach var in w4_a_height_1 w4_a_height_2 w4_a_height_3 w4_a_weight_1 w4_a_weight_2 w4_a_weight_3 { sum `var' replace `var'=. if `var'<0 sum `var' } * Height gen diff_a_h = w4_a_height_1 - w4_a_height_2 egen a_height = rowmean(w4_a_height_1 w4_a_height_2) if diff_a_h>=-1 & diff_a_h<=1 replace a_height = w4_a_height_3 if (diff_a_h<-1 | diff_a_h>1) & w4_a_height_3!=. * Weight gen diff_a_w = w4_a_weight_1 - w4_a_weight_2 egen a_weight = rowmean(w4_a_weight_1 w4_a_weight_2) if diff_a_w>=-1 & diff_a_w<=1 replace a_weight = w4_a_weight_3 if (diff_a_w<-1 | diff_a_w>1) & w4_a_weight_3!=. gen a_bmi = a_weight/((a_height/100)^2) replace a_bmi = . if age<15 //We will use all the information to compute the old MPI lab var a_bmi "Adult Body Mass Index" sum a_bmi, det count if age>=15 & age!=. //25,264 individuals aged 15 years and older count if a_bmi==. & (age>=15 & age!=.) //We don't have BMI info for 2,943 adults aged 15 years and older *** BMI-for-age for individuals 15-19 years & BMI for individuals 20+ years *** ******************************************************************* gen low_bmi = (a_bmi<18.5) if a_bmi!=. lab var low_bmi "BMI < 18.5" bysort hh_id: egen hh_low_bmi = max(low_bmi) gen hh_no_low_bmi = (hh_low_bmi==0) if hh_low_bmi!=. replace hh_no_low_bmi=1 if no_adults_eligible==1 & hh_no_low_bmi==. tab hh_no_low_bmi, miss /*In the context of South Africa NIDS 2014-15 nutrition information is from men and women */ gen low_bmi_byage = 0 lab var low_bmi_byage "Individuals with low BMI or BMI-for-age" replace low_bmi_byage = 1 if low_bmi==1 //Replace variable "low_bmi_byage = 1" if eligible women & men have low BMI /*Note: The following command replaces BMI with BMI-for-age for those between the age group of 15-19 by their age in months where information is available */ replace low_bmi_byage = 1 if low_bmiage==1 & (age>=15 & age<=19) //Replace variable "low_bmi_byage = 1" if eligible young adults have low BMI replace low_bmi_byage = 0 if low_bmiage==0 & (age>=15 & age<=19) /*Replace variable "low_bmi_byage = 0" if eligible young adults are identified as having low BMI but normal BMI-for-age */ /*Note: The following control variable is applied when there is BMI information for women and men, as well as BMI-for-age for teenagers */ replace low_bmi_byage = . if low_bmi==. & low_bmiage==. bysort hh_id: egen temp = max(low_bmi_byage) gen hh_no_low_bmiage = (temp==0) /*Households take a value of '1' if all eligible adults and teenagers in the household has normal bmi or bmi-for-age */ replace hh_no_low_bmiage = . if temp==. /*Households take a value of '.' if there is no information from eligible individuals in the household */ replace hh_no_low_bmiage = 1 if no_adults_eligible==1 /*Households take a value of '1' if there is no eligible population. In cases where there is BMI & BMI-for-age information from women and men, then activate the following command */ drop temp lab var hh_no_low_bmiage "Household has no adult with low BMI or BMI-for-age" tab hh_no_low_bmi, miss tab hh_no_low_bmiage, miss /*NOTE that hh_no_low_bmi takes value 1 if: (a) no any eligible adult in the household has (observed) low BMI or (b) there are no eligible adults in the household. One has to check and adjust the dofile so all people who are eligible and/or measured are included. It is particularly important to check if male are measured and what age group among males and females. The variable takes values 0 for those households that have at least one adult with observed low BMI. The variable has a missing value only when there is missing info on BMI for ALL eligible adults in the household */ ******************************************************************************** *** Step 2.3b Child Nutrition (5 - 15 years) *** ******************************************************************************** *** Child 5 - 15 BMI-for-age Indicator *** ************************************************************************ bysort hh_id: egen temp = max(low_bmiage_teen) gen hh_no_low_bmiage_teen = (temp==0) //Takes value 1 if no child 5-15 in the hh has low BMI-for-age replace hh_no_low_bmiage_teen = . if temp==. replace hh_no_low_bmiage_teen = 1 if no_teen_eligible==1 //Households with no eligible children will receive a value of 1 lab var hh_no_low_bmiage_teen "Household has no child 5-15 years with low BMI-for-age" drop temp tab hh_no_low_bmiage_teen, miss ******************************************************************************** *** Step 2.3c Child Nutrition (0 - 5 years) *** ******************************************************************************** *** Child Underweight Indicator *** ************************************************************************ bysort hh_id: egen temp = max(underweight) gen hh_no_underweight = (temp==0) //Takes value 1 if no child in the hh is underweight replace hh_no_underweight = . if temp==. replace hh_no_underweight = 1 if no_child_eligible==1 //Households with no eligible children will receive a value of 1 lab var hh_no_underweight "Household has no child underweight - 2 stdev" drop temp *** Child Stunting Indicator *** ************************************************************************ bysort hh_id: egen temp = max(stunting) gen hh_no_stunting = (temp==0) //Takes value 1 if no child in the hh is stunted replace hh_no_stunting = . if temp==. replace hh_no_stunting = 1 if no_child_eligible==1 //Households with no eligible children will receive a value of 1 lab var hh_no_stunting "Household has no child stunted - 2 stdev" drop temp *** Child Either Stunted or Underweight Indicator *** ************************************************************************ gen uw_st = 1 if stunting==1 | underweight==1 replace uw_st = 0 if stunting==0 & underweight==0 replace uw_st = . if stunting==. & underweight==. bysort hh_id: egen temp = max(uw_st) gen hh_no_uw_st = (temp==0) //Takes value 1 if no child in the hh is underweight or stunted replace hh_no_uw_st = . if temp==. replace hh_no_uw_st = 1 if no_child_eligible==1 //Households with no eligible children will receive a value of 1 lab var hh_no_uw_st "Household has no child underweight or stunted" drop temp ******************************************************************************** *** Step 2.3d Household Nutrition Indicator *** ******************************************************************************** /* The indicator takes value 1 if there is no low BMI among adults, no low BMI-for-age among children 5-15 years and no children under 5 stunted or underweight */ ************************************************************************ gen hh_nutrition_uw_st = 1 replace hh_nutrition_uw_st = 0 if hh_no_low_bmiage==0 | hh_no_low_bmiage_teen==0 | hh_no_uw_st==0 replace hh_nutrition_uw_st = . if hh_no_low_bmiage==. & hh_no_low_bmiage_teen==. & hh_no_uw_st==. lab var hh_nutrition_uw_st "Household has no child underweight/stunted or adult deprived by BMI/BMI-for-age" ******************************************************************************** *** Step 2.4 Child Mortality *** ******************************************************************************** /*In the context of South Africa NIDS 2014-15, women aged 15 years and older responded to the question on child mortality. Men don't answer on this issue However, to be consistent with the global MPI work, we only use information from women 15-49 */ tab child_died_per_wom sex, miss tab age if child_died_per_wom!=. & sex==2 //Almost 65% of women that answered are aged 15-49 //There is one man with info, which has been replaced as missing replace child_died_per_wom=. if sex!=2 replace child_died_per_wom=. if age<15 | age>50 replace child_died_per_wom=0 if w4_a_bhbrth==2 //Total child mortality reported by women bysort hh_id: egen child_mortality = sum(child_died_per_wom), missing lab var child_mortality "Total child mortality within household reported by women" tab child_mortality, miss /* Deprived if any children died in the household */ ************************************************************************ gen hh_mortality = (child_mortality==0) /*Household is replaced with a value of "1" if there is no incidence of child mortality*/ replace hh_mortality = . if child_mortality==. replace hh_mortality = 1 if no_fem_eligible==1 /*Household with no eligible women, aged 15-49 is assigned with a value of '1' */ lab var hh_mortality "Household had no child mortality" tab hh_mortality, miss /*Deprived if any children died in the household in the last 5 years from the survey year */ ************************************************************************ /* The new standard MPI indicator takes a value of "1" if eligible women within the household reported no child mortality or if any child died longer than 5 years from the survey year. The indicator takes a value of "0" if women in the household reported any child mortality in the last 5 years from the survey year. The indicator takes a missing value if there was missing information on reported death from eligible individuals. */ tab child_died_per_wom_5y sex, miss tab age if child_died_per_wom_5y!=. & sex==2 //Almost 65% of women that answered are aged 15-49 //There's one man with info, which has been replaced as missing replace child_died_per_wom_5y=. if sex!=2 replace child_died_per_wom_5y=. if age<15 | age>50 replace child_died_per_wom_5y = 0 if w4_a_bhbrth==2 /*Assign a value of "0" for: - all eligible women who never ever gave birth */ replace child_died_per_wom_5y = 0 if no_fem_eligible==1 /*Assign a value of "0" for: - individuals living in households that have non-eligible women */ bysort hh_id: egen child_mortality_5y = sum(child_died_per_wom_5y), missing replace child_mortality_5y = 0 if child_mortality_5y==. & child_mortality==0 /*Replace all households as 0 death if women has missing value and men reported no death in those households. However, in the context of South Africa NIDS, there was zero changes given that child mortality information was not collected from men. */ label var child_mortality_5y "Total child mortality within household past 5 years reported by women" tab child_mortality_5y, miss gen hh_mortality_5y = (child_mortality_5y==0) replace hh_mortality_5y = . if child_mortality_5y==. tab hh_mortality_5y, miss lab var hh_mortality_5y "Household had no child mortality in the last 5 years" ******************************************************************************** *** Step 2.5 Electricity *** ******************************************************************************** /*Members of the household are considered deprived if the household has no electricity */ lookfor electricity codebook w4_h_enrgelec, tab(10) gen electricity = (w4_h_enrgelec==1) if w4_h_enrglght!=. lab define lab_yes_no 0 "no" 1 "yes" lab values electricity lab_yes_no label var electricity "Household has electricity" codebook electricity, tab (10) ******************************************************************************** *** Step 2.6 Sanitation *** ******************************************************************************** /*Members of the household are considered deprived if the household's sanitation facility is not improved, according to MDG guidelines, or it is improved but shared with other household. */ lookfor toilet codebook w4_h_toi, tab(99) clonevar toilet=w4_h_toi //Shared toilet gen shared_toilet = . replace shared_toilet = 1 if w4_h_toishr==1 replace shared_toilet = 0 if w4_h_toishr==2 gen toilet_mdg = (toilet<5 & shared_toilet!=1) // Other is considered as non-improved replace toilet_mdg = 0 if toilet<5 & shared_toilet==1 replace toilet_mdg = . if toilet==. | toilet==99 lab var toilet_mdg "Household has improved sanitation with MDG Standards" tab toilet toilet_mdg, miss ******************************************************************************** *** Step 2.7 Drinking Water *** ******************************************************************************** /*Members of the household are considered deprived if the household does not have access to safe drinking water according to MDG guidelines, or safe drinking water is more than a 30-minute walk from home roundtrip.*/ lookfor water codebook w4_h_watsrc, tab(99) clonevar water=w4_h_watsrc replace water = . if water<0 gen ndwater = . //No information on non-drinking water codebook w4_h_watdis , tab(99) //time to water gen timetowater = (w4_h_watdis>=1 & w4_h_watdis<=4) replace timetowater=. if w4_h_watdis<0 replace timetowater=1 if w4_h_watdis==. & (water==1 | water==2 | water==5 | water==7) replace timetowater=0 if w4_h_watdis==5 gen water_mdg = 1 if water==1 | water==2 | water==3 | water==5 | water==6 | water==7 /*Non deprived if water is "piped into dwelling", "piped to yard/plot", "public tap/standpipe", "tube well or borehole", "protected well", "protected spring", "rainwater", "bottled water" */ replace water_mdg = 0 if water==4 | water==8 | water==9 | water==10 | water==11 | water==12 /*Deprived if it is "unprotected well", "unprotected spring", "tanker truck" "surface water (river/lake, etc)", "cart with small tank","other" */ replace water_mdg = 0 if water_mdg==1 & timetowater==0 //Deprived if water is at more than 1 km walk (roundtrip) //category -3: 'Missing' replace water_mdg = . if water==. | water==-3 lab var water_mdg "Household has drinking water with MDG standards (considering distance)" tab water water_mdg, miss ******************************************************************************** *** Step 2.8 Housing *** ******************************************************************************** /* Members of the household are considered deprived if the household has a dirt, sand or dung floor */ clonevar floor = w4_h_dwlmatflr replace floor = . if floor<0 codebook floor, tab(99) gen floor_imp = 1 replace floor_imp = 0 if (floor==1) replace floor_imp = . if floor==. lab var floor_imp "Household has floor that is not earth/sand/dung" tab floor floor_imp, miss /* Members of the household are considered deprived if the household has walls made of natural or rudimentary materials */ clonevar wall = w4_h_dwlmatrwll replace wall=. if wall<0 codebook wall, tab(99) gen wall_imp = 0 replace wall_imp = 1 if wall<=4 | wall==7 | wall==9 | wall==10 | wall>=12 replace wall_imp = . if wall==. lab var wall_imp "Household has wall that is not of low quality materials" tab wall wall_imp, miss /* Members of the household are considered deprived if the household has roof made of natural or rudimentary materials */ clonevar roof = w4_h_dwlmatroof replace roof=. if roof<0 codebook roof, tab(99) gen roof_imp = 0 replace roof_imp = 1 if roof<=4 | roof==7 | roof==9 | roof==10 | roof>=12 replace roof_imp = . if roof==. lab var roof_imp "Household has roof that it is not of low quality materials" tab roof roof_imp, miss /*Household is deprived in housing if the roof, floor OR walls uses low quality materials.*/ gen housing_1 = 1 replace housing_1 = 0 if floor_imp==0 | wall_imp==0 | roof_imp==0 replace housing_1 = . if floor_imp==. & wall_imp==. & roof_imp==. lab var housing_1 "Household has roof, floor & walls that are not low quality material" tab housing_1, miss ******************************************************************************** *** Step 2.9 Cooking Fuel *** ******************************************************************************** /* Members of the household are considered deprived if the household cooks with solid fuels: wood, charcoal, crop residues or dung. "Indicators for Monitoring the Millennium Development Goals", p. 63 */ lookfor cooking clonevar cookingfuel = w4_h_enrgck replace cookingfuel = . if cookingfuel<0 codebook cookingfuel, tab(99) gen cooking_mdg = 1 replace cooking_mdg = 0 if cookingfuel>=5 & cookingfuel<=8 replace cooking_mdg = . if cookingfuel==. lab var cooking_mdg "Household has cooking fuel by MDG standards" /*Deprived if: "coal/lignite", "charcoal", "wood", "straw/shrubs/grass" "agricultural crop", "animal dung" */ tab cookingfuel cooking_mdg, miss ******************************************************************************** *** Step 2.10 Assets ownership *** ******************************************************************************** lookfor car gen car = . replace car = 0 if w4_h_ownvehpri==2 replace car = 1 if w4_h_ownvehpri==1 tab car, miss gen motorbike = . replace motorbike = 0 if w4_h_ownmot==2 replace motorbike = 1 if w4_h_ownmot==1 tab motorbike, miss gen bicycle = . replace bicycle = 0 if w4_h_ownbic==2 replace bicycle = 1 if w4_h_ownbic==1 tab bicycle, miss gen television = . replace television = 0 if w4_h_owntel==2 replace television = 1 if w4_h_owntel==1 tab television, miss gen bw_television=. gen telephone = . replace telephone = 0 if w4_h_tellnd==2 | w4_h_tellnd==3 replace telephone = 1 if w4_h_tellnd==1 tab telephone, miss gen mobiletelephone = . replace mobiletelephone = 0 if w4_h_telcel==2 replace mobiletelephone = 1 if w4_h_telcel==1 tab mobiletelephone, miss gen refrigerator = . replace refrigerator = 0 if w4_h_ownfrg==2 replace refrigerator = 1 if w4_h_ownfrg==1 tab refrigerator, miss gen radio = . replace radio = 0 if w4_h_ownrad==2 & w4_h_ownhif==2 replace radio = 1 if w4_h_ownrad==1 | w4_h_ownhif==1 tab radio, miss lookfor computer gen computer=. replace computer=0 if w4_h_owncom==2 replace computer=1 if w4_h_owncom==1 lookfor cart gen animal_cart=. replace animal_cart=1 if w4_h_owncrt==1 replace animal_cart=0 if w4_h_owncrt==2 //Combine information on telephone and mobiletelephone replace telephone=1 if telephone==0 & mobiletelephone==1 replace telephone=1 if telephone==. & mobiletelephone==1 /* Members of the household are considered deprived in assets if the household does not own more than one of: radio, TV, telephone, bike, motorbike, refrigerator, computer or animal_cart and does not own a car or truck.*/ egen n_small_assets2 = rowtotal(television radio telephone refrigerator bicycle motorbike computer animal_cart), missing lab var n_small_assets2 "Household Number of Small Assets Owned" gen hh_assets2 = (car==1 | n_small_assets2 > 1) replace hh_assets2 = . if car==. & n_small_assets2==. lab var hh_assets2 "Household Asset Ownership: HH has car or more than 1 small assets incl computer & animal cart" ******************************************************************************** *** Step 2.11 Rename and keep variables for MPI calculation ******************************************************************************** //Retain data on sampling design: gen strata = 1 gen psu = 1 //Retain year, month & date of interview: clonevar year_interview = w4_p_intrv_y clonevar month_interview = w4_p_intrv_m clonevar date_interview = w4_p_intrv_d *** Rename key global MPI indicators for estimation *** recode hh_mortality_5y (0=1)(1=0) , gen(d_cm) recode hh_nutrition_uw_st (0=1)(1=0) , gen(d_nutr) recode hh_child_atten (0=1)(1=0) , gen(d_satt) recode hh_years_edu6 (0=1)(1=0) , gen(d_educ) recode electricity (0=1)(1=0) , gen(d_elct) recode water_mdg (0=1)(1=0) , gen(d_wtr) recode toilet_mdg (0=1)(1=0) , gen(d_sani) recode housing_1 (0=1)(1=0) , gen(d_hsg) recode cooking_mdg (0=1)(1=0) , gen(d_ckfl) recode hh_assets2 (0=1)(1=0) , gen(d_asst) *** Keep selected variables for global MPI estimation *** keep hh_id ind_id ccty ccnum cty survey year subsample /// strata psu weight area relationship sex age agec7 agec4 marital hhsize /// region year_interview month_interview date_interview /// d_cm d_nutr d_satt d_educ d_elct d_wtr d_sani d_hsg d_ckfl d_asst order hh_id ind_id ccty ccnum cty survey year subsample /// strata psu weight area relationship sex age agec7 agec4 marital hhsize /// region year_interview month_interview date_interview /// d_cm d_nutr d_satt d_educ d_elct d_wtr d_sani d_hsg d_ckfl d_asst *** Sort, compress and save data for estimation *** sort ind_id compress save "$path_out/zaf_nids14-15_pov.dta", replace log close //erase files erase "$path_out/ZAF14-15_child.dta" erase "$path_out/ZAF14-15_teen.dta" erase "$path_out/ZAF14-15_yadult.dta" erase "$path_out/ZAF14-15_mortality.dta" erase "$path_out/full_data_ZAF14-15.dta"