---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name:
log: e:\DHS\programs\U5_mortality\child_mort_BD61_no_ci_22Jan2015_log.txt
log type: text
opened on: 22 Jan 2015, 13:02:49
. set more off
.
. /*
> ****************************************************************************************
> PROGRAM TO PRODUCE UNDER 5 MORTALITY RATES FOR SPECIFIC WINDOWS OF TIME, WITH COVARIATES
> ****************************************************************************************
>
> Written by Thomas W. Pullum, tom.pullum@icfi.com, first version September 2011, many subsequent updates
>
> This program has been tested on several reports. For example, for the Ethiopia 2005 survey (ET51),
> it matches exactly the rates in the report on pp. 111 and 113
>
> start_i and end_i are scalars referring to the boundaries of the age intervals.
>
> Add dob (b3) to them to get cmc's of time
>
> The generic chapter number for child mortality is 8,
> but in some reports it is another number, e.g. 9
>
> chap8sr.APP is the generic CSPro program for tables
> See P:\DHS\PROJECTS\UgandaDHS2006\DataProcessing\tables\chapt8
>
> U5mort_C.sps, U5mortDHS_D.sps, and U5mortDHS_E.sps are the generic SPSS programs
>
> The standard periods are 0-4, 5-9, etc. years before the survey, but there are
> exceptions. For example, the periods in the Uganda 2006 report are 1-5, 6-11,
> etc. years before the survey.
>
> This Stata program has similarities to the Stata fertility rates program.
> Children, rather than women, are the units of analysis for the calculations.
> The child MAY have AT MOST ONE event, i.e. death. Therefore, rather than
> poisson regression (glm with poisson error, log link), the statistical
> model is log probability (glm with binomial error, log link).
>
> Another crucial difference is that for births, we are given the month (cmc) when
> the birth occurred, but for child deaths, we are given the age of the child
> (in months) at death. This produces some ambiguity in the month of death.
> This is particularly true for age at death above 23 months. For example, "24"
> means "24-35" and the month of death (dod) could be anywhere from dob+24 to dob+36.
>
> Within a window of time, each child is exposed to intervals of age, in months:
>
> 1 0
> 2 1–2
> 3 3–5
> 4 6–11
> 5 12–23
> 6 24–35
> 7 36–47
> 8 48–59
>
>
> The program agrees exactly with DHS procedures.
> It can use either the IR or the BR file.
>
> The main output files must be renamed or they
> will be over-written the next time the program is run.
>
> Here are the crucial variables, with DHS names
>
> caseid "Case Identification" for mother
> bidx Child id, nested in caseid
> v005 "Sample wt"
> v008 "Date of interview (CMC)" or doi
> b3 "Date of birth (CMC)" for the children
> b7 "Age at death (months-imputed)"; "." if child survived
>
>
> Rates (correctly called conditional probabilities) calculated here:
>
> Neonatal mortality: the probability of dying in the first month of life;
> Postneonatal mortality: the difference between infant and neonatal mortality (similar to the
> probability of dying during months 1 through 11, but different);
> Infant mortality: the probability of dying in the first year of life;
> Child mortality: the probability of dying between the first and fifth birthday;
> Under-five mortality: the probability of dying before the fifth birthday.
>
>
> DHS procedures, from http://www.measuredhs.com/help/Datasets/index.htm:
>
> A. Infant mortality rate:
>
> 1. Calculate the component survival probability by subtracting the component death probability from one.
> 2. Calculate the product of the component survival probabilities for 0, 1–2, 3–5, and 6–11 months of age.
> 3. Subtract the product from 1 and multiply by 1000 to get the infant mortality rate. Post neonatal
> mortality rate: Subtract the neonatal mortality rate from the infant mortality rate.
>
> B. Child mortality rate:
>
> 1. Calculate the component survival probability by subtracting the component death probability from 1.
> 2. Calculate the product of the component survival probabilities for 12–23, 24–35, 36–47, and 48–59 months of age.
> 3. Subtract the product from 1 and multiply by 1000 to get the child mortality rate.
>
> C. Under-five mortality rate:
>
> 1. Calculate the component survival probability by subtracting the component death probability from 1.
> 2. Calculate the product of the component survival probabilities for 0, 1–2, 3–5, and 6–11, 12–23, 24–35, 36–47,
> and 48–59 months of age.
> 3. Subtract the product from 1 and multiply by 1,000 to get the child mortality rate.
>
>
> This routine does not do any re-distribution of deaths to adjust for possible
> age heaping on a boundary such as 12 months.
>
> Dates are in cmc's and intervals are in months.
>
> The version of the program with a covariate works only for categorical covariates.
> The q's for each value of the covariate are written as lines in a separate output file.
>
> The filenames of the saved files will include v000 (e.g. results_TZ5.dta and results_with_ci_TZ5.dta).
>
> GO TO THE REPEATED LINES OF ASTERISKS FOR THE
> BEGINNING OF THE EXECUTABLE STATEMENTS
>
> */
.
. ******************************************************************************
. program define setup
1.
. scalar run_number=0
2.
. * Specification of nageints, lengths of age intervals
.
. /*
> Standard specification but it can be changed:
>
> i start_i end_i length_i
> 1 0 1 1
> 2 1 3 2
> 3 3 6 3
> 4 6 12 6
> 5 12 24 12
> 6 24 36 12
> 7 36 48 12
> 8 48 60 12
>
> */
.
. scalar nageints=8
3.
. scalar length_1=1
4. scalar length_2=2
5. scalar length_3=3
6. scalar length_4=6
7. scalar length_5=12
8. scalar length_6=12
9. scalar length_7=12
10. scalar length_8=12
11.
. scalar start_1=0
12. local i=2
13. while `i'<=nageints+1 {
14. local iminus1=`i'-1
15. scalar start_`i'=start_`iminus1'+length_`iminus1'
16. local i=`i'+1
17. }
18.
. local i=1
19. while `i'<=nageints {
20. local iplus1=`i'+1
21. scalar end_`i'=start_`iplus1'
22. local i=`i'+1
23. }
24.
. use temp.dta, clear
25.
. * The next seven lines are only used if the input file is IR rather than BR.
. * If the input file is BR, then insert /* before and */ after the next seven lines
. * If the input file is IR, then omit /* before and */ after the next seven lines
.
. /*
> drop bidx*
> renpfix b3_0 b3_
> renpfix b7_0 b7_
> reshape long b3_ b7_, i(caseid) j(bidx)
> drop if b3==.
> rename b3_ b3
> rename b7_ b7
> */
.
. rename v008 doi
26. summarize doi [fweight=v005]
27. scalar doi_mean=r(mean)
28. gen wt=v005/1000000.
29. scalar v000_string=v000[1]
30. save temp2.dta, replace
31. end
.
. ******************************************************************************
.
. program define end_date_start_date
1.
. * This routine calculates the end date and start date for the current window of time
.
. /*
> There are two ways to specify the window of time.
>
> Method 1: as calendar year intervals, e.g. with
>
> scalar lw=1992
> scalar uw=1996
>
> for a window from January 1992 through December 1996, inclusive.
>
> Method 2: as an interval before the date of interview, e.g. with
>
> scalar lw=-2
> scalar uw=0
>
> for a window from 0 to 2 years before the interview, inclusive
> (that is, three years)
>
> The program knows you are using method 2 if the two numbers
> you enter are negative or zero.
>
> lw is the lower end of the window and uw is the upper end.
> (Remember that both are negative or 0.)
>
> start_date is the cmc for the earliest month in the window and
> end_date is the cmc for the latest month in the window
>
> */
.
. * lw and uw will be NEGATIVE OR ZERO for "years ago"
.
. if lw<=0 {
2. gen start_date=doi+12*lw-12
3. gen end_date=doi+12*uw-1
4. }
5.
. if lw>0 {
6. * lw and uw will be >0 for "calendar years"
. gen start_date=12*(lw-1900)+1
7. gen end_date=12*(uw-1900)+12
8. }
9.
. replace end_date=min(end_date,doi-1)
10. gen refcmc=(start_date+end_date)/2
11. summarize refcmc [fweight=v005]
12. scalar refcmc_mean=r(mean)
13. drop refcmc
14.
. end
.
. ******************************************************************************
.
. program prepare_child_file
1.
. * This routine calculates dob, doi, dod_1, dod_2, and the start and end months
. * of each age interval for each child.
.
. * This routine calls end_date_start_date
.
. use temp2.dta, clear
2.
. end_date_start_date
3.
. rename b3 dob
4.
. *drop if missing on birthdate
. drop if dob==.
5.
. *drop if dob is later than the end of this time interval
. drop if dob>end_date
6.
. * b7 is given in single months up to 23 and then given as 24, 36, 48, 60, ...
. * but for the Uganda 2006 survey there are 2 births at 25, 1 at 26, 1 at 35
.
. rename b7 age_at_death
7. replace age_at_death=. if age_at_death>end_date
8. replace age_at_death=. if age_at_death>59
9.
. /*
> doi month of interview (CMC)
> dob "Date of birth (CMC)"
> age_at_death "Age at death (months-imputed)"
> dod_1 the lower bound of the range of possible months of death (CMC)
> dod_2 the upper bound of the range of possible months of death (CMC)
>
> construct the beginning and end months of each age interval, in months after dob
> dob+start_i is the cmc when the child entered age interval i
> dob+end_i is the cmc when the child left age interval i
>
> because births happen at the middle of the month, end_1=start_2, etc.
>
> drop if all risk preceded the window
> */
.
. drop if dob+end_8=start_`i' & age_at_death<=end_`i'-1
16. replace dod_1=dob+start_`i' if age_int_at_death==`i'
17. replace dod_2=dob+end_`i' if age_int_at_death==`i'
18. local i=`i'+1
19. }
20.
. * age_at_death and age_int_at_death will be "." if the child survived to doi-1
.
. * drop the child if died before the beginning of this time interval
. drop if dod_2end_date & end_date==doi-1
22.
. save temp3.dta, replace
23.
. end
.
. ******************************************************************************
. program define make_risk_and_deaths
1.
. /*
> CHILDREN'S DEATHS AND RISK IN AGE INTERVALS WITHIN THE WINDOW OF TIME
>
> Standard age intervals 1 through 8 refer to
> months 0, 1-2, 3-5, 6-11, 12-23, 24-35, 36-47, 48-59.
>
> In other DHS programs these intervals are numbered 0 through 7 rather than 1 through 8.
>
> Each child gets values for died1 through died 8. The initial values are 0. They remain
> at 0 if the child survived to the date of interview.
>
> The other possible values are 1 and 0.5: 1 if the child died in the time interval, 0.5
> if the age interval at death is divided across two time intervals.
>
> DHS does not calculate months of exposure. To emphasize that, this program uses the term
> "risk" rather than "exposure". Each child gets values for risk1 through risk8.
> The initial values are 0. They remain at 0 for any ages for which the child was not observed
> (within the window of time), because of death or censoring.
>
> The other possible values are 1 and 0.5: 1 if the child was observed for that age and all
> such observation was within the time interval, and 0.5 if the observation for that age is
> divided across two time intervals.
>
> The risk is not altered for the age interval and time interval(s) in which the child died,
> but risk drops to 0 after that age interval.
>
> In the log probability analysis, .5 is replaced with 1 and the weight is multiplied by .5.
>
> */
.
. prepare_child_file
2.
. use temp3.dta, replace
3.
. * Values of risk and died are calculated in the following loop. It could be streamlined
. * somewhat but you must be careful or you will lose the match with DHS.
.
. local i=1
4. while `i'<=nageints {
5. gen died`i'=.
6.
. * age interval is entirely in the time window
. replace died`i'=1 if age_int_at_death==`i' & dod_2<=end_date & dod_1>=start_date
7.
. * age interval is partly in the time window and partly in the previous time window
. replace died`i'=.5 if age_int_at_death==`i' & dod_2>=start_date & dod_1< start_date
8.
. * age interval is partly in the time window and partly in the next time window
. replace died`i'=.5 if age_int_at_death==`i' & dod_2> end_date & dod_1<=end_date
9.
. gen risk`i'=.
10.
. * age interval is entirely in the time window
. replace risk`i'=1 if (age_int_at_death>=`i' | age_at_death==.) & dob+end_`i'<=end_date & dob+start_`i'>=start_date
11.
. * age interval is partly in the time window and partly in the previous time window
. replace risk`i'=.5 if (age_int_at_death>`i' | age_at_death==.) & dob+end_`i'>=start_date & dob+start_`i'< start_date
12.
. * age interval is partly in the time window and partly in the next time window
. replace risk`i'=.5 if (age_int_at_death>`i' | age_at_death==.) & dob+end_`i'> end_date & dob+start_`i'<=end_date
13.
. * child dies in this interval
. * age interval is partly in the time window and partly in the previous time window
. replace risk`i'=.5 if age_int_at_death==`i' & end_`i'>=start_date & dob+start_`i'< start_date
14.
. * age interval is partly in the time window and partly in the next time window
. replace risk`i'=.5 if age_int_at_death==`i' & end_`i'> end_date & dob+start_`i'<=end_date
15.
. * Next line will change risk from 0 to .5 for a handful of cases in which died=.5 and risk=0;
. * it should not be necessary and suggests an error above, but ensures
. * a perfect match with the DHS programs.
.
. replace risk`i'=.5 if died`i'==.5
16.
. * the preceding lines produce some values of died and risk that should be changed from missing to 0
. replace died`i'=0 if died`i'==. & risk`i'>0 & risk`i'<=1
17. replace risk`i'=0 if risk`i'==. & died`i'>0 & died`i'<=1
18.
. * SEE NOTE BELOW!!
. *replace died`i'=.5 if risk`i'==.5 & died`i'==1
. replace risk`i'=1 if died`i'==1
19.
. gen wt_died`i'=wt*died`i'
20. gen wt_risk`i'=wt*risk`i'
21.
. * To this point, everything was unweighted. Now carry along both wtd and unwtd values.
. * Weighted values can be used in calc_rates_division.
. * Unweighted values are used in the logprob model, which will include weights in the model.
.
. ren died`i' unwtd_died`i'
22. ren risk`i' unwtd_risk`i'
23.
. local i=`i'+1
24. }
25.
. /*
> IN THE CSPRO AND SAS AND SPSS PROGRAMS IT CAN HAPPEN THAT RISK=.5 AND DIED=1.
> FROM THE PERSPECTIVE OF PROBABILITY THEORY, THIS IS AN ERROR.
> IN THOSE PROGRAMS IT ONLY HAPPENS IF THE CHILD DIED IN THE MOST RECENT TIME
> INTERVAL AND IN AN AGE GROUP THAT WOULD OTHERWISE HAVE BEEN CENSORED BY THE INTERVIEW.
> The above line "replace risk`i'=1 if died`i'==1" corrects this, possibly introducing
> a small difference from published estimates.
> */
.
. gen v_run_number=_n
26. gen v_lw=.
27. gen v_uw=.
28. gen v_refcmc=.
29.
. gen str10 variable="All"
30. gen value=.
31. label variable value "All"
32.
. * se`i' is the (estimated, of course) se of the log of rate i
.
. local i=1
33. while `i'<=nageints {
34. gen v_rate`i'=.
35. gen v_se`i'=.
36. local i=`i'+1
37. }
38.
. sort caseid bidx
39.
. save risk_and_deaths.dta, replace
40. end
.
. ******************************************************************************
. program define calc_rates_division
1.
. /*
>
> THIS ROUTINE IS NOT ACTUALLY CALLED BUT IS INCLUDED HERE FOR POSSIBLE USE
>
> It may have to be revised because it has not been used for a long time....
>
> This routine produces one line for each combination of time and category of covariate.
>
> It calculates rates simply by aggregating deaths and risk,
> and dividing the sum of the deaths by the sum of the risk.
>
> Only the WEIGHTED deaths and risk are used here.
>
> By contrast, the the log probability approach uses unweighted deaths and risk
> and does the weighting with pweight.
> */
.
. keep v_lw v_uw value variable wtd_died* wtd_risk*
2.
. *collapse (sum) wtd_died* wtd_risk*, by(time value variable)
. collapse (sum) wtd_died* wtd_risk*, by(value variable)
3.
. * calculate rates
.
. local i=1
4. while `i'<=nageints {
5. gen rate`i'=wtd_died`i'/wtd_risk`i'
6. local i=`i'+1
7. }
8.
. format %10.2f wtd_died* wtd_risk*
9.
. * Unweighted numerators and denominators can be useful to gauge statistical stability
. list lw uw variable value wtd_died1 wtd_died2 wtd_died3 wtd_died4, table clean
10. list lw uw variable value wtd_died5 wtd_died6 wtd_died7 wtd_died8, table clean
11. list lw uw variable value wtd_risk1 wtd_risk2 wtd_risk3 wtd_risk4, table clean
12. list lw uw variable value wtd_risk5 wtd_risk6 wtd_risk7 wtd_risk8, table clean
13.
. end
.
. ******************************************************************************
. program define calc_5_from_8
1.
. * Routine to calculate the various q's from rate1-rate8
.
. ren rate1 q_month_0
2. ren rate2 q_month_1to2
3. ren rate3 q_month_3to5
4. ren rate4 q_month_6to11
5. ren rate5 q_year_1
6. ren rate6 q_year_2
7. ren rate7 q_year_3
8. ren rate8 q_year_4
9.
. * calculate the compound q's
.
. gen neonatal=q_month_0
10. gen prob_1q0=1-(1-q_month_0)*(1-q_month_1to2)*(1-q_month_3to5)*(1-q_month_6to11)
11. gen postneonatal=prob_1q0-neonatal
12. gen prob_4q1=1-(1-q_year_1)*(1-q_year_2)*(1-q_year_3)*(1-q_year_4)
13. gen prob_5q0=1-(1-prob_1q0)*(1-prob_4q1)
14.
. end
.
. ******************************************************************************
.
. program define save_results
1.
. * This routine fills in the variables that were set up earlier,
. * using values calculated in calc_rates
.
. scalar run_number=run_number+1
2.
. replace v_lw=lw if v_run_number==run_number
3. replace v_uw=uw if v_run_number==run_number
4. replace v_refcmc=refcmc_mean if v_run_number==run_number
5.
. replace value=code if v_run_number==run_number
6.
. local i=1
7. while `i'<=nageints {
8. replace v_rate`i'=rate`i' if v_run_number==run_number
9. replace v_se`i'=se`i' if v_run_number==run_number
10. local i=`i'+1
11. }
12.
. end
.
. ******************************************************************************
.
. program define logprob
1.
. /*
> this routine produces one line for each combination of time and category of covariate
>
> each line gives results for all of the age intervals
>
> it calculates rates using log probability models (glm, log link, binomial error)
>
> the UNWEIGHTED deaths and risk are used, with adjustments
> for weights, clusters, and strata within the model
> */
.
. svyset v001 [pweight=v005], strata(strata) singleunit(centered)
2.
. local i=1
3. quietly while `i'<=nageints {
4. quietly summarize unwtd_died`i'
5. scalar died`i'tot=r(sum)
6. quietly summarize unwtd_risk`i'
7. scalar risk`i'tot=r(sum)
8.
. quietly summarize value
9. scalar code=r(mean)
10.
. scalar rate`i'=0
11. scalar se`i'=0
12.
. if died`i'tot>0 & risk`i'tot>0 & died`i'tot<. & risk`i'tot<. {
13. ***********************
. svy: glm unwtd_died`i', family(binomial unwtd_risk`i') link(log)
14. ***********************
. matrix bb=e(b)
15. scalar rate`i'=exp(bb[1,1])
16. matrix sesq`i'=e(V)
17. scalar se`i'=sqrt(sesq`i'[1,1])
18.
. }
19.
. local i=`i'+1
20. }
21.
. save_results
22.
. scalar list se1 se2 se3 se4 se5 se6 se7 se8
23.
. end
.
. ******************************************************************************
.
. program define catlogprob
1.
. svyset v001 [pweight=v005], strata(strata) singleunit(centered)
2.
.
. *************************************************
. **SPECIAL COMMAND TO DISABLE SVYSET, FOR EXAMPLE IF GETTING RATES AT THE LEVEL OF THE CLUSTER
.
. correlate value v001
3.
. scalar turnoffsvyset=r(rho)
4.
.
. *************************************************
.
.
.
. * loop through all the values of the categorical variable called "value"
.
. levelsof value, local(levels)
5.
. foreach cat of local levels {
6.
. scalar code=`cat'
7.
. local i=1
8.
. quietly while `i'<=nageints {
9. quietly summarize unwtd_died`i' if value==code
10. scalar died`i'tot=r(sum)
11. quietly summarize unwtd_risk`i' if value==code
12. scalar risk`i'tot=r(sum)
13.
. scalar rate`i'=0
14. scalar se`i'=0
15.
. if died`i'tot>0 & risk`i'tot>0 & died`i'tot<. & risk`i'tot<. {
16. ***********************
. if turnoffsvyset<1 {
17. quietly svy: glm unwtd_died`i' if value==code, family(binomial unwtd_risk`i') link(log)
18. }
19.
. if turnoffsvyset==1 {
20. quietly glm unwtd_died`i' if value==code, family(binomial unwtd_risk`i') link(log)
21. }
22.
. ***********************
. matrix bb=e(b)
23.
. scalar rate`i'=exp(bb[1,1])
24. matrix sesq=e(V)
25. scalar se`i'=sqrt(sesq[1,1])
26. }
27.
. local i=`i'+1
28. }
29.
. save_results
30.
. }
31.
. end
.
. ******************************************************************************
. program define calc_q_after_logprob
1.
. * Routine that uses rate1-rate8 to get the various q's following the log prob model.
. * First part is exactly the same as calc_5_from_8. The additional part is for the
. * calculation of standard errors and confidence intervals.
.
. calc_5_from_8
2.
. /*
> Here is a complete list of the q's (plus postneonatal, which is not a q!)
> q_month_0
> q_month_1to2
> q_month_3to5
> q_month_6to11
> q_year_1
> q_year_2
> q_year_3
> q_year_4
>
> neonatal
> postneonatal
> prob_1q0
> prob_4q1
> prob_5q0
> */
.
. * calculate confidence intervals for the 8 basic q's
.
. local i=1
3. while `i'<=nageints {
4. gen factor`i'=exp(1.96*se`i')
5. local i=`i'+1
6. }
7.
. gen L_q_month_0=q_month_0/factor1
8. gen L_q_month_1to2=q_month_1to2/factor2
9. gen L_q_month_3to5=q_month_3to5/factor3
10. gen L_q_month_6to11=q_month_6to11/factor4
11. gen L_q_year_1=q_year_1/factor5
12. gen L_q_year_2=q_year_2/factor6
13. gen L_q_year_3=q_year_3/factor7
14. gen L_q_year_4=q_year_4/factor8
15.
. gen H_q_month_0=q_month_0*factor1
16. gen H_q_month_1to2=q_month_1to2*factor2
17. gen H_q_month_3to5=q_month_3to5*factor3
18. gen H_q_month_6to11=q_month_6to11*factor4
19. gen H_q_year_1=q_year_1*factor5
20. gen H_q_year_2=q_year_2*factor6
21. gen H_q_year_3=q_year_3*factor7
22. gen H_q_year_4=q_year_4*factor8
23.
. drop factor*
24.
. * a procedure to estimate the se and ci's for the compound rates using the se
. * of the 8 basic rates is in a separate do file, child_mortality_ci_do.txt
.
. end
.
. **********************************************************
.
. program define partial_file_save
1.
. drop if v_run_number==.
2. keep v*
3.
. if run_number>1 {
4. append using partial_results.dta
5. }
6.
. save partial_results.dta, replace
7. use risk_and_deaths.dta, clear
8. end
.
. **********************************************************
.
. program define final_file_save
1.
. use partial_results.dta, clear
2.
. * construct two Stata data files that save the results.
. * The first includes confidence intervals, the second does not
.
. renpfix v_
3. drop if lw==.
4.
. calc_q_after_logprob
5.
. sort variable lw uw value
6. format q* se* L* H* p* neo* %7.5f
7. format lw uw value %5.0f
8.
. scalar results_with_ci_="results_with_ci_"
9. scalar cid=v000[1]
10. scalar dotdta=".dta"
11. scalar sfn=results_with_ci_+cid+dotdta
12. local lfn=sfn
13. save `lfn', replace
14.
. keep lw uw refcmc variable value neonatal postneonatal prob*
15. gen str4 v000=v000_string
16. * If all the interviews were in Jan. 1900, i.e. cmc 1, they would be placed
. * at the middle of January, i.e. one-half of one month into 1900.
. * Thus the reference doi doi must be adjusted downwards by one-half month.
.
. gen doi=1900+(doi_mean-.5)/12
17.
. * The reference date of interview would initially be calculated as starting month plus ending
. * month divided by 2 (which has already been calculated as mean_refcmc. Say the
. * starting month is 1 (January) and the ending month is 11 (December). The interval goes from
. * the start of the start month to the end of end month, so the mean is July 1, i.e. month 7.0.
. * But expressed as the fraction of the year, it is .5. In years, the correct ref date will be
. * 1900 + mean_refcmc/12.
.
. gen refdate=1900+refcmc/12
18.
. format doi refdate %8.2f
19.
. scalar results_="results_"
20. scalar scid=v000[1]
21. scalar sfnend=".dta"
22. scalar sfn=results_+scid+sfnend
23. local lfn=sfn
24. save `lfn', replace
25.
. end
.
. ******************************************************************************
. ******************************************************************************
. ******************************************************************************
. ******************************************************************************
. ******************************************************************************
. * EXECUTION BEGINS HERE
.
. /*
> Input file can be either IR or BR; check the end of the setup routine for seven
> extra lines, including reshape, that are to be included if the input file is IR.
>
> v002, v003 can be helpful for linking to other files, otherwise not needed
>
> v001 is the cluster variable
> v005 is the weight variable
> v023 is the stratum variable (usually but not always; can usually construct the strata,
> aka domains, as combinations of urban/rural and region)
>
> Weights, clustering, and stratification enter through svyset and svy:
>
> The adjustments for clustering and stratification will only affect the standard errors, which
> will be manipulated by an optional companion program, child_mortality_ci_(date)_do.txt
>
> To produce estimates for each cluster, v001, it is necessary to disable the svyset command.
>
> For that level, the cluster, stratification and weighting effects drop out completely.
>
> The program will automatically do this within the catlogprob routine
>
>
> */
.
. use c:\DHS\DHS_data\BR_files\BDBR61FL.dta
.
. keep caseid v000 v001 v002 v003 v005 v008 v011 bidx* bord* b3* b4* b7* b11* v023 v024 v025 v106 v149 v190 v743* v744* v502 s823
.
. gen strata=v023
.
. * Any recoding of the original variables will go here, before temp.dta is saved
.
. ********************************************************************************************
.
. * Construction of empowerment variables; necessary to save in required variables in the keep line above
.
. gen wt=v005/1000000
.
. **Panel 1 -- Decisionmaking -- uses vars v743a, v743b, v743d, s823
.
. recode v743a (1/2=1) (4/9=0), g(dmrhealth) /*This turns the vars into dichotomous variables & handles missing/odd values*/
(36856 differences between v743a and dmrhealth)
. recode v743b (1/2=1) (4/9=0), g(dmbuy)
(39458 differences between v743b and dmbuy)
. recode v743d (1/2=1) (4/9=0), g(dmvisit)
(38239 differences between v743d and dmvisit)
. recode s823 (1/2=1) (4/9=0), g(dmchealth)
(35859 differences between s823 and dmchealth)
. label def yesno 0 "no" 1 "yes"
. lab val dm* yesno
.
. gen dmnum = dmrhealth + dmbuy + dmvisit + dmchealth /*This provides a count of # of decisions Respondent makes*/
(2945 missing values generated)
. recode dmnum (0=0) (1/2=1) (3=3) (4=4) /*This collapses into the categories in the final report*/
(dmnum: 5096 changes made)
. lab def dmnum 1 "1-2"
. lab val dmnum dmnum
. lab var dmnum "Number of decisions in which women participate"
.
. tab dmnum [iw=wt] /*Check that results match final report Table 13.8*/
Number of |
decisions |
in which |
women |
participate | Freq. Percent Cum.
------------+-----------------------------------
0 | 7,109.3432 16.50 16.50
1-2 | 9,591.7872 22.26 38.75
3 | 6,800.1866 15.78 54.53
4 | 19,595.262 45.47 100.00
------------+-----------------------------------
Total |43,096.5791 100.00
.
. **Panel 2 -- Wife beating attitudes -- uses v744a-e
.
. recode v744a (8=0), g(v744a_r) /*This handles don't know values*/
(58 differences between v744a and v744a_r)
. recode v744b (8=0), g(v744b_r)
(71 differences between v744b and v744b_r)
. recode v744c (8=0), g(v744c_r)
(101 differences between v744c and v744c_r)
. recode v744d (8=0), g(v744d_r)
(207 differences between v744d and v744d_r)
. recode v744e (8=0), g(v744e_r)
(79 differences between v744e and v744e_r)
.
. gen wfbnum = v744a_r + v744b_r + v744c_r + v744d_r + v744e_r /*This provides a count of # of wife beating reasons Respondent agrees with*/
. recode wfbnum (1/2=1) (3/4=3) /*This collapses into the categories in the final report*/
(wfbnum: 5678 changes made)
. lab def wfbnum 1 "1-2" 3 "3-4"
. lab val wfbnum wfbnum
. lab var wfbnum "Number of reasons for which wife beating is justified"
.
. tab wfbnum if v502==1 [iw=wt] /*Check that results match Final Report Table 13.8*/
Number of |
reasons for |
which wife |
beating is |
justified | Freq. Percent Cum.
------------+-----------------------------------
0 | 28,132.289 65.28 65.28
1-2 | 9,811.5596 22.77 88.04
3-4 |3,926.01068 9.11 97.15
5 | 1,226.7203 2.85 100.00
------------+-----------------------------------
Total |43,096.5791 100.00
.
. drop wt
. ********************************************************************************************
.
. save temp.dta, replace
file temp.dta saved
.
. quietly setup
.
. * JUST RUN FOR THE EMPOWERMENT VARIABLES IN THE FIVE YEARS BEFORE THE SURVEY; MUST BEGIN WITH THE TOTAL
.
. scalar lw=-4
. scalar uw=0
. quietly make_risk_and_deaths
. quietly logprob
. partial_file_save
(0 observations deleted)
file partial_results.dta saved
.
. replace value=dmnum
(17636 real changes made)
. label variable value "dmnum"
. replace variable="Number of decisions in which women participate"
variable was str10 now str46
(18115 real changes made)
. quietly catlogprob
. partial_file_save
(0 observations deleted)
(label v023 already defined)
(label v024 already defined)
(label v025 already defined)
(label v106 already defined)
(label v149 already defined)
(label v190 already defined)
(label v502 already defined)
(label v743a already defined)
(label v743b already defined)
(label v743c already defined)
(label v743d already defined)
(label v743e already defined)
(label v743f already defined)
(label v744a already defined)
(label v744b already defined)
(label v744c already defined)
(label v744d already defined)
(label v744e already defined)
file partial_results.dta saved
.
. replace value=wfbnum
(18115 real changes made)
. label variable value "wfbnum"
. replace variable="Number of reasons for which wife beating is justified"
variable was str10 now str53
(18115 real changes made)
. quietly catlogprob
. partial_file_save
(0 observations deleted)
(label v744e already defined)
(label v744d already defined)
(label v744c already defined)
(label v744b already defined)
(label v744a already defined)
(label v743f already defined)
(label v743e already defined)
(label v743d already defined)
(label v743c already defined)
(label v743b already defined)
(label v743a already defined)
(label v502 already defined)
(label v190 already defined)
(label v149 already defined)
(label v106 already defined)
(label v025 already defined)
(label v024 already defined)
(label v023 already defined)
file partial_results.dta saved
.
. /*
>
> * Generic lines for setting the time interval and running on the total, for
> * the first table in the mortality chapter of the main report
> * For the Bangladesh 2011 report this is table 8.1
>
> scalar lw=-4
> scalar uw=0
> quietly make_risk_and_deaths
> quietly logprob
> partial_file_save
>
> scalar lw=-9
> scalar uw=-5
> quietly make_risk_and_deaths
> quietly logprob
> partial_file_save
>
> scalar lw=-14
> scalar uw=-10
> quietly make_risk_and_deaths
> quietly logprob
> partial_file_save
>
>
> * Generic lines for setting the time interval and running
> * the second table in the mortality chapter of the main report
> * For the Bangladesh 2011 report this is table 8.3
>
> * first you MUST run the total
>
> scalar lw=-4
> scalar uw=0
> quietly make_risk_and_deaths
> quietly logprob
> partial_file_save
>
>
> * next run the program within each category of covariates
> * IMPORTANT! You must previously run "make_risk_and_births" for these values of lw and uw
>
> replace value=v025
> label variable value "v025"
> replace variable="Type of Place"
> quietly catlogprob
> partial_file_save
>
> replace value=v024
> label variable value "v024"
> replace variable="Division"
> quietly catlogprob
> partial_file_save
>
> replace value=v149
> label variable value "v149"
> replace variable="Mother's education"
> quietly catlogprob
> partial_file_save
>
> replace value=v190
> label variable value "v190"
> replace variable="Wealth"
> quietly catlogprob
> partial_file_save
>
> */
.
. * the next line is essential at the end of the run
. final_file_save
(54336 observations deleted)
file results_with_ci_BD6.dta saved
file results_BD6.dta saved
.
.
. list v000 doi refdate refcmc lw uw variable value neonatal postneonatal prob*, table clean compress
v000 doi refdate refcmc lw uw variable value neona~l postn~l pro~1q0 prob_~1 pro~5q0
1. BD6 2011.75 2009.25 1311.025 -4 0 All . 0.03239 0.01012 0.04250 0.01147 0.05349
2. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 0 0.03504 0.01436 0.04940 0.01044 0.05932
3. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 1 0.03338 0.00987 0.04325 0.01370 0.05636
4. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 3 0.03602 0.00794 0.04397 0.01415 0.05750
5. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 4 0.02886 0.00906 0.03793 0.00955 0.04711
6. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 0 0.03034 0.00904 0.03938 0.00996 0.04895
7. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 1 0.04053 0.00878 0.04931 0.01305 0.06172
8. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 3 0.03277 0.02249 0.05525 0.01641 0.07076
9. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 5 0.01113 0.01071 0.02184 0.02026 0.04166
.
.
. gen Neonatal=round(1000*neonatal)
. gen Postneonatal=round(1000*postneonatal)
. gen Prob_1q0=round(1000*prob_1q0)
. gen Prob_4q1=round(1000*prob_4q1)
. gen Prob_5q0=round(1000*prob_5q0)
.
. * alternative list that may be more user friendly
.
. list v000 doi refdate refcmc lw uw variable value Neonatal Postneonatal Prob*, table clean
v000 doi refdate refcmc lw uw variable value Neonatal Postne~l Prob_1q0 Prob_4q1 Prob_5q0
1. BD6 2011.75 2009.25 1311.025 -4 0 All . 32 10 43 11 53
2. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 0 35 14 49 10 59
3. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 1 33 10 43 14 56
4. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 3 36 8 44 14 57
5. BD6 2011.75 2009.25 1311.025 -4 0 Number of decisions in which women participate 4 29 9 38 10 47
6. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 0 30 9 39 10 49
7. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 1 41 9 49 13 62
8. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 3 33 22 55 16 71
9. BD6 2011.75 2009.25 1311.025 -4 0 Number of reasons for which wife beating is justified 5 11 11 22 20 42
.
.
. * the next lines are optional; "ci" for "confidence intervals"; must have
. * child_mortality_ci_(date)_do.txt in the path
. scalar results_with_ci_="results_with_ci_"
. scalar cid=v000[1]
. scalar dotdta=".dta"
. scalar sfn=results_with_ci_+cid+dotdta
. local lfn=sfn
. use `lfn', clear
.
. * OMIT CONFIDENCE INTERVALS
. *do child_mortality_ci_5June2014_do.txt
.
.
. **************************************************************
. *END OF PROGRAM***********************************************
. **************************************************************
.
end of do-file
. exit, clear