* make sure there's enough RAM
set mem 500m
* load the downloaded data set
use "C:\Users\AnthonyD\Desktop\MWIR4DDT\MWIR4DFL.DTA" , clear
* scale the weights as recommended
generate weight = v005/1000000
* construct the strata variable as described in this forum
egen strata = group(v022 v025), label
* construct a "40-49 year olds" binary variable
generate ffn = ( v447a >= 40 & v447a < 50 )
* construct a "never married" binary variable
generate nm = v501 == 0
********************************
* for "Children ever born" the publication says:
* standard error 0.032
* DEFT 1.279
* deal with singleton psus by scaling..
svyset [pweight=weight], psu(v021) strata(strata) singleunit( scaled )
svy: mean v201
* this standard error is 0.034975
estat effects
* this DEFT says 1.39108
* try with jackknife replication instead..
svy jackknife: mean v201
* this standard error is 0.0349968
estat effects
* this DEFT says 1.39195
* try with certainty PSUs..
svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty )
svy: mean v201
* this standard error CORRECTLY rounds down to 0.032 -- it's 0.0324946
estat effects
* but the DEFT is 1.29243, which is still not correct
svy jackknife: mean v201
* this standard error rounds up to 0.033 -- it's 0.0325148
estat effects
* and the DEFT is also wrong: 1.29323
* try with centered PSUs..
svyset [pweight=weight], psu(v021) strata(strata) singleunit( centered )
* this standard error rounds up to 0.033 -- it's 0.0325691
estat effects
* and the DEFT is also wrong: 1.29539
svy jackknife: mean v201
* this standard error rounds up to 0.033 -- it's 0.0325148
estat effects
* the DEFT is also wrong: 1.29323
*********************
* certainty PSUs got me the closest, so i'll go with that for the next round
svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty )
* the publication says
* 0.168 of women were never married
* with a standard error of 0.006
* and a DEFT of 1.612
svy: mean nm
* gives the correct mean and standard error (0.0055304)
estat effects
* but the DEFT is still off: 1.59838
* the publication says
* there are an unweighted 1,710 records of women aged 40-49
* confirm that the `ffn` has been built correctly:
tab ffn
* the publication says
* there are a weighted 1,684 records of women aged 40-49
svy: total ffn
* gives 1,684.034 so i assume that's built correctly.
* the publication says: among women aged 40-49
* 6.550 children born
* standard error: 0.080
* DEFT of 1.188
svy, subpop( ffn ): mean v201
* this command gives mean: 6.549798 and SE: 0.0799961,
* which round to the two published numbers above
estat effects
* but the DEFT is 1.18558, which is off by 0.002?
* the publication says
* "The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates."
* so for the fertility rate, i believe it should be "svy jackknife"?
svy jackknife, subpop( ffn ): mean v201
* this command gives mean: 6.549798 and SE: 0.0798322,
* so the standard error no longer rounds correctly?
estat effects
* and here the DEFT is even further away: 1.18315