* make sure there's enough RAM set mem 500m * load the downloaded data set use "C:\Users\AnthonyD\Desktop\MWIR4DDT\MWIR4DFL.DTA" , clear * scale the weights as recommended generate weight = v005/1000000 * construct the strata variable as described in this forum egen strata = group(v022 v025), label * construct a "40-49 year olds" binary variable generate ffn = ( v447a >= 40 & v447a < 50 ) * construct a "never married" binary variable generate nm = v501 == 0 ******************************** * for "Children ever born" the publication says: * standard error 0.032 * DEFT 1.279 * deal with singleton psus by scaling.. svyset [pweight=weight], psu(v021) strata(strata) singleunit( scaled ) svy: mean v201 * this standard error is 0.034975 estat effects * this DEFT says 1.39108 * try with jackknife replication instead.. svy jackknife: mean v201 * this standard error is 0.0349968 estat effects * this DEFT says 1.39195 * try with certainty PSUs.. svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty ) svy: mean v201 * this standard error CORRECTLY rounds down to 0.032 -- it's 0.0324946 estat effects * but the DEFT is 1.29243, which is still not correct svy jackknife: mean v201 * this standard error rounds up to 0.033 -- it's 0.0325148 estat effects * and the DEFT is also wrong: 1.29323 * try with centered PSUs.. svyset [pweight=weight], psu(v021) strata(strata) singleunit( centered ) * this standard error rounds up to 0.033 -- it's 0.0325691 estat effects * and the DEFT is also wrong: 1.29539 svy jackknife: mean v201 * this standard error rounds up to 0.033 -- it's 0.0325148 estat effects * the DEFT is also wrong: 1.29323 ********************* * certainty PSUs got me the closest, so i'll go with that for the next round svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty ) * the publication says * 0.168 of women were never married * with a standard error of 0.006 * and a DEFT of 1.612 svy: mean nm * gives the correct mean and standard error (0.0055304) estat effects * but the DEFT is still off: 1.59838 * the publication says * there are an unweighted 1,710 records of women aged 40-49 * confirm that the `ffn` has been built correctly: tab ffn * the publication says * there are a weighted 1,684 records of women aged 40-49 svy: total ffn * gives 1,684.034 so i assume that's built correctly. * the publication says: among women aged 40-49 * 6.550 children born * standard error: 0.080 * DEFT of 1.188 svy, subpop( ffn ): mean v201 * this command gives mean: 6.549798 and SE: 0.0799961, * which round to the two published numbers above estat effects * but the DEFT is 1.18558, which is off by 0.002? * the publication says * "The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates." * so for the fertility rate, i believe it should be "svy jackknife"? svy jackknife, subpop( ffn ): mean v201 * this command gives mean: 6.549798 and SE: 0.0798322, * so the standard error no longer rounds correctly? estat effects * and here the DEFT is even further away: 1.18315