___ ____ ____ ____ ____ (R) /__ / ____/ / ____/ ___/ / /___/ / /___/ 11.2 Copyright 1985-2009 StataCorp LP Statistics/Data Analysis StataCorp 4905 Lakeway Drive MP - Parallel Edition College Station, Texas 77845 USA 800-STATA-PC http://www.stata.com 979-696-4600 stata@stata.com 979-696-4601 (fax) 2-user 2-core Stata network perpetual license: Serial number: 50110590558 Licensed to: Kaiser Family Foundation Kaiser Family Foundation Notes: 1. (/m# option or -set memory-) 50.00 MB allocated to data 2. (/v# option or -set maxvar-) 5000 maximum variables . do "C:\Users\AnthonyD\Desktop\20140429 malawi 2004 replication attempt.do" . * make sure there's enough RAM . set mem 500m Current memory allocation current memory usage settable value description (1M = 1024k) -------------------------------------------------------------------- set maxvar 5000 max. variables allowed 1.947M set memory 500M max. data space 500.000M set matsize 400 max. RHS vars in models 1.254M ----------- 503.201M . . * load the downloaded data set . use "C:\Users\AnthonyD\Desktop\MWIR4DDT\MWIR4DFL.DTA" , clear . . * scale the weights as recommended . generate weight = v005/1000000 . . * construct the strata variable as described in this forum . egen strata = group(v022 v025), label . . * construct a "40-49 year olds" binary variable . generate ffn = ( v447a >= 40 & v447a < 50 ) . . * construct a "never married" binary variable . generate nm = v501 == 0 . . . . ******************************** . . . * for "Children ever born" the publication says: . * standard error 0.032 . * DEFT 1.279 . . . . * deal with singleton psus by scaling.. . svyset [pweight=weight], psu(v021) strata(strata) singleunit( scaled ) pweight: weight VCE: linearized Single unit: scaled Strata 1: strata SU 1: v021 FPC 1: . . svy: mean v201 (running mean on estimation sample) Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Design df = 551 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 3.03055 .034975 2.961849 3.099251 -------------------------------------------------------------- Note: variance scaled to handle strata with a single sampling unit. . * this standard error is 0.034975 . . estat effects ---------------------------------------------------------- | Linearized | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .034975 1.93512 1.39108 ---------------------------------------------------------- Note: variance scaled to handle strata with a single sampling unit. . * this DEFT says 1.39108 . . * try with jackknife replication instead.. . svy jackknife: mean v201 (running mean on estimation sample) Jackknife replications (858) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 .................................................. 550 .................................................. 600 .................................................. 650 .................................................. 700 .................................................. 750 .................................................. 800 .................................................. 850 ........ Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Replications = 858 Design df = 551 -------------------------------------------------------------- | Jackknife | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 3.03055 .0349968 2.961807 3.099293 -------------------------------------------------------------- Note: variance scaled to handle strata with a single sampling unit. . * this standard error is 0.0349968 . . estat effects ---------------------------------------------------------- | Jackknife | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .0349968 1.93752 1.39195 ---------------------------------------------------------- Note: variance scaled to handle strata with a single sampling unit. . * this DEFT says 1.39195 . . * try with certainty PSUs.. . svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty ) pweight: weight VCE: linearized Single unit: certainty Strata 1: strata SU 1: v021 FPC 1: . . svy: mean v201 (running mean on estimation sample) Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Design df = 551 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 3.03055 .0324946 2.966721 3.094379 -------------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * this standard error CORRECTLY rounds down to 0.032 -- it's 0.0324946 . estat effects ---------------------------------------------------------- | Linearized | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .0324946 1.67038 1.29243 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * but the DEFT is 1.29243, which is still not correct . . svy jackknife: mean v201 (running mean on estimation sample) Jackknife replications (858) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 .................................................. 550 .................................................. 600 .................................................. 650 .................................................. 700 .................................................. 750 .................................................. 800 .................................................. 850 ........ Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Replications = 858 Design df = 551 -------------------------------------------------------------- | Jackknife | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 3.03055 .0325148 2.966682 3.094418 -------------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * this standard error rounds up to 0.033 -- it's 0.0325148 . estat effects ---------------------------------------------------------- | Jackknife | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .0325148 1.67246 1.29323 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * and the DEFT is also wrong: 1.29323 . . . * try with centered PSUs.. . svyset [pweight=weight], psu(v021) strata(strata) singleunit( centered ) pweight: weight VCE: linearized Single unit: centered Strata 1: strata SU 1: v021 FPC 1: . . * this standard error rounds up to 0.033 -- it's 0.0325691 . estat effects ---------------------------------------------------------- | Jackknife | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .0325148 1.67246 1.29323 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * and the DEFT is also wrong: 1.29539 . . svy jackknife: mean v201 (running mean on estimation sample) Jackknife replications (858) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .................................................. 100 .................................................. 150 .................................................. 200 .................................................. 250 .................................................. 300 .................................................. 350 .................................................. 400 .................................................. 450 .................................................. 500 .................................................. 550 .................................................. 600 .................................................. 650 .................................................. 700 .................................................. 750 .................................................. 800 .................................................. 850 ........ Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Replications = 858 Design df = 551 -------------------------------------------------------------- | Jackknife | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 3.03055 .0325148 2.966682 3.094418 -------------------------------------------------------------- Note: strata with single sampling unit centered at overall mean. . * this standard error rounds up to 0.033 -- it's 0.0325148 . estat effects ---------------------------------------------------------- | Jackknife | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 3.03055 .0325148 1.67246 1.29323 ---------------------------------------------------------- Note: strata with single sampling unit centered at overall mean. . * the DEFT is also wrong: 1.29323 . . ********************* . . . * certainty PSUs got me the closest, so i'll go with that for the next round . svyset [pweight=weight], psu(v021) strata(strata) singleunit( certainty ) pweight: weight VCE: linearized Single unit: certainty Strata 1: strata SU 1: v021 FPC 1: . . . * the publication says . * 0.168 of women were never married . * with a standard error of 0.006 . * and a DEFT of 1.612 . svy: mean nm (running mean on estimation sample) Survey: Mean estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Design df = 551 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ nm | .1683825 .0055153 .157549 .1792161 -------------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * gives the correct mean and standard error (0.0055304) . estat effects ---------------------------------------------------------- | Linearized | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- nm | .1683825 .0055153 2.54089 1.59402 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * but the DEFT is still off: 1.59838 . . . . * the publication says . * there are an unweighted 1,710 records of women aged 40-49 . * confirm that the `ffn` has been built correctly: . tab ffn ffn | Freq. Percent Cum. ------------+----------------------------------- 0 | 9,988 85.38 85.38 1 | 1,710 14.62 100.00 ------------+----------------------------------- Total | 11,698 100.00 . . * the publication says . * there are a weighted 1,684 records of women aged 40-49 . svy: total ffn (running total on estimation sample) Survey: Total estimation Number of strata = 307 Number of obs = 11698 Number of PSUs = 858 Population size = 11697.9 Design df = 551 -------------------------------------------------------------- | Linearized | Total Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ ffn | 1684.034 69.48961 1547.537 1820.531 -------------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * gives 1,684.034 so i assume that's built correctly. . . * the publication says: among women aged 40-49 . * 6.550 children born . * standard error: 0.080 . * DEFT of 1.188 . . . svy, subpop( ffn ): mean v201 (running mean on estimation sample) Survey: Mean estimation Number of strata = 265 Number of obs = 11619 Number of PSUs = 804 Population size = 11556.2 Subpop. no. obs = 1710 Subpop. size = 1684.03 Design df = 539 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 6.549798 .079726 6.393186 6.70641 -------------------------------------------------------------- Note: 42 strata omitted because they contain no subpopulation members. Note: strata with single sampling unit treated as certainty units. . * this command gives mean: 6.549798 and SE: 0.0799961, . * which round to the two published numbers above . estat effects ---------------------------------------------------------- | Linearized | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 6.549798 .079726 1.39612 1.18157 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * but the DEFT is 1.18558, which is off by 0.002? . . . * the publication says . * "The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates." . * so for the fertility rate, i believe it should be "svy jackknife"? . svy jackknife, subpop( ffn ): mean v201 (running mean on estimation sample) Jackknife replications (858) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50 .......................s.......s.................. 100 ...................................s.............. 150 ..............................................s... 200 ........s........................s................ 250 ........s...................s..................... 300 ..............................................s... 350 ............s..................................... 400 .................................................. 450 .............................s.................... 500 .................................................. 550 ........................................s..s...... 600 ..............s...s.................s............. 650 ..s...............ss.............................. 700 ..s...........................ssss................ 750 .s......s.....ss..s..s...s..ss.....ss..s.....s.... 800 ..s..ss........ss..ss..s..ss..s.........s..ss..... 850 ...s..ss Survey: Mean estimation Number of strata = 265 Number of obs = 11619 Number of PSUs = 804 Population size = 11556.2 Subpop. no. obs = 1710 Subpop. size = 1684.03 Replications = 804 Design df = 539 -------------------------------------------------------------- | Jackknife | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ v201 | 6.549798 .0798322 6.392978 6.706619 -------------------------------------------------------------- Note: 42 strata omitted because they contain no subpopulation members. Note: strata with single sampling unit treated as certainty units. . * this command gives mean: 6.549798 and SE: 0.0798322, . * so the standard error no longer rounds correctly? . estat effects ---------------------------------------------------------- | Jackknife | Mean Std. Err. DEFF DEFT -------------+-------------------------------------------- v201 | 6.549798 .0798322 1.39984 1.18315 ---------------------------------------------------------- Note: strata with single sampling unit treated as certainty units. . * and here the DEFT is even further away: 1.18315 . . . . end of do-file .