I am trying to replicate % of children who received ORS among children who had diarrhea in the past 15 days from the Nigeria 2013 data. I am using SAS for my analysis. For country, region and urban/rural I get the correct point estimates and standard error only when I ignore the strata command in SAS and use only clusters. I know that the data was first stratified by state and then urban/rural, is it okay to ignore stratification while constructing confidence intervals? would this change if I wanted confidence intervals for each state? I would really appreciate any insight into this! am I missing something about the study/sample design?

Please find the code that gives me the correct answer below, adding the strata command in proc survey freq(strata = v022) to this I get the wrong answer-

/*creating variable for having been given ORS*/

data ors;

set ors;

if h13 = 1 or h13 = 2 then ors =1;

else ors =0;

run;

data ors;

set ors;

wgt = v005/1000000;

run;

/*h11 = 2 when child had diarrhea in the past 15 days, v024 = region*/

proc surveyfreq

data = ors;

where h11 =2 and v024 =5;

cluster v021;

weight wgt;

tables ors;

run;]]>

Quote:

The standard errors in the 2013 Nigeria DHS report were calculated using the region by urban/rural (V024 cross V025) as stratification variable, which is the case for most of the DHS surveys. However, the 2013 Nigeria DHS used a more detailed stratification which is sate cross urban/rural and the stratification is coded in V022, but attention must be paid when use V022 (for the 2013 Nigeria DHS) as it has one single-cluster stratum, the urban of AKWA IBOM state. You may need to collapse it with the rural stratum of the state because sampling error calculation will not allow single-cluster stratum. If you want to replicate the DHS results on standard errors, you need to use V024 cross V025 as stratification. But you can use V022 as stratification, this gives the exact stratification used in the sample selection and this may give you more accurate results for other analysis such as modeling.]]>

By the way, if you just want to produce point estimation with confidence interval (CI), you would better use Proc Surveymeans instead of Proc Surveyfreq. Proc Surveymeans will give you the 95% CI in the output. However, note that your results may not exactly match the published results, since DHS uses mean minus/plus 2 times the estimated standard error as confidence limits. While the SAS output uses mean minus/plus t(d.f, 95%) as confidence limits. You can use "Class" statement to produce results for urban/rural and regions separately within the Proc Surveymeans statement, and use "Domain" to produce results based on other social-economic, education or age group variables. Hope this helps.

1) If I was trying to produce the same health indicators for every state what would be the appropriate strata to use? Would I ignore the strata all together? or would I just do urban/rural (V025) since in the sampling scheme each state was divided into urban vs rural regions and then clusters were identified? I am aware that the standard errors could be large but the unweighted denominators for most of my indicators are atleast more than 25 children/people for most states, what is too low a sample size in DHS surveys to consider the result useful?

2)Is there a way t0 create confidence intervals in SAS for under 5 mortality estimates? I tried writing a program for the under5 mortality rates and can replicate the statistics in the report for country and regions for Nigeria 2013 but it seems too complicated to create Jackknife CI for those rates since they are a product of other ratios. Is there a simpler way to code for that?

Thanks so much for your guidance so far! I really appreciate your input. ]]>

Quote:

To produce state level results, you can use V022 (sate cross urban/rural) as stratification variable. In Proc Surveymeans, you can add "Domain SHSTATE". It should produce the indicator results on the state level with the SE and 95% CI. However, we do not recommend you produce state level estimates, since the survey sample was not designed for this purpose, especially for the main indicators. The enough sample size to produce a DHS indicator differs based on the indicator type. For the TFR and the mortality rates, at least 800 to 1000 households is required for producing reliable indicators (with acceptable precision).

Regarding the mortality rates, if you are able to replicate the rate itself, it would be easy to apply the Jackknife variance estimation by applying the formula given in the DHS report appendix B; you need to calculate a series of UMR each time by dropping one cluster, then apply the formula.

Thank you for your post.]]>