I'm doing an analysis with Stata by pooling all DHS surveys (children recode) from 1900 - 2013.

When I specifiy surveyset to account for the survey design while running the regression I get no standard errors due to the fact that there are Strata with only one PSU. In some cases this is based on the fact that observations are deleted by Stata while running the regression, due to missing outcome or explanatory variables. In other cases this is based on the sample design.

What should I do with Strata with a single PSU?

There is the option in Stata so that I can treat all of those PSU as PSU's that were selected into the sample with a propability of 1. Is this the right way to handle with all the Strata with only one PSU?

Thanks a lot for your help!

]]>

Second, if people are being dropped out of your model because of missing explanatory variables, you have bigger issues than the svyset problem you are describing. Perhaps they are missing values on just one, but not all of your explanatory variables. If you have substantial amounts of people being dropped from the model because of missingness of values, I suggest you find out what variables are causing people to drop out of your model. If they are dropping out in large amounts because of missing values on a small amount of variables, you can create a flag for missing for each variable where the value is 1 if the value is missing and 0 otherwise, and recode the missing values in your explanatory variables to 0. That way, you get to keep all of your observations that are not missing on the outcome in the model. The coefficient on the missing flags may not have meaningful interpretation, but at least you are not selectively losing people over one or two variables with missing values out of the 10 or however many you have in your model.

Either way, seems like you might be dealing with selection issues here. I would investigate the explanatory variables first and see if creating those flags helps. Then, I would look at the outcome variable. Are they missing at random or is there endogeneity/self-selection into answering/not answering the question used for your outcome?

HTH,

RHS

]]>

I am working with a couple of DHS surveys and I encountered the problem of strata with single PSU in three of them:

1)Tanzania 2011

2)Nigeria 2013

3)Cameroon 2004

I would like to ask what it means/why that is the case that all of the strata in these surveys have only one PSU? How should I deal with this problem if I want to use the svy: command in stata to account for the survey design?

Many thanks for any suggestions,

Ewa]]>

I recommend that you always add the "singleunit" option to svyset. There are three versions: centered, scaled, and certainty. It makes very little difference which one you use. I usually used the centered option. That is, you pick one of the following three versions of svyset, for example:

svyset v021 [pweight=v005], strata(stratumid) singleunit(centered)

svyset v021 [pweight=v005], strata(stratumid) singleunit(scaled)

svyset v021 [pweight=v005], strata(stratumid) singleunit(certainty)

]]>