The DHS Program User Forum      
Discussions regarding The DHS Program data and results
Home » Countries » Uganda » 2011 stata data file
2011 stata data file [message #8408] Sat, 24 October 2015 13:02 Go to next message
rsroche is currently offline  rsroche
Messages: 2
Registered: October 2015
Member
Hi,

I am using the 2011 Stata data set for Uganda and it appears to be missing any data for the HV022 Sample Strata variable. Without this I am not sure how I can apply the survey weighting, is there a reason why it is missing or should I be using another variable?

Thanks for your help,

Rachel
Re: 2011 stata data file [message #8421 is a reply to message #8408] Mon, 26 October 2015 14:47 Go to previous messageGo to next message
Reduced-For(u)m
Messages: 250
Registered: March 2013
Senior Member


Which recode are you using? Is the variable "V022" there?

Re: 2011 stata data file [message #8427 is a reply to message #8421] Tue, 27 October 2015 01:12 Go to previous messageGo to next message
rsroche is currently offline  rsroche
Messages: 2
Registered: October 2015
Member
It's the Household recode. V022 is not there. I have been advised elsewhere that there has been confusion about HV022 as it is not the Strata variable but the sample strata used for sampling errors. I'm not sure now which variable I should have been using as Strata and maybe this is the wrong one, but I've used HV022 in my analyses for other data sets and it seemed to work. Any advice greatly appreciated.

Thanks for your help.

Rachel
Re: 2011 stata data file [message #8461 is a reply to message #8427] Thu, 29 October 2015 14:54 Go to previous messageGo to next message
Liz-DHS
Messages: 979
Registered: February 2013
Senior Member
Dear User,
Here is a response from Senior Statistician, Dr. Ruilin Ren,
Quote:

Variable V022 is at individual level, HV022 is at household level, but the two variables have the same contents. In older survey data sets, HV022/V022 might be codes for the paired clusters used for sampling error calculation, even the label of the variable is sampling stratum. Checking the frequency of the variable will help you understand if it is paired clusters (HV022/V022 has a large number of categories) or not. In recent surveys, HV022/V022 is labeled as "sample strata used for sampling errors", it is actually the sampling stratum used in sample selection. Another variable HV023/V023 is labeled as "sample strata used for sample selection", it is actually the same thing as HV022/V022 without exception. It might be the case that not all of the two variables are coded, use the one which is coded. Again, you can check the frequency of the variable. In any case, if you use HV025/V025 crossing HV024/V024, you will get the correct stratification in most of the cases, or it is close enough to the true stratification which will be good enough for all analysis purposes where stratification information is requested.

If you have additional questions, please feel free to post again. Thank you!
Re: 2011 stata data file [message #8479 is a reply to message #8461] Mon, 02 November 2015 13:26 Go to previous messageGo to next message
luke is currently offline  luke
Messages: 4
Registered: November 2015
Location: Baltimore MD
Member
I have the question as Rachel. In other data sets, I use the hv022 sampling strata to calculate confidence intervals and standard errors on binary variables, but in Uganda 6 hv022 consists only of null values. Rachel - it should be possible to recreate hv022 from sampling frame description in the final report, but I have not done this. Have you had any luck?
Re: 2011 stata data file [message #8481 is a reply to message #8479] Mon, 02 November 2015 14:36 Go to previous messageGo to next message
luke is currently offline  luke
Messages: 4
Registered: November 2015
Location: Baltimore MD
Member
Update: The DHS Final Report for Uganda 2011 (FR264) says "clusters, were selected from among a list of clusters sampled for the 2009/10 Uganda National Household Survey (2010 UNHS)." The UNHS 2010 report is available for download online, and that report says that ten regions compose the strata. UNHS did not divide each region into urban/rural for sampling, which we often see in other surveys and which had been done for an earlier UNHS.

So, I believe that you can pass region to svyset as the strata. That would certainly be true if you were analyzing UNHS data, but since the Uganda 2011 DHS clusters are a random sample of UNHS clusters, I'm not totally sure this is correct. Perhaps there is no clearly defined "strata" which is why hv023 is left as null. In any case, it might be good enough for your purposes.
Re: 2011 stata data file [message #8630 is a reply to message #8481] Mon, 23 November 2015 09:25 Go to previous messageGo to next message
duke2015
Messages: 27
Registered: March 2015
Location: United States
Member
Hi Luke,

I looked online and found this link for the UNHS 2010 report: http://www.ubos.org/UNHS0910/unhs200910.pdf. If you look at the sampling section, it says that :

The UNHS 2009/10 sample was designed to allow reliable estimation of key
indicators for the Uganda, rural-urban, and separately for ten sub regions. A
two-stage stratified sampling design was used. At the first stage,
Enumeration Areas (EAs) were grouped by districts and rural-urban
location; then drawn using Probability Proportional to Size (PPS). At the
second stage, households which are the Ultimate Sampling Units were
drawn using Systematic Sampling.

It looks like they used district (not sure if this is same as region?) and rural/urban for their stratification. Where were you looking where they said they didn't use rural/urban?
Re: 2011 stata data file [message #11425 is a reply to message #8630] Mon, 19 December 2016 10:40 Go to previous message
jack.murphy is currently offline  jack.murphy
Messages: 10
Registered: September 2016
Location: USA
Member

Hello all,

I am using the Individual Recode 2011 Uganda DHS data set in Stata, and have encountered a very high number of missing values (90-100% of the values) for the variables v023, v024, v101 and v139. Is there another variable that contains the region/district values? My goal is to make a "stratumid" variable from v025 (urban/rural) and region.

Thanks,
Jack
Previous Topic: Wealth Index - Uganda 2000
Next Topic: identifying districts
Goto Forum:
  


Current Time: Wed Jun 28 08:14:16 Eastern Daylight Time 2017