The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Sampling » Set up DRC data in Stata
Set up DRC data in Stata [message #2871] Thu, 04 September 2014 12:15 Go to next message
Ryan is currently offline  Ryan
Messages: 6
Registered: May 2013
Location: Washington
Member
Hello,

I would like to calculate means for a number of variables across provinces in the DRC (2007 dataset). I am not sure if the sampling procedure would preclude this and I am a little confused about how to set up my analysis (weights, strata, psu) in Stata. The main issue is that there are no reported strata (v022). I understand that the DRC had an uncommon, 3 stage sample design, but between my poor French and the new design, I'm not sure how to proceed.

Any advice would be greatly appreciated.

Thank you,

Ryan
Re: Set up DRC data in Stata [message #2876 is a reply to message #2871] Fri, 05 September 2014 13:33 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

In this survey the stratum variable can be constructed as v024 x v025. In all the DHS surveys I know about, this will give the strata or be the best approximation if, say, v022 is not in the data. The summarize command, and most related commands, will usually not work with weights, but "mean(y), over(x)" will work. For example, for y=v201 (children ever born), here is the syntax:

* Open CDIR50FL.dta

* region or province: v024
* children ever born: v201
* stratum: v024 x v025
* weight: v005
* cluster: v001 or v021 (the same)

egen stratum=group(v024 v025)
svyset v001 [pweight=v005], strata(stratum)
svy: mean v201, over(v024)

Re: Set up DRC data in Stata [message #2877 is a reply to message #2876] Fri, 05 September 2014 13:38 Go to previous messageGo to next message
Ryan is currently offline  Ryan
Messages: 6
Registered: May 2013
Location: Washington
Member
Excellent. Thank you Tom (and Bridgette).
Re: Set up DRC data in Stata [message #2878 is a reply to message #2877] Fri, 05 September 2014 13:39 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
You're welcome.
Re: Set up DRC data in Stata [message #11761 is a reply to message #2878] Tue, 07 February 2017 02:05 Go to previous messageGo to next message
soniwe is currently offline  soniwe
Messages: 7
Registered: October 2016
Location: Auckland
Member
Hi there,

I have just seen this post and have come across the same problem with DRC 2007. However, the report definitely says that there were 34 sampling strata created using three strata (statutory city, city, rural area) for 10 of the 11 provinces (except Kinshasa). I can't find any variable that relates to these three strata, and even if there was, that would create 31 strata not 34! Using v024 and v025 gives 21 strata. I am now quite confused about how this stratification was done. Can you give any more clarification on this?

I'm having a similar problem with Uganda 2011. There is no data for sample strata in either v022 or v023. The report states that a stratified design was used, but does not explicitly say what the strata are. I assumed v024 and v025 (region and urban/rural area of residence), but just wanted to verify, as Uganda 2006 DHS seems to use region plus refugee camps, not urban/rural.

Thanks,

Sonia
Re: Set up DRC data in Stata [message #11783 is a reply to message #11761] Wed, 08 February 2017 14:31 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Following is a response from DHS Senior Sampling Specialist, Ruilin Ren:

Regarding the DRC 2007, the stratification was given in the appendix A, urban was slit by two categories, and the rural, since Kinshasa has only Statutory city, so there were 34 sampling strata. But the data file may not give the two categories of urban. Crossing V024 and V025 was not correct for the stratification. If you want to know which stratum is which, produce a frequency table on number of clusters by using the v024 and the stratification variable, compare it with the sample allocation table, you will know exactly which stratum is which.

Regarding the Uganda 2011, it is true that the appendix B did not give the stratification details, but the sample allocation table in appendix A gives the stratification because the sample was allocated by stratum. V024 crossing V025 will give the correct stratification.
Re: Set up DRC data in Stata [message #11785 is a reply to message #11783] Thu, 09 February 2017 02:23 Go to previous messageGo to next message
soniwe is currently offline  soniwe
Messages: 7
Registered: October 2016
Location: Auckland
Member
Thanks for your response, and the clarification on the Uganda 2011 sample. For DRC 2007, I have now understood why there are 34 strata, but I am still not sure what to use as the "stratification variable" as v022 has no data in it and I can't find anything else that looks relevant for the three strata ville statutaire, cite, secteur/chefferie, and the four districts of Kinshasa. How do I correctly create a strata variable for the svyset command?
Re: Set up DRC data in Stata [message #11799 is a reply to message #11785] Fri, 10 February 2017 08:32 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Following is another response from Ruilin Ren:

I do not know why the stratification variable was not coded in the final data. However, if your aim is just for statistical analysis, and not intended to compare the difference between the urban categories (meaning you do not need to identify the stratum for restricted analysis), then using V024 crossed with V025 as stratification will be good enough, although it is not the exact stratification. Actually, the two different categories of the urban areas do not differ much in the Congo DR settings.
Previous Topic: Questionnaires applied to sub-samples
Next Topic: AREG Model and Svyset
Goto Forum:
  


Current Time: Sat Nov 23 13:06:56 Coordinated Universal Time 2024