The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » India » NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis
NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #2134] Wed, 07 May 2014 10:45 Go to next message
drsanmis is currently offline  drsanmis
Messages: 2
Registered: May 2014
Member
Hi,
I am looking at a categorical outcome in the NFHS 3 data. Do I need to use PROC LOGISTIC or PROC SURVEY LOGISTIC in SAS. ( Published papers have used either of them).
If proc survey logistic is used the syntax requires using STRATA and CLUSTER statements. I am not sure which variables in the dataset signify strata and cluster.
It will be very helpful if someone can guide me on this.
Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #2167 is a reply to message #2134] Thu, 15 May 2014 08:58 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Our SAS Specialist is reviewing your posting, and we will get back to you soonest.

Thanks.
Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #2169 is a reply to message #2134] Thu, 15 May 2014 09:21 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Following is a response from our Senior SAS Specialist, Ruilin Ren.

For the use of the logistic regression, it really depends on your purpose of analysis. If you are interested in exploring patterns of the survey data themselves, but do not intend to extrapolate the entire survey population, you can use proc logistic. But if your intention is to explore patterns among the entire survey population, you need to use proc survey logistic. In the later case, you need to declare CLUSTER (variable HV001 at household level, V001 at individual level), STRATA (HV022 at household level, V022 at individual level). You also need to declare the sampling weight WEIGHT (HV005 divided by 1000000 at household level, and V005 divided by 1000000 at individual level).
Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #2660 is a reply to message #2169] Sat, 02 August 2014 22:22 Go to previous messageGo to next message
ABLR is currently offline  ABLR
Messages: 10
Registered: August 2014
Member
Just to doublecheck, the strata variable would be v022 and not v023 or a combination of v024 and v025? Isn't it unusual in DHS data for v022 to be the strata variable. [I'm using Stata, but I assume this wouldn't differ across software.] Thanks for checking!
Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #2739 is a reply to message #2660] Tue, 19 August 2014 11:45 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member
Another response from Ruilin Ren:

As for stratification variables for NFHS-3, the stratification should be HV024 (V024) crossing HV025 (V025). HV022 (V022)is not the stratification variable.

Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #6751 is a reply to message #2739] Thu, 09 July 2015 06:20 Go to previous messageGo to next message
Kathryn Kershaw is currently offline  Kathryn Kershaw
Messages: 3
Registered: July 2015
Location: London, UK
Member
Hi,

My query is clarification of this one. In Stata I used the following commands to survey set my data:

gen wgt = v005/1000000
svyset v021 [pweight = wgt], strata(v023)


But I found this thread and thought maybe I should create a new strata variable:

egen strata = group(v024 v025)

This produces a different variable from v023 with separate groups for the rural and urban aspects of each state, which v023 doesn't have.
Could you clarify which one I should use please?

Many thanks!

Kathryn
Re: NFHS 3 data - what variables define STRATA and CLUSTER for SAS analysis [message #6755 is a reply to message #6751] Thu, 09 July 2015 17:15 Go to previous message
Reduced-For(u)m
Messages: 292
Registered: March 2013
Senior Member

Check out the country report for the year/country you are using. It should tell you how the sampling was done (see the intro or the appendix). From that, you should be able to determine which the correct stratification is - it is not the same for every country, but my experience is that the stratification by regionXurban [group(v024 v025)] is the most commonly used.
Previous Topic: 2014-2015 DHS
Next Topic: Village codes for educational facility in 1992 and 1998
Goto Forum:
  


Current Time: Sat Nov 23 11:22:19 Coordinated Universal Time 2024