Home » Countries » India » Appending multiple waves of the NFHS annd PSU codes
Appending multiple waves of the NFHS annd PSU codes [message #18169] |
Thu, 03 October 2019 07:41 |
Niranjana
Messages: 13 Registered: October 2019
|
Member |
|
|
Hello,
I am working with the women's module of the Indian NFHS. I would like to create a repeated cross sectional dataset using all four rounds of the survey using STATA ie append all 4 datasets. I have pulled the individual datasets for all 4 rounds and have kept only the variables I am looking at and recoded them. I would like to confirm that if I append these datasets, it would be a sound step to get the cross sectional dataset I am looking for.
Also, what is the difference between PSU numbers in NFHS 2 and 3 vs Cluster nos in NFHS 1 and 4? I want to use -svyset- on the data while running summary stats and analytics and since PSU is a key variable in that, I am not sure how to proceed with varying PSU numbers across waves. For example:
PSU ranges between
2015-16 : 10001 to 360482
2005-06 : 1001 to 33214
1998-99 : 1001 to 33214
1992-93 : 4 to 341
I would really appreciate if the experts at DHS could provide some guidance on that front. Thank you!
Niranjana
Thanks a lot!
Niranjana
[Updated on: Mon, 14 October 2019 11:21] Report message to a moderator
|
|
|
Re: Appending multiple waves of the NFHS annd PSU codes [message #18242 is a reply to message #18169] |
Fri, 18 October 2019 13:42 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
These four surveys have much different dates and sample sizes. You can append them as repeated cross-sections in order to make comparisons but I think it would be meaningless to regard them as a single cross-section, if that is what you are suggesting. For example, a CPR using all cases in the four surveys would not have a clear reference population. In general, in DHS datasets, the PSU is given both as v001 and v021 (or hv001 and hv021). The two almost always agree exactly. However, if they do not agree exactly, use v021 (or hv021). For a small number of surveys, only one or the other is included, in which case you use the one that is included.
|
|
|
|
svy and svyset after appending multiple waves of the NFHS [message #18304 is a reply to message #18242] |
Mon, 04 November 2019 05:19 |
Niranjana
Messages: 13 Registered: October 2019
|
Member |
|
|
Dear Bridgette and Tom,
While working with the appended (all four waves have been appended to form a repeated cross-sectional dataset) version of the DHS India women's folders, I've found that the variable for strata (V023) is coded differently across all years from 1992-93 to 2015-16. I would like to work with -svy- and -svyset- codes but am concerned about the variable definition varying significantly between survey waves. Currently, this is how I was hoping to run the codes to obtain SEs, CIs.
gen wt=v005/1000000
egen stratum = group(v024 v025)
svyset v021 [pw=wt], strata(stratum)
br stratum
svy : mean var_interest, over(v024)
However, I am unable to produce any standard errors or CI after this and I get the following error message:
Missing standard errors because of stratum with single sampling unit
I am also not sure if I need to generate a different weight variable for each year of the India women's survey rounds.
Do let me know how to proceed on this front.
Thank you,
Niranjana
[Updated on: Mon, 04 November 2019 05:43] Report message to a moderator
|
|
|
Re: svy and svyset after appending multiple waves of the NFHS [message #18327 is a reply to message #18304] |
Mon, 11 November 2019 08:52 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Specialist, Kerry MacQuarrie:
When appending surveys, you will have to do the following two steps:
1. Harmonize the strata variable;
2. Ensure that variable has unique values across all surveys.
(Same for the PSU variable).
Step 1:
The appropriate strata variable is usually v023 but not always. In our workshops, we train participants to match it by examining Appendix A. You may want to create a new variable, called "strata", before appending that is equal to v024 x v025 for NFHS-1 and NFHS-2, and equal to v022 for NFHS-3 and NFHS-4.
Step 2:
There may be equivalent values on this variable in 2 different surveys that refer to different strata (or PSUs). I typically handle this by adding a prefix, e.g. add 100 or 1,000 to the strata variable in the first survey, 200 or 2,000 to the strata variable in the second survey, etc. This should be done BEFORE appending. (Note: Because of the number of strata and PSUs in all of India's surveys, but especially NFHS-4, you may need more digits!)
|
|
|
Re: svy and svyset after appending multiple waves of the NFHS [message #18333 is a reply to message #18327] |
Mon, 11 November 2019 09:30 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
We apologize for the long delay with this reply. There is no need to append the files. You can append them if you want, of course, but if you do append them, the file will be very very large because of the size of the NFHS's.
You need to include v024 (or hv024) as part of the id if you do any merging of the India files (within a single round).
In DHS surveys in general, the PSU and cluster are usually the same, but not always. Usually the cluster is v001 and the psu is v021 and you can confirm that v001=v021. In the India surveys, as I recall, you have v021 and not v001. If you have both v001 and v021 and they are different, then svyset should use v021.
|
|
|
|
Re: svy and svyset after appending multiple waves of the NFHS [message #18361 is a reply to message #18335] |
Mon, 18 November 2019 07:38 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
In the appended file, I suggest that you includ a variable "survey" that is numbered 1, 2, 3, 4 for the successive NFHS's. You then construct the combined PSU number with one of the "egen" commands, specifically "group". That is, if the PSU is given by v021, the command would be "egen PSU_all=group(survey v021)". You need to do something similar for the strata, for example as "egen strata_all=group(survey v023)". There are other ways to develop unique id codes for the clusters and strata but this is the easiest way.
|
|
|
Goto Forum:
Current Time: Sat Nov 23 08:16:14 Coordinated Universal Time 2024
|