The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Topics » General » Total number households reported vs v002 (Unable to match total number of households in reports to v002/hv002 variables)
Total number households reported vs v002 [message #20267] Fri, 16 October 2020 22:50 Go to next message
Glory is currently offline  Glory
Messages: 17
Registered: April 2017
Location: Newcastle
Member

Hi,

I'm trying to combine data sets from 2003 and 2008 kenya DHS. However it appears the reported total number of households(in the DHS reports) do not correspond to the total number in the "v002" variable for both surveys. Also the total number of clusters is far more than total households in both surveys (400 vs 203) specifically in the women file. This doesnt seem right.

Please let me know if there is something I'm missing.

Thanks

Re: Total number households reported vs v002 [message #20277 is a reply to message #20267] Mon, 19 October 2020 10:00 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2025
Registered: February 2013
Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

The cluster id (v001) is unique for each cluster. However, the household id (v002) is within clusters. That is, each cluster has a household 1, etc. Similarly, line number (v002) is numbered within households. Perhaps that is the source of the discrepancy.

To find out how many households there are you could enter a line such as "egen hh=group(v001 v002)" and then "summarize hh". The maximum value of hh would be the number of households in the survey.
Re: Total number households reported vs v002 [message #20279 is a reply to message #20267] Mon, 19 October 2020 10:40 Go to previous messageGo to next message
Glory is currently offline  Glory
Messages: 17
Registered: April 2017
Location: Newcastle
Member

Thank you for the quick response.

I doublechecked the procedure to estimate total numebr of households as advised. The numbers of households in the data sets(individual women file) still do not correspond to the reports for the 3 survey years.

2003 - 6159 vs 8561
2008/09 - 6458 vs 9057
2014 - 24705 vs 36430

Please let me know if there is something I have missed.

Thank you.
Re: Total number households reported vs v002 [message #20283 is a reply to message #20279] Mon, 19 October 2020 14:12 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2025
Registered: February 2013
Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

To get the number of households in the survey you should use either the HR or PR files. What I said before will give you the number of households in the IR file, i.e. households in which there were eligible and interviewed women. Many households do not have any women who were eligible (hv117=1) and interviewed. That accounts for the difference.
Re: Total number households reported vs v002 [message #20293 is a reply to message #20283] Tue, 20 October 2020 05:20 Go to previous messageGo to next message
Glory is currently offline  Glory
Messages: 17
Registered: April 2017
Location: Newcastle
Member

This was helpful, many thanks. I also wanted to clarify if the clusters and households from all surveys are randomly selected from the same master list such that there is a potential temporal correlation within clusters and households since a psu/household may be selected multiple times across surveys.

Thank you.
Re: Total number households reported vs v002 [message #20294 is a reply to message #20293] Tue, 20 October 2020 09:03 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2025
Registered: February 2013
Senior Member

Following is another response from DHS Research & Data Analysis Director, Tom Pullum:

The clusters are enumeration areas from the latest census, sampled with probability proportional to size. If a census occurs between two surveys, the sampling frame is updated. There is no correspondence between the cluster numbers in a DHS survey and the numbering system for the enumeration areas in the sampling frame, or between the clusters numbers in two successive surveys (except for the continuous surveys in Peru and Senegal). There is no way to know whether a specific enumeration area has appeared in two DHS surveys, but it is believed that the chance of this is extremely small. The chance that the same household would appear in two DHS surveys is much smaller than that. You can safely consider two successive surveys to be statistically independent.
Re: Total number households reported vs v002 [message #20318 is a reply to message #20294] Thu, 22 October 2020 09:08 Go to previous message
Glory is currently offline  Glory
Messages: 17
Registered: April 2017
Location: Newcastle
Member

This was helpful. Thank you.
Previous Topic: Change in data values for "Treatment of Diarrhea" indicator?
Next Topic: Aggregating up to district level in NFHS-4
Goto Forum:
  


Current Time: Sat Nov 28 03:22:46 Coordinated Universal Time 2020