The DHS Program User Forum: India » Merged HH and Individual Data

Home » Countries » India » Merged HH and Individual Data

Show: Today's Messages :: Show Polls :: Message Navigator

Merged HH and Individual Data [message #16072]

Thu, 01 November 2018 17:11

SR
Messages: 3
Registered: November 2018

Member

Hello

I am working with NFHS4 to look at women's variables and needed the household information for a full analysis. I have been able to successfully merge the individual and household member data (in Stata). After the merge, the total number of women is 6,99,686 which corresponds to the number of women interviewed in the NFHS4.

My question is:
1. I am assuming that multiple women from some households were interviewed. If so, how do i make sure i only have the one woman who responded to the household level questions in my analysis/ or one woman per household.
I do not want repetitions of household members- that is responses from women belonging to the same household, because i want to look at household level indicators.
My guess is listwise deletion during analysis will solve this problem to an extent, but is there any suggestion on how to choose only one respondent per HH?

Any help will be appreciated. thanks!

[Updated on: Thu, 01 November 2018 17:20]

Report message to a moderator

Re: Merged HH and Individual Data [message #16085 is a reply to message #16072]

Fri, 02 November 2018 10:40

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is a response from Senior DHS Stata Specialist, Tom Pullum:

I'm not sure what your question is. In order to do a merge of the IR and PR files for this survey, you must match hv024, hv001, hv002, hvidx in the PR file with v024, v001, v002, and v003, respectively, in the IR file. When there are multiple women respondents in the same household, this will give you all of them. Are you asking how to randomly select just one of these women, in households that had more than one woman? Why would you want to do that? If you do that, you will lose representativeness of the population of women (although you can restore representativeness by multiplying v005 by the number of women in the household) and you will reduce the effective sample size.

If you want to look at how the women's characteristics are related to household characteristics, you do not need to do any such subsampling. You would keep all the women, and just interpret the household's characteristics as characteristics of the woman. For some purposes, you could think of doing multi-level analysis and treating the household as the level 2 unit (and the cluster as level 3), but there is little gain from doing this because most households contain only one woman.

Report message to a moderator

Re: Merged HH and Individual Data [message #16089 is a reply to message #16085]

Fri, 02 November 2018 14:30

SR
Messages: 3
Registered: November 2018

Member

Thanks for your response and clarification. Yes, i was asking if i could select one woman when there were more than one woman in the household. The India report for NFHS4 says 601,509 households and 699,686 women were interviewed. Some households might not have any women, most as you suggest would have only one woman, and some would have more than one. In the latter case, with a merge, the household variables would be replicated for each woman in that household and i wanted to see if there was a way to take care of this replication. Thanks again.

Report message to a moderator

Re: Merged HH and Individual Data [message #16099 is a reply to message #16089]

Mon, 05 November 2018 07:27

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is another response from Senior DHS Stata Specialist, Tom Pullum:

You could do something like the following. Before you do the merge, assign the women in the IR file a random number; then sort the women in the household by this number; then keep the one who is first in the sort. Then do the merge with the household file. in the following, I use the HR file rather than the PR file, and just keep the household-level variables.

set more off
set maxvar 10000
use ...IAHR74FL.DTA" , clear
drop *_*
sort hv024 hv001 hv002 
save ...IAHRtemp.dta, replace

use ...IAIR74FL.DTA" , clear
gen rn=uniform()
sort v024 v001 v002 rn
egen sequence=seq(), by(v024 v001 v002)
tab sequence
keep if sequence==1
gen hv024=v024
gen hv001=v001
gen hv002=v002
sort hv024 hv001 hv002
merge hv024 hv001 hv002 using e:\DHS\DHS_data\scratch\IAHRtemp.dta
keep if _merge==3
drop sequence _merge

Report message to a moderator

Re: Merged HH and Individual Data [message #16159 is a reply to message #16099]

Mon, 12 November 2018 21:28

SR
Messages: 3
Registered: November 2018

Member

Thank you Dr Pullum. That is very helpful.
regards

Report message to a moderator

Re: Merged HH and Individual Data [message #17921 is a reply to message #16072]

Thu, 18 July 2019 07:19

sujata
Messages: 18
Registered: May 2019

Member

hello,
I am working on NFHS4 dataset and i am having trouble in matching the values with report. I have emerged PR, IR and MR files since some values are not in MR file. But when i am matching the values for variables like Religion, caste, marital status and schooling with the report provided, they are coming to be different. I have merged the files using hv024, hv001, hv002 and hvidx variables from PR file, v024, v001, v002 and v003 from IR file and same variables from MR file.
Any help will be highly appreciated.

Report message to a moderator

Previous Topic:	Village related questions (NFHS-3, NFHS-4)
Next Topic:	Matching Household number to education of household head

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Sat Jan 24 18:31:17 Coordinated Universal Time 2026