The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » Merging from multiple countries
Merging from multiple countries [message #16054] Tue, 30 October 2018 18:47 Go to next message
no7321
Messages: 2
Registered: January 2018
Location: usa
Member
Hi,

I created a pooled dataset with the most recent IR files of 6 South Asian countries (Afghanistan, Bangladesh, India, Maldives, Nepal, Pakistan)

I realized that I also needed a variable from the household files (how households treated their water, i.e. boiled etc.) and so wanted to merge HR data files with the pooled IR data set I created. I first created a pooled HR dataset with all of the same countries and then executed the following code:

/*merge HR to IR*/
use "E:\South Asia\dta\sasiahr.dta"
gen v000=hv000
gen v001=hv001
gen v002=hv002
gen v003=hv003
sort v000 v001 v002 v003
save "E:\South Asia\dta\sasiahr.dta", replace

use "E:\South Asia\dta\sasia.dta"
sort v000 v001 v002 v003
merge m:m v000 v001 v002 v003 using "E:\South Asia\dta\sasiahr.dta"

However, there are still many individual women from the IR dataset that are not matched to household records. The results from the merge show the following:

Result # of obs.
-----------------------------------------
not matched 935,707
from master 521,319 (_merge==1)
from using 414,388 (_merge==2)

matched 259,242 (_merge==3)


I'm wondering why I have 521,319 women from my master file (the IR file) not merged with the HR files? I thought all women interviewed would have a household record I could match them to.

Thanks for the help.


Re: Merging from multiple countries [message #16100 is a reply to message #16054] Mon, 05 November 2018 07:30 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1495
Registered: February 2013
Senior Member
Following is a response from DHS Senior Stata Specialist, Tom Pullum:

In the IR files, line number is v003, but in the HR and PR files the line number is hvidx, NOT hv003. (In those files, hv003 is the line number of the household respondent and is the same for everyone in the same household.) You just need to make that change.
Previous Topic: STATA codes to merge women and household datasets
Next Topic: merging variables from HR to BR
Goto Forum:
  


Current Time: Wed Nov 14 13:31:15 Eastern Standard Time 2018