The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Ethiopia » Confirming the correctiness of mereging two datasets
Re: Confirming the correctiness of mereging two datasets [message #25756 is a reply to message #25738] Mon, 05 December 2022 13:13 Go to previous messageGo to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3064
Registered: February 2013
Senior Member
Following is a response from Senior DHS staff member, Tom Pullum:

Your merge is ok. The WI file has one record for every household in the survey, and there were 7095 households that had no children under 5. Below I will give the Stata code for this merge, because it shows how to unpack whhid in the WI file. I use an older version of the merge command, which I prefer because it does not require spedifying 1:m, etc.

* Specify a workspace
cd e:\DHS\DHS_data\scratch

* Prepare the WI file
use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\ETWI41FL.DTA" 
describe hhid

* whhid is str12

forvalues li=1/12 {
gen col`li'=substr(whhid,`li',1)

list col* if _n<=20, table clean
tab1 col*

* It appears that hv001 is cols 7-9 and hv002 is cols 10-12
gen hv001=substr(whhid,7,3)
gen hv002=substr(whhid,10,3)

destring(hv001), generate(cluster)
destring(hv002), generate(hh)

sort cluster hh
save ETWItemp.dta, replace

* Prepare the KR file
use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\ETKR41FL.DTA" 
summarize v001 v002
* v001 and v002 have 1-3 columns 
gen cluster=v001
gen hh=v002

list cluster hh if _n<=20, table clean
sort cluster hh

* Do the merge
merge cluster hh using ETWItemp.dta
tab _merge

* _merge=2 for 7095 cases; these are households that have no children under 5; drop them

drop if _merge==2
drop _merge

Read Message
Read Message
Read Message
Previous Topic: Timing of variables' collection
Next Topic: Appropriate handling of missing values in analysis
Goto Forum:

Current Time: Thu May 23 10:51:53 Coordinated Universal Time 2024