The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Countries » Ethiopia » Merging issue
Re: Merging issue [message #5624 is a reply to message #5589] Wed, 17 June 2015 14:22 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3230
Registered: February 2013
Senior Member
Following is a response from DHS Senior Stata Specialist, Tom Pullum:

As I understand it, you want to merge the AR data with the IR data. Here are Stata lines to do this:

use c:\DHS\DHS_data\AR_files\ETar61FL.dta, clear
rename hivclust v001
rename hivnumb v002
rename hivline v003
sort v001 v002 v003
save c:\DHS\DHS_data\scratch\temp.dta, replace
use c:\DHS\DHS_data\IR_files\ETIR61FL.dta, clear
sort v001 v002 v003
merge v001 v002 v003 using c:\DHS\DHS_data\scratch\temp.dta
tab _merge
keep if _merge==3
drop _merge

You will need to change the paths, of course. I am using the old version of the merge command, but the version you used would work equally well. Your formula for an id code did not produce unique values. Look at the following results for the IR file:

gen long id=((1000+v001)*10000)+(v002*100)+v003

. gen n=1

. collapse (sum) n, by(id)

. tab n

index.php?t=getfile&id=403&private=0

There are 601 id codes that appear 2 times, 28 that appear 3 times, and 1 that appears 4 times. It is safer to use a hierarchical sort command, such as " sort v001 v002 v003". Also easier. To get a truly unique id you could use this: "egen id=group(v001 v002 v003)". Your formula with powers of 10 will not always work.
  • Attachment: tab.jpg
    (Size: 14.24KB, Downloaded 852 times)
 
Read Message
Read Message
Previous Topic: Rural Addis PSUs in 2005 Ethiopia Sample
Next Topic: Child marriage prevalence discrepancies
Goto Forum:
  


Current Time: Fri Feb 7 18:25:27 Coordinated Universal Time 2025