The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » South Africa (1998): Merging Wealth Index with Individual Recode File in Stata
South Africa (1998): Merging Wealth Index with Individual Recode File in Stata [message #882] Tue, 29 October 2013 10:48 Go to next message
synvox is currently offline  synvox
Messages: 2
Registered: October 2013
Member
I tried to merge Wealth Index file for South Africa (ZAWI31FL.DTA) with the Individual Recode file, but was unable to. I created a new variable in the IR file called whhid, from a substring of the CASEID:

gen whhid = substr(caseid,1,12) 


I then tried to merge the IR and WI files:

  merge whhid using "..."


I was told that "the variable whhid does not uniquely identify observations". How can I merge the WI and IR files in Stata?

I understand that the CASEID is 15 character string variable that is the household ID + 3 characters that represent the individual (CASEID = HHID + V003). I do not understand how to merge the two files given that it seems that there are no common/shared id variables.

Please provide Stata specific coding in your reply if possible!
Re: South Africa (1998): Merging Wealth Index with Individual Recode File in Stata [message #884 is a reply to message #882] Tue, 29 October 2013 11:17 Go to previous messageGo to next message
synvox is currently offline  synvox
Messages: 2
Registered: October 2013
Member
The attached file is from another question in another section of the forum (Data use in Stata). It demonstrates merging WI and IR files from Ethiopia.

The file worked adjusting for South Africa file names.

My only concern: the IR file contained fewer observations than the WI file. How is this possible? Are the merged data for the new IR file (Original IR + WI) reliable?

Thank you for the Stata help!

[Updated on: Tue, 29 October 2013 11:18]

Report message to a moderator

Re: South Africa (1998): Merging Wealth Index with Individual Recode File in Stata [message #915 is a reply to message #884] Fri, 01 November 2013 14:13 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3017
Registered: February 2013
Senior Member
Here is a comment from one of our STATA experts Tom Pullum, that should answer your question.

Instead of extracting a substring of caseid, it is better to use v001 and v002. That is, open the WI file; then "sort v001 v002"; then open the IR file; "sort v001 v002"; then "merge v001 v002" using the sorted WI file.

Previous Topic: Merging Women's & HIV Datasets
Next Topic: V001 and V002 do not uniquely identify observations in certain files?
Goto Forum:
  


Current Time: Fri Mar 29 03:52:13 Coordinated Universal Time 2024