| Home » Topics » Wealth Index » Merge wealth index with individual recode - Malawi 2000 Goto Forum:
	| 
		
			| Merge wealth index with individual recode - Malawi 2000 [message #10793] | Wed, 14 September 2016 20:58  |  
			| 
				
				
					|  jswindle Messages: 5
 Registered: September 2016
 | Member |  |  |  
	| Hi all, 
 I would like to merge the household wealth index data file for Malawi 2000 with the individual recode data file for Malawi 2000. I would like to do this because I am then merging the Malawi 2000, 2004, and 2010 individual recode data files together, and the 2004 and 2010 data files include the wealth index variables (v190 and v191), but the 2000 data file does not.
 
 My understanding is that I will need to merge the 2000 wealth index data file with the 2000 household recode data file first, after which I can merge this new edited 2000 household recode data file with the 2000 individual recode data file. If that is incorrect and I can merge the 2000 wealth index data file with the 2000 individual recode data file directly, then I would prefer to do that, but am not sure how I would do it.
 
 When I follow my plans to merge the wealth and household files first, and then merge the household and individual files, I run into problems with the output of the merge between the 2000 individual recode and 2000 household recode data files. The files merge, but the merge "match" results appear to be quite off. The total n for the new merged file is 20,052, which is the total n of the two data files put together, rather than the n of the original individual recode data file alone, which is 13,220.
 
 Below is the code I am using in Stata. Any advice would be appreciated. Thank you in advance.
 
 STATA CODE
 
 * Add wealth index to 2000 individual recode
 use "H:\DHS\Malawi\2000 wealth recode\MWwi42fl.DTA", clear
 rename whhid hhid
 merge 1:1 hhid using "H:\DHS\Malawi\2000 household recode\MWHR41FL.DTA"
 drop _merge
 rename hv001 v001
 rename hv002 v002
 rename hv003 v003
 rename wlthindf v190
 rename wlthind5 v191
 keep v001 v002 v003 v190 v191
 sort v001 v002 v003
 save "H:\DHS\Malawi\2000 wealth recode\wealthmergeprep.DTA", replace
 clear
 use "H:\DHS\Malawi\2000 individual recode\MWIR41FL.DTA", clear
 sort v001 v002 v003
 merge m:1 v001 v002 v003 using "H:\DHS\Malawi\2000 wealth recode\wealthmergeprep.DTA"
 
 
 This is the output generated after the merge command:
 
 . merge m:1 v001 v002 v003 using "H:\DHS\Malawi\2000 wealth recode\wealthmergeprep.DTA"
 
 Result                           # of obs.
 -----------------------------------------
 not matched                        12,671
 from master                     5,839  (_merge==1)
 from using                      6,832  (_merge==2)
 
 matched                             7,381  (_merge==3)
 -----------------------------------------
 
 .
 end of do-file
 
 
 
 
 
 |  
	|  |  |  
	| 
		
			| Re: Merge wealth index with individual recode - Malawi 2000 [message #10835 is a reply to message #10793] | Thu, 22 September 2016 20:56   |  
			| 
				
				
					| Liz-DHS Messages: 1516
 Registered: February 2013
 | Senior Member |  |  |  
	| Dear User, A response from one of our experts, Dr. Sarah Staveteig:
 Quote:
 The household ID (whhid) in the Malawi 2000 wealth index file is a concatenation of the cluster number and the household number. It is similar to the caseid string found in the IR file except that caseid also has the respondent's line number at the end.* To directly merge the IR file with the WI file, you have three options: (1) create a new whhid in the IR file based only on v001 and v002, (2) parse out a numeric cluster number and numeric household number from whhid in the WI file to merge with v001 and v002, or (3) truncate the caseid string in the IR file to match whhid by excluding the respondent's line number.
 
 Option 1 requires that you know the rules for spacing strings within whhid and add a certain number of spaces based on the number of digits in the cluster number. Option 2 involves splitting a string and transforming it into numeric values. Option 3 is the most straightforward, as you only need to know the length of whhid and truncate caseid to that width.
 
 To determine how long whhid is in the WI file, you can click on "variable properties" in the bottom right corner of your Stata screen, or type:
 
 
 Quote:
 Stata will tell you that it is a 12-character string. Based on that information, you can use the substring command to return that number of characters from caseid in the MWIR41FL file. Namely:
 
 
 
gen whhid = substr(caseid,1,12)
Quote:
 Now, if you merge whhid between IR and WI files they will match.
 
 For clarity, the full code is:
 
 
 
use MWIR41FL
gen whhid = substr(caseid,1,12)
merge m:1 whhid using "MWwi42fl"
*  According to the DHS recode manual, CASEID is typically a concatenation of v001, v002, and v003. To confirm that this is true in Malawi, browse the first few columns of MWIR41FL or open that dataset and type
 
 
list caseid v001-v003 in 1/10 
Then browse the first few rows of MWwi42fl or open it and type
 
 You can see that caseid is indeed v001, v002, and v003 concatenated together with extra spaces for alignment and that whhid is similar to caseid except--as it is a household-level dataset--there is no respondent line number.
 [Updated on: Thu, 22 September 2016 21:10] Report message to a moderator |  
	|  |  |  
	|  |  
	|  |  
	|  | 
 
 
 Current Time: Sun Oct 26 05:59:35 Coordinated Universal Time 2025 |