* do e:\DHS\Requests_and_queries\questions_on_merges\PH_health_do_25Aug2015.txt use e:\DHS\DHS_data\PR_files\PHPR61FL.dta cd c:\scratch * It appears that all the relevant variables have the form sh2*_1 through sh2*_9 * Say that there is another set of variables you want to save. I will say they are hv0* and hv1* set more off drop if sh207==. | sh207==0 save tempa.dta, replace * all the desired information is exactly the same on every line in the household; keep one line per hh keep hv001 hv002 hvidx sh208*_* sh209*_* sh210*_* sh211*_* sh212*_* sh213a*_* * add any other variables that you want to keep sort hv001 hv002 hvidx save tempb.dta, replace keep if hvidx==1 egen maxline=rowmax(sh208_*) tab maxline * the highest line number actually used is 17 * loop through all 9 subscripts and 17 lines. This approach will omit the 6 cases with line 0, i.e. the person died. local lj=1 while `lj'<=17 { gen newsh209_`lj'=. gen newsh210_`lj'=. gen newsh211_`lj'=. gen newsh212_`lj'=. gen newsh213a_`lj'=. * similarly for any other variables that you want to keep local lj=`lj'+1 } local lj=1 quietly while `lj'<=17 { local li=1 while `li'<=9 { replace newsh209_`lj' =sh209_`li' if sh208_`li'==`lj' replace newsh210_`lj' =sh210_`li' if sh208_`li'==`lj' replace newsh211_`lj' =sh211_`li' if sh208_`li'==`lj' replace newsh212_`lj' =sh212_`li' if sh208_`li'==`lj' replace newsh213a_`lj'=sh213a_`li' if sh208_`li'==`lj' * similarly for the other variables that you want to keep local li=`li'+1 } local lj=`lj'+1 } keep hv001 hv002 new* reshape long newsh209_ newsh210_ newsh211_ newsh212_ newsh213a_, i(hv001 hv002) j(hvidx) * add any other sh variables rename *_ * drop if newsh209==. sort hv001 hv002 hvidx merge hv001 hv002 hvidx using tempb.dta keep if _merge==3 drop _merge label values newsh209 SH209 label values newsh210 SH210 label values newsh211 SH211 label values newsh212 SH212 label values newsh213a SH213a * add any other sh variables * the following line can be used for checking that the procedure has worked correctly *list hv001 hv002 hvidx sh208_1 sh208_2 sh208_3 sh209_1 sh209_2 sh209_3 newsh209 if _n<=100, table clean nolabel keep hv001 hv002 hvidx new* list if _n<=100, table clean nolabel * if you want any other variables from the PR file, you can merge the current file * with tempa.dta which was saved above