Merging DHS data in Stata [message #70] |
Wed, 20 February 2013 11:45 |
DHS user
Messages: 111 Registered: February 2013
|
Senior Member |
|
|
Could you kindly advice me on how to merge the Individual, Male and HIV Recode files in Stata?
|
|
|
Re: Merging DHS data in Stata [message #71 is a reply to message #70] |
Wed, 20 February 2013 11:46 |
Bridgette-DHS
Messages: 3214 Registered: February 2013
|
Senior Member |
|
|
Here is a response from one of our STATA experts Tom Pullum, that should answer your question.
Assuming you want a single file that includes men and women, as individuals, and has the HIV data merged onto the individual records. In Stata, I would go through the following steps:
Open the IR file and construct a new variables, sex=2 (female), for all cases. Save as file 1.
Open the MR file and construct a new variable, sex=1 (male), for all cases. For all variables you need that have an mv prefix, drop the m, so the prefix becomes v. Save as file 2.
Open the AR file and change the variable names exactly as you did. Sort on v001 v002 v003. Save as file 3.
Open file 1. Then APPEND file 2 to the end of file 1, getting a file with all the men and women as observations. Sort on v001 v002 v003. Merge with file 3, on v001 v002 v003. Drop any cases with hiv05 missing (these are cases with no HIV result). Save as file 4. This will be your working file.
In file 4, the preferred weight will be hiv05, rather than v005. The cluster variable will be v001 (it is duplicated in the v020's but we always use v001). There may be a variable that is identified as a stratum variable, e.g. v022, but we recommend that you use what is identified as the domain variable, e.g. v023. If domains are not given, you can construct a domain variable for virtually all the surveys as all combinations of region and urban/rural.
I hope this helps.
Bridgette-DHS
[Updated on: Mon, 18 March 2013 09:12] Report message to a moderator
|
|
|
|
|
|
|
|
|
|
|
|
Re: Merging DHS data in Stata [message #4377 is a reply to message #275] |
Sat, 16 May 2015 07:46 |
Malachi Arunda
Messages: 30 Registered: February 2014
|
Member |
|
|
Dear DHS expert,
Below is my syntax for merging HIV dataset AR to individual dataset IR using Tanzania 2011-12 dataset on SPSS. Please let know why I do not get the desired results. The combined dataset does not have HIV data for IR, rather the AR (HIV test results) data is scattered within the combined IR/AR dataset and seem to be no connection.
DATASET ACTIVATE DataSet4.
SORT CASES BY HIVCLUST(A) HIVNUMB(A) HIV01(A).
DATASET ACTIVATE DataSet1.
SORT CASES BY V001(A) V002(A) V003(A).
SAVE
RENAME VARIABLES (V001 V002 V003 =
HIVCLUST HIVNUMB HIVLINE).
EXECUTE
DATASET ACTIVATE DataSet6.
MATCH FILES /FILE=*
/FILE='DataSet4'
/BY HIVCLUST HIVNUMB HIVLINE.
EXECUTE.
Thank you
[Updated on: Sun, 17 May 2015 10:12] Report message to a moderator
|
|
|
|
|
Re: Merging DHS data in Stata [message #5493 is a reply to message #4393] |
Sat, 30 May 2015 13:46 |
Malachi Arunda
Messages: 30 Registered: February 2014
|
Member |
|
|
Dear DHS experts,
Thank you once again for the guidance, however, I still seem to be having hitches with Tanzania SPSS 2011-12 datasets and the HIV datasets. I have tried to merge the two datasets using the syntax guideline you provided in the forum but they seem to merge with different case IDs, and the HIV dataset merges only as new cases with own variables with. Here is the syntax I used. Could I get further guidance. Thank you.
DATASET ACTIVATE DataSet4.
SORT CASES BY hivCLUST(A) hivNUMB(A) hivLINE(A).
DATASET ACTIVATE DataSet1.
SORT CASES BY V001(A) V002(A) V003(A).
SAVE
RENAME VARIABLES (V001 V002 V003 =
hivCLUST hivNUMB hivLINE).
EXECUTE
DATASET ACTIVATE DataSet6.
MATCH FILES /FILE=*
/FILE='DataSet4'
/BY hivCLUST hivNUMB hivLINE.
EXECUTE.
Kind regards,
Malachi
[Updated on: Sat, 30 May 2015 13:53] Report message to a moderator
|
|
|
|
|
|
Re: Merging DHS data in Stata [message #6928 is a reply to message #5578] |
Tue, 04 August 2015 17:55 |
Malachi Arunda
Messages: 30 Registered: February 2014
|
Member |
|
|
Hallo Bridgette,experts,
Merging worked so well and the work is almost complete. However, incase I wanted to add v"age at death" variable from the children data variable to the already merged women and hiv dataset, how would I go about it? (The merged W + HIV datasets) assign NA to the 'age at death' variable) in spss.
Thank you,
Malachi
[Updated on: Tue, 04 August 2015 17:57] Report message to a moderator
|
|
|
|
|
Re: Merging DHS data in Stata [message #8154 is a reply to message #6940] |
Sun, 30 August 2015 06:21 |
Malachi Arunda
Messages: 30 Registered: February 2014
|
Member |
|
|
Dear Bridgette, experts,
I looked at Tanzania 2003-2004 under-5 mortality frequencies below (34.7%) and I was awed. Please encourage me that these figures are real or I made a mistake somewhere, perhaps during merging.Thank you. (Of course I restricted some variables)
Children born alive who died
Frequency Percent Valid Percent Cumulative Percent
Valid No 2704 65.3 65.3 65.3
Yes 1434 34.7 34.7 100.0
Total 4138 100.0 100.0
Kind regards,
Malachi
[Updated on: Sun, 30 August 2015 06:24] Report message to a moderator
|
|
|
Re: Merging DHS data in Stata [message #8155 is a reply to message #8154] |
Sun, 30 August 2015 14:36 |
Reduced-For(u)m
Messages: 292 Registered: March 2013
|
Senior Member |
|
|
I think your estimates are off by about triple. The DHS final report for Tanzania pegs the U5 mortality rate around 112/1,000, or about 10%.
Also, to just get the mortality rate, you don't need to merge anything, so... Is this just the HIV-positive sub-sample or something (glancing back over the thread)? In that case, given that your period would cover the late 1990's and early 2000's before lots of ART drugs were available, your 30% figure could be about right. But certainly it is too high for the whole sample.
|
|
|
|
Re: Merging DHS data in Stata [message #8161 is a reply to message #8156] |
Mon, 31 August 2015 12:27 |
Malachi Arunda
Messages: 30 Registered: February 2014
|
Member |
|
|
Thank you. Could I be mixing up reports, I can see the dhs/AIS 2003=4 report (http://dhsprogram.com/pubs/pdf/AIS1/AIS1.pdf) and then the 2004-5 report link you sent andI am using the 2003-4 dataset, could you please help clarify which one I could use. Like you say the mortality numbers are too high, I just obtained the raw frequencies, didn't calculate anything and that is what is what I have to use unless you advise me otherwise. And yes I merged the HIV data to the 2003-4 survey, however, I wanted to consider only last born children under-5, this variable is much easily selected in 2011-12 dataset but in 2003-4, I find it abit difficult to select these cases. Any help?
Thank you very much,
Malachi
|
|
|
|
|
|
|
Re: Merging DHS data in Stata [message #8347 is a reply to message #2496] |
Wed, 14 October 2015 09:38 |
kinsukmanisinha@gmail.com
Messages: 9 Registered: January 2015 Location: Milan
|
Member |
|
|
Dear DHS experts,
I am trying to merge the IR and PR file. I have read the comments on this page, about how to use m:m merge (question by
owraza and reply by DHS expert). I understand the entire procedure (use hv001 & hv002 from PR and v001
v002 from IR) but towards the beginning of the explanation the DHS expert mention that is is better to perform old merge as the new m:m merge may do something we dont want it to.
I have stata verion 12 and it does not allow me to use old merge, leaving m:m merge as the only option. However, the help file on stata suggest that I may use joinby, can I do this..???
So, if I use joinby
In the PR (household file):
rename hv001 v001
rename hv002 v002
sort v001 v002
In the IR (women file):
sort v001 v002
I am lost after this point, I cant use m:m merge as that is not advised and I dont know how to proceed further. I would appreciate any help.
Thanks a lot.
|
|
|
|
Re: Merging DHS data in Stata [message #9719 is a reply to message #8388] |
Tue, 10 May 2016 10:01 |
mianrashid
Messages: 13 Registered: February 2016 Location: Pau, France
|
Member |
|
|
I tried to merge the file of PR and IR, by following,
use "C:\Users\Rashid Javed\Desktop\DHS Data Set\Pakistan\Stata_PK_2012-13_DHS_01052016_926_87403\6_House hold Member Recode_pkpr61
> dt\PKPR61FL.DTA", clear
rename hv001 v001
. rename hv002 v002
. gen in_PR=1
. sort v001 v002
save "C:\Users\Rashid Javed\Desktop\temp.dta
use "C:\Users\Rashid Javed\Desktop\DHS Data Set\Pakistan\Stata_PK_2012-13_DHS_01052016_926_87403\3_Indiv idual Recode_pkir61dt\PKI
> R61FL.DTA", clear
gen in_IR=1
. sort v001 v002
. merge v001 v002 using C:\Users\Rashid Javed\Desktop\temp.dta
After this finally received this error,
(note: you are using old merge syntax; see [D] merge for new syntax)
variables v001 v002 do not uniquely identify observations in the master data
file C:\Users\Rashid.dta not found
r(601);
MianRashid
|
|
|
|
|
|
Re: Merging DHS data in Stata [message #9747 is a reply to message #9730] |
Thu, 12 May 2016 09:08 |
Bridgette-DHS
Messages: 3214 Registered: February 2013
|
Senior Member |
|
|
An additional response from Tom Pullum to the post: Quote:I tried to merge the file of PR and IR, by following.....
The problem is that you have not used the line numbers. The following should work. You do not really need in_IR and in_PR, but since you constructed them, I am keeping them.
use PKPR61FL.DTA, clear
rename hv001 v001
rename hv002 v002
rename hvidx v003
gen in_PR=1
sort v001 v002 v003
save temp.dta
use PKIR61FL.DTA, clear
gen in_IR=1
sort v001 v002 v003
merge v001 v002 v003 using temp.dta
keep if in_PR==1 & in_IR==1
drop _merge in_PR in_IR
[Updated on: Thu, 12 May 2016 09:12] Report message to a moderator
|
|
|
Re: Merging DHS data in Stata [message #9784 is a reply to message #9747] |
Tue, 17 May 2016 13:01 |
mianrashid
Messages: 13 Registered: February 2016 Location: Pau, France
|
Member |
|
|
Hello,
Thanks you.
I merged the men, women, and children level data (IR, MR, KR) as a household unit. Now i want to declare survey for data set, Please let me know command is correct?
gen wgt=hv005/1000000
(880 missing values generated)
. svyset [pw=wgt], psu(hv021) strata(hv023)
??
Many Thanks
Rashid
MianRashid
|
|
|
|
Re: Merging DHS data in Stata [message #11129 is a reply to message #9796] |
Wed, 02 November 2016 13:29 |
phres110
Messages: 39 Registered: October 2014 Location: korea
|
Member |
|
|
Bridgette-DHS
kindly help me...i want to merge WI (pkwi21) file with IR (pkir21) file in PDHS 1990 with SPSS. I tried many time by using the same commands that are already discussed here but failed. Kindly help me where i am wrong.
GET
FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkwi21sv\PKWI21FL.SAV'.
SORT CASES BY whhid withindf wlthind5.
SAVE OUTFILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\PKWI21FL.SAV'.
GET
FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkir21sv\PKIR21FL.SAV'.
SORT CASES BY V001 V002 V003.
RENAME VARIABLES (V001 V002 V003 =
whhid withindf wlthind5).
EXECUTE.
MATCH FILES /FILE=*
/FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkwi21sv\PKWI21FL.SAV'.
/BY whhid withindf wlthind5.
EXECUTE.
dhs110
|
|
|
|