The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » Merging DHS data in Stata
Merging DHS data in Stata [message #70] Wed, 20 February 2013 11:45 Go to next message
DHS user is currently offline  DHS user
Messages: 95
Registered: February 2013
Senior Member
Could you kindly advice me on how to merge the Individual, Male and HIV Recode files in Stata?
Re: Merging DHS data in Stata [message #71 is a reply to message #70] Wed, 20 February 2013 11:46 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Here is a response from one of our STATA experts Tom Pullum, that should answer your question.

Assuming you want a single file that includes men and women, as individuals, and has the HIV data merged onto the individual records. In Stata, I would go through the following steps:

Open the IR file and construct a new variables, sex=2 (female), for all cases. Save as file 1.

Open the MR file and construct a new variable, sex=1 (male), for all cases. For all variables you need that have an mv prefix, drop the m, so the prefix becomes v. Save as file 2.

Open the AR file and change the variable names exactly as you did. Sort on v001 v002 v003. Save as file 3.

Open file 1. Then APPEND file 2 to the end of file 1, getting a file with all the men and women as observations. Sort on v001 v002 v003. Merge with file 3, on v001 v002 v003. Drop any cases with hiv05 missing (these are cases with no HIV result). Save as file 4. This will be your working file.

In file 4, the preferred weight will be hiv05, rather than v005. The cluster variable will be v001 (it is duplicated in the v020's but we always use v001). There may be a variable that is identified as a stratum variable, e.g. v022, but we recommend that you use what is identified as the domain variable, e.g. v023. If domains are not given, you can construct a domain variable for virtually all the surveys as all combinations of region and urban/rural.

I hope this helps.

Bridgette-DHS

[Updated on: Mon, 18 March 2013 09:12]

Report message to a moderator

Re: Merging DHS data in Stata [message #206 is a reply to message #71] Tue, 26 March 2013 16:20 Go to previous messageGo to next message
kf2349 is currently offline  kf2349
Messages: 1
Registered: March 2013
Member
Do you have similar suggestion for merging DHS data in SAS? I would like to have a single data file that includes men and women as individuals, as well as HIV status.

Karin
Re: Merging DHS data in Stata [message #217 is a reply to message #206] Thu, 28 March 2013 09:25 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
The suggestions are the same for merging in SAS. Please follow the same steps in SAS, and you can also reference the page on Merging Datasets, for additional information.

Bridgette-DHS
Re: Merging DHS data in Stata [message #252 is a reply to message #70] Tue, 02 April 2013 18:46 Go to previous messageGo to next message
caragh is currently offline  caragh
Messages: 1
Registered: April 2013
Location: Dublin, Ireland
Member
Can you please advise me if there are similar steps to be followed to merge the HIV test result with the Men's questionnaire only using SPSS? Thank you
Re: Merging DHS data in Stata [message #275 is a reply to message #252] Tue, 09 April 2013 12:07 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Below is an example syntax for merging the HIV Test data with the Women's data in SPSS. You can modify this for other datasets.

GET FILE='C:\DATAUSER\ZMAR51FL.SAV'.
SORT CASES BY
HIVCLUST (A) HIVNUMB (A) HIVLINE (A).
SAVE OUTFILE='C:\DATAUSER\HIV.sav'
/COMPRESSED.

GET FILE='C:\DATAUSER\ZMIR51FL.SAV'.
SORT CASES BY
V001 (A) V002 (A) V003 (A) .
SAVE OUTFILE='C:\DATAUSER\WOMEN.sav'
/RENAME(V001 V002 V003=
HIVCLUST HIVNUMB HIVLINE)
/COMPRESSED.

GET FILE='C:\DATAUSER\WOMEN.sav'.
MATCH FILES /FILE=*
/TABLE='C:\DATAUSER\HIV.sav'
/BY HIVCLUST HIVNUMB HIVLINE.
EXECUTE.
SAVE OUTFILE='C:\DATAUSER\ZMAR_IR.SAV'
/COMPRESSED.
Re: Merging DHS data in Stata [message #2488 is a reply to message #275] Sat, 28 June 2014 20:38 Go to previous messageGo to next message
owraza is currently offline  owraza
Messages: 31
Registered: December 2013
Location: Tehran
Member
How do I know if I have merged files in STATA correctly or not? I am trying to merge IR file with PR file, I followed following codes:

use "C:\Users\Owais\Copy\DHS\DHS13 Datasets\Households\PKPR61FL.DTA", clear
gen v001=hv001
gen v002=hv002
label variable v001 "Cluster number_copy"
recast long v001
format %12.0g v001
label variable v002 "Household number_copy"
recast int v002
format %8.0g v002 
sort v001 v002

use "C:\Users\Owais\Copy\DHS\DHS13 Datasets\Women\PKIR61FL.DTA", clear
sort v001 v002
merge m:m v001 v002 using "C:\Users\Owais\Copy\DHS\DHS13 Datasets\Households\PKPR61FL.DTA"
Re: Merging DHS data in Stata [message #2496 is a reply to message #2488] Mon, 30 June 2014 13:20 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Following is a response from DHS Senior Specialist, Tom Pullum:

You are right to be concerned about whether the merge was correct. It is very easy to make a mistake when merging. The way I would do it is given below. You can change the paths and run it.

During a merge, a variable called "_merge" is constructed. The codes for it are given in "help merge". In your case, you will want to keep the cases with _merge=3. In the lines below I construct "in_PR=1" and "in_IR=1" in the PR and IR files, respectively, and if you run the code you will see that the cases in the merged file with _merge=3 are exactly the same as those with in_PR=1 AND in_IR=1. I sometimes add this extra check to be absolutely sure that I have what I want.

You have to save the sorted PR file. You should save it with another name. Never over-write the basic recode files. I use old syntax for the merge command. I know some other people do too. The current syntax doesn't always mean what you think it means. I dropped the lines to recast v001 and v002. They are not needed. I used rename instead of gen to get v001 and v002 in the PR file. You can do things like that in temporary or scratch files.

Let us know if you have other questions.

use c:\DHS\DHS_data\PR_files\PKPR61FL.dta, clear
rename hv001 v001
rename hv002 v002
gen in_PR=1
sort v001 v002

save c:\DHS\DHS_data\scratch\temp.dta, replace

use c:\DHS\DHS_data\IR_files\PKIR61FL.dta, clear
gen in_IR=1
sort v001 v002
merge v001 v002 using c:\DHS\DHS_data\scratch\temp.dta

tab1 _merge in_PR in_IR

keep if in_PR==1 & in_IR==1

tab _merge
drop _merge

Re: Merging DHS data in Stata [message #2510 is a reply to message #2496] Wed, 02 July 2014 07:24 Go to previous messageGo to next message
owraza is currently offline  owraza
Messages: 31
Registered: December 2013
Location: Tehran
Member
Thank you very much Pullum & Bridgette. Your reply was helpful.
Re: Merging DHS data in Stata [message #2511 is a reply to message #2510] Wed, 02 July 2014 10:21 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
You are welcome.
Re: Merging DHS data in Stata [message #4376 is a reply to message #275] Sat, 16 May 2015 07:39 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Tanzania 2011-12, Zambia 2011-12, Merging HIV data to women file (IR) on SPSS. I hope this "merging" syntax provided works in this case too? Since male and female are combined in IR, I will "merge" then create a new variable for only women. Will that be the right procedure? Thank you
.
Re: Merging DHS data in Stata [message #4377 is a reply to message #275] Sat, 16 May 2015 07:46 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Dear DHS expert,
Below is my syntax for merging HIV dataset AR to individual dataset IR using Tanzania 2011-12 dataset on SPSS. Please let know why I do not get the desired results. The combined dataset does not have HIV data for IR, rather the AR (HIV test results) data is scattered within the combined IR/AR dataset and seem to be no connection.


DATASET ACTIVATE DataSet4.
SORT CASES BY HIVCLUST(A) HIVNUMB(A) HIV01(A).

DATASET ACTIVATE DataSet1.
SORT CASES BY V001(A) V002(A) V003(A).
SAVE

RENAME VARIABLES (V001 V002 V003 =
HIVCLUST HIVNUMB HIVLINE).
EXECUTE

DATASET ACTIVATE DataSet6.
MATCH FILES /FILE=*
/FILE='DataSet4'
/BY HIVCLUST HIVNUMB HIVLINE.
EXECUTE.
Thank you

[Updated on: Sun, 17 May 2015 10:12]

Report message to a moderator

Re: Merging DHS data in Stata [message #4393 is a reply to message #4377] Tue, 19 May 2015 11:14 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
In line 2 you are sorting by the wrong variable:

SORT CASES BY HIVCLUST(A) HIVNUMB(A) HIV01(A).

It should be

SORT CASES BY HIVCLUST(A) HIVNUMB(A) HIVLINE(A).
Re: Merging DHS data in SPSS [message #4422 is a reply to message #4393] Sun, 24 May 2015 08:29 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Thank you,
I corrected that error but still the 2 data sets do not merge. Even after sorting cases and renaming variables HIVCLUST HIVNUMB HIVLINE properly. The case IDs still remain different and the data merged cannot link HIV status to any individual data, (SPSS).I will only consider mothers in this study. Could there be solution?Thank you.

Warm regards,
Malachi

Re: Merging DHS data in Stata [message #5493 is a reply to message #4393] Sat, 30 May 2015 13:46 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Dear DHS experts,
Thank you once again for the guidance, however, I still seem to be having hitches with Tanzania SPSS 2011-12 datasets and the HIV datasets. I have tried to merge the two datasets using the syntax guideline you provided in the forum but they seem to merge with different case IDs, and the HIV dataset merges only as new cases with own variables with. Here is the syntax I used. Could I get further guidance. Thank you.

DATASET ACTIVATE DataSet4.
SORT CASES BY hivCLUST(A) hivNUMB(A) hivLINE(A).

DATASET ACTIVATE DataSet1.
SORT CASES BY V001(A) V002(A) V003(A).
SAVE

RENAME VARIABLES (V001 V002 V003 =
hivCLUST hivNUMB hivLINE).
EXECUTE

DATASET ACTIVATE DataSet6.
MATCH FILES /FILE=*
/FILE='DataSet4'
/BY hivCLUST hivNUMB hivLINE.
EXECUTE.
Kind regards,
Malachi

[Updated on: Sat, 30 May 2015 13:53]

Report message to a moderator

Re: Merging DHS data in Stata [message #5501 is a reply to message #5493] Mon, 01 June 2015 13:00 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Our SPSS Specialist, Ladys Ortiz, took a closer look at your syntax, and had the following comments:

You are missing a period after the "save" command, and the 1st "execute" command. Also, you are referencing "dataset6" where you are expected to be referencing "dataset1". Following is the syntax that you can use or compare to yours:

GET
FILE='C:\Tanzania2011\TZAR6AFL.SAV'.
SORT CASES BY hivCLUST(A) hivNUMB(A) hivLINE(A).
SAVE OUTFILE='C:\Tanzania2011\TZAR6AFL.SAV'.

GET
FILE='C:\Tanzania2011\TZIR6AFL.SAV'.
SORT CASES BY V001(A) V002(A) V003(A).

RENAME VARIABLES (V001 V002 V003 =
hivCLUST hivNUMB hivLINE).
EXECUTE.

MATCH FILES /FILE=*
/FILE='C:\Tanzania2011\TZAR6AFL.SAV'
/BY hivCLUST hivNUMB hivLINE.
EXECUTE.

Re: Merging DHS data in Stata [message #5574 is a reply to message #5501] Tue, 09 June 2015 23:21 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Thank you very much,
This time the merging has worked.

Warm regards,
Malachi
Re: Merging DHS data in Stata [message #5578 is a reply to message #5574] Wed, 10 June 2015 10:41 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
You are welcome.
Re: Merging DHS data in Stata [message #6928 is a reply to message #5578] Tue, 04 August 2015 17:55 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Hallo Bridgette,experts,
Merging worked so well and the work is almost complete. However, incase I wanted to add v"age at death" variable from the children data variable to the already merged women and hiv dataset, how would I go about it? (The merged W + HIV datasets) assign NA to the 'age at death' variable) in spss.
Thank you,
Malachi

[Updated on: Tue, 04 August 2015 17:57]

Report message to a moderator

Re: Merging DHS data in Stata [message #6931 is a reply to message #6928] Wed, 05 August 2015 10:04 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Following is a response from Senior Data Processing Specialist, Noureddine Abderrahim:

Tanzania 2011-12 is an AIDS Indicators Survey. In this survey we did not collect information about age at death of children, which explains why this variable is set to Not Applicable (NA).
Re: Merging DHS data in Stata [message #6940 is a reply to message #6931] Wed, 05 August 2015 13:22 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Alright, thank you.

Kind regards,
Malachi

[Updated on: Wed, 05 August 2015 13:23]

Report message to a moderator

Re: Merging DHS data in Stata [message #8154 is a reply to message #6940] Sun, 30 August 2015 06:21 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Dear Bridgette, experts,
I looked at Tanzania 2003-2004 under-5 mortality frequencies below (34.7%) and I was awed. Please encourage me that these figures are real or I made a mistake somewhere, perhaps during merging.Thank you. (Of course I restricted some variables)

Children born alive who died
Frequency Percent Valid Percent Cumulative Percent
Valid No 2704 65.3 65.3 65.3
Yes 1434 34.7 34.7 100.0
Total 4138 100.0 100.0



Kind regards,
Malachi

[Updated on: Sun, 30 August 2015 06:24]

Report message to a moderator

Re: Merging DHS data in Stata [message #8155 is a reply to message #8154] Sun, 30 August 2015 14:36 Go to previous messageGo to next message
Reduced-For(u)m
Messages: 291
Registered: March 2013
Senior Member

I think your estimates are off by about triple. The DHS final report for Tanzania pegs the U5 mortality rate around 112/1,000, or about 10%.

Also, to just get the mortality rate, you don't need to merge anything, so... Is this just the HIV-positive sub-sample or something (glancing back over the thread)? In that case, given that your period would cover the late 1990's and early 2000's before lots of ART drugs were available, your 30% figure could be about right. But certainly it is too high for the whole sample.
Re: Merging DHS data in Stata [message #8156 is a reply to message #8155] Sun, 30 August 2015 14:37 Go to previous messageGo to next message
Reduced-For(u)m
Messages: 291
Registered: March 2013
Senior Member

Sorry - link to final Tanzania 2004 report:

http://dhsprogram.com/pubs/pdf/FR173/FR173-TZ04-05.pdf
Re: Merging DHS data in Stata [message #8161 is a reply to message #8156] Mon, 31 August 2015 12:27 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Thank you. Could I be mixing up reports, I can see the dhs/AIS 2003=4 report (http://dhsprogram.com/pubs/pdf/AIS1/AIS1.pdf) and then the 2004-5 report link you sent andI am using the 2003-4 dataset, could you please help clarify which one I could use. Like you say the mortality numbers are too high, I just obtained the raw frequencies, didn't calculate anything and that is what is what I have to use unless you advise me otherwise. And yes I merged the HIV data to the 2003-4 survey, however, I wanted to consider only last born children under-5, this variable is much easily selected in 2011-12 dataset but in 2003-4, I find it abit difficult to select these cases. Any help?

Thank you very much,
Malachi
Re: Merging DHS data in Stata [message #8163 is a reply to message #8161] Mon, 31 August 2015 15:10 Go to previous messageGo to next message
Reduced-For(u)m
Messages: 291
Registered: March 2013
Senior Member

Are you using the DHS (2004-5) or the HIV/AIDS Indicator Survey (2003-4)? If the latter, someone else will need to chime in...
Re: Merging DHS data in Stata [message #8164 is a reply to message #8163] Mon, 31 August 2015 15:43 Go to previous messageGo to next message
Malachi Arunda is currently offline  Malachi Arunda
Messages: 25
Registered: February 2014
Member
Yes I am using standard HIV/AIDS Indicator Survey (2003-4). Thank you
Re: Merging DHS data in Stata [message #8204 is a reply to message #70] Fri, 11 September 2015 08:45 Go to previous messageGo to next message
PublicHealthMaster is currently offline  PublicHealthMaster
Messages: 1
Registered: September 2015
Member
I am trying to evaluate the HIV status data in relationship to a few questions about family planning. Since there is no 1:1 identifier between the HIV status respondents and the questionnaire respondents, is the way to approximate that through the sample weights in both data sets when merging? Pardon my beginner question, but what's the term for this kind of analysis/method? I'm using SPSS and just need to be pointed in the right direction and I'll take it from there!
Re: Merging DHS data in Stata [message #8283 is a reply to message #8204] Wed, 30 September 2015 07:31 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member

Following is a response from Senior DHS Data Processing Specialist, Noureddine Abderrahim:

For respondents in the HIV sample, there is a one to one relationship between the data in the main file and the data in the HIV Status file. Each respondent in the HIV subsample of the main survey has a record in the HIV file.
Re: Merging DHS data in Stata [message #8347 is a reply to message #2496] Wed, 14 October 2015 09:38 Go to previous messageGo to next message
kinsukmanisinha@gmail.com
Messages: 9
Registered: January 2015
Location: Milan
Member
Dear DHS experts,

I am trying to merge the IR and PR file. I have read the comments on this page, about how to use m:m merge (question by
owraza and reply by DHS expert). I understand the entire procedure (use hv001 & hv002 from PR and v001
v002 from IR) but towards the beginning of the explanation the DHS expert mention that is is better to perform old merge as the new m:m merge may do something we dont want it to.

I have stata verion 12 and it does not allow me to use old merge, leaving m:m merge as the only option. However, the help file on stata suggest that I may use joinby, can I do this..???

So, if I use joinby

In the PR (household file):
rename hv001 v001
rename hv002 v002
sort v001 v002

In the IR (women file):
sort v001 v002

I am lost after this point, I cant use m:m merge as that is not advised and I dont know how to proceed further. I would appreciate any help.

Thanks a lot.
Re: Merging DHS data in Stata [message #8388 is a reply to message #8347] Wed, 21 October 2015 13:55 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

I can do everything I want with the old version of merge, which did not include 1:1 or 1:m or m:1 or m:m. I prefer simplicity, so I stay with the old version, but it's not better or worse. You can use whatever version of the merge command that you want to.

You should only occasionally need to use joinby. It's a very powerful command but tends to produce files that are much much larger than the input files. It will give you all possible combinations of cases that match on whatever id code you are using. If you have 10 cases in file 1 with id=12345 and 8 cases in file 2 with id=12345, you will get 10*8 joined (or paired) cases with id=12345.

I think what you want to do would be as follows:

use PRXXXXFL.dta, clear
rename hv001 v001
rename hv002 v002
rename hvidx v003
sort v001 v002 v003
save temp.dta, replace

use IRXXXXFL.dta, clear
sort v001 v002 v003

merge v001 v002 v003 using temp.dta
tab _merge
keep if _merge==3
drop _merge

Re: Merging DHS data in Stata [message #9719 is a reply to message #8388] Tue, 10 May 2016 10:01 Go to previous messageGo to next message
mianrashid is currently offline  mianrashid
Messages: 12
Registered: February 2016
Location: Pau, France
Member
I tried to merge the file of PR and IR, by following,
use "C:\Users\Rashid Javed\Desktop\DHS Data Set\Pakistan\Stata_PK_2012-13_DHS_01052016_926_87403\6_House hold Member Recode_pkpr61
> dt\PKPR61FL.DTA", clear

rename hv001 v001

. rename hv002 v002

. gen in_PR=1

. sort v001 v002

save "C:\Users\Rashid Javed\Desktop\temp.dta

use "C:\Users\Rashid Javed\Desktop\DHS Data Set\Pakistan\Stata_PK_2012-13_DHS_01052016_926_87403\3_Indiv idual Recode_pkir61dt\PKI
> R61FL.DTA", clear

gen in_IR=1

. sort v001 v002

. merge v001 v002 using C:\Users\Rashid Javed\Desktop\temp.dta

After this finally received this error,

(note: you are using old merge syntax; see [D] merge for new syntax)
variables v001 v002 do not uniquely identify observations in the master data
file C:\Users\Rashid.dta not found
r(601);


MianRashid
Re: Merging DHS data in Stata [message #9727 is a reply to message #9719] Tue, 10 May 2016 15:24 Go to previous messageGo to next message
owraza is currently offline  owraza
Messages: 31
Registered: December 2013
Location: Tehran
Member
clear all

*Renaming Household's unique variable*

use "C:\Users\...\PKPR61FL.DTA" 
gen v001=hv001
gen v002=hv002
label variable v001 "Cluster number_copy"
recast long v001
format %12.0g v001
label variable v002 "Household number_copy"
recast int v002
format %8.0g v002 

sort v001 v002
save "C:\Users\...\PKPR61FL_A.DTA"

*Merging Women's file with Household's file*

use "C:\Users\...\PKIR61FL.DTA", clear
sort v001 v002
merge m:m v001 v002 using "C:\Users\...\PKPR61FL_A.DTA"

* For checking merging has done correctly *
* v138=eligible women in women's questionnaire & hv010=eligible women in household's questionnaire*
tab v138 hv010 
Re: Merging DHS data in Stata [message #9729 is a reply to message #9727] Tue, 10 May 2016 17:04 Go to previous messageGo to next message
mianrashid is currently offline  mianrashid
Messages: 12
Registered: February 2016
Location: Pau, France
Member
Thank you.
How can I merge the IR, KR and MR files of Pakistan DHS data set?


MianRashid
Re: Merging DHS data in Stata [message #9730 is a reply to message #9729] Tue, 10 May 2016 20:33 Go to previous messageGo to next message
owraza is currently offline  owraza
Messages: 31
Registered: December 2013
Location: Tehran
Member
Since I haven't used MR file, so advising you anything might be risky. I am sure you can get help from here: https://www.dhsprogram.com/data/Merging-Datasets.cfm

Re: Merging DHS data in Stata [message #9747 is a reply to message #9730] Thu, 12 May 2016 09:08 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
An additional response from Tom Pullum to the post: Quote:
I tried to merge the file of PR and IR, by following.....


The problem is that you have not used the line numbers. The following should work. You do not really need in_IR and in_PR, but since you constructed them, I am keeping them.

use PKPR61FL.DTA, clear
rename hv001 v001
rename hv002 v002
rename hvidx v003
gen in_PR=1
sort v001 v002 v003
save temp.dta
use PKIR61FL.DTA, clear
gen in_IR=1
sort v001 v002 v003
merge v001 v002 v003 using temp.dta
keep if in_PR==1 & in_IR==1
drop _merge in_PR in_IR

[Updated on: Thu, 12 May 2016 09:12]

Report message to a moderator

Re: Merging DHS data in Stata [message #9784 is a reply to message #9747] Tue, 17 May 2016 13:01 Go to previous messageGo to next message
mianrashid is currently offline  mianrashid
Messages: 12
Registered: February 2016
Location: Pau, France
Member
Hello,
Thanks you.
I merged the men, women, and children level data (IR, MR, KR) as a household unit. Now i want to declare survey for data set, Please let me know command is correct?

gen wgt=hv005/1000000
(880 missing values generated)

. svyset [pw=wgt], psu(hv021) strata(hv023)
??

Many Thanks

Rashid


MianRashid
Re: Merging DHS data in Stata [message #9796 is a reply to message #9784] Thu, 19 May 2016 10:14 Go to previous messageGo to next message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:


This should work. You will get the same results with wgt=hv005/1000000 or with wgt=hv005. There is no need to divide by 1000000 because Stata automatically re-normalizes pweights to have a mean of 1. For the psu you can use hv021 or hv001. They are identical.

There is typically a small difference between the weight for the same woman in the PR file and in the IR files. That is, her value of hv005 and v005 can be a little different. Similarly for a child. Similarly, hv005 and mv005 for the same man may differ. This is because of adjustments for nonresponse in the IR and MR files. However, given how you have constructed your file, I think it will be ok to use hv005 as the weight for the household units.

Re: Merging DHS data in Stata [message #11129 is a reply to message #9796] Wed, 02 November 2016 13:29 Go to previous messageGo to next message
dhs110 is currently offline  dhs110
Messages: 39
Registered: October 2014
Location: korea
Member
Bridgette-DHS
kindly help me...i want to merge WI (pkwi21) file with IR (pkir21) file in PDHS 1990 with SPSS. I tried many time by using the same commands that are already discussed here but failed. Kindly help me where i am wrong.


GET
FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkwi21sv\PKWI21FL.SAV'.
SORT CASES BY whhid withindf wlthind5.
SAVE OUTFILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\PKWI21FL.SAV'.

GET
FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkir21sv\PKIR21FL.SAV'.
SORT CASES BY V001 V002 V003.

RENAME VARIABLES (V001 V002 V003 =
whhid withindf wlthind5).
EXECUTE.

MATCH FILES /FILE=*
/FILE='E:\ALL MY PAPER CS\1.DATA SETS\ALL IR SPSS DATA\pkwi21sv\PKWI21FL.SAV'.
/BY whhid withindf wlthind5.
EXECUTE.


dhs110
Re: Merging DHS data in Stata [message #11165 is a reply to message #11129] Fri, 11 November 2016 10:27 Go to previous messageGo to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 1621
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:

Quote:
You are using SPSS and I only work with Stata, so I cannot give you SPSS code. In Stata correct steps would be as follows. You want to merge on the household id code, which is whhid in the WI file and is the first 12 characters of caseid in the IR file. I hope than in SPSS you know a way to extract the first 12 characters of caseid.

use e:\DHS\DHS_data\WR_files\PKWR21FL.dta, clear
sort whhid
save e:\DHS\DHS_data\scratch\PKWRtemp.dta, replace

use e:\DHS\DHS_data\IR_files\PKIR21FL.dta, clear
gen whhid=substr(caseid,1,12)
sort whhid
merge whhid using e:\DHS\DHS_data\scratch\PKWRtemp.dta
tab _merge
keep if _merge==3
drop _merge
Previous Topic: merging household and husbands' info to women
Next Topic: Mergining PR onto a Combined IR, MR Dataset in SPSS
Goto Forum:
  


Current Time: Fri May 24 01:57:17 Eastern Daylight Time 2019