How to merge DHS databases of different countries [message #12439] |
Wed, 17 May 2017 17:31 |
AmsP
Messages: 24 Registered: April 2016
|
Member |
|
|
Hello,
I expect to create a single database that combines databases of various countries. I have already de-normalised different survey rounds of a particular country and created single database for each country. Then what should I take care if I append various countries' databases together? My purpose is to find out if living in different countries have any impact on v133 of a particular ethnic group that live in all these countries.
Because different countries have different weighting systems, I think the "svyset" command may not work for such a combined database.
Thank you very much!
|
|
|
Re: How to merge DHS databases of different countries [message #12445 is a reply to message #12439] |
Thu, 18 May 2017 10:55 |
Bridgette-DHS
Messages: 3215 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Stata Specialist, Tom Pullum:
There have been several posts on this topic. You need to construct new variables that distinguish ALL the strata in all the surveys, and ALL the clusters in all the surveys. You can assign a numeric code (1, 2, ..., for example) called "surveyid" to each survey. For each survey, you can use v021 as the cluster id. The stratification variable is not completely consistent across surveys, so you can give a name such as "stratumid" to v023 or whatever is the correct identifier for each survey. You then use lines such as "egen allclusterid=group(survey v021)" and "egen allstratumid=group(survey stratumid)". Even if the strata are the same in two successive surveys in the same country, say, you need to distinguish them this way. You can leave the weight variable alone or you can re-weight as described in earlier postings. Then you put "allclusterid" and "allstratumid" and the weight variable into svyset for the pooled file.
|
|
|
Re: How to merge DHS databases of different countries [message #12449 is a reply to message #12445] |
Thu, 18 May 2017 17:30 |
AmsP
Messages: 24 Registered: April 2016
|
Member |
|
|
Thank you very much! So shall I do the manipulation on the original cluster and stratum variables (e.g. v021 and v022) in each survey or on the recoded cluster and stratum (e.g. v021+2000 and v022+2000)? Because STATA cannot distinguish cluster and stratum across different surveys, Dr. Ruilin Ren suggested, in previous posts, the method to add 1000 or 2000 to cluster and stratum in different survey rounds.
Thank you again!
|
|
|
Re: How to merge DHS databases of different countries [message #12450 is a reply to message #12449] |
Fri, 19 May 2017 12:05 |
Bridgette-DHS
Messages: 3215 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Stata Specialist, Tom Pullum:
If you are using Stata, then the "egen group" command will probably be easier and safer than adding 1000, or 2000, etc. If you are not using Stata, then adding those numbers may be easier. You do not need to do any manipulation before appending the files, so long as you have a unique identifier for each survey, as I suggested. So long as you have a "survey" variable or other unique identifier, you can do all your survey-specific edits, etc., AFTER the appending. Note: do not rely on v000 or hv000 as survey identifiers. There are many cases of two different surveys, conducted close together in time, having the same values of v000 or hv000.
|
|
|
|
|
|
Re: How to merge DHS databases of different countries [message #12460 is a reply to message #12454] |
Mon, 22 May 2017 08:23 |
AmsP
Messages: 24 Registered: April 2016
|
Member |
|
|
Sorry for a perhaps naive question. But after appending surveys of different countries and running allclusterid=group(survey cluster) and allstratum=group(survey stratum), in the following -svyset- I think I should use psu(allclusterid) and strata(allstratumid), instead of psu(v021) and strata(v022). Is this right?
Thank you!
|
|
|
|