The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Merging data files » Survey identifiers
Survey identifiers [message #4265] Tue, 28 April 2015 12:27 Go to next message
harry is currently offline  harry
Messages: 2
Registered: April 2015
Location: UK

I'd like to be able to reliably and automatically match between DHS survey data (from any survey) and the corresponding regional boundaries downloaded (where available) from the spatial data repository.

To do this I'm looking for some clarification on survey identifiers that are used in the DHS surveys. As far as I can tell there are (at least) three different systems and I can't quite see how I can reliably match between them.

Downloaded survey data files have names such as BDIR61 and this page shows that this means it's a survey for Bangladesh, Phase 6, first survey done in that country on that phase.

The boundary data downloaded from the new spatial data site here contain a column "SVYID" which is a three digit identifier, as well as country and year; in the case of the above Bangladesh survey it has the value 349. (Unfortunately this isn't in the GPS cluster data when available).

The DHS API when used to retrieve information on available surveys with a request like this returns a field SurveyID which has a different value again; in the case of the above survey it has the value BD2011DHS.

I can see that for a given survey file I could extract the country code and year of interview from HV000 and HV007 and use that to match the boundary polygons. But I don't know if this is always going to match the year recorded in the corresponding boundary file, e.g. for surveys that spanned more than one year? It feels like there ought to be a cleaner way of doing this - a single survey identifier that is common to the survey data and the boundaries (and the GPS data when available) and the DHS API. But I can't find much information on the different survey identifiers. Can anyone explain / clarify these? Is there a single, published mapping between (in this case) BDIR61, 349, and BD2011DHS?

Re: Survey identifiers [message #4266 is a reply to message #4265] Tue, 28 April 2015 16:13 Go to previous messageGo to next message
Trevor-DHS is currently offline  Trevor-DHS
Messages: 680
Registered: January 2013
Senior Member
You are right that there isn't a published mapping between these various IDs. We will look into updating the API to provide a full mapping of these. I'm attaching a file here that provides the current mapping.

You will find included here the survey ID from the API, as well as the numeric ID, plus the range for the datasets (as version numbers change). For the dataset range, the first two characters are the country code, the next two are the dataset type (represented in this file as ??), and the next two are the phase and sequence/version number.

Let us know if you find any problems using this file.
  • Attachment: SurveyIDs.csv
    (Size: 16.47KB, Downloaded 414 times)
Re: Survey identifiers [message #4271 is a reply to message #4266] Wed, 29 April 2015 05:19 Go to previous message
harry is currently offline  harry
Messages: 2
Registered: April 2015
Location: UK
Hello Trevor,

Many thanks for the quick reply. That CSV is exactly what I need! Thank you.

Yes, if the API could be updated to include this information going forward that would be a great idea.

Previous Topic: merging height weight file to children's file in SPSS
Next Topic: Merging Hiv Women Men and Couples
Goto Forum:

Current Time: Fri Apr 3 05:46:50 Eastern Daylight Time 2020