Home » Topics » General » Understanding dataset structure
Understanding dataset structure [message #13515] |
Mon, 13 November 2017 09:44 |
popanalyst
Messages: 2 Registered: November 2017
|
Member |
|
|
Hi, I'm working with the Malawi 2015-2016 DHS data to identify out-of-school adolescent girls and young women aged 15-24 who have an HIV test result. I am a bit confused about all the different datasets and wondering from where I should pull observations. Currently, I have tried to pull from the individual recode and couples recode, and match them to their HIV status. I'm wondering if I should pull from the household roster, or if I have already counted those by using the individual dataset? It looks like I will need to pull current schooling status from the household dataset as well, and I'm wondering how to connect people in the roster to the other datasets, if possible. Thanks!
|
|
|
|
Re: Understanding dataset structure [message #14577 is a reply to message #13515] |
Sat, 21 April 2018 22:14 |
kingx025
Messages: 95 Registered: August 2016 Location: Minneapolis. Minnesota
|
Senior Member |
|
|
The set of households covered by the household member (PR) files are randomly selected from within primary sampling areas, regardless of whether then include a woman of childbearing age or not. Women of childbearing age within those households (which is most commonly defined as women age 15-49) then receive the long individual women's questionnaire and go into the IR (women's) and couples files. The young women age 15-24 from the household roster are already included in the IR file, so I'm afraid you would be double counting them if you used the same women from the PR file without merging between the file types. Though I don't have personal experience working with the couples' file, I believe the women in the couples file are a subset of the same women in the IR file.
It may help to think of how the data are collected, in terms of the survey forms. Usually there is just a household survey form (including the roster of household members), a woman's survey form (which also collects information for the children and birth recode files), and maybe a men's file. There is no separate survey for couples, and the women who get the individual women's form are drawn from the larger group of randomly selected households with their household rosters. While it's easy to think of the separate DHS files as having their own separate existence, they are really rearrangements of material collected via the 3 forms of household, women, and men.
We hope to eventually do such merging of IR and PR files within the IPUMS-DHS project, but so far we have only linked across household, women (IR), child (KR), and birth (BR) files. I wonder if you might roughly approximate the education data you need using the data on total years of schooling or highest level of schooling for women in your age group of interest (V106, V133) in the IR files. If you need to link between the IR file and the PR file for the school attendance data, I assume you would use the household id number and the person's line number (V003 in the IR file) and HVIDX (the person's line number in the household).
Good luck!
Miriam King
Dr. Miriam King
IPUMS-DHS Project Manager (www.idhsdata.org)
|
|
|
Goto Forum:
Current Time: Mon Nov 18 22:05:50 Coordinated Universal Time 2024
|