The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Dataset use in SPSS » Selecting one child per mother (Women's decision-making and child malnutrition status)
Selecting one child per mother [message #16448] Wed, 16 January 2019 11:54 Go to next message
caroline.mckenna is currently offline  caroline.mckenna
Messages: 1
Registered: January 2019

I am an undergraduate student investigating the association between women's decision-making variables (participation in decisions regarding spending income, her healthcare, major household purchases, visits to family and relatives) and stunting and wasting in her children under five. I am using the 2013-14 Democratic Republic of the Congo DHS.

I already created dichotomous variables for "Stunting Status" where HAZ< -2 SD and "Wasting Status" where WHZ < -2 SD according to WHO child growth standards. I currently have the child data set with linked mothers, but the problem is that the mothers are duplicated if they have several children (ex if a mother has 6 children, her data is counted 6 times). Now, I would like to randomly select one child per mother to create "mother-child pairs." Specifically, I would like to classify a mother-child pair as "exposed/malnourished" if the mother has at least one child who is stunted/wasted. How would I do this?

Re: Selecting one child per mother [message #16572 is a reply to message #16448] Tue, 29 January 2019 08:18 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2306
Registered: February 2013
Senior Member

Following is a response from Senior DHS Stata Specialist, Tom Pullum:

I assume that you are using the children in the KR file. In that file, the mothers are identified by v001 v002 v003. I suggest the following:

egen mother_id=group(v001 v002 v003)
gen rn=uniform()
sort mother_id rn
egen sequence=seq(), by(mother_id)
keep if sequence==1
drop rn sequence

This procedure sorts the children of each woman in a random order, and then selects the child with the smallest random number. Because of the random step in it, the results will not be replicable, at least not exactly. Every time you run it, you will get a slightly different sample of children. That's inevitable if you do it at random An alternative would be to use all children, with a multi-level adjustment for the similarity of children of the same mother. Another alternative would be to take just the youngest child, for example, the child with bidx=1. However, that will introduce some bias (see
Previous Topic: mean and median duration of breastfeeding
Next Topic: Calculating cases for IYCF
Goto Forum:

Current Time: Fri Jun 18 07:05:34 Coordinated Universal Time 2021