The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Dataset use in Stata » Creating a household level variable
Creating a household level variable [message #3302] Wed, 19 November 2014 12:15 Go to next message
Mercysh is currently offline  Mercysh
Messages: 35
Registered: April 2014
Member
I am trying to generate a variable household composition using ages of household members (hv105) in households in which at least a member is aged 0-19 years. The categories should be adult presence (ages 20-59), older adult (ages 60 and over) and children-adolescents only (0-19 years). In multigenerational households the order of precedence should be the same as above. I have the following code but it does not seem to work:
I use the Lesotho 2009 (LSPR60FL.DTA) household dataset and keep only if hv105>=97 & hv102==1
****************
sort hhid
egen hhrestrict=tag(hhid)

*Create Adolescents
gen adolescents =.
replace adolescents =1 if hv105 <=15 & hv105>=17

*Households with adolescent member
egen hhadolescents = max(adolescents), by (hhid)
tab hhadolescents if hhrestrict==1

**Household composition
gen household_composition=.
replace household_composition =1 if(hv105 >=20 & hv105 <=59) & hhadolescents ==1 & hhrestrict ==1
replace household_composition =2 if hv105 >=60 & hhadolescents ==1 & hhrestrict ==1
replace household_composition =3 if hv105 <=19 & hhadolescents ==1 & hhrestrict ==1

I merge with the (LSIR60FL.DTA) individual dataset after restricting to if v502==0, v012 <=17, v135==1, v531 >=90 and is still fine, the first output is
. tab household_composition

Household composition | Freq. Percent Cum.
-----------------------+-----------------------------------
Adult | 683 65.74 65.74
Older adult | 263 25.31 91.05
Child-adolescents only | 93 8.95 100.00
-----------------------+-----------------------------------
Total | 1,039 100.00
and only a problem when I add the svy command here:

svy: tab household_composition and gives the following output:

Number of strata = 19 Number of obs = 63
Number of PSUs = 53 Population size = 62.20275
Design df = 34

-----------------------
Household |
compositi |
on | proportions
----------+------------
Child-ad | 1
-----------------------
Key: proportions = cell proportions

I have already svyset
generate weight= v005/1000000
svyset [pw=weight], psu( hv021) strata(hv023)

Please advise

Thank you






Mercy
Re: Creating a household level variable [message #3312 is a reply to message #3302] Thu, 20 November 2014 00:02 Go to previous messageGo to next message
Trevor-DHS is currently offline  Trevor-DHS
Messages: 774
Registered: January 2013
Senior Member
It looks to me that you have < and > reversed in a couple of cases:
1) In your keep condition, if hv105>=97 & hv102==1 probably should be if hv105<=97 & hv102==1
2) replace adolescents =1 if hv105 <=15 & hv105>=17
should be
replace adolescents =1 if hv105>=15 & hv105<=17

I'm not quite following your code, but if I understand it correctly then here is how I would create the household variable:
use "LSPR60FL.DTA"

* Recode age into the relevant age groups
recode hv105 (20/59=1)(60/97=2)(0/19=3) if hv105<=97 & hv102==1, gen(household_composition)

* Collapse and use the minimum value for each household 
collapse (min) household_composition, by(hhid)
lab def hh_comp 1 "adult presence (ages 20-59)" 2 "older adult (ages 60 and over)" 3 "children-adolescents only (0-19 years)"
lab val household_composition hh_comp

* Save this variable for merging
keep hhid household_composition
sort hhid
save "hh_comp.dta"

* Open women's data file 
use "LSIR60FL.DTA", clear
* generate hhid from caseid by dropping the last three characters for the line number
gen hhid = substr(caseid,1,length(caseid)-3)
sort hhid
* merge the data - many women to one household 
merge m:1 hhid using "hh_comp.dta"
* drop households without interviewed women
drop if _merge==2

* tabulate to check.  Note some cases are missing due to households with no de jure members or ages all don't know or missing
tab household_composition,m

Re: Creating a household level variable [message #3318 is a reply to message #3312] Thu, 20 November 2014 09:22 Go to previous message
Mercysh is currently offline  Mercysh
Messages: 35
Registered: April 2014
Member
Its my first time to use collapse, instead I was using tag which did not work. Thank you for your help.



Mercy
Previous Topic: how to reproduce malawi 2004 published sampling errors and deft values
Next Topic: How to view full name of X variables in Stata (almost e...)
Goto Forum:
  


Current Time: Fri Aug 12 09:12:10 Coordinated Universal Time 2022