The DHS Program User Forum: Dataset use in Stata

Home » Data » Dataset use in Stata » family structure

Show: Today's Messages :: Show Polls :: Message Navigator

family structure [message #18035]

Fri, 23 August 2019 04:16

Muntaha
Messages: 3
Registered: August 2019

Member

I need help in constructing family structure variable in Stata software by using data from a household survey. I have data on household roster and to create family structure variable i will be using data of these variables: hhcode, relation to head, gender, marital status and residential status.
In family structure variable, I want to create three types of families: nuclear family(coded as 1), extended family(coded as 2) and multiple family(coded as 3).
Nuclear family criteria: husband and wife living with their unmarried children
Extended family criteria: husband, wife, married/unmarried children, parents, and any unmarried sibling living together.
Multiple family criteria: more than one married siblings living together
I am unable to find Stata commands that can help me in this regard.

[Updated on: Fri, 23 August 2019 04:27]

Report message to a moderator

Re: family structure [message #18062 is a reply to message #18035]

Tue, 03 September 2019 14:46

Bridgette-DHS
Messages: 3230
Registered: February 2013

Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

Below are some Stata lines that will help get you started, using the 2017 DHS survey of the Philippines to identify nuclear households. The lines tell you whether the household contains one person with hv101=1, one person with hv101=2, at least one person with hv101=3, and no other persons. Some modifications are possible, e.g. using hv111-hv114 to check whether each child of the household head is also a child of the spouse of the household head. You might also want to check the ages of the children (it's possible that some children are adults). No matter how carefully you define your household types, a few households will probably be hard to classify. A limitation is that you do now know how everyone in the household is related to everyone else. Good luck

* identify nuclear households: head, spouse, children

use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\PHPR70FL.DTA" , clear

cd e:\DHS\DHS_data\scratch

keep hv001 hv002 hvidx hv101-hv105 hv111-hv114 
label list HV101

sort hv001 hv002 hvidx
save PHtemp.dta, replace

gen n=1
levelsof hv101, local(levels_hv101)
foreach li of local levels_hv101 {
gen n_`li'=0
replace n_`li'=1 if hv101==`li'
}

collapse (sum) n*, by(hv001 hv002)

gen family_type=.
replace family_type=1 if n_1==1 & n_2==1 & n_3>0 & n==n_1+n_2+n_3

tab family_type, m


* construct other types
* then can merge back to PHtemp.dta

Report message to a moderator

Re: family structure [message #18132 is a reply to message #18062]

Wed, 25 September 2019 05:41

Mr_Bokoboko
Messages: 2
Registered: September 2019

Member

Morning,

Could this work?

* identify nuclear households: head, spouse, children
clear all
set mem 1000
cd "C:\Users\Bokoboko\Desktop\SADHS\DATASET\ZAPR71DT"
use "ZAPR71FL", clear

** ========================================================================== **

*keep hhid hv001 hv002 hvidx hv101-hv105 hv111-hv114 
label list HV101

** ========================================================================== **

cap drop relationship
gen relationship=99
replace relationship = 1 if hv101 == 1
replace relationship = 2 if hv101 == 2
replace relationship = 3 if inlist(hv101,3,11,13,14)
replace relationship = 4 if hv101 == 8
replace relationship = 5 if hv101 == 6
*replace relationship = 6 if hv101 == ??
replace relationship = 7 if hv101 == 5
replace relationship = 8 if inlist(hv101,4,7,10)
replace relationship = 9 if hv101 == 12
label define relationship 1	"Head/acting head" 2 "Husband/wife/partner" ///
3 "Son/daughter/stepchild/adopted child" 4 "Brother/sister/stepbrother/stepsister" ///
5 "Father/mother/stepfather/stepmother" 6 "Grandparent/great grandparent" ///
7 "Grandchild/great grandchild" 8 "Other relative" 9 "Non-related persons" ///
99 "Unspecified"
label var relationship "Relationship to head"
label val relationship relationship

sort hhid hv001 hv002 hvidx

by hhid: generate hhsize=_N
egen hhtag = tag(hhid)

sort hv001 hv002

save PHtemp.dta, replace

** ========================================================================== **

gen n=1
levelsof relationship, local(levels_hv101)
foreach li of local levels_hv101 {
gen n_`li'=0
replace n_`li'=1 if relationship==`li'
}
*
collapse (sum) n*, by(hhsize hv001 hv002)

tab1 n*, m


cap drop family_type
gen family_type=.
*replace family_type=1 if n_1==1 & n_2==1 & n_3>0 & n==n_1+n_2+n_3
replace family_type = 1 if hhsize == 1
replace family_type = 2 if (n_1 >=1 & n_2 >=1) | (n_1 >=1 & n_3 >=1)
replace family_type = 3 if (n_4 >=1 | n_5>=1) | (/*n_6 >=1 |*/ n_7>=1) | (n_8 >=1)
replace family_type = 4 if n_9 >= 1
replace family_type = . if n_1 == 0
label define family_type 1"Single" 2"Nuclear" 3"Extended" 4"Complex" 9"Unspecified"
label var family_type "Family type / household composition"
label val family_type family_type

tab family_type, m

** ========================================================================== **

** MERGE **
sort hv001 hv002

merge 1:m hv001 hv002 using "C:\Users\MlulekiT\Desktop\SADHS 2016 DATA\DATASETS\ZAPR71DT\PHtemp.dta"
duplicates report
cap drop _merge

order hhid hhsize hv001 hv002 hv005 family_type hv105 hv104 hv217 hv021 hv023 relationship ///
n n_1 n_2 n_3 n_4 n_5 n_7 n_8 n_9 *

** ========================================================================== **

** SURVEY SET
gen person_wgt=hv005/1000000
gen psu =    hv021
gen strata = hv023

svyset psu [pw = person_wgt], strata(strata) vce(linearized)

** ========================================================================== **

tab hv101 family_type, m
svy: tab family_type, percent format(%9.1f) col miss
tabstat hhsize [aw=person_wgt], by(family_type) stat(mean median sd min max) format(%9.1f)
tabstat hhsize if hhtag==1 [aw=person_wgt], by(family_type) stat(mean median sd min max) format(%9.1f)

***************

tab family_type [iw= person_wgt], m
svy: tab family_type, count format(%9.0f) miss
svy: tab family_type, percent format(%9.1f) col miss

svy: tab hv270 family_type, count format(%9.0f) miss
svy: tab hv270 family_type, percent format(%9.1f) row miss

svy: tab hv109 family_type, percent format(%9.1f) row miss

** ========================================================================== **

exit

Report message to a moderator