The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Correct Way to Do Sample Weighting for Cross-Sectional Pooled Data from Kenya
Correct Way to Do Sample Weighting for Cross-Sectional Pooled Data from Kenya [message #14283] Sun, 18 March 2018 13:30 Go to next message
subanara is currently offline  subanara
Messages: 1
Registered: December 2016
Location: Los Angeles, CA
Member
Hi,

I am using a pooled sample of the Kenya 2003, 2008-09, and 2014 datasets for only women who answered the domestic violence module. So far I have done my regressions unweighted, but I know that this likely causes error towards the samples in 2014 (since the total sample size is much larger in this dataset) and have used a control variable indicating the interview year to control for interview variation.

I was wondering what the correct procedure (in STATA) to generate weights for this type of pooled sample, I'm assuming I use the dv005 weight.

Thanks so much!
Re: Correct Way to Do Sample Weighting for Cross-Sectional Pooled Data from Kenya [message #14298 is a reply to message #14283] Tue, 20 March 2018 08:55 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 2536
Registered: February 2013
Senior Member
Following is a response from Senior DHS Stata Specialist, Tom Pullum:


Yes, you should use d005. Then I recommend that you give equal weight to each survey. There are various ways to do this, e.g. the following. Refer to the three surveys in the pooled file with survey=1, survey=2, survey=3. Calculate the sum of the DV weights in each survey as D1, D2, and D3, and the sum D=D1+D2+D3. Then in the first survey calculate the revised d005 as d005r=(d005/D1)*(D/3). In the second survey, d005r=(d005/D2)*(D/3). In the third survey, d005r=(d005/D3)*(D/3).

The following lines will accomplish this (change the paths, of course):

set more off
use e:\DHS\DHS_data\IR_files\KEIR42FL.dta, clear
keep caseid v0* d*
gen survey=1

save e:\DHS\DHS_data\scratch\KEdvtemp.dta, replace

append using e:\DHS\DHS_data\IR_files\KEIR52FL.dta 
keep caseid v0* d* survey
replace survey=2 if survey==.

append using e:\DHS\DHS_data\IR_files\KEIR70FL.dta 
keep caseid v0* d* survey
replace survey=3 if survey==.

drop if d005==.

summarize d005 if survey==1
scalar D1=r(N)*r(mean)

summarize d005 if survey==2
scalar D2=r(N)*r(mean)

summarize d005 if survey==3
scalar D3=r(N)*r(mean)

scalar D=D1+D2+D3
scalar list D1 D2 D3 D

gen d005r=d005*(D/3)

replace d005r=d005r/D1 if survey==1
replace d005r=d005r/D2 if survey==2
replace d005r=d005r/D3 if survey==3

summarize d005*

save, replace

Previous Topic: IWeight -EDHS
Next Topic: Weighted Chi2 (STATA)
Goto Forum:
  


Current Time: Mon May 16 01:28:47 Coordinated Universal Time 2022