The DHS Program User Forum
Discussions regarding The DHS Program data and results
Home » Data » Weighting data » Weights not normalized? (Unweighted number of observations and sum of household weights differ)
Weights not normalized? [message #22751] Sat, 01 May 2021 16:36 Go to next message
MiFoo
Messages: 15
Registered: January 2021
Member
Hello,

in my dataset (DHS from Bangladesh, PR file), the weighted number of observations is always lower than the unweighted one. Shouldn't the weights be normalized in DHS surveys for each survey year, so that the sum of the normalized weights equals the sum of the cases over the entire sample?

Even when using the full data in the PR file from a single year such as 2017,
data %>% summarize(n=survey_total())
or equivalently
sum(dataPR$hv005/1000000))

I get a lower number than the number of observations in the dataset. Have I misunderstood the normalization in DHS surveys?
Note: I am using R Studio

Thank you!

[Updated on: Sun, 02 May 2021 17:22]

Report message to a moderator

Re: Weights not normalized? [message #22777 is a reply to message #22751] Thu, 06 May 2021 14:27 Go to previous message
Bridgette-DHS is currently offline  Bridgette-DHS
Messages: 3199
Registered: February 2013
Senior Member

Following is a response from DHS Research & Data Analysis Director, Tom Pullum:

In the HR, the mean value of hv005 is 1 (or 1000000 if you keep the factor of one million). In that file, the units are households and hv005 is a household weight. The PR file lists all the individuals in all the households but keeps the same value of hv005 as the HR file. That is, the PR file is an individual-level file but the weight variable is the household weight. If you calculate the average value of hv005 in the PR file, restricted to hvidx=1 or hv101=1 (one person per household) you will get a mean of 1.

The mean of v005 is 1 in the IR file, but the children in the BR file are given the mother's weight (v005), so the mean of v005 in the BR file is not 1.

We always use hv005 when analyzing the PR file and v005 when analyzing the BR file (or KR file) and do not re-normalize to a mean of 1. If you want to go through that step, you certainly can, but it won't make much difference. You are not using Stata, but in Stata, with the pweight option, the weights are automatically normalized to have a mean of 1.
Previous Topic: Denormalization of weights required?
Next Topic: Weights used in the Pakistan 2017-18 Survey
Goto Forum:
  


Current Time: Thu Nov 28 13:43:32 Coordinated Universal Time 2024