Weighting Couples HIV data [message #8380] |
Mon, 19 October 2015 12:10 |
DaniD
Messages: 13 Registered: November 2014
|
Member |
|
|
Hi!
I am working with the couples data merged with HIV data. I am wondering which HIV weight (men's or women's) I should be using when I am running analyses that involve both men and women.
Thanks!
Dani
|
|
|
|
|
Re: Weighting Couples HIV data [message #8498 is a reply to message #8484] |
Wed, 04 November 2015 12:04 |
DaniD
Messages: 13 Registered: November 2014
|
Member |
|
|
Hi again,
Actually this might be a good question for Tom Pullum since he has actually done this exact type of weighting.
Here is how I am thinking about the weighting issue and maybe he or someone else can confirm that this is correct.
So I am doing an analysis of HIV-tested couples in xx African countries. I am planning to use the WOMEN'S HIV weights, since they were the ones selected for the DHS samples. I will then follow the strategy for combining countries that Tom describes in another topic thread which is copied below:
Following is a response from Senior DHS Stata Specialist, Tom Pullum
Any analysis using hiv03 (result of the HIV test) should be weighted with hiv05. Once you have merged with the AR file, you should ignore v005 or mv005 or hv005.
Your within-survey analyses are fine with the original hiv05 as the weight. Any adjustment to the weights related to pooling would be by a survey-specific multiplier and would have no effect on within-survey estimates.
My preferred way to handle the renumbering of clusters and strata in a pooled file is to use the "egen group" command. Within each survey, the cluster variable is always v001 (which is duplicated as v021). The stratum variable does not always have the same number and it is not always even named correctly. The strata are virtually always the combinations of region x v025 (v025 is urban/rural). I would find or construct that variable and then rename it as "strata", e.g. "gen strata=v022". You also need a unique identifier for "survey". You cannot rely on v000 for this, because v000 is a 3-character string such as "NG5", where "NG" is the country id and "5" is the phase of DHS. Sometimes there will be two surveys in the same phase, and v000 will be the same for both of them. (This is not an issue if you are using just one survey per country.) Anyway, you will need a line such as
egen cluster_pooled=group(survey v001)
egen strata_pooled=group(survey strata) and then you will have the unique identifiers.
To give equal weight to each survey, you need lines such as these FOR EACH SURVEY SEPARATELY:
scalar TOTWT=1000000
quietly summarize hiv05
scalar T=r(sum)
gen hiv05r=hiv05*TOTWT/T
You can do this adjustment before the pooling, or put those lines in a loop after the pooling, but just be sure that the recoding is survey-specific. These lines will remove the arbitrary factor of 1000000 from the original hiv05 and will give each survey an arbitrary TOTAL weight of 1000000. (That number could be anything you want.) This approach will give the same weight to every survey, regardless of the population of the country or the size of the sample. You need to make it very clear that your regional estimates were calculated that way. If, say, you wanted to weight each survey in proportion to its population size, you would replace "TOTWT" with the country's total population or the population age 15-49 or something like that.
If you think there are problems with this strategy and/or can propose a better one, please let me know.
Thanks so much!
Dani
|
|
|
Re: Weighting Couples HIV data [message #8545 is a reply to message #8498] |
Thu, 12 November 2015 09:44 |
Bridgette-DHS
Messages: 3214 Registered: February 2013
|
Senior Member |
|
|
Following is a response from Senior DHS Stata Specialist, Tom Pullum:
If you are using couples with the HIV data attached, DHS would recommend that you use hiv05 from the men as the weight variable. We always have worse nonresponse for men than for women, especially for HIV tests. This is contrary to your suggestion. Sorry about that.
Just "off the top of my head" I would suggest that in the couples file you take the log of hiv05 for the women and hiv05 for the men. I would predict that in most surveys the standard deviation of log(hiv05) will be greater for men than for women. That's a way of quantifying the variability in nonresponse.
Stan Becker at Johns Hopkins University has proposed a way to synthesize the weights from men and women in a couples file. Here's a reference: http://uaps2015.princeton.edu/abstracts/151669.
|
|
|