Home » Data » Weighting data » weighting issues of multilevel modelling using the DHS survey data with multiplestage sampling
weighting issues of multilevel modelling using the DHS survey data with multiplestage sampling [message #15703] 
Mon, 03 September 2018 05:19 
YUJP
Messages: 3 Registered: September 2018

Member 


Dear DHS expert,
I am reading extensively the historical and current forum discussion on the weighting issues of multilevel modelling using the DHS survey data with multiplestage sampling. I would appreciate if you can help to enlighten me on the following question:
Basically, I am using a multilevel model t analysis dataset from DHS Cambodia 2014 with the outcome of the children under five diarrhoea and predictors at both the level of the children as well as the level of cluster (PSU). I am using the "melogit" command for this analysis (same results can be produced using the "meglm" command. I plan to use the scaling methods (methods A or B) as proposed by by Sophia RabeHesketh (2006) (http://www.gllamm.org/JRSSAsurvey_06.pdf) and Adam C Carle (2009) ( https://bmcmedresmethodol.biomedcentral.com/articles/10.1186 /14712288949). One problem is that in the DHS database we only have the weight (v005 or hv005) that has taking the two stage sampling (cluster (PSU and women (or household) into consideration. As it was stated in the STATAMULTILEVEL MIXEDEFFECTS REFERENCEMANUAL RELEASE 15 (page 104) (https://www.stata.com/manuals/me.pdf), we don't have Wj or W ij but only Wij:
"Now take these same data and fit a twolevel model with meglm, it is not sufficient to use the single sampling weight wij , because weights enter the log likelihood at both the group level and the individual level. Instead, what is required for a twolevel model under this sampling design is wj , the inverse of the probability that group j is selected in the first stage, and wij , the inverse of the probability that individual i from group j is selected at the second stage conditional on group j already being selected. You cannot use wij without making any assumptions about wj .
Given the rules of conditional probability, wij = wj wij . If your dataset has only wij , then you
will need to either assume equal probability sampling at the first stage (wj = 1 for all j) or find
some way to recover wj from other variables in your data; see RabeHesketh and Skrondal (2006) and the references therein for some suggestions on how to do this, but realize that there is little yet known about how well these approximations perform in practice.
What you really need to fit your twolevel model are data that contain wj in addition to either
wij or wij . If you have wijthat is, the unconditional inclusion weight for observation i; jthen you need to divide wij by wj to obtain wij ."
However, when I reread the DHS report of Cambodia, I found that there are actually information on the distribution of enumeration areas in the sampling by strata. (page 282 Appendix A Table A2, Cambodia Demographic and Health Survey 2014: https://dhsprogram.com/pubs/pdf/fr312/fr312.pdf) . If I call them Cj (j= strata 1, 2, ... 38)), as we can easily get the number of selected clusters per each strata, which I call them CSj (j= strata 1, 2, ... 38)), it seems that I would be able to calculate the probability that the clusters in each strata were selected (CSj/Cj) and thus the weight Wj = Cj/CSj). With Wj, when I can calculate the wij which is Wij/Wj.
I use the methods and the information in the Appendix of the report and recalculated the scaled weights and got a results which is a bit different from (but still very similar with) the results that was produced by using the wij (v005) and presume that the second level weight to be "1".
I would appreciate if you can guide me whether this is a valid solution to obtain the two level weights for the multilevel analysis using DHS data? Or at least this can provide a better (less biased) estimate of the parameters than the one using wij as the first level weight and presume the second level weight be "1"?
Many thanks in advance.
[Updated on: Mon, 03 September 2018 06:22] Report message to a moderator







Goto Forum:
Current Time: Mon May 16 00:21:19 Coordinated Universal Time 2022
