The DHS Program User Forum: Weighting data » Weighting data after merging survey rounds with different levels of representation

Home » Data » Weighting data » Weighting data after merging survey rounds with different levels of representation

Show: Today's Messages :: Show Polls :: Message Navigator

Re: Weighting data after merging survey rounds with different levels of representation [message #11576 is a reply to message #11575]

Wed, 11 January 2017 21:20

jswindle
Messages: 5
Registered: September 2016

Member

Hello Tom and Bridgett,

Thank you for your very helpful and prompt reply.

I ran the code you shared and got some results that surprised me. When I ran the final command of "tab strata survey, table clean" I got an error message saying that I could not use those options. When I instead ran "tab strata survey", I got these interesting results:

tab strata survey

group(surv
ey
strata_tem survey
p) 1 2 3 Total

1 226 0 0 226
2 416 0 0 416
3 607 0 0 607
4 320 0 0 320
5 17 0 0 17
6 253 0 0 253
7 190 0 0 190
8 470 0 0 470
9 27 0 0 27
10 447 0 0 447
11 767 0 0 767
12 174 0 0 174
13 556 0 0 556
14 172 0 0 172
15 438 0 0 438
16 433 0 0 433
17 611 0 0 611
18 187 0 0 187
19 459 0 0 459
20 195 0 0 195
21 340 0 0 340
22 800 0 0 800
23 105 0 0 105
24 121 0 0 121
25 450 0 0 450
26 331 0 0 331
27 186 0 0 186
28 193 0 0 193
29 28 0 0 28
30 188 0 0 188
31 435 0 0 435
32 185 0 0 185
33 239 0 0 239
34 67 0 0 67
35 22 0 0 22
36 591 0 0 591
37 193 0 0 193
38 787 0 0 787
39 95 0 0 95
40 614 0 0 614
41 285 0 0 285
42 0 420 0 420
43 0 283 0 283
44 0 47 0 47
45 0 850 0 850
46 0 40 0 40
47 0 732 0 732
48 0 81 0 81
49 0 693 0 693
50 0 263 0 263
51 0 690 0 690
52 0 78 0 78
53 0 625 0 625
54 0 31 0 31
55 0 789 0 789
56 0 101 0 101
57 0 705 0 705
58 0 307 0 307
59 0 403 0 403
60 0 42 0 42
61 0 735 0 735
62 0 230 0 230
63 0 3,553 0 3,553
64 0 0 92 92
65 0 0 754 754
66 0 0 825 825
67 0 0 318 318
68 0 0 33 33
69 0 0 789 789
70 0 0 35 35
71 0 0 786 786
72 0 0 60 60
73 0 0 718 718
74 0 0 45 45
75 0 0 821 821
76 0 0 32 32
77 0 0 781 781
78 0 0 138 138
79 0 0 650 650
80 0 0 76 76
81 0 0 832 832
82 0 0 480 480
83 0 0 646 646
84 0 0 53 53
85 0 0 723 723
86 0 0 55 55
87 0 0 746 746
88 0 0 44 44
89 0 0 786 786
90 0 0 41 41
91 0 0 823 823
92 0 0 127 127
93 0 0 668 668
94 0 0 197 197
95 0 0 755 755
96 0 0 29 29
97 0 0 706 706
98 0 0 42 42
99 0 0 778 778
100 0 0 70 70
101 0 0 747 747
102 0 0 81 81
103 0 0 737 737
104 0 0 63 63
105 0 0 831 831
106 0 0 37 37
107 0 0 782 782
108 0 0 35 35
109 0 0 767 767
110 0 0 90 90
111 0 0 761 761
112 0 0 66 66
113 0 0 723 723
114 0 0 85 85
115 0 0 778 778
116 0 0 137 137
117 0 0 746 746

Total 13,220 11,698 23,020 47,938

The part of these results that I found surprising is that the number of strata per survey vary in strange way. There are 41 categories for 2000, 22 categories for 2004, and 54 categories for 2010. The result for 2010 makes sense; there were 27 districts and when stratified by urban/rural you get 54. The result for 2004, I believe comes from 11 districts categories stratified by urban/rural; those 11 district categories are the ten largest districts that were sampled in a representative manner and then there is one big catch-all for the other 17 districts, hence the huge total of 3,553 respondents in the catch-all rural category (at least that is my guess). The 2000 results are perplexing. From what I can gather in the final report for the 2000 Malawi DHS, the sampling was done in the same manner as the 2004 survey, so I'm not sure why there are 41 categories here. Thoughts?

Once I have calculate the strata correctly, would the rest of this code (pasted below) work to appropriately survey set the data?

generate weight = v005/10000000
egen clusters=group(survey v021), label
svyset clusters [pweight=weight], strata(strata) singleunit(centered)

Or would you simply do:

generate weight = v005/10000000
svyset [pweight=weight], psu(v021) strata(strata)

In case it is relevant for deciding how to svyset the data, my ultimate goal is to do a three-level mixed effects model with the higher orders being the districtyear and district variables.

A final issue I am facing if I do this sort of mixed effects model is whether the 2000 and 2004 data from the 17 districts that are not sampled sufficiently to be representative could be appropriately incorporated into such a model. I realize that is outside the purvue of the DHS surveys, but I'm guessing you have faced these types of issue before in your own research. Any thoughts?

thank you kindly,
Jeff

Report message to a moderator

[Message index]

		Weighting data after merging survey rounds with different levels of representation By: jswindle on Wed, 11 January 2017 01:11
		Re: Weighting data after merging survey rounds with different levels of representation By: Bridgette-DHS on Wed, 11 January 2017 18:56
		Re: Weighting data after merging survey rounds with different levels of representation By: jswindle on Wed, 11 January 2017 21:20
		Re: Weighting data after merging survey rounds with different levels of representation By: Bridgette-DHS on Fri, 13 January 2017 11:45

Previous Topic:	Post-stratification for DHS data
Next Topic:	Disaggregating to lower Administrative Division

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Thu Jan 29 11:00:11 Coordinated Universal Time 2026