Home » Topics » HIV » recoding v034X (line number fo husband)
recoding v034X (line number fo husband) [message #9853] |
Tue, 31 May 2016 11:12 |
mm4599
Messages: 3 Registered: May 2016 Location: New York
|
Member |
|
|
I am using the 2003/04 Tanzania HIV/AIDS Indicator Survey, the 2007/08 Tanzania AIDS Indicator Survey (AIS) and the 2011/12 AIS
The sample size in the 2003/04 individual recode file (IR): tzir4afl.dta is 12,522. The line number of husband variable (v034X) variable is wide and there are no observations in v034_5 - v034_8. Tabulating v034_1 - v034_4 reveals that:
v034_1 v034_2 v034_3 v034_4
0 677 203 22 4 906
1 3,541 2 3543
2 2,635 36 6 2677
3 199 38 3 240
4 119 4 4 127
5 50 3 1 54
6 40 4 44
7 35 2 37
8 33 2 35
9 22 3 1 26
10 21 1 22
11 14 14
12 10 10
13 5 1 6
14 6 6
15 3 3
16 2 2
17 2 2
20 1 1
22 1 1
31 1 1
35 1 1
36 2 2
39 1 1
42 1 1
43 1 1
0
Total 7,422 300 36 5 7,763
When you reshape the file ( gen id = _n; reshape long v034_ , i(id) j(linenum) and tabulate you get v034_ = 7,763 but the dataset increases to 50,088 records with 42,325 missing observations for the v034_ variable.
What do I need to do so that the sample size for the reshaped file is NOT 50,088?
When I create a variable (gen v034= .) and use the replace command to group the multiple husband line number categories, I get 12,522 records (7422+ 5100), which is not correct,
tab v034_
v034_ Freq. Percent Cum
0 677 9.12 9.12
1 3,541 47.71 56.83
2 2,635 35.5 92.33
3 199 2.68 95.01
4 119 1.6 96.62
5 50 0.67 97.29
6 40 0.54 97.83
7 35 0.47 98.3
8 33 0.44 98.75
9 22 0.3 99.04
10 21 0.28 99.33
11 14 0.19 99.51
12 10 0.13 99.65
13 5 0.07 99.72
14 6 0.08 99.8
15 3 0.04 99.84
16 2 0.03 99.87
17 2 0.03 99.89
20 1 0.01 99.91
31 1 0.01 99.92
35 1 0.01 99.93
36 2 0.03 99.96
39 1 0.01 99.97
42 1 0.01 99.99
43 1 0.01 100
Total 7,422 100
Another thing I tried was to spilt the dataset into 4 files: one for v034_1, one for v034_2, one for v034_3, and one for v034_4 and then merge the files. Doing so I obtain 12,863 records (7763 + 5100 missing or 12522+341)
Which is the correct way to reformat this dataset so that I only have one variable v034?
I am attaching the do file.
Thanks
MM
|
|
|
Goto Forum:
Current Time: Tue Jan 7 09:50:26 Coordinated Universal Time 2025
|