Home » Topics » Reproductive Health » Reproductive calendar
|
|
Re: Reproductive calendar [message #1390 is a reply to message #1388] |
Thu, 20 February 2014 15:09 |
|
user-rhs
Messages: 132 Registered: December 2013
|
Senior Member |
|
|
Kash,
From a USAID 2009 report: Levels, Trends, and Reasons for Contraceptive Discontinuation (http://pdf.usaid.gov/pdf_docs/PNADQ639.pdf)
Quote:The DHS has created a system for generating events-based datasets from the calendar data,
where each change in the calendar becomes one observation, or ―row,‖ in a dataset. Each event
in the calendar--an episode of contraceptive use, a pregnancy, a birth, a termination, or an
episode of contraceptive non-use--is converted from the calendar string (the VCAL variables in
individual recode or woman-based datasets) into a separate observation for analysis. The start
and end date of each event is also recoded, allowing us to calculate directly the duration of the
event, women's age, women's parity, and children ever born (using the birth history) at the start
or end of each event.
As an alternative, you may want to read documentation on the string or substr function in Stata. Someone may have already answered your question on Statalist http://www.stata.com/statalist/archive/
RHS
EDIT: I have attached a do-file that does what was mentioned in the report (takes the string and creates a new observation for each character). For this country, the length of the VCAL_1 variable was 65, so I generated 65 new vbls and reshaped from wide to long. From here, you can use -stset- or what have you to set the dataset as survival time data, panel, etc. I used Excel to help me repeat all of those command lines 65 times. I'm sure there's a more elegant way to do this, but this is one way you can get started
-
Attachment: vcal.do
(Size: 2.27KB, Downloaded 1135 times)
-
Attachment: vcal.xlsx
(Size: 10.67KB, Downloaded 1126 times)
[Updated on: Thu, 20 February 2014 18:10] Report message to a moderator
|
|
|
Re: Reproductive calendar [message #1395 is a reply to message #1390] |
Fri, 21 February 2014 14:54 |
Reduced-For(u)m
Messages: 292 Registered: March 2013
|
Senior Member |
|
|
Wow. So that is a cool thing. But I'm not exactly sure what it is doing (I only had a chance to glance at the documentation). Can I ask- I've been creating woman-year birth history panels using "expand", some re-numbering of within-person obs into years, and then some loops that go through the cross-sectional data and see if anything happened in that observation/year (did you have a baby this 1997, yes/no, in 1998, in 1999?). Does this method make that same panel (for whatever outcome), or is it making something else entirely. I can't really see the structure of the final data from this description.
|
|
|
Re: Reproductive calendar [message #1396 is a reply to message #1395] |
Fri, 21 February 2014 18:44 |
|
user-rhs
Messages: 132 Registered: December 2013
|
Senior Member |
|
|
Hi Reduced, since the calendar is stored as a string variable, for it to be useful in survival analysis, it should be transformed so that each event is its own line (at least in my mind, that is how I think about it). Like I said earlier, I'm sure there's a more elegant way of separating the parts than what I did. I use Excel a lot when I'm managing data, especially for repetitious work, because I'm too lazy and impatient to do it manually.
Re: the panel question. In a sense it is a type of panel, in that births are nested within women. You can do survival analysis using the same data with the same long structure too. Of course creating this sort of "panel" is is only appropriate in some settings, i.e. when recall is reasonably accurate. I assume that recall of birth hx is generally accurate, especially if it happened in the past 5 years.
|
|
|
Re: Reproductive calendar [message #1408 is a reply to message #1396] |
Sat, 22 February 2014 15:39 |
Reduced-For(u)m
Messages: 292 Registered: March 2013
|
Senior Member |
|
|
Thanks. I feel like sometimes when I read certain papers I don't know what the unit of observation is: woman, event, woman-year, potential-event-year... I think depending on your estimator and your strategy, any of those might be appropriate, but I was curious what the calendar variable creates. Got it.
"I use Excel a lot when I'm managing data, especially for repetitious work, because I'm too lazy and impatient to do it manually." - Me too! Auto-fill has saved me hours and hours of stupid code writing (followed by stupid typo de-bugging). Sometimes a loop is more elegant, but sometimes that trade-off isn't worth the time and energy.
|
|
|
|
Re: Reproductive calendar [message #1510 is a reply to message #993] |
Wed, 05 March 2014 11:33 |
Liz-DHS
Messages: 1516 Registered: February 2013
|
Senior Member |
|
|
Dear User,
Do you still need help with this post? If so, can you please provide a little more detail on what you are trying to accomplish and which data set you are working with?
Thank you!
|
|
|
Re: Reproductive calendar [message #1619 is a reply to message #1510] |
Tue, 18 March 2014 20:20 |
|
user-rhs
Messages: 132 Registered: December 2013
|
Senior Member |
|
|
Hi Liz,
I have got one for you about VCAL (feeding my curiosity here). In my previous post, I trimmed the VCAL_1 variable down to 65 characters. Upon reading the documentation for VCAL more closely, I see this is not the right thing to do. The recode manual says:
Quote:The first character in each variable represents the most recent point in time, while the 80th character
position represents data for January of the year in which the calendar started. The calendars
are fixed at the 80th character position, such that the first few entries in the calendar
represent points in time after the date of interview, and are consequently left blank.
I am wondering if you can help me understand what I'm seeing.
FYI, I'm looking at the Ethio 2011 data. When I separate the variable VCAL_1 into 80 separate variables (I call them 'en' where n=location of character in string, so that the first character, AKA the most recent time point, is 'e1' and the point where the calendar starts is 'e80'). I do this in preparation of converting the data from wide to long. Just looking at this step, I see that e1-e11 are blank for all observations. Variables e12-e80 all have non-blank values. Given that the recall period is typically 5 years (60 months), do those first 11 columns mean anything? I noticed that for column 12, 99.81% had " " as the value, .15% had "0", and .04% had "P", which makes me think this column was only populated because the women were interviewed at the tail-end of data collection or something like that. Is this a correct conclusion?
Is my "most recent timepoint" (month of interview) in fact column 12/13?
Thanks,
RHS
[Updated on: Tue, 18 March 2014 20:21] Report message to a moderator
|
|
|
Re: Reproductive calendar [message #1620 is a reply to message #1619] |
Tue, 18 March 2014 22:25 |
Liz-DHS
Messages: 1516 Registered: February 2013
|
Senior Member |
|
|
Dear RHS,
I will send your question over to one of our experts. The one thing I know about Ethiopia is that their calendar is different. That is they use a different system than the Gregorian calendar. Will get back to you as soon as I have an answer.
Thanks!
|
|
|
Re: Reproductive calendar [message #1630 is a reply to message #1620] |
Wed, 19 March 2014 14:18 |
Liz-DHS
Messages: 1516 Registered: February 2013
|
Senior Member |
|
|
Dear RHS,
Here is a response from one of our experts, Guillermo Rojas:
Regarding the calendar in general (not just Ethiopia), the calendar should be used in conjunction with the date of interview. Row 80 in the calendar correspond to the beginning of the calendar and that is fixed for all interviews. It the case of Ethiopia it correspond to January, 1998 (Variable V017 = 1177 (1998-1900)*12+1). It is important to know that Ethiopia uses a different calendar, 2011 in the western calendar corresponds to 2003 in Ethiopia.
The first row with information in the calendar correspond to the month of interview for that woman. Variable V018 tells where the calendar begins for a particular case. In the case of Ethiopia, row 12 correspond to woman interviews collected in September, 2003. (V006=9, V007=2003, V008=1245). Row 11 to interviews made in August, 2003 (V006=8, V007=2003, V008=1244) and so on. By looking at the frequencies only 32 women were interviewed in September , 2003 and thus the number of blanks in that row.
Thank you for your post. If this does not answer your question, please feel free to post again.
|
|
|
|
|
|
|
|
|
Re: Reproductive calendar [message #2203 is a reply to message #2202] |
Mon, 26 May 2014 11:41 |
|
kash
Messages: 5 Registered: December 2013 Location: Pretoria
|
Member |
|
|
RHS,
Following our discussions, this is how I derive, in STATA, the duration to first pregnancy since the start of the reproductive calendar. I hope other users could find it useful or improve on it.
------------------------------------------------------------ ----------------------------
di length(vcal_1) /*length is 80*/
format %80s vcal_1
local i=1
forvalues n= 80(-1)1{
gen str1 mcont`i' = substr(vcal_1,`n',1)
local ++i
}
reshape long mcont, i(caseid) j(event_time)
by caseid, sort: egen fecund = min(cond(mcont == "P", event_time,.))
------------------------------------------------------------ --------------------
Best regards,
Kash
|
|
|
|
|
|
Re: Reproductive calendar [message #7047 is a reply to message #993] |
Mon, 17 August 2015 06:32 |
Kisaakye
Messages: 15 Registered: August 2015
|
Member |
|
|
Hello..
I am just beginning to work with calendar data (for now am using the 2011 Uganda DHS). I have extracted the data from the standard recode file. In the meantime i am trying to study the data but as i browse through, i realize the first 11 columns have no record/event. I am struggling to give an explanation to this. The second question relates to the events column. What does "N" stand for? I seem to understand all other letters and figures.
Thanks in advance
Peter
|
|
|
Re: Reproductive calendar [message #7048 is a reply to message #7047] |
Mon, 17 August 2015 10:59 |
Trevor-DHS
Messages: 802 Registered: January 2013
|
Senior Member |
|
|
The calendar (vcal_1) is set up as a string of 80 characters, with the 80th character representing the start point of the calendar, which, in the case of Uganda DHS 2011, is January 2006. V017 tells you the century month code for the start of the calendar (1273=January 2006). Position 1 in the calendar in this case would then be 79 months later, which would be August 2012, however, this is beyond the date of interview. Any months beyond the date of interview are left blank. If the interview, for example, took place in November 2011, then the information in the calendar would start in position 10, and positions 1-9 (corresponding to August 2012 back to December 2011) would all be blank. V018 tells you the position of the month of interview, and for an interview in November 2011 would be position 11.
You say that the first 11 positions have no record/event, but that is only the case for an interview taking place in September 2011. The Interviews took place between June 2011 and December 2011, so the number of blanks on the beginning of the record will differ, depending on the month of interview.
IMPORTANT NOTE: For some surveys, the calendar data are misaligned in the Stata data files. In those surveys affected, the calendar data are all trimmed so that they are left aligned. This only affects the Stata versions of the datasets, and does not affect all surveys (only a subset). I have written code to correct the misalignment, which I am attaching here. You will find a foreach statement in the code that lists the datasets affected (that I know of to date). There is a quick check to see if your dataset is affected:
gen v = substr(vcal_1,1,1)
tab v
If the calendar is correctly aligned, v will be mostly blank - in fact in the vast majority of surveys it will be all blank.
If v is not blank for most/all cases, then the calendar is misaligned. To correct the alignment, you can use the code attached.
I'm also attaching code I use for reshaping the calendar, which you can adapt for your own needs. Make sure you run the realign code first, though, before the reshape code, if the survey needs it [Uganda DHS 2011 is fine and does not need realigning - see the foreach for the datasets that I know do need realigning].
|
|
|
Re: Reproductive calendar [message #7049 is a reply to message #7048] |
Tue, 18 August 2015 05:57 |
Kisaakye
Messages: 15 Registered: August 2015
|
Member |
|
|
Thanks very much Trevor for this wonderful explanation. In the meantime, as i try to understand this, could you know what events "N" and "W" stand for?
Thanks.
|
|
|
|
Re: Reproductive calendar [message #7052 is a reply to message #7051] |
Tue, 18 August 2015 10:17 |
Kisaakye
Messages: 15 Registered: August 2015
|
Member |
|
|
Yes it has but these two specific letters (N and W) are not defined and yet they appear somewhere?! I would not know what they stand for or the specific method they represent.
|
|
|
Re: Reproductive calendar [message #7053 is a reply to message #7051] |
Tue, 18 August 2015 10:33 |
|
user-rhs
Messages: 132 Registered: December 2013
|
Senior Member |
|
|
By "country-specific documentation," I mean the Word document ("Individual Recode Documentation") that comes with the zipped dataset. I don't have access to the Uganda dataset, so I can't tell you what your codes mean, but in the Indonesia 2012 dataset, for VCAL1, "W" stands for "Other" and "N" stands for "Implants/Norplant," for VCAL2 (discontinuation), "W" stands for "Other" and "N" stands for "IUD expelled," for VCAL5 (source), "N" stands for "Friends/relatives," and for VCAL6, "W" stands for "Other" and "N" stands for "Implants/Norplant."
[Updated on: Tue, 18 August 2015 10:34] Report message to a moderator
|
|
|
|
Re: Reproductive calendar [message #7056 is a reply to message #7054] |
Tue, 18 August 2015 11:19 |
Liz-DHS
Messages: 1516 Registered: February 2013
|
Senior Member |
|
|
Dear User,
In the Recode6 dictionary, the Col1 W is other N is implants/Norplant for Col2 discontinuation W is other and there is no N code. In this survey, there was only a 2 column calendar.
Thank you!
|
|
|
Re: Reproductive calendar [message #7058 is a reply to message #7056] |
Tue, 18 August 2015 11:24 |
Kisaakye
Messages: 15 Registered: August 2015
|
Member |
|
|
Thanks for the response but there is a big mix-up. According to the coding frame available to me (or the one used in the survey - derived from the questionnaire), 5 was for Implants in Col1 and X for other modern methods in Col1. However, for Col2 X was maintained for other. Still stuck on how to go about it
|
|
|
|
|
Re: Reproductive calendar [message #8126 is a reply to message #7066] |
Wed, 26 August 2015 08:17 |
Kisaakye
Messages: 15 Registered: August 2015
|
Member |
|
|
Trevor, the information(codes for specific methods) in the manual are very helpful in understanding what the letters stand for in the data-sets. However, "M" is not listed among or explained in the recode manual (page 98) and yet it appears in the data set (UDHS 2011). Could you help with this.
Thanks
|
|
|
|
|
Goto Forum:
Current Time: Wed Nov 13 04:16:35 Coordinated Universal Time 2024
|