District-level sex ratio and district-level age distribution [message #23960] |
Fri, 21 January 2022 09:54 |
Lea F
Messages: 3 Registered: February 2017
|
Member |
|
|
Hi,
I am using the 2015-2016 DHS data for India (NFHS-4), and I would like to calculate the following:
1. District-level sex ratio (not at birth but in the total population)
2. District-level age distribution (e.g. in 5-year categories, or % below 15 years of age, or something similar)
I was wondering: a) how these would be calculated from the dataset, and b) which dataset would be best to use for these estimations - I assume the household survey would be best, but am not sure.
Thank you very much for your help!
|
|
|
Re: District-level sex ratio and district-level age distribution [message #23974 is a reply to message #23960] |
Sun, 23 January 2022 16:07 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
This would be done with the PR file. The following lines use the "collapse" command to give you the sex distribution and the age distribution (in 5-year intervals). There is also a file that gives the age-sex distribution. You need to specify whether you want the de facto or de jure population. Even the collapsed files are large!
* You need to specify a workspace
cd e:\DHS\DHS_data\scratch
use hv005 hv102-hv105 shdistri using "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\IAPR74FL.DTA" , clear
gen wt=hv005/1000000
* If you want the de jure population
keep if hv102==1
* If, instead, you want the de facto population
* keep if hv103==1
* drop if age (hv105) is missing (.02 of 1% of cases). hv105 is 95 for age 95+
drop if hv105>95
* age5 gives standard age intervals for age 0-4, 5-9, etc
gen age5=1+ int(hv105/5)
collapse (sum) wt, by(hv104 age5 shdistri)
save IAtemp_age_sex_district.dta, replace
use IAtemp_age_sex_district.dta, clear
collapse (sum) wt, by(sex district)
* this file gives the weighted numbers of males and females in each district in the NFHS-4
save IAtemp_sex_district.dta, replace
use IAtemp_age_sex_district.dta, clear
collapse (sum) wt, by(age district)
* this file gives the weighted numbers of males and females in each five-year age group in the NFHS-4
save IAtemp_age_district.dta, replace
|
|
|
|
Re: District-level sex ratio and district-level age distribution [message #23988 is a reply to message #23986] |
Tue, 25 January 2022 07:42 |
Bridgette-DHS
Messages: 3199 Registered: February 2013
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
I just looked again and I see a couple of typos. I didn't actually test the program because the file is so large.
In these lines:
use IAtemp_age_sex_district.dta, clear
collapse (sum) wt, by(sex district)
* this file gives the weighted numbers of males and females in each district in the NFHS-4
save IAtemp_sex_district.dta, replace
use IAtemp_age_sex_district.dta, clear
collapse (sum) wt, by(age district)
"by(sex district)" should be replace by "by(hv104 shdistri)" and "by(age district)" should be replaced by "by(age5 shdistri)".
I really think it is very clear what each step of the program does. Stata code is very intuitive. There are no complicated commands or options. I'll just clarify one command. "collapse (sum) wt, by(hv104 shdistri)" means that the file is reduced by adding up the values of "wt" within each combination of sex and district. There will be one line for each combination and it will include the total weighted number of cases.
|
|
|