* do e:\DHS\programs\MYERS\Myerstest_DHS_data_mlogit_do_24Mar2021.txt set more off cd e:\DHS\programs\MYERS set logtype text *log using Myerstest_DHS_data_mlogit_log_24Mar2021.txt, replace ************************************************************************ * Myers' blended index * This version is adapted to an individual-level DHS data file * This program calculates Myers' Index within a statistical model, multinomial * logit. It can be confirmed that this matches the calculation within the * framework of demographic techniques. * Note: Myers index is expressed as a percentage. * Updated on June 7, 2017 and March 24, 2021 * Tom Pullum, tom.pullum@icf.com ************************************************************************ program define index * Routine to calculate the index of dissimilarity directly from the digit distribution * The tab command must have been run and a vector T saved matrix T=100*T/r(N) * calculate Myers' index from this vector of percentages scalar index=0 scalar excess_0_5=0 scalar si=1 while si<=10 { scalar index=index+abs(T[si,1]-10)/2 if si==1 | si==6 { scalar excess_0_5=excess_0_5+abs(T[si,1]-10)/2 } scalar si=si+1 } scalar list index excess_0_5 end ************************************************************************ program define myers * digit1 is the 1's digit, digit10 is the 10's digit * we want to check for possible heaping on digit1 gen digit10=int(number/10) gen digit1=number-10*digit10 save temp.dta, replace * The observed distribution across the final digit for the age range being used, unweighted. * The overall distribution if the 1's digit is given in the "Total" column. gen n=1 collapse (sum) n, by(digit1 digit10) tab digit1 digit10 [iweight=n], col collapse (sum) n, by(digit1) tab digit1 [iweight=n], matcell(T) index scalar unblended_index=index scalar unblended_excess_0_5=excess_0_5 * The observed distribution across the final digit for the age range being used, weighted. * The overall distribution if the 1's digit is given in the "Total" column. use temp.dta, clear collapse (sum) wt, by(digit1 digit10) tab digit1 digit10 [iweight=wt], col collapse (sum) wt, by(digit1) tab digit1 [iweight=wt], matcell(T) index scalar blended_index=index scalar blended_excess_0_5=excess_0_5 scalar list unblended_index unblended_excess_0_5 scalar list blended_index blended_excess_0_5 * now start the individual-level procedure use temp.dta, clear * mwt will be 1 through 9 for start through start+8 * then 10 up to and including end * otherwise it is . (here, equivalent to 0) gen mwt=. scalar si=1 quietly while si<=9 { replace mwt=si if number==start+si-1 scalar si=si+1 } replace mwt=10 if number>start+8 & number<=end * Multiply by sampling weight gen mwtn=mwt*wt * the blended distribution across the final digit for the age range being used tab digit1 [iweight=mwtn], matcell(N) * Now use mlogit. The advantage of mlogit here is that the method can then be applied * to individual-level data and can include covariates. * You can get the index directly from the percentages in the blended distribution; * Myers' Index = Index of Dissimilarity * Here is the individual-level version of the model; can modify for full svyset * Note! it is essential to use mwtn as a pweight! svyset clusterid [pweight=mwtn], strata(stratumid) singleunit(centered) mlogit digit1, baseoutcome(0) mlogit digit1, baseoutcome(1) * Be careful with indices. element i+1 of b is log[b(i+1)/b0] * for digits i=0 through 9 (and elements 1 through 10) matrix b=e(b) * Lines to confirm correct extraction of elements of e(b) matrix list b local li=1 while `li'<=10 { scalar b`li'=b[1,`li'] scalar list b`li' local li=`li'+1 } * Here b[1,10] corresponds with digit 0, etc. * Manipulate the coeffients, i.e. the cells of b, to get the percentages that * were in the blended distribution across the digits scalar sum=0 local li=1 while `li'<=9 { local liplus1=`li'+1 scalar p`li'overp0=exp(b[1,`liplus1']) scalar sum=sum+p`li'overp0 local li=`li'+1 } scalar p0=100/sum scalar psum=0 scalar index=abs(p0-10) local li=1 while `li'<=9 { scalar p`li'=p0*p`li'overp0 scalar psum=psum+p`li' scalar index=index+abs(p`li'-10) local li=`li'+1 } scalar index=index/2 scalar list p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 scalar list index scalar excess_0_5=p0+p5-20 drop number digit10 digit1 mwt wt mwtn erase temp.dta end ************************************************************************ ************************************************************************ ************************************************************************ ************************************************************************ ************************************************************************ ************************************************************************ ************************************************************************ ************************************************************************ * EXECUTION BEGINS HERE * Open a DHS PR file use "C:\Users\26216\ICF\Analysis - Shared Resources\Data\DHSdata\PKPR61FL.DTA", clear keep hv001 hv002 hvidx hv005 hv021 hv023 hv104 hv105 gen number=hv105 gen wt=hv005/1000000 scalar start=0 scalar end=79 keep if number>=start & number<=end * only need the next lines if using individual-level data and svyset; stratumid may vary gen clusterid=hv021 gen stratumid=hv023 myers scalar list unblended_index unblended_excess_0_5 scalar list blended_index blended_excess_0_5 /* For age in household survey: scalar start=0 scalar end=79 excess in 0 and 5 For age in survey of women: scalar start=15 scalar end=44 excess in 0 and 5 For age at marriage in survey of women: scalar start=15 scalar end=44 For year of marriage in survey of women: scalar start=median year of survey-30 scalar end=median year of survey-1 excess in 0 and 5 For years since marriage in survey of women scalar start=0 scalar end=30 excess in 0 and 5 For year of birth in household survey: scalar start=median year of survey-50 scalar end=median year of survey-1 (just ages 0-49 excess in 0 and 5 For months of age for children age 0-14 in BR file (but must adapt to 12 months) scalar start=0 scalar end=180 excess in 0 and 6