Re: How to add village level characteristics [message #24910 is a reply to message #24884] |
Mon, 01 August 2022 12:27 |
Janet-DHS
Messages: 888 Registered: April 2022
|
Senior Member |
|
|
Following is a response from DHS Research & Data Analysis Director, Tom Pullum:
Cluster-level control variables would usually be interpreted as something like the proportion of the households in the bottom two wealth quintiles or the proportion of women with no schooling, etc. You can also attach cluster-level variables using the geographic covariates data file.
Note that the cluster id numbers are nested within states (v024). A unique cluster-level ID could be constructed with "egen cluster_ID=group(v024 v001)" and then a "fixed effects" model would include the term "i.cluster_ID". You definitely should not use fixed effects for clusters. Apart from the time required to run a model with 5000 clusters (if it would run at all), the model would be seriously over-fitted or over-determined and your substantive covariates would become insignificant. I recommend cluster-level variables as described above and/or a multi-level model, which will indicate how much of the total explanatory power is individual-level and how much is cluster-level.
The correspondence between clusters and villages (or neighborhoods, in urban areas) is very loose. Some clusters actually include more than one village, for eample. Such a correspondence is often assumed but you can't rely on it.
|
|
|