Mon, 08 July 2013 12:22
 bsayer
The use of subpop in Stata has nothing to do with weights and everything to do with intra-cluster correlation. You need one observation per stratum-PSU combination to accurately calculate this. The best thing to do is use all the observations and the subpop option.

There are some situations where there might be some alternatives. If for some reason you think you are in those situations, you should study the issue carefully. I doubt that you will get a completely correct answer in a forum.

For a variable that represents percentiles of an entire population, it should have been weighted when it was created. If you want to create a new percentile, then you will need to create a weighted version for the population that you are interested in. For example, if you want the percentile of women ages 20 to 25 that have never had a child, you would use that population and the corresponding weight for women. This is because different women have a different probability of being selected in the survey (typically urban women have a higher probability, for example). So if urban women have a higher probability of not having had a child, then we need to account for both the probability of selection and the probability of not having had a child.

I would suggest something like small area estimation for these types of problems.

Bryan Sayer
Statistician
Social & Scientific Systems, Inc.

