Poisson models with sample selection

Poisson regression is often used to model count outcomes, such as the number of patents that firms were granted, the number of times people visited the doctor, or the number of times unfortunate Prussian soldiers died by horse kick.

With observational data, we do not always see the outcome for all subjects. This is different from observing zero events; we simply have no information at all about the outcome. Why? Surveys have nonresponse. Firms may prefer trade secrets to patent applications. And so on. We might expect the outcomes of those we observe and those we do not observe to be different. This kind of missingness is called sample selection, or more correctly, endogenous sample selection. It is also called missing not at random (MNAR).

New command heckpoisson fits models to count data and produces estimates as though the sample selection did not occur. That is to say, it fits models that let you make inferences about the whole population, not just those who would be observed.

Post your comment

Timberlake Consultants