CES Technical Notes (Restricted Access)
This technical note provides an overview of the development of a method for estimating the probability that a person observed in administrative data around a reference date is residing in the U.S. The analysis uses the 2020 extract of the Demographic Frame, which compiles person-place records from more than 20 administrative data sources. I augment the Demographic Frame with signs of living outside the U.S., such as foreign addresses, from the contributing sources. I also incorporate additional predictors, such as citizenship status, from the Census Numident. U.S. residence is proxied using appearance in the 2020 Census. I describe the data and model design of a logistic regression model used to predict U.S. residency. I evaluate the performance of the model using a testing dataset. I also document the importance of incorporating foreign address information into the model. I measure the impact of incorporating foreign address business rules as well as the U.S. residency probability measure in estimating national population statistics. I conclude with a discussion of the current limitations and steps to be taken to improve the modeling design.