Education
PhD in Economics, 2023 (expected)
Massachusetts Institute of Technology
BA in Economics, Applied Mathematics, and Statistics, 2016
University of California, Berkeley
Working Papers
Working Papers
This paper studies consumers' demand for quality in the nursing home market, where information frictions are a source of concern. Using administrative data on the universe of nursing home residents, I estimate quality of nursing homes in California, and use these estimates as inputs into a structural demand model. I find substantial variation in nursing home quality: one standard deviation higher quality is associated with 2 percent lower risk-adjusted 90-day mortality rate. Yet, despite the high stakes for residents, average demand for quality is very low, even after accounting for unobserved supply-side constraints arising from selective admissions practices by nursing homes. Patterns of demand heterogeneity highlight information frictions as a major reason for this low demand: residents who were younger, highly educated, free from dementia, and who made their choices after the introduction of the star rating system were more responsive to quality. Counterfactual simulations based on estimates of the structural demand model and a competing risks model suggest that eliminating information frictions can reduce deaths by at least 8 to 28 percent, and potentially even more if supply side responses are considered.
This paper studies consumers' demand for quality in the nursing home market, where information frictions are a source of concern. Using administrative data on the universe of nursing home residents, I estimate quality of nursing homes in California, and use these estimates as inputs into a structural demand model. I find substantial variation in nursing home quality: one standard deviation higher quality is associated with 2 percent lower risk-adjusted 90-day mortality rate. Yet, despite the high stakes for residents, average demand for quality is very low, even after accounting for unobserved supply-side constraints arising from selective admissions practices by nursing homes. Patterns of demand heterogeneity highlight information frictions as a major reason for this low demand: residents who were younger, highly educated, free from dementia, and who made their choices after the introduction of the star rating system were more responsive to quality. Counterfactual simulations based on estimates of the structural demand model and a competing risks model suggest that eliminating information frictions can reduce deaths by at least 8 to 28 percent, and potentially even more if supply side responses are considered.
In regression discontinuity designs with multiple running variables (MRD designs), units are assigned to treatment base on whether their value on several observed running variables exceed known thresholds. In this design, applied work commonly uses analyzes each running separately: for example, when financial aid eligibility depends on GPA and family income, researchers separately consider the sample of students with low enough family income and study at the GPA threshold, and the sample of students with high enough GPA and study the income threshold. However, this approach does not fully exploit the richness of the data. I propose a new estimator for MRD designs using thin plate splines which improves upon the applied practice in two ways. First, the estimator provides efficiency gains by using the entire sample, and second, it may be used to estimate the conditional average treatment effect at every point on the boundary separating treated and untreated units. I provide an automated procedure for undersmoothing which eliminates the asymptotic bias of the estimate as well as Bayesian confidence intervals for the estimator, and derive theoretical results justifying the use of these methods. I find in a simulation study that the estimator is roughly unbiased in finite samples and that the confidence intervals have close-to-coverage empirical coverage. Finally, I demonstrate the performance of my estimator in an empirical application from Londoño-Vélez, Rodríguez, and Sánchez (2020), which studies the effect of a large financial aid program on higher education in Colombia. R code for estimation and inference is available.
In regression discontinuity designs with multiple running variables (MRD designs), units are assigned to treatment base on whether their value on several observed running variables exceed known thresholds. In this design, applied work commonly uses analyzes each running separately: for example, when financial aid eligibility depends on GPA and family income, researchers separately consider the sample of students with low enough family income and study at the GPA threshold, and the sample of students with high enough GPA and study the income threshold. However, this approach does not fully exploit the richness of the data. I propose a new estimator for MRD designs using thin plate splines which improves upon the applied practice in two ways. First, the estimator provides efficiency gains by using the entire sample, and second, it may be used to estimate the conditional average treatment effect at every point on the boundary separating treated and untreated units. I provide an automated procedure for undersmoothing which eliminates the asymptotic bias of the estimate as well as Bayesian confidence intervals for the estimator, and derive theoretical results justifying the use of these methods. I find in a simulation study that the estimator is roughly unbiased in finite samples and that the confidence intervals have close-to-coverage empirical coverage. Finally, I demonstrate the performance of my estimator in an empirical application from Londoño-Vélez, Rodríguez, and Sánchez (2020), which studies the effect of a large financial aid program on higher education in Colombia. R code for estimation and inference is available.
Research in Progress
Research in Progress
Assessing the Relative Importance and Potential Interactions Between Common Explanations for Racial Segregation: Evidence from Nursing Homes
Assessing the Relative Importance and Potential Interactions Between Common Explanations for Racial Segregation: Evidence from Nursing Homes
Racial segregation is a pervasive phenomenon in a number of important settings, such as school, neighborhood, and nursing home choice. Past work has found evidence supporting a number of explanations for these patterns, including in-group preferences, discrimination, and location. However, since most of these factors have been studied independently, it is difficult to make precise statements about the relative importance of these explanations and potential interactions between them. In this project, I take advantage of an administrative data set on the universe of nursing home residents to study a number of explanations simultaneously using a two-sided matching model. The estimation results indicate that both in-group preferences and discrimination contribute to the observed pattern of minorities being disproportionately concentrated in lower-quality nursing homes, whereas location is unlikely to play a major role. Moreover, lower minority demand for quality also contributes to segregation, with further analysis suggesting that this may be due to information frictions. In simulations, I quantify the relative importance and potential interactions between these factors.
Racial segregation is a pervasive phenomenon in a number of important settings, such as school, neighborhood, and nursing home choice. Past work has found evidence supporting a number of explanations for these patterns, including in-group preferences, discrimination, and location. However, since most of these factors have been studied independently, it is difficult to make precise statements about the relative importance of these explanations and potential interactions between them. In this project, I take advantage of an administrative data set on the universe of nursing home residents to study a number of explanations simultaneously using a two-sided matching model. The estimation results indicate that both in-group preferences and discrimination contribute to the observed pattern of minorities being disproportionately concentrated in lower-quality nursing homes, whereas location is unlikely to play a major role. Moreover, lower minority demand for quality also contributes to segregation, with further analysis suggesting that this may be due to information frictions. In simulations, I quantify the relative importance and potential interactions between these factors.
Selective Admissions and Discharges by Nursing Homes
Selective Admissions and Discharges by Nursing Homes
Previous research has shown that as a consequence of capacity constraints, nursing homes selectively choose which types of residents to admit (Gandhi, 2019; Cheng, 2022), and when to discharge residents (Hackmann, Pohl, and Ziebarth, 2020). I provide a microfoundation for a structural model where arrivals of different types of potential residents and the evolution of “discharge readiness” of existing residents follow certain stochastic processes, and nursing homes choose optimal optimal admission and discharge policies that maximize expected present discounted value of future profits. The solution to this problem yields testable implications, and shed light on identification of the structural model – intuitively, nursing homes’ admission and discharge policies are identified by differences in the characteristics of residents they admit and discharge during times of high and low occupancy. I estimate the model using an extension of the Gibbs sampler in Agarwal and Somaini (2022) and Cheng (2022), with data augmentation on residents’ indirect utility and latent variables that determine nursing homes’ admission decisions for potential residents and discharge decisions for existing residents.
Previous research has shown that as a consequence of capacity constraints, nursing homes selectively choose which types of residents to admit (Gandhi, 2019; Cheng, 2022), and when to discharge residents (Hackmann, Pohl, and Ziebarth, 2020). I provide a microfoundation for a structural model where arrivals of different types of potential residents and the evolution of “discharge readiness” of existing residents follow certain stochastic processes, and nursing homes choose optimal optimal admission and discharge policies that maximize expected present discounted value of future profits. The solution to this problem yields testable implications, and shed light on identification of the structural model – intuitively, nursing homes’ admission and discharge policies are identified by differences in the characteristics of residents they admit and discharge during times of high and low occupancy. I estimate the model using an extension of the Gibbs sampler in Agarwal and Somaini (2022) and Cheng (2022), with data augmentation on residents’ indirect utility and latent variables that determine nursing homes’ admission decisions for potential residents and discharge decisions for existing residents.
Selection on Unobservables in Discrete Choice Models
Selection on unobservables is an important concern for causal inference in observational studies, and accordingly, previous papers have developed methods for sensitivity analysis for OLS (Altonji, Elder, and Taber 2005; Oster 2019), as well as epidemiological models (Ding and VanderWeele 2016). In this paper, I develop methods for sensitivity analysis in discrete choice models, deriving bounds for the omitted variables bias under an assumption about how much the consumer values the omitted variable(s) relative to the included control variables, and optionally, about the relationship between the omitted variable and the variable of interest. After providing theoretical results and demonstrating the performance of my bounding procedure in simulations, I show that the procedure produces economically meaningful bounds in an empirical application.
Selection on unobservables is an important concern for causal inference in observational studies, and accordingly, previous papers have developed methods for sensitivity analysis for OLS (Altonji, Elder, and Taber 2005; Oster 2019), as well as epidemiological models (Ding and VanderWeele 2016). In this paper, I develop methods for sensitivity analysis in discrete choice models, deriving bounds for the omitted variables bias under an assumption about how much the consumer values the omitted variable(s) relative to the included control variables, and optionally, about the relationship between the omitted variable and the variable of interest. After providing theoretical results and demonstrating the performance of my bounding procedure in simulations, I show that the procedure produces economically meaningful bounds in an empirical application.
Past Work
Past Work
Cigarette Consumption and Tax Salience
This paper studies how cigarette consumption responds over time to changes in tax rates. Using a panel of state data, I estimate that the cumulative effect of an excise tax rise on consumption is larger than the cumulative effect of an increase in sales tax, in line with a theory of tax salience. In addition, I find that consumption falls in advance of an excise tax hike, whereas it only falls in the year after a sales tax increase. The pattern of consumption response to sales taxes is also consistent with consumer learning over time.
Cigarette Consumption and Tax Salience
This paper studies how cigarette consumption responds over time to changes in tax rates. Using a panel of state data, I estimate that the cumulative effect of an excise tax rise on consumption is larger than the cumulative effect of an increase in sales tax, in line with a theory of tax salience. In addition, I find that consumption falls in advance of an excise tax hike, whereas it only falls in the year after a sales tax increase. The pattern of consumption response to sales taxes is also consistent with consumer learning over time.
The issue of fake news has been hotly debated in recent years, with some commentators claiming that it played a role in US presidential elections and the Brexit vote. Despite these claims, there has been limited evidence to date linking fake news directly to voting behavior. In this project, I seek to provide credible evidence on this question by using big college football games as an instrument for fake news consumption. I find that search volumes for pro-Trump fake news terms were lower in counties close to college football teams that played a big game shortly before the election, and also that these counties were less likely to vote for Trump. The magnitude of these estimates suggest that a one-standard deviation increase in search volume for pro-Trump fake news terms increased Trump’s vote share by about 4.5 percent. Finally, I do not find evidence that fake news affected overall turnout rates, or that fake news resulted in down-ballot effects.
The issue of fake news has been hotly debated in recent years, with some commentators claiming that it played a role in US presidential elections and the Brexit vote. Despite these claims, there has been limited evidence to date linking fake news directly to voting behavior. In this project, I seek to provide credible evidence on this question by using big college football games as an instrument for fake news consumption. I find that search volumes for pro-Trump fake news terms were lower in counties close to college football teams that played a big game shortly before the election, and also that these counties were less likely to vote for Trump. The magnitude of these estimates suggest that a one-standard deviation increase in search volume for pro-Trump fake news terms increased Trump’s vote share by about 4.5 percent. Finally, I do not find evidence that fake news affected overall turnout rates, or that fake news resulted in down-ballot effects.