- Built a predictive model to identify which hazardous waste generators in New York are at greatest risk of committing violations of the Resource Conservation and Recovery Act.
- Designed a complete analysis pipeline (in Python and PostgreSQL) from ETL through feature generation, model fitting and error analysis, obtaining 1.87 lift through a random forest model.
- github repo: http://github.com/dssg/rcra
- Collaborated with researchers from the University of Tromsø, Norway to design and implement a novel continuous-time reinforcement learning algorithm for artificial pancreas management of blood glucose level in type 1 diabetics
- Wrote Stata programs to analyze migrants’ remittances flows, private/public-sector compensation and characteristics of the Supplemental Nutritional Assistance Program population using generalized linear models.
- Wrote Fortran 90 code for the Long-Term Modeling Group’s microsimulation model to analyze the distributional effects of Social Security reform proposals.
- Analyzed the costs to the private sector incurred by health-, education-, and labor-related legislation.
Change point detection in Network models: preferential attachment and long range dependence
- To appear in Annals of Applied Probability.
We developed a changepoint variant of preferential attachment and then showed how to detect the changepoint using a functional central limit theorem for the number of leaves in the graph. Along the way we also proved an interesting result regarding the effect of the changepoint on the exponent of the degree distribution. This paper forms the bulk of my dissertation.
Predictive Enforcement of Pollution and Hazardous Waste Violations in New York State
We built a predictive model to help the New York State Department of Environmental Conservation better target hazardous waste generators. This was my team's main project from the summer of Data Science for Social Good 2016.
github repo: http://github.com/dssg/rcra
Technologies: Python, PostgreSQL, bash scripting, git
- I shamelessly stole this beautiful template from my friend Lin Taylor.
PhD Statistics (2017)
- University of North Carolina at Chapel Hill
BA Economics (2010)
- Swarthmore College
- Python (Numpy, Scipy, Pandas)