Jimmy Jin



Data Science for Social Good May - Aug 2016
  • Built a predictive model to identify which hazardous waste generators in New York are at greatest risk of committing violations of the Resource Conservation and Recovery Act.
  • Designed a complete analysis pipeline (in Python and PostgreSQL) from ETL through feature generation, model fitting and error analysis, obtaining 1.87 lift through a random forest model.
  • github repo: http://github.com/dssg/rcra

Visiting Researcher

Norwegian Centre for E-health Research May - Aug 2014
  • Collaborated with researchers from the University of Tromsø, Norway to design and implement a novel continuous-time reinforcement learning algorithm for artificial pancreas management of blood glucose level in type 1 diabetics

Assistant Analyst

Congressional Budget Office Jul 2010 - Jul 2012
  • Wrote Stata programs to analyze migrants’ remittances flows, private/public-sector compensation and characteristics of the Supplemental Nutritional Assistance Program population using generalized linear models.
  • Wrote Fortran 90 code for the Long-Term Modeling Group’s microsimulation model to analyze the distributional effects of Social Security reform proposals.
  • Analyzed the costs to the private sector incurred by health-, education-, and labor-related legislation.

Selected projects

Change point detection in Network models: preferential attachment and long range dependence

  • To appear in Annals of Applied Probability.

    We developed a changepoint variant of preferential attachment and then showed how to detect the changepoint using a functional central limit theorem for the number of leaves in the graph. Along the way we also proved an interesting result regarding the effect of the changepoint on the exponent of the degree distribution. This paper forms the bulk of my dissertation.
  • Technologies: Math

Predictive Enforcement of Pollution and Hazardous Waste Violations in New York State

  • We built a predictive model to help the New York State Department of Environmental Conservation better target hazardous waste generators. This was my team's main project from the summer of Data Science for Social Good 2016.
  • github repo: http://github.com/dssg/rcra
  • Technologies: Python, PostgreSQL, bash scripting, git


PhD Statistics (2017)

BA Economics (2010)


A lot