Jimmy Jin


Data Scientist

Stripe July 2019 - Present
  • Payment infrastructure at Stripe.


Optimizely Nov 2017 - May 2019
  • Responsible for all statistics operations at Optimizely and assumed acting Product Manager duties for the Analytics team in Feb 2019.
  • Created Epoch Stats Engine, a novel stratification-based estimator for sequential testing designed to eliminate "Simpson's Paradox" due to dynamic traffic allocation. Blog post, Technical writeup
  • Authored Optimizely's first-ever Statistics Roadmap, outlining the company's statistics short- and long-term strategy for statistics R&D.


Data Science for Social Good May - Aug 2016
  • Built a predictive model to identify which hazardous waste generators in New York are at greatest risk of committing violations of the Resource Conservation and Recovery Act.
  • Designed a complete analysis pipeline (in Python and PostgreSQL) from ETL through feature generation, model fitting and error analysis, obtaining 1.87 lift through a random forest model.
  • github repo: http://github.com/dssg/rcra

Visiting Researcher

Norwegian Centre for E-health Research May - Aug 2014
  • Collaborated with researchers from the University of Tromsø, Norway to design and implement a novel continuous-time reinforcement learning algorithm for artificial pancreas management of blood glucose level in type 1 diabetics

Assistant Analyst

Congressional Budget Office Jul 2010 - Jul 2012
  • Wrote Stata programs to analyze migrants’ remittances flows, private/public-sector compensation and characteristics of the Supplemental Nutritional Assistance Program population using generalized linear models.
  • Wrote Fortran 90 code for the Long-Term Modeling Group’s microsimulation model to analyze the distributional effects of Social Security reform proposals.
  • Analyzed the costs to the private sector incurred by health-, education-, and labor-related legislation.

Selected projects

Epoch Stats Engine

  • Joint work with Leo Pekelis at Optimizely.
  • We developed a new estimator to make A/B testing compatible with dynamic traffic allocation policies such as those induced by a multi-armed bandit algorithm. The estimator is based on a simple stratification idea that is simple to implement, requires minimal assumptions, and is compatible with most central limit theorem-based methods.
  • Blog post: https://blog.optimizely.com/2018/11/27/stats-accelerator-acceleration-time-varying-signals/

Change point detection in Network models: preferential attachment and long range dependence

  • To appear in Annals of Applied Probability.

    We developed a changepoint variant of preferential attachment and then showed how to detect the changepoint using a functional central limit theorem for the number of leaves in the graph. Along the way we also proved an interesting result regarding the effect of the changepoint on the exponent of the degree distribution. This paper forms the bulk of my dissertation.


PhD Statistics (2017)

BA Economics (2010)


A lot