Frontiers of Causal Inference in Data Science: Perspectives from Leaders in Tech and Academia | May 28, 2021

Registration is now closed.

The Center for Causal Inference hosted the symposium Frontiers of Causal Inference in Data Science: Perspectives from Leaders in Tech and Academia.

In three exciting sessions on May 28, 2021, starting at 11:00 am and ending at 5:20 pm EDT, we heard from researchers working at the intersection of causal inference, machine learning, and data science. 



Welcoming Remarks

11:00-11:10 am

Nandita Mitra
University of Pennsylvania
CCI Co-Director
Jason Roy
Rutgers University
CCI Co-Director



11:10-11:35 am
Susan Murphy, Harvard University
Susan Murphy is Professor of Statistics at Harvard University, Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences and Radcliffe Alumnae Professor at the Radcliffe Institute, Harvard University.  Her lab works on clinical trial designs and online learning algorithms for developing personalized mobile health interventions.  She developed the micro-randomized trial for use in constructing mobile health interventions; this trial design is in use across a broad range of health related areas.  She is a 2013 MacArthur Fellow, a member of the National Academy of Sciences and the National Academy of Medicine, both of the US National Academies. 

11:35-12:00 pm
Eytan Bakshy, Facebook 
Eytan Bakshy is a principal scientist and director on the Facebook Core Data Science Team, where he leads the Adaptive Experimentation group. Dr. Bakshy is particularly interested in developing practical and robust methods for sequential experimentation and reinforcement learning for real-world applications. Much of his work has been motivated by the use of randomized experiments for understanding social behavior, including influence and information diffusion in networks.

12:00-12:25 pm
Eric Tchetgen Tchetgen, University of Pennsylvania
Eric Tchetgen Tchetgen is the Luddy Family President’s Distinguished Professor and Professor of Biostatistics at The Wharton School, University of Pennsylvania. His primary area of interest is in semi-parametric efficiency theory with application to causal inference, missing data problems, statistical genetics and mixed model theory. In general, Dr. Tchetgen Tchetgen works on the development of statistical and epidemiologic methods that make efficient use of the information in data collected by scientific investigators, while avoiding unnecessary assumptions about the underlying data generating mechanism.

12:25-12:45 pm 

Danielle Belgrave, Microsoft 
Danielle Belgrave is a machine learning researcher in the Healthcare Intelligence group at Microsoft Research, in Cambridge (UK) where she works on Project Talia.  Dr. Belgrave's research focuses on integrating medical domain knowledge, probabilistic graphical modeling and causal modeling frameworks to help develop personalized treatment and intervention strategies for mental health. Mental health presents one of the most challenging and under-investigated domains of machine learning research. In Project Talia, she explores how a human-centric approach to machine learning can meaningfully assist in the detection, diagnosis, monitoring, and treatment of mental health. problems.

Edward Kennedy, Carnegie Mellon University
Edward Kennedy is Assistant Professor of Statistics & Data Science at Carnegie Mellon University. He joined the department after graduating with a PhD in biostatistics from the University of Pennsylvania. Prior to that he received an MA in statistics from The Wharton School, an MS in biostatistics from the University of Michigan, and a BA in mathematics from the University of Pennsylvania. Dr. Kennedy's research interests include causal inference, machine learning, and nonparametric theory, especially in settings involving high dimensional and otherwise complex data. He is particularly interested in applications in criminal justice, health services, medicine, and public policy. Dr. Kennedy is a recipient of the NSF CAREER award, the David P Byar Young Investigator award, and the Thomas Ten Have Award for exceptional research in causal inference.

12:45-1:05 pm



1:15-1:40 pm
Caroline Uhler, MIT
Caroline Uhler is an Associate Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society at MIT. She holds an MSc in mathematics, a BSc in biology, and an MEd from the University of Zurich. She obtained her PhD in statistics from UC Berkeley in 2011 and then spent three years as an assistant professor at IST Austria before joining MIT in 2015. She is a Simons Investigator, a Sloan Research Fellow and an elected member of the International Statistical Institute. In addition, she received an NSF Career Award, a Sofja Kovalevskaja Award from the Humboldt Foundation, and a START Award from the Austrian Science Foundation. Her research focuses on machine learning and statistics, in particular on graphical models and causal inference, and applications to genomics.

1:40-2:05 pm
Mark van der Laan, UC Berkeley
Mark van der Laan is the Jiann-Ping Hsu/Karl E. Peace Professor of Biostatistics and Statistics at the University of California, Berkeley. He has made contributions to survival analysis, semiparametric statistics, multiple testing, and causal inference. He also developed the targeted maximum likelihood methodology and general theory for super-learning. He is a founding editor of the Journal of Causal Inference and International Journal of Biostatistics. He has authored four books on targeted learning, censored data and multiple testing, authored over 300 publications, and graduated 50 PhD students.  He received the COPSS Presidents' Award in 2005, the Mortimer Spiegelman Award in 2004, and the van Dantzig Award in 2005. 

2:05-2:30 pm
Justin Dyer, LinkedIn
Justin S. Dyer is a Senior Staff Data Scientist at LinkedIn, where he co-leads the Data Science Tech Leadership Group, supporting an organization of 300+ data scientists, and guides a variety of data-science initiatives spanning multiple business lines. Prior to his position at LinkedIn, he was the head of research and a quantitative researcher in the systematic trading space for a number of years, as well as a data scientist in the Search Ads group at Google. He holds a PhD in statistics and an MS in financial mathematics from Stanford in addition to the BSEE and the MS in electrical engineering from Kansas State University.  He has published in top-tier journals and conferences in statistics and applied probability, machine learning, wireless communications, signal processing, and instrumentation and measurement.

2:30-2:50 pm
Francesca Dominici, Harvard University 
Francesca Dominici is the Director of the Harvard Data Science Initiative, at the Harvard University and the Clarence James Gamble Professor of Biostatistics, Population and Data Science at the Harvard T.H. Chan School of Public Health. She is an elected member of the National Academy of Medicine and of the International Society of Mathematical Statistics. She leads an interdisciplinary groups of scientists with the ultimate goal of addressing important questions in environmental health science, climate change, and health policy. Her productivity and contributions to the field have been remarkable. Dr. Dominici has provided the scientific community and policy makers with robust evidence on the adverse health effects of air pollution, noise pollution, and climate change. Her studies have directly and routinely impacted air quality policy.  Dr. Dominici has published more than 220 peer-reviewed publications and was recognized in Thomson Reuter’s 2019 list of the most highly cited researchers–ranking in the top 1% of cited scientists in her field. Her work has been covered by the New York Times, Los Angeles Times, BBC, the Guardian, CNN, and NPR. In April 2020, she has been awarded the Karl E. Peace Award for Outstanding Statistical Contributions for the Betterment of Society by the American Statistical Association. Dominici is an advocate for the career advancement of women faculty. Her work on the Johns Hopkins University Committee on the Status of Women earned her the campus Diversity Recognition Award in 2009. At the T.H. Chan School of Public Health, she has led the Committee for the Advancement of Women Faculty.

Elizabeth Ogburn, Johns Hopkins University
Elizabeth (Betsy) Ogburn is Associate Professor of Biostatistics at Johns Hopkins Bloomberg School of Public Health. She is also a member of the Institute for Data-Intensive Engineering and Science at Johns Hopkins University and affiliated faculty of the Center for Causal Inference at University of Pennsylvania. She has worked on measurement error, semiparametric estimation, instrumental variables methods, and causal mediation analysis. Currently she is excited about causal inference in the presence of unmeasured confounding, causal and statistical inference using data with complex dependence, and the efficient use of randomized trial data to find effective COVID treatments. Betsy completed her PhD in Biostatistics at Harvard University and is a 2016 National Academy of Science Kavli Fellow.  

2:50- 3:10 pm 



3:20-3:45 pm
Sean Taylor, Lyft
Sean J. Taylor is head of Rideshare Labs at Lyft, where he works on causal inference, experimentation, forecasting, and structural modeling. Previously, he led the applied statistics team for Facebook’s Core Data Science team.  He earned his PhD in Information Systems from NYU’s Stern School of Business as well as a BS in Economics from The Wharton School. He specializes in using machine learning, statistics, and randomized experiments for measurement, forecasting, and policy decisions. Sean’s research spans a wide range of topics: online social influence, social networks, applied statistics, causal inference, and Bayesian modeling.  He is also an avid engineer who enjoys putting research into practice by building software, such as his forecasting library Prophet.

3:45-4:10 pm
Sherri Rose, Stanford University
Sherri Rose is an Associate Professor at Stanford University in the Center for Health Policy and Center for Primary Care and Outcomes Research. She is also Co-Director of the Health Policy Data Science Lab. Her methodological research focuses on machine learning for prediction and causal inference. Within health policy, Dr. Rose works on risk adjustment, algorithmic fairness, comparative effectiveness, and health program evaluation. She was recently named a fellow of the American Statistical Association and her other honors include the ISPOR Bernie J. O’Brien New Investigator Award, an NIH Director’s New Innovator Award, and Mid-Career Awards from the American Statistical Association and Penn-Rutgers Center for Causal Inference.

4:10-4:35 pm
Alexander D’Amour, Google
Alexander D’Amour is a Research Scientist at Google Brain in Cambridge, MA. He works primarily on problems in causal inference and fairness. More generally, he is interested in problems where simple prediction is not enough.  Dr. D'Amour's research and consulting projects have included working in social network analysis, sports, healthcare, education, marketing, finance, microfinance, and entertainment.

4:35-4:55 pm
Debashis Ghosh,  University of Colorado
Debashis Ghosh is Professor and Chair of the Department of Biostatistics and Informatics at the Colorado School of Public Health.  He was previously at Penn State University and the University of Michigan.  His research interests are in machine learning methods and their application to problems in biostatistics and bioinformatics.  He has over 230 publications in the scientific and statistical literature and is currently funded as a principal investigator by the National Science Foundation and the National Institutes of Health.


Alan Hubbard, UC Berkley
Alan Hubbard is a Professor of Biostatistics at UC Berkley.  Dr. Hubbard's research focuses on the application of statistics to population studies with particular expertise in semi-parametric models and the use of machine learning in causal inference, as well as applications in high dimensional biology. Applied work ranges from the molecular biology of aging, wildlife biology, social epidemiology, infectious disease and acute trauma. He is particularly interested in harnessing machine-learning algorithms and advances in semiparametric causal inference towards machines for optimizing the estimation of parameters related to causal inference/variable importance, with particular emphasis on discovering and estimating the impact of treatment rules. In addition, currently exploring the application of data-adaptive target parameter approaches to formalize asymptotics for exploratory data analysis, to allow for a lack of a priori specified hypotheses while still providing an estimation of meaningful parameters and estimators with predictable sampling distributions.

4:55-5:15 pm



The Center for Causal Inference (CCI) is a research center that is operating under a partnership between Penn’s Center for Clinical Epidemiology and Biostatistics (CCEB), the Department of Biostatistics and Epidemiology, Rutgers School of Public Health, and Penn’s Wharton School. The mission of the CCI is to be a leading center for research and training in the development and application of causal inference theory and methods.


6th Floor Blockley Hall 
423 Guardian Drive 
Philadelphia, PA 19104 

Email us with general inquiries