Loading…
This event has ended. Visit the official site or create your own event on Sched.
Back To Schedule
Wednesday, July 27 • 4:00pm - 4:20pm
The tidysynthesis R package

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Society benefits when leaders make more evidence-based decisions, but growing privacy concerns hamper researchers’ ability to understand and improve the world. Fully synthetic data, pseudo data generated by models, can protect confidentiality and produce statistically valid analysis. This talk shares how the Urban Institute collaborates with the IRS to create fully synthetic tax data for tax policy research. We built an R package called tidysynthesis to create machine learning models for each variable in the data. tidysynthesis leverages the power of tidymodels and allows users to run a sequences of machine learning models with different recipes, engines, and samplers while adding additional noise and enforcing logical constraints.

Speakers
avatar for Aaron R. Williams

Aaron R. Williams

Urban Institute
Aaron R. Williams is a senior data scientist at the Urban Institute where he works on microsimulation models, data imputation methods, and expanding access to administrative data with formal privacy and synthetic data. Williams leads Urban’s R Users Group and teaches Intro to Data... Read More →


Wednesday July 27, 2022 4:00pm - 4:20pm EDT
2. Potomac D