Data Management and Visualization I 

EPID 5360
Spring term (second half of term)
0.5 CU
Core
Prerequisite
Permission of Instructor; EPID 5100, EPID 5260, EPID 5270

The objective of this two-course series is to enhance MSCE students' comfort and acumen in all aspects of clinical epidemiological data management and presentation, particularly graphical representation of results. The course progresses from best practices in data collection and database use to advanced data management, summarization of results, and data visualization, all of which are grounded in the prioritization of producing efficient and reproducible research processes. The course will cover and develop skills in: basic data collection, harmonization, and integration with Stata software; best practices for data variable derivation and creation; assessing and dealing with missing data; merging and appending datasets; management of dates and times; assessing free text data; dealing with specific data types such as ICD-9 and 10 codes, cost data, management of longitudinal and time-to-event data; production of descriptive and regression tables (for all regression types); descriptive and regression model visualization; and the use of Stata Markdown files such that research reports can be created directly from Stata. By the end of the two-course series, students will become fluent in the Stata statistical language and be uniquely positioned to advance their independent clinical epidemiological careers through best research and data presentation practices.