SuperUROP, launched in 2012, is an expanded version of MIT’s Undergraduate Research Opportunities Program (UROP). Hosted by the Department of Electrical Engineering and Computer Science and now open to students throughout the School of Engineering, the yearlong program provides students the opportunity to partner with faculty on publication-worthy research. In 2015, Erin Hong and Aaron Zalewski worked with the MIMIC database—which archives clinical data from nearly 60,000 patient stays in intensive care units at Boston’s Beth Israel Deaconess Medical Center—under faculty advisor Roger Mark. Now seniors, they revisit the goals they stated going into last year’s SuperUROP experience, and reflect on what they learned.

Erin Hong ’17
Actifio Undergraduate Research and Innovation Scholar

Before: “I plan to expand the database… While merging data sets, I will have to account for gaps in the database due to flawed data and different hospital information formatting, all while finding the best way to capture and communicate implicit information to researchers… I hope to learn how to study and manipulate data through the eyes of a clinical technician and with the mind of an engineer.”

After: “Before working toward building a federated data model for heterogeneous electronic health records (EHR), I naïvely dedicated a few weeks of my work to read over system documentation and understand the organization of several EHR schemas. Soon enough, I was daunted by the low quality in electronic health system documentation, erroneous data, and vast amounts of unmapped implicit knowledge. These observations not only pointed to the lack of standardized health record systems but also indicated how difficult consolidating various data sets was. Building a federated data model and fully understanding the distinctions of individual data sets therefore necessitated a more stepwise approach… Since the end of my SuperUROP program, I have joined a team of oncologists, roboticists, computational biologists, and engineers as an intern at Driver, a cancer genomics company whose mission is to provide patients access to clinical trials.”

Aaron Zalewski ’17
Angle Undergraduate Research and Innovation Scholar

Before: “We will analyze data from physiological time series by using machine learning algorithms to predict patient mortality and the likelihood of the patient developing sepsis. We will model physiological time series using machine learning techniques to discover ‘clusters’ of time series segments with similar trajectories and transient dynamics. Our goal is to identify prototypical temporal patterns from vital sign time series that we can use to generate early warning signs to alert doctors of patients with worsening conditions.”

After: “While the MIMIC database has done a great deal to simplify working with ICU data, dealing with inconsistent frequencies in the measurement of patient data was a particular challenge. It’s difficult to determine the best way to standardize and then analyze patient data when the raw data aren’t uniform. I ended up solving this problem by using interpolation and some custom-made feature vectors to generate the input for my clustering algorithms… The biggest thing I learned is that how you process the data beforehand is just as important, if not more important, than the algorithms that you use to analyze those data.”


Share your thoughts

Thank you for your comments and for your role in creating a safe and dynamic online environment. MIT Spectrum reserves the right to remove any content that is deemed, in our sole view, commercial, harmful, or otherwise inappropriate.

Your email address will not be published. Required fields are marked *