The DataMine Research Group

Deriving actionable insights from real-world data


The research in DataMine laboratory focuses on the development of novel tools and approaches for data-centric artificial intelligence (DCAI). Unlike model-centric artificial intelligence (MCAI), which treats training data as auxiliary to the learning process and spends extraordinary time on optimizing the parameters of the model, DCAI aims to achieve better outcomes by keeping the model and its parameters unchanged and spends more time on improving the quality of the training data. Using non-trivial and automated data selection approaches, in our past research, we showed that accurate models may be trained with significantly less data, thus, requiring fewer computing resources and reducing the carbon footprint of MCAI.

We are currently recruiting motivated and creative undergraduate and graduate students for two projects:

  • Green algorithms for the core-set selection problems.
  • Efficient data selection for training of a predictor of acute myeloid leukemia (AML) relapse after transplant.


Current Group Members

Natalia Khuri, Principal Investigator

link to personal page

Michael Wang, Undergraduate Researcher

Michael is developing novel approaches for core-set selection for machine learning.

Shelton Zhao, Undergraduate Researcher

Shelton is working on evoluationary multi-objective optimization.

Past Members

Han Bao, Undergraduate Researcher (2019-2020)
Sapan Bhandari, MSCS Thesis and Summer Research Assistanship (2020-2021)
Andrew Greene, Early-College Undergraduate Research and URECA Undergraduate Scholar (2019-2020)
Andrew Knox, Undergraduate Researcher (Summer 2020)
Tianen Liu, CS Honor's Project (2019-2020)
Caitlyn Marsac, URECA Undergraduate Scholar (Summer 2020)
Joshua Mannion, MSCS Thesis (2020-2021)
Esteban Murillo Burford, MSCS Thesis (2019-2020)
Sarah Parsons, Staff Research Scientist (2020-2021)
Anish Prasanna, Undergraduate Researcher (2019-2020)
Jackson Shapiro, CS Honor's Project (2019-2020)
Mitchell Topaloglu, CS Honor's Project (2021-2022)
Xiaochen Wang, CS Honor's Project (Fall 2020)
Nathan Whitener, Early-College Undergraduate Research, URECA Undergraduate Scholar, CS Honor's Project (2019-2023)
Reyna Wu, CS Honor's Project and Undergraduate Researcher (2020-2021)
Ria Xia, Undergraduate Researcher (2021-2022)
Jasmine Xu, CS Honor's Project (2020-2021)
Tian Yun, CS Honor's Project and Undergraduate Researcher (2019-2020)

Get in touch, if you are interested in collaborations, have project ideas, or want to discuss our research.