Deriving actionable insights from real-world data
Our research focuses on understanding machine learning models and real-world data through quantitative metrics and analyses. We work on interdisciplinary and crossdisciplinary research projects in biomedical, life, health and social sciences and Computer Science education. Below are ongoing projects in our research group.
Data-driven modeling and machine learning are rapidly changing the process of scientific discovery and development of solutions to real-world problems. We are interested in understanding and quantifying the value of the real-world data used in training of such models. We are applying data valuation techniques to improve virtual screening of ligands and acquire better fitness tracking data from IoT cycling devices. Funding: 2020 WFU Pilot Research Grant.
Skills needed: Python programming, machine learning, game theory, high performance computing, computer vision.
To better understand the safety and efficacy of approved drugs in pediatric populations, we are building a PediatricDB portal powered by the state-of-the art natural language processing and machine learning. We are developing text classifiers and hybrid topic modeling approaches to accurately screen millions of unstructured biomedical texts and extract drug-patient relations.
We are developing recurrent neural networks to predict targets for CRISPR-Cas9 genome editing in mammalian genomes. Additionally, we are building a web-based application powered by deep learning, to predict CRISPR arrays in newly sequenced bacterial and archaeal genomes.
Single-cell RNA-sequencing is increasingly used in biomedical domains. Computational analyses of these high-dimensional, zero-inflated datasets focus mostly on the gene expression. We are developing a software package, NLPSeq, which allows for the analyses of gene expression data along with the clinical annotations. Funding: 2019 WFU Biomedical Informatics Pilot Research Grant.
Skills needed: bioinformatics, biomedical informatics, deep learning, high performance computing, cloud computing
Meta-research is the process of organizing, producing and communicating scientific research. Its overall aim is to contribute to the scientific ecosystem by identifying gaps in knowledge as well as in transparency, rigor and reproducibility. We are particularly interested in understanding the trends and interconnections between CS education research and scholarly works at a large scale. We are developing open source computational approaches to studying the education literature at a large scale. Funding: 2019 NCWIT Academic Alliance Seed Award; 2020 WFU Leadership and Character Course Development Grant.
Skills needed: natural language processing, machine learning, high performance computing
link to personal page
Sarah is developing a multi-stage data valuation method for machine learning and hybrid methods for generative topic modeling.
Sapan is developing hierarchical deep learning models for cell type prediction from heterogeneous single-cell RNA-sequencing data.
Joshua is building a data-driven predictor of target sites for CRISPR-CAS9 genome editing.
Reyna is evaluating the utility of generative adversarial network (GAN) for dimensionality reduction of single-cell RNA-sequencing data.
Jasmine is studying how coalition game theory can be used to improve data acquisition from fitness tracking IoT devices.
Nathan is developing computational approaches for the discovery of novel biomarkers of cancer immunotherapies.
Get in touch, if you are interested in collaborations, have project ideas, or want to discuss our research.