I am an Assistant Professor of Computer Science at Virginia Tech. Prior to VT, I spent a year as research scientist (Research Staff Member) at IBM Almaden Research Center, working on Machine Learning, Natural Language Processing and Information Retrieval problems. I obtained my Ph.D. from the Computer Science Department, University of Illinois at Urbana - Champaign, under the supervision of Prof. ChengXiang Zhai, Text Information Management and Analysis Group (TIMAN).

My research interests lie in the intersection of Data Science, Big Data, Machine Learning, Natural Language Processing and Information Retrieval. My work revolves around machine learning challenges related to data, e.g., learning with limited imperfect supervision, self-supervision, uncertainty calibration, outlier detection, adversarial training, essentially work under the theme of Data Quality in Machine Learning. My current projects involve active and semi-supervised learning, contrastive multi-modal learning, graph adversarial learning, sequential decision making and interpretability mechanisms. Overall, I am interested in building intelligent task assistants that augment human intelligence. I have extensively collaborated on interdisciplinary projects that touch societal dimensions, in areas ranging from Health Informatics and Genomics to Psychology, Education and Social Computing. You can learn more by looking at my publications. I am excited about opportunities to collaborate in areas such as Reinforcement Learning, Multi-agent RL and Meta-learning.

  • If you are a prospective student interested in working with me, please make sure to apply and include me in your research statement. Unfortunately, due to email overload, responding to such requests has become sporadic lately, and there is a high likelihood of missing an email.


  • April, 2020: Paper accepted to SIGIR’21 and workshop paper accepted to PerInt@PETRA’21.
  • April, 2020: Will be teaching CS5604: Information Storage and Retrieval in Fall 2021 (more details soon).
  • March, 2020: Our multimodal Chest X-ray data was recently published in Nature Scientific Data, accompanied with a short blog describing the work. We show that utilizing eye-gaze information can lead to improved performance and guide the model to produce more accurate activation maps.
  • December, 2020: Will be teaching a new seminar course,CS6604: Data Challenges in Machine Learning.
  • December, 2020: Happy to be recognized as an Outstanding Reviewer for EMNLP 2020.
  • November, 2020: Will be joining the Computer Science Department at Virginia Tech as assistant professor.
  • November, 2019: Defended my PhD thesis: Data Quality in the Deep Learning Era and joined IBM Research.
  • October, 2019: Attending ISWC 2019 in Auckland, NZ (travel award)
  • August, 2019: Selected to participate in the EECS Rising Stars 2019 workshop