Project:  Contamination Model for Spacecraft Associated Surfaces
Disciplines:  Data Science, Biology
Mentor:  Ashish Mahabal, Lead Computational Scientist, (PMA), aam@astro.caltech.edu, Phone: 6263954201
Mentor URL:  http://www.astro.caltech.edu/~aam  (opens in new window)
Background:  NASA GeneLab Data System (GLDS) is a simple repository of "omics" data generated by NASA-funded projects without any computational tools to analyze these complex datasets. Microbiome Analysis of NASA GeneLab Omics (MANGO), enables the visualization of microbial next-generation sequencing (NGS) data that are not yet curated but available on GLDS. MANGO is thus a Systems Biology Informatics (SBI) research tool that extends the GLDS investigations. This tool can also be helpful in developing new computational algorithms and techniques to perform novel informatics research and produce new informatics products that enhance the value of GLDS for all future open science-based investigations, and provide easy-to-interpret insights into the microbiomes of the International Space Station (ISS), revealing potential changes in the microbial community dynamics.
Description:  The MANGO team analyzed samples from 289 locations on the ISS and spacecraft assembly facility. This makes it currently the most extensive metagenomic study associated with any NASA project present in GeneLab. The samples consist of controls and PMA treated, indicating the viable microbiome population in spacecraft associated surfaces. We will use dataset generated from these ~300 locations, on ~3000 organisms to investigate overlaps, as well as possible levels of contamination with species that can be detrimental from the perspective of planetary protection. The project will involve applying trait-based machine learning models as well as unsupervised clustering in order to create a “Contamination model” for spacecraft associated surfaces. The techniques will be applicable to many other metagenomic datasets.
References:  NASA-GeneLab: https://genelab.nasa.gov
MANGO super study: https://www.ebi.ac.uk/metagenomics/super-studies/3
BacDive: https://bacdive.dsmz.de/api/bacdive/
Trait Synthesis: https://www.nature.com/articles/s41597-020-0497-4
Student Requirements:  Proficiency in python, jupyter notebooks, and git. Conversant with statistics basics, knowledge about linux/unix. Should know the basics of machine learning. Basic biology knowledge will be a plus.
