skip to main content
Home  /  Undergraduate Research  /  Programs  /  Amgen Scholars  /  Announcements of Opportunity

Amgen Scholars: Announcements of Opportunity

Below are Announcements of Opportunity posted by Caltech faculty for the Amgen Scholars program.

Announcements of Opportunity are posted as they are received. Please check back regularly for new AO submissions! Remember: This is just one way that you can go about identifying a suitable project and/or mentor. For additional tips on identifying a mentor click here.

Please remember:

  • Students pursuing Amgen must be U.S. citizens, U.S. permanent residents, or students with DACA status.
  • Students pursuing Amgen must complete the 10-week program from June 18 - August 23, 2024. Students must commit to these dates. No exceptions will be made.
  • Accepted students must live in provided Caltech housing.

<< Prev    Record 4 of 63    Next >>           Back To List

Project:  Using machine learning to understand the fundamentals of glycosaminoglycan-protein interactions
Disciplines:  Data Science, Biochemistry, Biology, Chemistry, Computer Science
Mentor:  Linda Hsieh-Wilson, Milton and Rosalind Chang Professor of Chemistry, (CCE),, Phone: 626-395-6101
Mentor URL:  (opens in new window)
AO Contact:  Hailan Yu,
Background:  Glycosaminoglycans (GAGs) are linear sugar polymers consisting of repeating disaccharide (two sugar monomers) units anchored to the cell surface. Being able to interact with more than 3000 proteins, GAGs contribute to a variety of biological processes such as embryonic development, cancer metastasis, and pathogenic infections. What enables GAGs to bind to proteins with drastically different binding sites is their immense structural diversity. However, this diversity also makes it extremely challenging to isolate structurally defined GAGs in appreciable quantities from natural sources or to synthesize GAGs in the lab. Therefore, a systematic understanding of the structure-activity relationship between GAGs and GAG-binding proteins has not yet been achieved, despite its high biological relevance and therapeutic potential. In an effort to address this lack of understanding, our lab has synthesized a comprehensive library of 64 heparan sulfate (one type of GAGs) tetrasaccharides, encompassing all commonly found modification patterns of natural heparan sulfate tetrasaccharides. We have also collected binding affinity data of various fibroblast growth factors (a class of mitogens indispensable to development and homeostasis) to each of those 64 compounds.
Description:  We aim to use and develop machine learning algorithms to uncover fundamental rules of how GAGs recognize proteins and to provide the field with new workflows on collecting and analyzing GAG-protein interaction data. In this project, the student will be a) refining the codes of existing algorithms the lab has been using to analyze protein binding data, b) assisting in and developing code for converting the analyzed data into clear visual outputs, and c) developing new algorithms for mining the data of sequence information.
References:  Previous work on synthesizing the 64 compounds library and collecting protein binding data to the 64 compounds:
Student Requirements:  Experience and interest in applying machine learning to biological data sets is strongly recommended (languages include python and R). Interest in biochemistry and biology. A background in biology and chemistry would be a plus.
Programs:  This AO can be done under the following programs:

  Program    Available To
       SURF    Caltech students only 

Click on a program name for program info and application requirements.

<< Prev    Record 4 of 63    Next >>           Back To List