Home  /  Programs  /  SURF  /  Announcements of Opportunity

Announcements of Opportunity

SURF: Announcements of Opportunity

Below are Announcements of Opportunity posted by Caltech faculty and JPL technical staff for the SURF program.

Each AO indicates whether or not it is open to non-Caltech students. If an AO is NOT open to non-Caltech students, please DO NOT contact the mentor.

Announcements of Opportunity are posted as they are received. Please check back regularly for new AO submissions! Remember: This is just one way that you can go about identifying a suitable project and/or mentor. Click here for more tips on finding a mentor.

Announcements for external summer programs are listed here.

New for 2021: Students applying for JPL projects should complete a SURF@JPL application instead of a "regular" SURF application.

Students pursuing opportunities at JPL must be
U.S. citizens or U.S. permanent residents.

  << Prev    Record 36 of 69    Next >>           Back To List


Project:  Learning to Extrapolate - SURF@Newcastle
Disciplines:  Computation and Neural Systems, Math, Computer Science, ACM, Physics, other majors are also poss.
Mentor:  Pablo Moscato, Professor, (EAS), pablo.moscato@newcastle.edu.au, Phone: +61 2 434216209 (mobile)
Mentor URL:  https://www.newcastle.edu.au/profile/pablo-moscato  (opens in new window)
Background:  NOTE: This project is being offered by a Caltech alum and will be conducted at the University of Newcastle in Newcastle, Australia. Only Caltech students are eligible for this project.

Do you know that many of the machine learning methods currently in practice have notorious problems when they need to produce results “outside the domain” defined by their training sets?

This project aims at exploring which is the extent of these problems and also to find ways to remediate this.

The project stems from a collaboration with Caltech SURF students in 2018, 2019, and 2020, so this would be the fourth year of a successful enterprise in this area. We are developing truly innovative new machine learning methods.

As of January 2021, several publications have arising from the work with Caltech SURF students and some manuscripts have been submitted [1,2,3,4]. In addition, a new manuscript is being prepared in the area of computational stylistics (involving work by Shakespeare and his peers) [5].

This said, this project has a vast area of applications and it is flexible to accommodate a problem domain that has the interest of the student and offers the possibility of further extensions and research collaborations with the mentors.

This project involves the continuation of this work [1,2,3,4], acceleration of the codes (currently one in Matlab, another in C++), with emphasis in improving several aspects of the existing memetic algorithm [3,6,7]. Implementations of core components in GPUs, TPUs and future hardware systems based on neural networks would also be of interest to speed-up the computations. (if the student has skills in those areas). At the moment we also need to improve the performance of the non-linear optimization components of our algorithm. Significant experimentation and coding will be part of the project and, ideally, familiarity with the existing tools and methods should be gained before coming to work at Newcastle.

Students will explore the limitation of nearly 40 different machine learning regression methods and explore the approach based on analytic continued fractions. Students will work in individual subprojects, but also as a team (in close collaboration with other Caltech SURF students in 2021) and with other Caltech students still interested in the project, current postdocs and PhD students in Newcastle, and partners in Italy, Spain and Australia.

Additional information from mentor:
The University of Newcastle, City Campus, Newcastle, Australia.

Students will have access to the facilities of the NewSpace building located in front of it:

https://www.newcastle.edu.au/about-uon/our-environments/new-space

Some days of the week the students will also conduct their research at the Callaghan Campus of The University of Newcastle accompanying the mentors while they teach/research there. A shuttle exists allowing easy commute between the two campuses.

If COVID-19 still limits travel, the project can be executed online. As in 2020, daily meetings of around one hour are expected at the end of each day to evaluate progress and decision of activities for the next day

Check out @mynewcastle or @pablomoscato on Instagram for photos of the city and its natural beauty. The NewSpace building is close to some beaches and attractions.
Description:  The student will continue the ongoing development of open source codes for memetic algorithms for machine learning problems, mainly in regression but with extension to classification, which it will be based on a representation that exploits the power of analytic continued fractions.

This is likely to lead to a powerful new method to address the problem in which some variables are selected and a non-linear optimization problem needs to be solved to identify the contribution of these variables to fitting a particular function given experimental data.

The method will be tested with a number of datasets of interest and available for experimentation. A comparison with other machine learning approaches are expected, thus the deliverables may help the team to continue the collaboration after SURF and engage in ongoing competitions in international events dedicated to this area or those such as being sponsored by Kaggle and other international groups.

We expect that candidates could continue developing this research area while returning to Caltech, if interested in developing an ongoing collaboration with the mentors (as it has happened in the past). It is possible to imagine a number of other approaches can be explored during the SURF project including the implementation of implementations of algorithms in GPUs, TPUs and future hardware systems (such as Intel’s Nervana, or Graphcore IPUs) and to run the method on them. We expect to get access to some of these systems soon. The internship may provide the necessary time for effective communication of what the core problems are and find a first solution which may result in, at least, one journal publication.
References:  1) A memetic algorithm for symbolic regression,
H. Sun and P. Moscato, in Proc. of IEEE Conference on Evolutionary Computation 2019, pp. 2167-2174, (2019)
https://ieeexplore.ieee.org/document/8789889
2) Analytic Continued Fractions for Regression: Results on
352 datasets from the physical sciences, P. Moscato, H. Sun, M.N. Haque, in Proc. of IEEE IEEE Conference on Evolutionary Computation 2020, pp. 1-8. (2020)
https://ieeexplore.ieee.org/abstract/document/9185564
3) Analytic Continued Fractions for Regression: A Memetic Algorithm Approach, P. Moscato, H. Sun and M.N. Haque, (2020), https://arxiv.org/abs/2001.00624
4) Learning to extrapolate using continued fractions:
Predicting the critical temperature of superconductor materials, P. Moscato, M.N. Haque, K. Huang, J. Sloan and J.C. de Oliveira, (2020)
https://arxiv.org/abs/2012.03774
5) Continued fractions meet the classics or‘ My kingdom for a continued fraction!’, P. Moscato, H. Craig, G. Egan, M.N. Haque, K. Huang, J. Sloan and J.C. de Oliveira (to appear, 2021).
6) Handbook of Memetic Algorithms, F. Neri, C. Cotta and P. Moscato (Eds.), Springer, 2012.
https://www.springer.com/gp/book/9783642232466
7) Memetic Algorithms for Business Analytics and Data Science: A Brief Survey, Pablo Moscato and Luke Mathieson, in Business and Consumer Analytics: New Ideas, Pablo Moscato and Natalie Jane de Vries (Eds), pp 545-608, https://link.springer.com/chapter/10.1007/978-3-030-06222-4_13
8) Padé approximant, by G.A. Baker Jr. in Scholarpedia, http://www.scholarpedia.org/article/ Padé_approximant
9) Distilling Freeform Natural Laws from Experimental Data, https://www.youtube.com/watch?v=lmiAugo1CJI
10) John R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.
11) John R. Koza. Human-competitive results produced by genetic programming. Genetic Programming and Evolvable Machines, 11(3/4):251–284, September 2010. Tenth Anniversary Issue: Progress in Genetic Programming and Evolvable Machines.
12) Gene Expression Programming: A Survey, Jinghui Zhong, Liang Feng, Yew-Soon Ong, http://ieeexplore.ieee.org/abstract/document/7983467/
13) Machine-assisted discovery of relationships in astronomy, Graham, Matthew J., et al. arXiv preprint arXiv:1302.5129 Mon. Not. R. Astron. Soc. (2013).
14) www.genetic-programming.org

Student Requirements:  High-level programming skills, interest in scientific computing/machine learning/artificial intelligence. Experience in HPC and GPU computing, knowledge of symbolic regression and its applications is also a plus.
Programs:  This AO can be done under the following programs:

  Program    Available To
       SURF    Caltech students only 

Click on a program name for program info and application requirements.


  << Prev    Record 36 of 69    Next >>           Back To List