Search Search

SURF: Announcements of Opportunity

Below are Announcements of Opportunity posted by Caltech faculty and JPL technical staff for the SURF program. Additional AOs for the Amgen Scholars program can be found here.

Specific GROWTH projects being offerred for summer 2018 can be found here.

Each AO indicates whether or not it is open to non-Caltech students. If an AO is NOT open to non-Caltech students, please DO NOT contact the mentor.

Announcements of Opportunity are posted as they are received. Please check back regularly for new AO submissions! Remember: This is just one way that you can go about identifying a suitable project and/or mentor.

Announcements for external summer programs are listed here.

Students pursuing opportunities at JPL must be
U.S. citizens or U.S. permanent residents.

  << Prev    Record 27 of 151    Next >>           Back To List


Project:  Memetic Programming for Symbolic Regression - SURF@Newcastle
Disciplines:  Mathematics, Computer Science, Applied and Computational Mathematics, Computation and Neural Systems
Mentor:  Pablo Moscato, Professor, (EAS), pablo.moscato@newcastle.edu.au, Phone: +61 2 434216209 (mobile)
Mentor URL:  https://www.newcastle.edu.au/profile/pablo-moscato  (opens in new window)
Background:  NOTE: This project will be conducted at the University of Newcastle in Newcastle, Australia.

How can we use problem domain knowledge to deliver a truly memetic programming approach for symbolic regression?

In general, methods for symbolic regression, such as GP, represent solutions as tree structures. Following standard graph notation, there are edges connecting pairs of vertices and some are leaves and others internal vertices these structures. The trees’ internal vertices have associated some mathematical operators. These operators depend on the problem domain, they could be mathematical, e.g. addition, represented as ‘+’, substraction (‘-’), multiplication (‘*’), division (‘/’), etc. In other problem domains it could even be logical operators, or even primitive algorithms, etc. Each of the leaves will have associated one of the predictor variables of the study of which we have data available. As said before, other building blocks associated to the internal vertices could be logical operators (e.g. AND, NOT, OR, XOR, etc), mathematical functions (e.g. trigonometric functions, exponentiation, etc.) and many others.

Again, in general, research in symbolic regression methods, like GPs, have not linked to exploit problem-domain information. In a Memetic Programming approach, we could exploit mathematical knowledge built over many decades in the area of function approximation. For instance, Padé approximants are known to provide good models for fitting one-dimensional functions yet are hardly ever observed as the result of GP-evolved solutions. Padé approximants are considered to be the 'best' approximation of a function by a rational function of given order. The power series of the Padé approximant's agrees with the power series of the function it is approximating.

This indicates that within a memetic computing approach even the representation should be reconsidered and revisited with new proposal methods. For instance, the functional forms of the Padé approximants can be evolved and the parameters adjusted via ad-hoc non-linear optimization algorithms.

Implementations of core components in GPUs, TPUs and future hardware systems based on neural networks would also be of interest to speed-up the computations.

This project has a vast area of applications and it is flexible to accommodate a problem domain that has the interest of the student and offers the possibility of further extensions and research collaborations with the mentors.
Description:  The student will develop an open source code for Memetic Programming for symbolic regression which it will be based on a representation via ratios of polynomials and Multivariate Padé approximants.

This is likely to lead to a heuristic to address the problem in which some variables are selected and a non-linear optimization problem needs to be solved to identify the contribution of these variables to fitting a particular function given experimental data.

The method will be tested with a number of datasets of interest and available for experimentation. A comparison with other Genetic Programming approaches are expected, thus the deliverables may help to compete in international events dedicated to this area.

Candidates could continue developing this research area while returning to Caltech, if interested in developing an ongoing collaboration with the mentors. It is possible to imagine a number of other approaches can be explored during the SURF project including the implementation of implementations of algorithms in GPUs, TPUs and future hardware systems (such as Intel’s Nervana) and to run Memetic Programming method on them. The internship may provide the necessary time for effective communication of what the core problems are and find a first solution which may result in, at least, one journal publication.
References:  1) John R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.

2) John R. Koza. Human-competitive results produced by genetic programming. Genetic Programming and Evolvable Machines, 11(3/4):251–284, September 2010. Tenth Anniversary Issue: Progress in Genetic Programming and Evolvable Machines.

3) Handbook of Memetic Algorithms, F. Neri, C. Cotta and P. Moscato (Eds.), Springer, 2012.

4) Gene Expression Programming: A Survey, Jinghui Zhong, Liang Feng, Yew-Soon Ong, http://ieeexplore.ieee.org/abstract/document/7983467/

5) Machine-assisted discovery of relationships in astronomy, Graham, Matthew J., et al. arXiv preprint arXiv:1302.5129 Mon. Not. R. Astron. Soc. (2013).

6) www.genetic-programming.org

7) Padé approximant, by G.A. Baker Jr. in Scholarpedia, http://www.scholarpedia.org/article/ Padé_approximant

8) Distilling Freeform Natural Laws from Experimental Data, https://www.youtube.com/watch?v=lmiAugo1CJI
Student Requirements:  High-level programming skills, interest in scientific computing/machine learning/artificial intelligence. Experience in HPC and GPU computing, knowledge of symbolic regression and its applications is also a plus.
Programs:  This AO can be done under the following programs:

  Program    Available To
       SURF    Caltech students only 

Click on a program name for program info and application requirements.


  << Prev    Record 27 of 151    Next >>           Back To List