Announcements of Opportunity
SURF: Announcements of Opportunity
Below are Announcements of Opportunity posted by Caltech faculty and JPL technical staff for the SURF program.
Each AO indicates whether or not it is open to non-Caltech students. If an AO is NOT open to non-Caltech students, please DO NOT contact the mentor.
Announcements of Opportunity are posted as they are received. Please check back regularly for new AO submissions! Remember: This is just one way that you can go about identifying a suitable project and/or mentor. Click here for more tips on finding a mentor.
Announcements for external summer programs are listed here.
New for 2021: Students applying for JPL projects should complete a SURF@JPL application instead of a "regular" SURF application.
Students pursuing opportunities at JPL must be
U.S. citizens or U.S. permanent residents.
|Project:||Learning and Optimization: How can we succeed being "Lazy but Wise?" - SURF@Newcastle|
|Disciplines:||Computation and Neural Systems, Ma, CS, ACM, Ph, other majors are possible|
|Mentor URL:||https://www.newcastle.edu.au/profile/pablo-moscato (opens in new window)|
NOTE: This project is being offered by a Caltech alum and will be conducted at The University of Newcastle in Newcastle, Australia. Only Caltech students are eligible for this project.
Believe it or not, it is sometimes a good strategy to “just think” for a bit, and find “a lazy way” to reformulate a computational problem “and solve it” with minimum effort. But what really means for practice?
Like many of us, you were possibly taught that, basically, “algorithms rule the world”. While this is obviously true, in some domains, we could say that “heuristics rule the world”.
One of the most important heuristics happens at the initial stage of the solution of any problem by computational means. Even if you want to solve a problem exactly, you may need to consider the problem mathematical formulation. They can have huge consequences on the final computational time, and for learning problems, the capacity for generalization.
This project aims at exploring how we can heuristically reformulate some important problems in learning and optimization and how we can accelerate in orders of magnitude the time to find a near-optimal, or even an optimal solution.
The project stems from a collaboration with Caltech SURF students in 2018, 2019, and 2020, so this would be the fourth year of a successful enterprise in this area. We are developing truly innovative new machine learning methods.
As of January 2021, several publications have arising from the work with Caltech SURF students and some manuscripts have been submitted [1,2,3,4]. In addition, a new manuscript is being prepared in the area of computational stylistics (involving work by Shakespeare and his peers) .
This subproject is motivated by our recent research in machine learning in our group [6,7] and the role of objective function reformulation also in optimization .
This said, this project has a vast area of applications and it is flexible to accommodate a problem domain that has the interest of the student and offers the possibility of further extensions and research collaborations with the mentors.
Students will explore the limitation of nearly 40 different machine learning regression methods and explore the approach based on analytic continued fractions. Students will work in individual subprojects, but also as a team (in close collaboration with other Caltech SURF students in 2021) and with other Caltech students still interested in the project, current postdocs and PhD students in Newcastle, and partners in Italy, Spain and Australia.
Additional information from mentor:
The University of Newcastle, City Campus, Newcastle, Australia.
Students will have access to the facilities of the NewSpace building located in front of it:
Some days of the week the students will also conduct their research at the Callaghan Campus of The University of Newcastle accompanying the mentors while they teach/research there. A shuttle exists allowing easy commute between the two campuses.
If COVID-19 still limits travel in 2021, the project can be executed online. As in 2020, daily meetings of around one hour are expected at the end of each day to evaluate progress and brainstorm and reach decisions for the activities for the next day.
Check out @mynewcastle or @pablomoscato on Instagram for photos of the city and its natural beauty. The NewSpace building is close to some beaches and attractions.
The student will continue the ongoing development of open source codes for memetic algorithms for machine learning problems, mainly in regression but with extension to classification, which it will be based on a representation that exploits the power of analytic continued fractions.
This is likely to lead to a powerful new method to address the problem in which some variables are selected and a non-linear optimization problem needs to be solved to identify the contribution of these variables to fitting a particular function given experimental data.
The method will be tested with a number of datasets of interest and available for experimentation. A comparison with other machine learning approaches are expected, thus the deliverables may help the team to continue the collaboration after SURF and engage in ongoing competitions in international events dedicated to this area or those such as being sponsored by Kaggle and other international groups.
We expect that candidates could continue developing this research area while returning to Caltech, if interested in developing an ongoing collaboration with the mentors (as it has happened in the past). It is possible to imagine a number of other approaches can be explored during the SURF project including the implementation of implementations of algorithms in GPUs, TPUs and future hardware systems (such as Intel’s Nervana, or Graphcore IPUs) and to run the method on them. We expect to get access to some of these systems soon. The internship may provide the necessary time for effective communication of what the core problems are and find a first solution which may result in, at least, one journal publication.
1) A memetic algorithm for symbolic regression,
H. Sun and P. Moscato, in Proc. of IEEE Conference on Evolutionary Computation 2019, pp. 2167-2174, (2019)
2) Analytic Continued Fractions for Regression: Results on
352 datasets from the physical sciences, P. Moscato, H. Sun, M.N. Haque, in Proc. of IEEE IEEE Conference on Evolutionary Computation 2020, pp. 1-8. (2020)
3) Analytic Continued Fractions for Regression: A Memetic Algorithm Approach, P. Moscato, H. Sun and M.N. Haque, (2020), https://arxiv.org/abs/2001.00624
4) Learning to extrapolate using continued fractions:
Predicting the critical temperature of superconductor materials, P. Moscato, M.N. Haque, K. Huang, J. Sloan and J.C. de Oliveira, (2020)
5) Continued fractions meet the classics or ‘My kingdom for a continued fraction!’, P. Moscato, H. Craig, G. Egan, M.N. Haque, K. Huang, J. Sloan and J.C. de Oliveira (to appear, 2021).
6) The Unexpected Virtue of Problem Reductions or How to Solve Problems Being Lazy but Wise, L. Mathieson and P. Moscato, in Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), https://ieeexplore.ieee.org/abstract/document/9308295
7) Target curricula via selection of minimum feature sets: a case study in Boolean networks, S. Fenn and P. Moscato, Journal of Machine Learning Research, https://dl.acm.org/doi/abs/10.5555/3122009.3176858
8) Target Curricula for Multi-Target Classification: The Role of Internal Meta-Features in Machine Teaching, Shannon Fenn, PhD Thesis, The University of Newcastle, 2019.
9) Handbook of Memetic Algorithms, F. Neri, C. Cotta and P. Moscato (Eds.), Springer, 2012.
10) Memetic Algorithms for Business Analytics and Data Science: A Brief Survey, Pablo Moscato and Luke Mathieson, in Business and Consumer Analytics: New Ideas, Pablo Moscato and Natalie Jane de Vries (Eds), pp 545-608, https://link.springer.com/chapter/10.1007/978-3-030-06222-4_13
11) Padé approximant, by G.A. Baker Jr. in Scholarpedia, http://www.scholarpedia.org/article/ Padé_approximant
12) Distilling Freeform Natural Laws from Experimental Data, https://www.youtube.com/watch?v=lmiAugo1CJI
|Student Requirements:||High-level programming skills, interest in scientific computing/machine learning/artificial intelligence. Experience in HPC and GPU computing, knowledge of symbolic regression and its applications is also a plus.|
This AO can be done under the following programs:
<< Prev Record 35 of 69 Next >> Back To List