Student-Faculty Programs Office
Summer 2024 Announcements of Opportunity


<< Prev    Record 30 of 63    Next >>           Back To List

Project:  SURF@Newcastle, Australia: Memetic Algorithms and Continued Fraction Regression for Precision Machine Learning
Disciplines:  Computation and Neural Systems, Astronomy, Math, CS, ACM, Physics
Mentor:  Pablo Moscato, Professor, (EAS), pablo.moscato@newcastle.edu.au, Phone: +61 2 434216209 (mobile)
Mentor URL:  https://www.newcastle.edu.au/profile/pablo-moscato  (opens in new window)
Background:  NOTE: This project is being offered by a Caltech alum and is open only to Caltech students. The project will take place at University of Newcastle Australia in Newcastle, Australia. See additional notes in last field below.

Towards addressing the need of more interpretable models in AI, we started a collaboration that involved several Caltech SURF students in 2018, 2019, and 2020.

In this work we have we pioneered the use of analytic continued fractions.

As of January 2021, several publications have arising from the work with Caltech SURF students and some manuscripts have been accepted [1,2,3,4,5]. In addition, new manuscripts in areas such as nuclear physics [6,7], approximating unknown potentials given limited data [8], rock mechanics [9], and the famous Thomson problem [10] have been submitted and/or published. Links to these publications are available via searching for Prof. Moscato’s information in Google Scholar.

What characterizes our work is to find good approximations but also that excel in generalization in extrapolation.

This is indeed a project for pioneers in new techniques that deliver better generalization from limited number of examples.

Students will explore the limitations of different machine learning regression methods and help to develop a new methodology based on analytic continued fractions. SURF Students will work in individual subprojects, but also as a team and potentially with other Caltech students still interested in the project, current postdocs and PhD students in Newcastle, and partners in Italy, Spain and Australia.
Description:  The student will continue the ongoing development of open source codes for memetic algorithms [3,11] for machine learning problems which it will be based on a representation that exploits the power of analytic continued fractions.

In this particular project, the student will look at datasets of current interest to our group and we envision that we can also look at other datasets of interest of the student and/or collaborators of Caltech.

The method will be tested with a number of datasets of interest and available for experimentation. A comparison with other machine learning approaches are expected, thus the deliverables may help the team to continue the collaboration after SURF and engage in ongoing competitions in international events dedicated to this area or those such as being sponsored by Kaggle and other international groups.

We expect that candidates could continue developing this research area while returning to Caltech, if interested in developing an ongoing collaboration with the mentors (as it has happened in the past).

The internship may provide the necessary time for effective communication of what the core problems are and find a first solution which may result in, at least, one journal publication.
References:  1) A memetic algorithm for symbolic regression,
H. Sun and P. Moscato, in Proc. of IEEE Conference on Evolutionary Computation 2019, pp. 2167-2174, (2019)
https://ieeexplore.ieee.org/document/8789889

2) Analytic Continued Fractions for Regression: Results on 352 datasets from the physical sciences, P. Moscato, H. Sun, M.N. Haque, in Proc. of IEEE IEEE Conference on Evolutionary Computation 2020, pp. 1-8. (2020).
https://ieeexplore.ieee.org/abstract/document/9185564

3) Analytic continued fractions for regression: A memetic algorithm approach
P Moscato, H Sun, MN Haque
Expert Systems with Applications 179, 115018

4) Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials, P Moscato, MN Haque, K Huang, J Sloan, J Corrales de Oliveira, Algorithms 16 (8), 382, (2023), https://www.mdpi.com/1999-4893/16/8/382

5) Multiple regression techniques for modelling dates of first performances of Shakespeare-era plays,
Pablo Moscato, Hugh Craig, Gabriel Egan, Mohammad Nazmul Haque, Kevin Huang, Julia Sloan, Jonathon Corrales de Oliveira, Expert Systems with Applications 200, 116903, (2022), https://www.sciencedirect.com/science/article/abs/pii/S0957417422003414

6) Approximating the Nuclear Binding Energy Using Analytic Continued Fractions, P Moscato, R Grebogi (preprint, Dec. 2023, available online), https://www.researchsquare.com/article/rs-3703198/v1

7) Approximating the Boundaries of Unstable Nuclei Using Analytic Continued Fractions, P Moscato, R Grebogi, Proceedings of the Genetic and Evolutionary Computation Conference, 2023. https://dl.acm.org/doi/10.1145/3583133.3590638

8) New alternatives to the Lennard-Jones potential, P Moscato, MN Haque, (preprint, Dec. 2023, available online), https://www.researchsquare.com/article/rs-3633448/v1

9) Mathematical modelling of peak and residual shear strength of rough rock discontinuities using continued fractions
O Buzzi, M Jeffery, P Moscato, RB Grebogi, MN Haque
Rock Mechanics and Rock Engineering, 1-15, 2023. https://link.springer.com/article/10.1007/s00603-023-03548-0

10) Continued fractions and the Thomson problem, P Moscato, MN Haque, A Moscato, Scientific Reports 13 (1), 7272 (2023). https://www.nature.com/articles/s41598-023-33744-5

11) Dynamic Depth for Better Generalization in Continued Fraction Regression, P Moscato, A Ciezak, N Noman, Proceedings of the Genetic and Evolutionary Computation Conference, 520-528, 2023. https://dl.acm.org/doi/abs/10.1145/3583131.3590461

12) Handbook of Memetic Algorithms, F. Neri, C. Cotta and P. Moscato (Eds.), Springer, 2012.
https://www.springer.com/gp/book/9783642232466

13) Memetic Algorithms for Business Analytics and Data Science: A Brief Survey, P. Moscato and L. Mathieson, in Business and Consumer Analytics: New Ideas, Pablo Moscato and Natalie Jane de Vries (Eds), pp 545-608, https://link.springer.com/chapter/10.1007/978-3-030-06222-4_13
Student Requirements:  High-level programming skills, interest in scientific computing/machine learning/artificial intelligence. Experience in HPC and GPU computing, knowledge of symbolic regression and its applications is also a plus.

Additional Notes: Students will have access to the facilities of the NewSpace building located in front of it:

https://www.newcastle.edu.au/about-uon/our-environments/new-space

Some days of the week the students will also conduct their research at the Callaghan Campus of The University of Newcastle accompanying the mentors while they teach/research there. A shuttle exists allowing easy commute between the two campuses.

Check out @mynewcastle or @pablomoscato on Instagram for photos of the city and its natural beauty. The NewSpace building is close to some beaches and attractions.

A video of continued fractions for regression problems in physics can be found at Prof. Moscato’s YouTube channel.
https://www.youtube.com/channel/UC10nC9lebye6tLm-xO2ASnw
Programs:  This AO can be done under the following programs:

  Program    Available To
       SURF    Caltech students only 

Click on a program name for program info and application requirements.



<< Prev    Record 30 of 63    Next >>           Back To List
 

Problems with or questions about submitting an AO?  Call Alexandra Katsas of the Student-Faculty Programs Office at (626) 395-2885.
 
About This Site