Table of Contents
For a full overview of my publications please refer to Google Scholar
Below are short summaries with links on things I am working and have worked on.
## Recent Projects
### AnomalyMatch
Identifying Astrophysical Anomalies in 99.6 Million Cutouts from the Hubble Legacy Archive Using AnomalyMatch David O'Ryan, Pablo Gómez arXiv:2505.03508
We systematically searched approximately 100 million image cutouts from the entire Hubble Legacy Archive using AnomalyMatch, a semi-supervised anomaly detection method. This comprehensive search uncovered 138 new candidate gravitational lenses, 18 jellyfish galaxies, and 417 mergers or interacting galaxies. The method can trawl the complete archive within just 2-3 days, demonstrating its potential for large-scale astronomical surveys including upcoming Euclid data releases.
AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning Pablo Gómez, David O'Ryan arXiv:2505.03509
AnomalyMatch is an anomaly detection framework combining semi-supervised FixMatch algorithms with active learning for efficient discovery of rare objects. The method achieves strong performance on astronomical and natural image benchmarks, processing predictions for 100 million images within three days on a single GPU. Integrated into ESA's Datalabs platform, it facilitates targeted discovery of scientifically valuable anomalies in vast astronomical datasets.
### Cutana
Cutana: A High-Performance Tool for Astronomical Image Cutout Generation at Petabyte Scale Pablo Gómez, Laslo Erik Ruhberg, Kristin Anett Remmelgas, David O'Ryan
Memory-efficient software for batch processing astronomical images at petabyte scale. Uses vectorized NumPy operations for simultaneous cutout extraction with automated memory-aware scheduling, processing thousands of cutouts per second with near-linear scaling. Built for Euclid's data releases, it processes all 30 million Q1 sources in under 4 hours.
### Gaia Exoplanet Orbits
Machine learning-based identification of Gaia astrometric exoplanet orbits Johannes Sahlmann, Pablo Gómez Monthly Notices of the Royal Astronomical Society, Volume 537, Issue 2, February 2025
ML approach using only Gaia DR3 orbital solutions to identify the best candidates for exoplanets and brown-dwarf companions. Combines semi-supervised anomaly detection methods with extreme gradient boosting and random forest classifiers to determine likely low-mass outliers among ~170,000 astrometric orbit solutions, producing a list of 20 best candidates including confirmed brown dwarfs.
### Euclid Anomaly Detection
Machine learning-driven Anomaly Detection and Forecasting for Euclid Space Telescope Operations Pablo Gómez, Roland D. Vavrek, Guillermo Buenadicha, John Hoar, Sandor Kruk, Jan Reerink Presented at IAC 2024
Analyses temperature anomalies in Euclid's telemetry from February to August 2024, focusing on eleven temperature parameters and 35 covariates. Uses a predictive XGBoost model to forecast temperatures and detect anomalies as deviations from predictions, with SHAP analysis for understanding complex parameter relationships. Demonstrates how ML can enhance telemetry monitoring for space missions.
### XAMI
XAMI — A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images Elisabeta-Iulia Dima, Pablo Gómez, Sandor Kruk, Peter Kretschmar, Simon Rosen, Călin-Adrian Popa Accepted for oral presentation at SPAICE 2024
A dataset of 1000 hand-annotated images from the XMM-Newton Optical Monitor showing different types of artefacts (hot pixels, cosmic rays, satellite trails). Demonstrates a hybrid instance segmentation approach combining CNNs and transformer-based models for accurate artefact detection and masking.
### ChatGPT in Astronomy
Delving into the Utilisation of ChatGPT in Scientific Publications in Astronomy Simone Astarita, Sandor Kruk, Jan Reerink, Pablo Gómez Accepted at SPAICE 2024
Extracts words that ChatGPT uses more frequently than humans and searches 1 million articles tracked by the NASA Astrophysics Data System since 2000. Finds a statistically significant increase in ChatGPT-favoured words in 2024 astronomy publications, suggesting widespread adoption of large language models in academic writing.
### MCTED
Paper | GitHub | HuggingFace
MCTED: A Machine-Learning-Ready Dataset for Digital Elevation Model Generation From Mars Imagery Rafał Osadnik, Pablo Gómez, Eleni Bohacek, Rickbir Bahia
Mars CTX Terrain-Elevation Dataset containing 80,898 samples derived from Mars Reconnaissance Orbiter imagery for DEM generation from single orbital images. A small U-Net trained on this dataset outperforms the DepthAnythingV2 foundation model on elevation prediction tasks.
## Other Activities
### Workshop / Competition Organisation
- ESA Datalabs Ariel Hackathon 2025
- Space Optimisation Competition (SpOC) at GECCO 2022
- AI for Space Workshop at CVPR 2021
- Health Hackers Malaria Challenge
- Health Hackers Machine Learning Workshop Series
### Supervised Theses and Internships
#### 2025
-
Machine Learning-based Anomaly Detection in XMM-Newton Instrument Telemetry Data
Tristan Dijkstra - M.Sc. thesis at TU Delft - Report -
Gravitational Lens Detection in JWST Imagery using AnomalyMatch
Elisabeta-Iulia Dima - Archival Research Visitor Programme (ARVP) at ESA -
MCTED: Mars CTX Terrain-Elevation Dataset
Rafał Osadnik - ESA National Trainee - Paper, Code
#### 2024
- Instance Segmentation of Space Telescope Artefacts Using Convolutional and Transformer-Based Architecture
Elisabeta-Iulia Dima - B.Sc. thesis at Politehnica University of Timişoara - Paper, Code - Periodicity in billions of Gaia astrometric timeseries
Anicet Nougaret - ESA Internship
#### 2023
- Trajectory Optimization for a Spacecraft Swarm Orbiting P67/Churyumov-Gerasimenko
Rasmus Maråk - Master's Thesis at KTH - Code, Paper - Investigation of the Robustness of Neural Density Fields
Jonas Schuhmacher - TUM Guided Research - Code, Report, Paper - Preliminary Investigation of Alternative Reality Environments for Accessibility Testing in Spacecraft Design
Antonia Sattler - Bachelor's Thesis at Technical University Ingolstadt of Applied Sciences
#### 2022
- Accessibility on a Lunar Base
Antonia Sattler - ESA Internship - Report - Neural Inverse Design of Nanostructures (NIDN)
Torbjørn Bogen-Storø - ESA Internship - Code, Paper - Neural Network Approximators of Linear Algebra Operations
Tove Ågren - ESA Internship - Code - Efficient Polyhedral Gravity Modeling in Modern C++
Jonas Schuhmacher - TUM Interdisciplinary Project - Code, Report - Modeling Upcoming Megaconstellations in Space Debris Environment Simulations
Albert Noswitz - TUM Bachelor's Thesis - Code, Thesis - Integrating Numerical Backend Modularity into Torchquad Using Autoray
Fritz Hofmeier - TUM Master's Thesis - Code, Thesis
#### 2021
- Efficient Trajectory Modelling for Space Debris Evolution
Oliver Bösing - TUM Bachelor's Thesis - Code, Thesis - Efficient Implementation and Evaluation of the NASA Breakup Model in Modern C++
Jonas Schuhmacher - TUM Bachelor's Thesis - Code, Thesis
#### 2018
- Airflow in the Two-Mass Model
F. W. - FAU Master's Research Internship in Electrical Engineering