Table of Contents
  1. Recent Projects
  2. Other Activities

For a full overview of my publications please refer to Google Scholar

Below are short summaries with links on things I am working and have worked on.

## Recent Projects

### AnomalyMatch

Six astrophysical anomalies discovered in the Hubble archive — Credit: ESA/Hubble & NASA, D. O'Ryan, P. Gomez, M. Zamani

GitHub

Identifying Astrophysical Anomalies in 99.6 Million Cutouts from the Hubble Legacy Archive Using AnomalyMatch David O'Ryan, Pablo Gómez arXiv:2505.03508

We systematically searched approximately 100 million image cutouts from the entire Hubble Legacy Archive using AnomalyMatch, a semi-supervised anomaly detection method. This comprehensive search uncovered 138 new candidate gravitational lenses, 18 jellyfish galaxies, and 417 mergers or interacting galaxies. The method can trawl the complete archive within just 2-3 days, demonstrating its potential for large-scale astronomical surveys including upcoming Euclid data releases.

AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning Pablo Gómez, David O'Ryan arXiv:2505.03509

AnomalyMatch is an anomaly detection framework combining semi-supervised FixMatch algorithms with active learning for efficient discovery of rare objects. The method achieves strong performance on astronomical and natural image benchmarks, processing predictions for 100 million images within three days on a single GPU. Integrated into ESA's Datalabs platform, it facilitates targeted discovery of scientifically valuable anomalies in vast astronomical datasets.

### Cutana

Cutana — astronomical image cutout generation

GitHub | Paper

Cutana: A High-Performance Tool for Astronomical Image Cutout Generation at Petabyte Scale Pablo Gómez, Laslo Erik Ruhberg, Kristin Anett Remmelgas, David O'Ryan

Memory-efficient software for batch processing astronomical images at petabyte scale. Uses vectorized NumPy operations for simultaneous cutout extraction with automated memory-aware scheduling, processing thousands of cutouts per second with near-linear scaling. Built for Euclid's data releases, it processes all 30 million Q1 sources in under 4 hours.

### Gaia Exoplanet Orbits

Gaia colour-magnitude and mass function diagrams for exoplanet orbit candidates

Paper | arXiv

Machine learning-based identification of Gaia astrometric exoplanet orbits Johannes Sahlmann, Pablo Gómez Monthly Notices of the Royal Astronomical Society, Volume 537, Issue 2, February 2025

ML approach using only Gaia DR3 orbital solutions to identify the best candidates for exoplanets and brown-dwarf companions. Combines semi-supervised anomaly detection methods with extreme gradient boosting and random forest classifiers to determine likely low-mass outliers among ~170,000 astrometric orbit solutions, producing a list of 20 best candidates including confirmed brown dwarfs.

### Euclid Anomaly Detection

SHAP analysis of Euclid temperature anomaly detection

Paper

Machine learning-driven Anomaly Detection and Forecasting for Euclid Space Telescope Operations Pablo Gómez, Roland D. Vavrek, Guillermo Buenadicha, John Hoar, Sandor Kruk, Jan Reerink Presented at IAC 2024

Analyses temperature anomalies in Euclid's telemetry from February to August 2024, focusing on eleven temperature parameters and 35 covariates. Uses a predictive XGBoost model to forecast temperatures and detect anomalies as deviations from predictions, with SHAP analysis for understanding complex parameter relationships. Demonstrates how ML can enhance telemetry monitoring for space missions.

### XAMI

Examples of artefacts in XMM-Newton Optical Monitor images

Paper | Code | Dataset

XAMI — A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images Elisabeta-Iulia Dima, Pablo Gómez, Sandor Kruk, Peter Kretschmar, Simon Rosen, Călin-Adrian Popa Accepted for oral presentation at SPAICE 2024

A dataset of 1000 hand-annotated images from the XMM-Newton Optical Monitor showing different types of artefacts (hot pixels, cosmic rays, satellite trails). Demonstrates a hybrid instance segmentation approach combining CNNs and transformer-based models for accurate artefact detection and masking.

### ChatGPT in Astronomy

Analysis of ChatGPT word usage in astronomy publications

Paper

Delving into the Utilisation of ChatGPT in Scientific Publications in Astronomy Simone Astarita, Sandor Kruk, Jan Reerink, Pablo Gómez Accepted at SPAICE 2024

Extracts words that ChatGPT uses more frequently than humans and searches 1 million articles tracked by the NASA Astrophysics Data System since 2000. Finds a statistically significant increase in ChatGPT-favoured words in 2024 astronomy publications, suggesting widespread adoption of large language models in academic writing.

### MCTED

Mars terrain elevation model generation

Paper | GitHub | HuggingFace

MCTED: A Machine-Learning-Ready Dataset for Digital Elevation Model Generation From Mars Imagery Rafał Osadnik, Pablo Gómez, Eleni Bohacek, Rickbir Bahia

Mars CTX Terrain-Elevation Dataset containing 80,898 samples derived from Mars Reconnaissance Orbiter imagery for DEM generation from single orbital images. A small U-Net trained on this dataset outperforms the DepthAnythingV2 foundation model on elevation prediction tasks.

## Other Activities

### Workshop / Competition Organisation

### Supervised Theses and Internships

#### 2025

#### 2024

#### 2023

#### 2022

#### 2021

#### 2018