The galaxy clusters in the XMM-Euclid FornaX deep field

The XMM Heritage project on the DEEP Euclid Fornax field aims to characterize distant galaxy clusters by comparing X-ray and optical/IR detections. The two methods call on very different cluster properties; ultimately, their combination will make it possible to set the free parameters of the Euclid cluster selection function over the entire WIDE survey, and thus constitute a fundamental ingredient for Euclid cosmological analysis.

The targeted redshift range ([1-2]) has never been systematically explored, despite being a critical area for the use of clusters in cosmology.
With FornaX, for the first time we'll have access to a large volume at these redshifts, enabling us to statistically quantify the evolution of clusters: role of AGNs in the properties of intracluster gas? Are there massive gas-deficient clusters? What are the respective biases of X-ray and optical detection?
The thesis work will involve (1) building and validating the X-ray cluster catalog; (2) correlating it with the optical/IR catalogs obtained by Euclid; and (3) studying the combined X-ray and optical evolution of the clusters.

All the algorithms for detecting and characterizing clusters in XMM images already exist, but we'll be pushing detection even further by using artificial intelligence techniques (combining spatial and spectral information on sources).
The complex problem of spatial correlation between XMM and Euclid cluster catalogs will also involve AI.

Project website: https://fornax.cosmostat.org/

Fast parameter inference of gravitational waves for the LISA space mission

Context
In 2016, the announcement of the first direct detection of gravitational waves ushered in an era in which the universe will be probed in an unprecedented way. At the same time, the complete success of the LISA Pathfinder mission validated certain technologies selected for the LISA (Laser Interferometer Space Antenna) project. The year 2024 started with the adoption of the LISA mission by the European Space Agency (ESA) and NASA. This unprecedented gravitational wave space observatory will consist of three satellites 2.5 million kilometres apart and will enable the direct detection of gravitational waves at undetectable frequencies by terrestrial interferometers. ESA plans a launch in 2035.
In parallel with the technical aspects, the LISA mission introduces several data analysis challenges that need to be addressed for the mission’s success. The mission needs to prove that with simulations, the scientific community will be able to identify and characterise the detected gravitational wave signals. Data analysis involves various stages, one of which is the rapid analysis pipeline, whose role is to detect new events and characterise the detected events. The last point concerns the rapid estimation of the position in the sky of the source of gravitational wave emission and their characteristic time, such as the coalescence time for a black hole merger.
These analysis tools form the low-latency analysis pipeline. As well as being of interest to LISA, this pipeline also plays a vital role in enabling multi-messenger astronomy, consisting of rapidly monitoring events detected by electromagnetic observations (ground-based or space-based observatories, from radio waves to Gamma rays).

PhD project
The PhD project focuses on the development of event detection and identification tools for the low-latency alert pipeline (LLAP) of LISA. This pipeline will be an essential part of the LISA analysis workflow, providing a rapid detection of massive black hole binaries, as well as a fast and accurate estimation of the sources’ sky localizations as well as coalescence time. These are key information for multi-messager follow-ups as well as for the global analysis of the LISA data.
While rapid analysis methods have been developed for ground-based interferometers, the case of space-based interferometers such as LISA remains a field to be explored. Adapted data processing will have to consider how data is transmitted in packets, making it necessary to detect events from incomplete data. Using data marred by artefacts such as glitches or missing data packages, these methods should enable the detection, discrimination and analysis of various sources: black hole mergers, EMRIs (spiral binaries with extreme mass ratios), bursts and binaries from compact objects. A final and crucial element of complexity is the speed of analysis, which constitutes a strong constraint on the methods to be developed.
To this end, the problems we will be tackling during this thesis will be:
1. The fast parameter inference of the gravitational waves, noticeably, the sky position, and the coalescence time. Two of the main difficulties reside in the multimodality of the posterior probability distribution of the target parameters and the stringent computing time requirements. To that end, we will consider different advanced inference strategies including:
(a) Using gradient-based sampling algorithms like Langevin diffusions or Hamiltonian Monte Carlo methods adapted to LISA’s gravitational wave problem,
(b) Using machine learning-assisted methods to accelerate the sampling (e.g. normalising flows),
(c) Using variational inference techniques.
2. The early detection of black hole mergers.
3. The increasing complexity of LISA data, including, among others, realistic noise, realistic instrument response, glitches, data gaps, and overlapping sources.
4. The online handling of the incoming 5-minute data packages with the developed fast inference framework.
This thesis will be based on applying Bayesian and statistical methods for data analysis and machine learning. However, an effort on the physics part is necessary, both to understand the simulations and the different waveforms considered (with their underlying hypotheses) and to interpret the results regarding the detectability of black hole merger signals in the context of the rapid analysis of LISA data.

Machine-learning methods for the cosmological analysis of weak- gravitational lensing images from the Euclid satellite

Weak gravitational lensing, the distortion of the images of high-redshift galaxies due to foreground matter structures on large scales, is one
of the most promising tools of cosmology to probe the dark sector of the Universe. The statistical analysis of lensing distortions can reveal
the dark-matter distribution on large scales, The European space satellite Euclid will measure cosmological parameters to unprecedented accuracy. To achieve this ambitious goal, a number of sources of systematic errors have to be quanti?ed and understood. One of the main origins of bias is related to the detection of galaxies. There is a strong dependence on local number density and whether the galaxy's light emission overlaps with nearby
objects. If not handled correctly, such ``blended`` galaxies will strongly bias any subsequent measurement of weak-lensing image
distortions.
The goal of this PhD is to quantify and correct weak-lensing detection biases, in particular due to blending. To that end, modern machine-
and deep-learning algorithms, including auto-di?erentiation techniques, will be used. Those techniques allow for a very e?cient estimation
of the sensitivity of biases to galaxy and survey properties without the need to create a vast number of simulations. The student will carry out cosmological parameter inference of Euclid weak-lensing data. Bias corrections developed during this thesis will be included a prior in galaxy shape measurements, or a posterior as nuisance parameters. This will lead to measurements of cosmological parameters with an reliability and robustness required for precision cosmology.

Bayesian Inference with Differentiable Simulators for the Joint Analysis of Galaxy Clustering and CMB Lensing

The goal of this PhD project is to develop a novel joint analysis for the DESI galaxy clustering
and Planck PR4/ACT CMB lensing data, based on numerical simulations of the surveys and
state-of-the-art machine learning and statistical inference techniques. The aim is to overcome
many of the limitations of the traditional approaches and improve the recovery of cosmological
parameters. The joint galaxy clustering - CMB lensing inference will significantly improve
constraints on the growth of structure upon DESI-only analyses and refine even more the test of general relativity.

Source clustering impact on Euclid weak lensing high-order statistics

In the coming years, the Euclid mission will provide measurements of the shapes and positions of billions of galaxies with unprecedented precision. As the light from the background galaxies travels through the Universe, it is deflected by the gravity of cosmic structures, distorting the apparent shapes of galaxies. This effect, known as weak lensing, is the most powerful cosmological probe of the next decade, and it can answer some of the biggest questions in cosmology: What are dark matter and dark energy, and how do cosmic structures form?
The standard approach to weak lensing analysis is to fit the two-point statistics of the data, such as the correlation function of the observed galaxy shapes. However, this data compression is sub- optimal and discards large amounts of information. This has led to the development of several approaches based on high-order statistics, such as third moments, wavelet phase harmonics and field-level analyses. These techniques provide more precise constraints on the parameters of the cosmological model (Ajani et al. 2023). However, with their increasing precision, these methods become sensitive to systematic effects that were negligible in the standard two-point statistics analyses.
One of these systematics is source clustering, which refers to the non-uniform distribution of the galaxies observed in weak lensing surveys. Rather than being uniformly distributed, the observed galaxies trace the underlying matter density. This clustering causes a correlation between the lensing signal and the galaxy number density, leading to two effects: (1) it modulates the effective redshift distribution of the galaxies, and (2) it correlates the galaxy shape noise with the lensing signal. Although this effect is negligible for two-point statistics (Krause et al. 2021, Linke et al. 2024), it significantly impacts the results of high-order statistics (Gatti et al. 2023). Therefore, accurate modelling of source clustering is critical to applying these new techniques to Euclid’s weak lensing data.
In this project, we will develop an inference framework to model source clustering and asses its impact on cosmological constraints from high-order statistics. The objectives of the project are:
1. Develop an inference framework that populates dark matter fields with galaxies, accurately modelling the non-uniform distribution of background galaxies in weak lensing surveys.
2. Quantify the source clustering impact on the cosmological parameters from wavelet transforms and field-level analyses.
3. Incorporate source clustering in emulators of the matter distribution to enable accurate data modelling in the high-order statistics analyses.
With these developments, this project will improve the accuracy of cosmological analyses and the realism of the data modelling, making high-order statistics analyses possible for Euclid data.

Detecting the first clusters of galaxies in the Universe in the maps of the cosmic microwave background

Galaxy clusters, located at the node of the cosmic web, are the largest gravitationally bound structures in the Universe. Their abundance and spatial distribution are very sensitive to cosmological parameters, such the matter density in the Universe. Galaxy clusters thus constitute a powerful cosmological probe. They have proven to be an efficient probe in the last years (Planck, South Pole Telescope, XXL, etc.) and they are expected to make great progress in the coming years (Euclid, Vera Rubin Observatory, Simons Observatory, CMB- S4, etc.).
The cosmological power of galaxy clusters increases with the size of the redshift (z) range covered by the catalogue. Planck detected the most massive clusters in the Universe in the redshift range 0<z<1. SPT and ACT are more sensitive but covered less sky: they detected tens of clusters between z=1 and z=1.5, and a few clusters between z=1.5 and z=2. The next generation of instruments (Simons Observatory starting in 2025 and CMB- S4 starting in 2034) will routinely detect clusters in 1<z<2 and will observe the first clusters formed in the Universe in 2<z<3.
Only the experiments studying the cosmic microwave background will be able to observe the hot gas in these first clusters at 2<z<3, thanks to the SZ effect, named after its discoverers Sunyaev and Zel’dovich. This effect is due to high energetic electrons of the gas, which distorts the frequency spectrum of the cosmic microwave background, and is detectable in current experiments. But the gas is not the only component emitting in galaxy clusters: galaxies inside the clusters can also emit in radio or in infrared, contaminating the SZ signal. This contamination is weak at z<1 but increases drastically with redshift. One expects that the emission from radio and infrared galaxies in clusters are of the same order of magnitude as the SZ signal in 2<z<3.
One thus needs to understand and model the emission of the gas as a function of redshift, but also the emission of radio and infrared galaxies inside the clusters to be ready to detect the first clusters in the Universe. Irfu/DPhP developed the first tools for detecting clusters of galaxies in cosmic microwave background data in the 2000s. These tools have been used successfully on Planck data and on ground-based data, such as the data from the SPT experiment. They are efficient at detecting clusters of galaxies whose emission is dominated by the gas, but their performance is unknown when the emission from radio and infrared galaxies is significant.
This thesis will first study and model the radio and infrared emission from galaxies in the clusters detected in the cosmic microwave background data (Planck, SPT and ACT) as a function of redshift.
Secondly, one will quantify the impact of these emissions on existing cluster detection tools, in the redshift range currently being probed (0<z<2) and then in the future redshift range (2<z<3).
Finally, based on our knowledge of these radio and infrared emissions from galaxies in clusters, we will develop a new cluster extraction tool for high redshift clusters (2<z<3) to maximize the detection efficiency and control selection effects, that is the number of detected clusters compared to the total number of clusters.
The PhD student will join the Simons Observatory and CMB-S4 collaborations.

The biased Cosmic web, from theoretical modelling to observations

The study of the filamentary Cosmic Web is a paramount aspect of modern research in cosmology. With the advent of extremely large and precise cosmological datasets which are now (or within months) coming notably from the Euclid space mission, it becomes feasible to study in detail the formation of cosmic structures through gravitational instability. In particular, fine non-linear aspects of this dynamics can be studied from a theoretical point of view with the hope of detecting signatures in real observations. One of the major difficulty in this regard is probably to make the link between the observed distribution of galaxies along filaments and the underlying matter distribution for which first-principles models are known. Building on recent and state of the art theoretical developments in gravitational perturbation theory and constrained random field theory, the successful candidate will develop first-principles predictions for statistical observables (extrema counts, topological estimators, extrema correlation functions, e.g. Pogosyan et al. 2009, MNRAS 396 or Ayçoberry, Barthelemy, Codis 2024, A&A 686) of the cosmic web, applied to the actual discrete field of galaxies which only traces the total matter in a biased manner. This model will then be applied to the analysis of Euclid data.