Building a first-in-class host-based multi-omic risk model for breast cancer using long-read sequencing
Programme: HER-CARE Marie Skłodowska-Curie Doctoral Network
Host institution: Gustave Roussy, INSERM U981, Villejuif / Grand Paris, France
PhD-awarding institution: Université Paris-Saclay
Duration: 36 months
Supervision: Dr Bruno Duso, with Dr Suzette Delaloge and an interdisciplinary team in genomics, biostatistics, computational oncology and cancer prevention
Secondments: University of Cambridge Early Cancer Institute and Oxford Nanopore Technologies
Application deadline: 31/07/2026
Expected start date: 05/10/2026
A next generation of cancer risk prediction
We are looking for an exceptional doctoral candidate to help build a new class of risk model.
This project is not another tumour-detection test. It is not a conventional liquid biopsy, and it does not compete with single-cancer or multi-cancer early detection assays. Instead, the ambition is to interrogate: can we read the biology of the host before malignancy is clinically visible, and identify individuals whose molecular landscape indicates a higher risk of developing breast cancer?
Current breast cancer risk prediction relies heavily on clinical variables, family history, breast density and germline genetics, including polygenic scores. These tools are important, but they remain incomplete. They capture only part of the biological reality that determines whether cancer will emerge. This PhD will push beyond static genotype-based risk prediction towards models that combine clinical, genetic and molecular information into a more dynamic estimate of risk.
Using saliva/oral samples and Oxford Nanopore long-read sequencing, the candidate will help build models that integrate:
- Established clinical and genetic risk information;
- Long-read-derived genomic and epigenomic profiles;
- Metagenomic and host-microbial information;
- Additional molecular features available from long-read data.
The goal is to move breast cancer risk prediction beyond demographics and inherited risk alone, towards models informed by measurable host biology.
The scientific setting
The project is embedded in HER-CARE, a European Commission-funded Marie Skłodowska-Curie Doctoral Network dedicated to hereditary and early-onset breast cancer. The successful candidate will be based at Gustave Roussy, Europe’s leading comprehensive cancer centre, within INSERM U981, the Interception Programme and the Interception Translational Research Laboratory. The work will bring together cancer prevention, long-read genomics, computational biology, biostatistics and translational oncology around whether measurable biology in the pre-diagnostic window can refine breast cancer risk prediction.
The project will use the MyPeBS trial, the largest European randomised study of risk-based breast cancer screening, as its clinical discovery setting. MyPeBS enrolled more than 53,000 women aged 40-74 to compare standard national screening with a personalised strategy based on five-year invasive breast cancer risk. Within this context, the candidate will work with prospectively collected non-invasive samples and rich clinical/risk information from women followed over time, asking whether molecular information measured before diagnosis can refine current approaches to breast cancer risk stratification.
The PhD will be developed alongside a dedicated postdoctoral/data-science position working on the same programme. This creates a structure where the candidate will join an active local effort, rather than carrying the project alone, with complementary expertise in sequencing, bioinformatic processing, prediction modelling, validation and biological interpretation. The shared objective is not to add another molecular signature to the literature, but to build a risk model that can be validated, interpreted and eventually tested as part of a new logic for cancer prevention.
Your PhD project
The candidate will develop and validate computational approaches to connect long-read-derived molecular information with future breast cancer risk. The work will require rigorous bioinformatics, careful statistical thinking and a constant connection between model performance and biological meaning.
You will:
- Process and analyse Oxford Nanopore long-read sequencing data from saliva samples;
- Develop reproducible pipelines to extract and harmonise long-read-derived molecular features from minimally invasive samples;
- Integrate molecular data with established clinical and genetic risk information;
- Build machine-learning and statistical models for breast cancer risk stratification;
- Evaluate whether multi-layer biological features improve current demographic/genotype-based risk models;
- Explore the biological interpretation of predictive signals, with careful attention to what they may reveal about host susceptibility, early disease biology and prevention-relevant risk states;
- Work with experts in cancer prevention, computational oncology, genomics, biostatistics and applied mathematics;
- Produce publishable, reproducible analyses and present the work within HER-CARE and at international scientific meetings.
The project will be anchored in the Interception Translational Research Laboratory and the broader Interception Programme at Gustave Roussy. The candidate will work with the Gustave Roussy Genomics Platform on long-read sequencing strategy, sample processing and ONT data generation; with Computational Oncology and Clinical Discovery Bioinformatics on long-read bioinformatic workflows and biological interpretation; with the Biostatistics / Oncostat teams on study design, performance assessment and validation; and with MICS at CentraleSupélec on advanced modelling and multi-layer data integration.
International and industry training
This is a Marie Skłodowska-Curie Doctoral Network position, designed to train a new generation of scientists across disciplines, sectors and countries.
The PhD trajectory will include:
- A research secondment at the Early Cancer Institute, University of Cambridge, with Prof Rebecca Fitzgerald’s team, learning from a programme that has combined fundamental work on early carcinogenesis with the development and clinical translation of practical early-detection assays;
- An industry secondment / summer placement within Oxford Nanopore Technologies, the developer of the sequencing platform used in the project. This will give the candidate access not only to nanopore sequencing as a user, but to the environment where its applications, workflows and strategic direction are shaped. ONT’s vision (to “enable the analysis of anything, by anyone, anywhere”) is directly aligned with the ambition of this project to move long-read-enabled risk profiling towards accessible, scalable and clinically useful prevention;
- Network-wide HER-CARE training in breast cancer biology, epidemiology, applied AI, multi-omics, entrepreneurship, communication and transferable skills;
- Mentorship from clinicians, computational scientists, data scientists, geneticists and industry partners.
Who we are looking for
This PhD is for a candidate who wants to turn complex molecular data into clinically credible risk prediction; someone who is excited by the challenge of making models that are not only accurate, but biologically interpretable and useful for prevention.
Essential profile
You should have:
- A Master’s degree or equivalent in data science, bioinformatics, computational biology, biostatistics, computer science, biomedical engineering, genomics, systems biology, or a closely related field;
- Strong programming skills in Python, R, or equivalent;
- A serious interest in machine learning, statistical modelling and predictive modelling;
- Motivation to work with complex, high-dimensional biological data;
- Ability to work in a multidisciplinary environment with clinicians, biologists, engineers and data scientists;
- Excellent written and spoken English.
Highly desirable
Experience or strong interest in one or more of the following will be a major advantage:
- Long-read sequencing or next-generation sequencing data analysis;
- Cancer genomics, epigenomics or metagenomics;
- Multi-omic data integration;
- Machine learning, statistical modelling or clinical prediction;
- Epidemiological study designs, especially analyses nested within prospective cohorts or trials;
- Model validation, calibration, discrimination and interpretability;
- Reproducible workflows, Git, Snakemake / Nextflow, containers or HPC;
- Breast cancer biology, cancer prevention, risk stratification or personalised screening.
Wet-lab experience is welcome but not mandatory. The project includes exposure to sequencing workflows and experimental/translational environments, but the core of the PhD is computational, analytical and model-building.
What you will gain
The training will prepare the candidate for potential careers in academia, biotech, diagnostics, pharma, AI-enabled health technology, precision prevention and translational oncology. It will also place them within the HER-CARE MSCA network, with access to an international community of doctoral candidates, senior scientists, clinicians and industry partners working on hereditary and early-onset breast cancer.
Eligibility
Applicants must comply with Marie Skłodowska-Curie Doctoral Network eligibility rules:
- You must not already hold a doctoral degree at the date of recruitment;
- You must not have resided or carried out your main activity (work, studies, etc) in France for more than 12 months during the 36 months immediately before recruitment;
- You must be able to start on the agreed date and commit to full-time employment for the duration of the contract;
- Good spoken and written English is required.
Supporting documentation for the MSCA mobility rule will be requested.
Application
Please submit a single PDF including:
- A motivation letter explaining why this project excites you and how your background fits;
- A detailed CV;
- Academic transcripts;
- Contact details for at least two referees;
- A short statement confirming your MSCA eligibility, including current/previous countries of residence or main activity during the last 36 months.
Applications should be sent to:
bruno.achutti-duso@gustaveroussy.fr and suzette.delaloge@gustaveroussy.fr.
Email subject:
MSCA HER-CARE DC8 Application – Host-Based Multi-Omic Risk Prediction
You will help build a new way of thinking about cancer risk by not simply asking whether a tumour can be detected, but whether measurable biology in the pre-diagnostic window can improve risk prediction. You will work with unique prospective samples, long-read sequencing, world-class clinical and computational teams, and partners at Cambridge and Oxford Nanopore Technologies.
This is a chance to help define the next generation of precision cancer prevention.