A Hidden Bias in Template RVs Echoes Through Exoplanet Science

The hunt for planets beyond our solar system rides on tiny, almost invisible nudges from starlight. Astronomers measure a star’s wobble—its radial velocity (RV)—to infer a planet’s presence, mass, and orbit. Over the past decades, template-based RV methods have become a central tool for turning a spectrum into a velocity signal, especially for cooler stars where the spectrum is rich with lines. But a new study led by A. M. Silva and colleagues, carried out at the Instituto de Astrofísica e Ciências do Espaço (University of Porto) and partners across Europe, reveals a subtle trap inside these very methods. When researchers build a high-SNR stellar template from observations gathered within a short window—think a single night—the RVs they extract can drift in a quasi-linear way over the night. Not a planetary signal, but a spurious drift born from how the template is crafted. This is a warning shot about the precision we demand from our instruments and the data-driven models we trust to interpret them.

The team behind the work is a collaboration anchored in Portugal’s IAAS (Institute of Astrophysics and Space Sciences) and the University of Porto, with coauthors across Switzerland, Spain, Italy, and Canada. The lead author, A. M. Silva, and colleagues used data from ESPRESSO and HARPS—two of the world’s most precise optical spectrographs—to show that template-based RV extraction, including both template-matching (TM) and line-by-line (LBL) approaches, can host a multi-meter-per-second per hour bias under short baselines. The bias does not appear in the traditional cross-correlation function (CCF) RVs, which has important consequences for how we interpret intra-night measurements and, more broadly, how we design observations for certain science goals. The paper is a clear reminder that the devil is in the data preparation as much as in the data analysis.

What the bias looks like

Picture a night of observations where every spectrum is piled into a single, shared stellar template. The authors report a quasi-linear drift in the template-based RVs that can swing from a fraction of a meter per second per hour up to tens of meters per second per hour, depending on the star and the data quality. In their sample, slopes ranged roughly from -0.3 m s−1 h−1 to -52 m s−1 h−1. That means the longer you stay on a single night, the more the extracted RV crawls downward—not because the star is actually accelerating, but because the template was born from data that share a time window and, crucially, a shared pattern of telluric lines and detector quirks.

The bias shows up across multiple TM pipelines and a prominent LBL pipeline, and it travels with observations from different spectrographs. In sharp contrast, the CCF RVs do not display this trend. The implication is not that template-based RVs are broken; rather, it’s a reminder that the method’s data-driven heart can carry residual footprints of the environment and instrument if not carefully controlled. It also means that a science case relying on rapid, intra-night RVs—such as asteroseismology, transit spectroscopy, or atmospheric studies of exoplanets—could misinterpret a spurious drift as a physical signal if the template is built from a narrow temporal slice.

Where the bias comes from

The authors conducted a meticulous, multi-pronged exploration to identify the culprit. One of the most telling clues is that when templates are built from observations spanning many nights or even years, the spurious drift largely vanishes. In a striking contrast, templates assembled from a single night consistently produced the suspicious linear trend. This points to the template itself as the contaminant, not the star or the velocity signal of interest. The team then peeled back another layer: the effect is stronger in the red part of the spectrum and appears even in LBL methods that do not rely on a continuum in the same way as TM methods. All of this threads toward a common hypothesis: micro-telluric features—tiny, rapidly moving absorption features from Earth’s atmosphere—and other detector-internal, time-coherent patterns leave a subtle imprint on the template when it’s built from short-baseline data. When you combine those faint telluric stamps with the stability patterns of the detector, the template ends up “carrying” those features into the RV alignment process, biasing the RV time series for that night.

The paper also digs into the role of barycentric Earth radial velocity (BERV), the way the star’s spectrum shifts as Earth orbits the Sun. By grouping observations into bins of BERV and comparing RVs, the authors show that the bias persists even when you account for the spectral lines moving across the detector. The red detector bears the larger bias, consistent with telluric lines tending to be deeper there, but the blue side isn’t entirely free of it. The conclusion is sobering: the bias is not a fluke of a single instrument or dataset; it appears to be a consequence of stacking data into a single template, in a regime where telluric and detector systematics are coherent on timescales of hours.

Why this matters for science—and for what kinds of science it hurts

The most immediate implication is practical: template-based RVs are not automatically immune to time-correlated systematics on intra-night scales. For planet hunting, this may be manageable because exoplanet signals are spread over longer timescales, and broad campaigns naturally sample many BERVs and nights. The authors show that for exoplanet detection and characterization, the bias is effectively averaged out when data are collected across a reasonable span of BERV, which is the common practice in dedicated RV surveys. However, other science cases are more fragile. Asteroseismology studies, which chase the star’s solar-like oscillations on short timescales, or transmission spectroscopy—where one analyzes spectra during a planet’s transit to tease out an atmosphere’s fingerprint—could be misled if a quasi-linear drift mimics a real signal within a single night. Even subtle biases can masquerade as spurious world signals when you’re listening for flickers in starlight at the level of a few meters per second or less.

What’s remarkable is not just that a bias exists, but that it reveals something deeper about how we construct templates from real, messy data. The template is meant to be a faithful, high-SNR model of the star’s spectrum, built by stacking many observations. But stacking builds a mirror of the data’s own flaws, and if those flaws are coherent within a narrow time window, they bleed into the model. In other words, our best proxy for the star—the template—can become a little too good at echoing Earth’s atmospheric quirks and instrument wiggles when we build it with too few nights of data. The result is a bias that is real, measurable, and not captured by the usual activity indicators used in CCF analyses.

How to fix it, and what the authors recommend

The authors do not claim template-based RVs are broken beyond repair. Instead, they offer practical guidance for future work and observational planning. The simplest fix is to create templates from data gathered over many nights, preferably across a wide span of BERV, so that telluric and detector systematics are averaged out rather than reinforced. In their τ Ceti analysis, templates built from observations spanning multiple nights showed no meaningful intra-night slope, while single-night templates did. The lesson is not about throwing away templates; it’s about how we curate the data that goes into those templates. The longer the temporal baseline of observations used to train the stellar model, the more the night’s systematic quirks get diluted.

Masking and telluric correction strategies can help, but the study finds they do not completely eliminate the effect. In fact, relaxing telluric masks tends to amplify the bias, while applying corrections still leaves room for residual micro-telluric features to slip through into the template. A robust mitigation, then, combines careful telluric treatment with deliberate temporal separation between the data used to build the template and the observations it’s used to model. The authors also hint at other clever ideas: using templates derived from a different, but similar, star; or engineering template construction workflows that explicitly model and marginalize not just flux differences but also potential systematic flux contaminants that drift with the detector over hours.

Another practical takeaway is methodological: plan short-baseline campaigns with an awareness of template construction’s sensitivity. If a project’s science goals require ultra-high precision on intranight timescales, you’ll want to ensure that templates are less likely to inherit a night’s idiosyncrasies. This may mean integrating data from broader, multi-night campaigns or employing alternative RV extraction strategies as cross-checks. The authors are candid that this does not invalidate template-based RVs, but it does demand a more nuanced view of how templates are built and how their biases might creep into short-timescale measurements.

A broader view: a reminder, not a retreat from template methods

This work sits at an interesting crossroads for astronomy. Template-based RVs have propelled many discoveries, from nearby Neptune-sized worlds to Earth-like candidates around cool dwarfs. The new findings do not so much dethrone templates as reframe how we use them. They remind us that data-driven models inherit the world they learn from—the Earth’s atmosphere, the detector’s quirks, and the observational cadence. The goal is not to abandon template methods but to refine how we assemble the data that builds them, especially when we’re chasing subtle signals on tight time scales.

In the end, Silva and colleagues offer more than a caution. They chart a practical path forward: prioritize longer, multi-night templates; scrutinize how tellurics and fixed-pattern noise can seep into a stellar model; and remain open to cross-checks with alternative RV extraction methods. As exoplanet science marches toward ever-smaller planets and ever-more precise characterizations, such vigilance will become an essential partner to innovation. The study is a reminder that precision in astronomy is a dance between instrument design, data processing, and thoughtful observing strategies—and that sometimes the most profound insights come from the quiet, almost invisible biases that emerge when we look too closely at the templates we trust to tell the truth about distant worlds.

Lead researchers and affiliations: The work emerges from the Instituto de Astrofísica e Ciências do Espaço, CAUP, Universidade do Porto, Portugal, led by A. M. Silva, with collaborators from Universidade do Porto, Instituto de Astrofísica de Lisboa, Univ. de Genève, and partner institutions across Europe. The study highlights the central role of A. M. Silva as the lead author and situates the effort within the broader network of European exoplanet and asteroseismology programs that rely on ESPRESSO and HARPS data.