The quest for truly honest 2D data maps reshapes data science.

The quest for truly honest 2D data maps reshapes data science.

In the age of big data, we love a good map. A scatter of numbers becomes a landscape you can walk through, recognize, or argue about. Nonlinear dimension reduction (NLDR) is the toolbox that makes that landscape by squashing high-dimensional data into a friendly two-dimensional view. It promises clarity: clusters emerge, patterns glow, and complex relationships simplify into something we can discuss over coffee. The catch is that the maps aren’t neutral. Different NLDR methods — the famous tSNE, UMAP, PHATE, and newer cousins — can spit out wildly different pictures from the same data. It’s not just a matter of taste; it’s a reliability issue. Are you looking at a genuine structure, or a clever illusion crafted by your chosen method and its settings?

That was the central worry tackled by a team of researchers at Monash University (Econometrics & Business Statistics) led by Jayani P. Gamage, with coauthors including Dianne Cook, Paul Harrison, Michael Lydeamore, and Thiyanga S. Talagala, plus collaborators in Sri Lanka. Their paper Stop Lying to Me: New Visual Tools to Choose the Most Honest Nonlinear Dimension Reduction doesn’t promise to replace NLDR. Instead, it offers a way to check whether the 2-D view you’re using truly reflects the structure of the original high-dimensional space. Think of it as a “model-in-the-data-space” approach: you treat the NLDR layout as a provisional model, then lift it back into the original space to see how well it fits, where it fits, and where it inevitably misfits.

Why does this matter beyond nerdy diagnostics? Because NLDR shapes how we communicate data stories to others — scientists, policymakers, and curious readers alike. If a visualization exaggerates cluster separations or hides subtler connections, it can steer decisions in directions that aren’t warranted. The Monash team doesn’t just criticize a style of visualization; they hand us a principled toolkit to rate, compare, and even predict how a given NLDR representation behaves in the full data space. They also show that the problem isn’t just theoretical: in real-world datasets like single-cell gene expression (PBMC3k) and even handwritten digits (MNIST), the choice of NLDR layout can materially shift what we think we’re seeing. The goal, loudly and clearly, is honesty in the visuals.

NLDR can lie to you in plain sight

Nonlinear dimension reduction aims to reveal the shape of data that lives in far more dimensions than our eyes can handle. Methods like tSNE and UMAP are designed to reveal local neighborhoods and global structure in creative, high-contrast ways. But that design comes at a price: different methods, or even different parameter choices for the same method, can paint very different pictures of the same data. The paper shows, in striking fashion, how eight different NLDR representations of the same data can look like wildly different universes. Some layouts scream with clearly separated clusters; others sketch long, dangling strings or half-empty spaces where data is sparse. The lesson is not that NLDR is broken, but that it can be painting the data with a brush that exaggerates, demotes, or even invent structure.

To make matters trickier, these representations are often taken as evidence of real groupings or trajectories. When a researcher says, “these are three cell types,” or “this is a developmental path,” the 2-D view may be doing a lot of interpretive work that the data don’t actually warrant. The paper invites us to pause and test: are these clusters genuine, or are they artifacts of the method used to generate the 2-D map? The authors emphasize a practical point that resonates with anyone who’s tried to compare multiple visualizations: you may end up trusting the prettiest map rather than the most truthful one. That is a dangerous habit in any data story.

The team also foregrounds a powerful companion idea: the “tour” of linear projections. A tour is essentially a movie of all possible 2-D views along different directions in the data space. It is a way to see what the NLDR map is hiding or exaggerating by comparing the nonlinear layout against the faithful, linear slices of the same data. When you overlay a wireframe model that lives in the p-dimensional space, you stop looking at a static picture and start watching the data breathe. This blend of an honest, data-space check with a 2-D narrative is, in the authors’ words, a meaningful step toward stopping the lying in NLDR visualizations.

A wireframe map that travels back to the data

The core technical move is elegant in its simplicity and its clarity: treat the NLDR layout as a fitted model and then lift that model back into the original high-dimensional space. The authors formalize the NLDR mapping as a function g that takes the high-dimensional data (p dimensions) and returns a 2-D layout. But rather than leaving it at “this is the 2-D map,” they augment the map with neighborhood information so it can be reinterpreted in p-D.

How does that work in practice? First, they scale the data and then lay a hexagonal grid over the 2-D NLDR layout. Why hexagons? They’re efficient at packing space evenly and avoid some of the distortions a square grid introduces in curved or elongated shapes. Each hexagon has a centroid, and every observation in the 2-D space is assigned to the nearest centroid. This reduces the data to a manageable number of bins while preserving the overall geometry of the layout.

Next comes the neighborhood wiring. Delaunay triangulation connects the bin centroids, creating a lightweight “wireframe” of neighboring bin centers. This is the backbone of the 2-D model that will be lifted back into high-dimensional space. The lifting step is where the cleverness shines: for each bin, you compute the p-D mean of all observations that fell into that bin. This gives you a high-dimensional centroid corresponding to each 2-D hexagon. Those high-D centroids tell you where the 2-D layout is pointing in p-D.

With this bridge in place, you can see whether the 2-D layout faithfully encodes actual structure in the data, or whether it’s distortive hype. If the NLDR view is faithful, the bin’s high-D centroid should line up nicely with the spread of points within that region of the original space. If not, the mismatch lives in the residuals — and the residuals are what you want to measure and understand, not pretend doesn’t exist.

There’s more. The authors describe how to predict a new observation’s 2-D location by simply mapping it to the nearest 2-D bin’s centroid. This opens a path to uncertainty checks and train/test splits in a world where 2-D maps routinely serve as baseline narratives for complex data.

Measuring truth in a 2-D dream

The method isn’t just a pretty overlay. It turns the 2-D NLDR representation into a concrete statistical model, one that you can diagnose and compare across methods. The authors adopt a familiar structure from factor analysis and multidimensional scaling: the 2-D layout explains a portion of the variance in the original data, and the discrepancy between the 2-D layout’s predictions and the real p-D coordinates is the residual, measured in p-D space.

Crucially, they quantify fit with a straightforward, interpretable statistic: the root-mean-squared error (RMSE) of these residuals across all bins and observations. The lower the RMSE, the closer the 2-D representation is to faithfully recapitulating the high-dimensional structure. But the twist is in the tuning: bin width matters. If you make bins too large, you gloss over detail and distort the fit; if you make them too small, you collapse many points into many bins and the statistic can become volatile. The authors don’t pretend there’s a single magic bin size. Instead, they advocate exploring a range of bin widths, tracking RMSE, the fraction of non-empty bins, and the density distribution across bins. The best representation is not the one with the prettiest clusters, but the one that yields consistently low RMSE across a sensible spectrum of bin sizes.

They also tie in the human-in-the-loop idea of interactivity. The langevitour tool, along with their R package quollr, lets researchers flip between NLDR representations, then watch how the same data behaves when the model is projected into p-D space. You can brush a region in the 2-D map and immediately see which data points in high dimensions occupy that region, along with their residuals. It’s a kind of spatial debugging for data visualization, and it makes the abstract battle of “which map is best” tangible.

Real-world tests reveal biases

The paper doesn’t live in the land of thought experiments. It opens the door to concrete verdicts about widely used visualizations. The authors showcase two classic datasets that data scientists know well: PBMC3k, a single-cell RNA sequencing dataset with thousands of cells and gene expressions, and MNIST’s handwritten digits, a high-profile benchmark where 28-by-28 pixel images live in a 784-dimensional space before you squint them into 2-D. In both cases, they apply several NLDR methods—tSNE, UMAP, PHATE, TriMAP, PaCMAP, and variants of two standard methods with different settings—and then test them with their RMSE-in-space diagnostic.

In the PBMC3k example, the team re-creates a familiar claim: the published layout suggested three very separated clusters. But when they push the representations through their diagnostic lens, the RMSE tells a different story. Some layouts with sizable separations in 2-D turn out to be less faithful to the 9-D or 3-D structure of the data when lifted back to p-D. In other words, what looked like a clean triad of cell-types in 2-D wasn’t as clearly separated in the high-dimensional reality, and the “best” view shifts depending on how you measure the warp between spaces. The authors argue that the optimal choice, constrained by their diagnostics, is not the one that screams the loudest separation but the one whose wireframe tucks into the data in the most honest way. They identify layout e as a strong candidate, even if it trades dramatic separations for a more faithful representation of the underlying geometry.

In MNIST, the digits famously lie on a nonlinear manifold. The 2-D renderings of the number 1 vary a lot depending on the method and tuning. The authors show that certain layouts collapse a broad, curved manifold into a flatter, pancake-like form in 2-D. When you lift that back into 10-D (the 10 principal components they use for visualization), the model reveals that some methods exaggerate the continuity of the shape, while others preserve a more nuanced curvature. The result isn’t merely academic: it highlights how a seemingly simple 2-D sketch can smooth over misfit patterns that matter for downstream tasks like clustering, recognition, or anomaly detection.

The paper also flags a particularly intriguing byproduct of their analysis: a tendency for some NLDR methods to exhibit a “pancake” effect within clusters, especially PaCMAP in certain projections. In some data sets, the model that best fits the data space looks very different from the 2-D view you might be used to. And it isn’t that one method is universally wrong; it’s that the interpretation of the 2-D layout without the data-space check risks arriving at a false narrative about the data’s structure.

Why this matters for data storytelling

If NLDR is how we tell stories about high-dimensional data, then this paper gives us a way to tell better stories — not merely prettier ones. The authors are careful to emphasize that the goal isn’t to dethrone NLDR. Rather, it’s to inoculate the practice with a robust diagnostic framework: a way to compare multiple NLDR layouts, quantify their honesty, and choose representations that reflect real structure in the data rather than the quirks of a particular algorithm or parameter choice.

There’s a practical value for researchers who want reproducible visuals. The new toolkit (the quollr package) not only computes the fit and RMSE but also enables interactive exploration with tours. Researchers can compare the same data across methods and parameter settings, revealing where a chosen map overstates separation, hides density variations, or creates artificial gaps. In fields where visual interpretation guides crucial decisions — biology, medicine, materials science, and beyond — this kind of honesty can prevent overconfident claims that later crumble under deeper inspection.

Beyond the specific datasets in the paper, the broader implication feels almost like a call to arms for the data science community: let your visuals justify themselves against the data-space reality. The authors’ mantra—stop lying to me—resonates not as a harsh critique but as a reminder that visuals are arguments, and every argument deserves a fair test. The approach invites a more transparent dialogue about what a 2-D map can and cannot reveal, and it gives scientists a concrete way to communicate the limits of their representations to a wider audience.

From PBMCs to digits, a practical takeaway

What should a curious reader take away from this work? First, that NLDR is not a magic mirror but a distorting lens shaped by method, settings, and data density. Second, that there is a rigorous, replicable way to check whether a given 2-D layout matches the high-dimensional geography it’s intended to summarize. Third, that this check can be performed with tools that overlay a wireframe model onto the data space, letting researchers probe where the representation is faithful and where it lies. Finally, that this approach encourages a healthier culture around data visualization: one where the goal is not to produce the most dramatic map, but the most honest one.

And there’s a social dimension, too. The work explicitly ties back to institutions and people: Monash University’s Econometrics & Business Statistics group, including lead author Jayani P. Gamage and coauthors Dianne Cook, Paul Harrison, Michael Lydeamore, and Thiyanga S. Talagala (with collaboration across Sri Jayewardenepura), demonstrates how a focused, methodical critique can yield practical tools for everyone who builds, shares, and interprets data visualizations. The study’s transparency—open code, a devoted R package, and interactive demos—reflects a model of openness that data storytelling needs more of in a noisy, hyper-visual age.

In short, the paper doesn’t end with a verdict on which NLDR tool is best; it gives us a framework to judge honestly, to understand when a map is telling a true story, and to recognize when the storytelling deserves a second glance. If you care about seeing data as it really is, this is a toolkit you’ll want to keep close as you navigate the next wave of high-dimensional visualization.