Tropical forests aren’t just green canopies rustling in the breeze. They are living archives of life, carbon storage factories, and weather-makers that shape climates far beyond their own borders. Yet counting the trees inside those vaults of green has long been a human-scale challenge. Ground surveys are slow, dangerous, and patchy; satellites struggle to see through the dense, moth-eaten mist of tropical canopies; and most open datasets didn’t come close to capturing the diversity of tropical crowns—sizes, shapes, and the way they overlap with each other. The new SELVABOX dataset, created by a collaboration spanning Mila – Quebec AI Institute, Université de Montréal, McGill University, and several partner institutions, is changing the game. Its lead author, Hugo Baudchon, and a team of researchers have assembled the largest open-access collection yet of high-resolution drone imagery annotated with individual tropical tree crowns.
83,137 crowns annotated across three neotropical countries—Brazil, Ecuador, and Panama—span 14 RGB orthomosaics captured at a razor-sharp ground sampling distance of 1.2 to 5.1 centimeters per pixel. That scale is hard to imagine: it’s not just a few trees mapped here and there, but thousands of crowns across landscapes that vary from primary forests to secondary regrowth and even native plantations. The project is a public resource built to accelerate the kind of machine-learning work ecology needs to meaningfully quantify forests at scale. The dataset is backed by real institutions and real researchers, and it comes with preprocessing utilities and model benchmarks so others can pick up where the team left off. The authors emphasize that this isn’t merely an impressive pile of labels—it’s a springboard for robust, cross-ecosystem detection of individual crowns.
As with many ambitious scientific endeavors, the people behind SELVABOX are intertwined with institutions that make it possible: Mila – Quebec AI Institute and Université de Montréal anchor the project, with collaborators including McGill University, Polytechnique Montréal, Rubisco AI, and Colorado Mesa University. The work demonstrates how modern field ecology increasingly runs on a shared infrastructure of data and code. In short, this is biology meeting engineering, with the forest as the proving ground.
Unlocking tropical secrets with high-res drone data
The SELVABOX team didn’t just collect pretty pictures; they built a dataset engineered for a very specific problem: detecting and delineating the crowns of individual trees in tropical forests from high-resolution drone imagery. The density and variety of tropical crowns pose a twofold challenge: trees come in a wide spectrum of sizes, from towering emergents to compact understory residents, and their crowns often overlap so densely that a single tree can be partly hidden behind another. The dataset captures that complexity in vivid detail: crowns ranging from less than 2 meters to more than 50 meters in diameter, and images gathered under different lighting, weather, and canopy structures.
One line in the paper that’s worth pausing on is its explicit scale: this is the largest open-access tropical tree crown dataset of its kind. The authors report more than 83,000 manually labeled crowns across a dense tapestry of forest types, with imagery collected from three countries and multiple drone platforms. That kind of scale matters because, in machine learning, models learn not just from a few examples but from the diversity of those examples. It’s the difference between recognizing a crown that’s a near-takeover in Panama and one that hides in Ecuador’s sun-dappled understory. The data collection is paired with a careful split strategy—train, validation, and test areas that are spatially separated to minimize geospatial leakage—so researchers aren’t fooled by trivial splits that inflate performance.
From the outset, the authors emphasize something essential: annotation quality and coverage matter. The six biologists who labeled the crowns worked under a tight, practical constraint—crowns could be near-impossible to distinguish in some densely shaded patches, and some areas ended up with sparser annotations. Rather than penalizing models for those gaps, the team masked certain pixels during training to prevent training-time penalties for what might be missing annotations. It’s a pragmatic acknowledgment that real-world data rarely behaves like a perfectly labeled textbook, and that robust models should be able to cope with imperfect annotations as they learn to map real forests.
And the scale isn’t just about raw counts. The project introduces a pipeline with practical, open tools to preprocess rasters, tile them for training, and aggregate predictions back into a raster-wide map. In a field where the operational need is often a wall-to-wall map of crown locations for a large landscape, that end-to-end consideration—from data capture to forest-wide predictions—matters a lot more than a tidy table of numbers.
From crowded canopies to climate models
Why does this matter beyond the cool-factor of a new dataset? Because the distribution and demography of tree crowns feed straight into how forests store carbon and how they respond to climate change. In tropical forests, the largest trees aren’t just architectural wonders; they are carbon powerhouses. The paper notes that the largest 1% of trees can account for about half of a forest’s carbon stock even though they comprise a tiny slice of the population. Accurately counting and characterizing those crowns is crucial to understanding carbon budgets, biodiversity, and the resilience of forests under warming temperatures and shifting precipitation patterns.
The authors also advance a core methodological point: you can’t rely on a purely tile-based evaluation when you want to understand performance at the scale that matters for forest inventories. They introduce RF175, a raster-level F1 score that evaluates model predictions after aggregating tile-level detections across an entire raster, using a strict IoU threshold of 75% to count a crown as a match. It’s a reminder that scoring methods shape what we think “success” looks like in ecological ML—tools must align with practical goals like accurate tree counting and reliable crown delineation across landscapes, not just per-tile accuracy.
In their experiments, the SELVABOX team paints a clear picture: higher-resolution inputs consistently improve detection accuracy, and transformer-based detectors with a large Swin backbone outperform traditional CNNs. More surprisingly, models trained solely on SELVABOX achieve strong zero-shot performance on unseen tropical datasets. In other words, the crown-detection skills learned from one tropical forest can transfer to others without re-training, a kind of cross-forest generalization that’s been hard to achieve in tropical remote sensing. When they extended training to include other datasets spanning resolutions from 3 to 10 cm per pixel, a unified multi-resolution pipeline rose to the top across all evaluated datasets. The model wasn’t just good in one place; it looked capable across many tropical forests and even across non-tropical forests in some tests.
What makes that result especially exciting is not just a better detector but a blueprint for how researchers and practitioners could build scalable monitoring systems. In practice, you’d load up a new drone mosaic from a tropical site, run a robust, resolution-aware detector, and obtain a wall-to-wall crown map that can feed biomass estimates, biodiversity assessments, and management decisions. The team even released Python libraries named geodataset and CanopyRS to standardize raster preprocessing, inference, and benchmarking, lowering barriers for ecologists who want to test ideas without reinventing the wheel each time.
There’s a layer of humility in these results, too. The authors discuss the limitations of current methods and datasets, especially when transferring to very different forest types or to island forestry contexts. The zero-shot results, while promising, aren’t a guarantee that a single model will perfectly map every tropical canopy. But they are a clear signal: with enough diverse data and careful training strategies, you can edge closer to a generalized capability for crown detection that scales with our growing fleet of high-resolution drones.
State-of-the-art on forests of every stripe
The paper doesn’t stop at tropical forests. It also tests whether the same models can generalize to temperate and urban forests. The results are striking: the multi-dataset approach with SELVABOX achieves strong, sometimes state-of-the-art performance on non-tropical datasets as well. This isn’t just a tropical story; it’s a demonstration that a well-designed, multi-resolution training regime can bridge gaps between forest types that historically required separate models and datasets. The upshot is a future where a single, robust detector could help forest managers, conservationists, and researchers compare crown maps across continents and forest types with a shared standard of evaluation.
Beyond the numbers, the SELVABOX project hints at a broader trend in environmental machine learning: the push toward data-centric, open, and reproducible science. The authors publish their datasets, code, and pre-trained weights, making it easier for others to validate results or push the idea further. It’s the kind of openness that helps ecologists and computer scientists speak the same language, accelerating progress toward actionable forest monitoring and climate insight.
Training machines to understand trees, not just images
One of the paper’s most practical contributions is not a single model but a whole ecosystem for doing forest ML well. The SELVABOX team released two Python libraries that matter for real-world work. geodataset is a preprocessing toolkit that slices large rasters into trainable tiles with intelligent masking and AOI management. CanopyRS provides an end-to-end pipeline for training detectors, running inferences, and benchmarking models against raster-level metrics like RF175. The authors also emphasize that their best-performing detector, a transformer-based model with Swin-L backbones and multi-scale inputs, benefits from multi-resolution augmentation that mimics the variety of real-world data—different flight heights, different cameras, and different canopy structures. In other words, you train the model to be robust to what it will see in the wild rather than train it to memorize a single dataset.
Why does this approach work so well? It mirrors a principle in perception: humans don’t rely on a single gaze or a single kind of eye to understand a crowded scene. We generalize by watching scenes at multiple scales, from the broad horizon to the tiny details. The category of techniques used here—the transformer-style detectors that forgive long-range dependencies and multi-scale features—appeals to that intuition. The SELVABOX study shows that, when you pair such architectures with thoughtful data curation and cross-dataset training, you don’t just improve accuracy in a lab setting; you build tools with real ecological and policy relevance.
The cultural and scientific takeaway is empowering: with open data, scalable tools, and a mindset that embraces multiple resolutions and forest types, we can begin to quantify the tropical planet at a level of granularity that was once the domain of field crews marching through the undergrowth. The result isn’t just better numbers; it’s a language for talking about forest structure, carbon, and biodiversity in a way that can inform conservation strategies, climate models, and sustainable management decisions across borders.
And yet the authors are keen to acknowledge caveats. The dataset, while vast, isn’t perfect. Some areas had sparser annotations or more challenging visual conditions. They deliberately mask sparse annotation zones during training to keep models honest about what’s actually labeled. They also confront ethical considerations: any technology capable of pinpointing crown locations could, in theory, be exploited for illegal logging. Their stance is transparent and principled—open access to benefit research and conservation, paired with a sober eye toward misuse and a commitment to responsible deployment.
Ultimately, SELVABOX is more than a dataset; it’s a blueprint for how to measure, compare, and improve tree-crown detection across the world’s forests. It’s a map that can guide us toward better understanding of carbon stocks, forest health, and biodiversity under a changing climate. The researchers behind it—led by Hugo Baudchon, with collaborators from Mila, Université de Montréal, and partner institutions—have given the world a robust tool and a clear challenge: keep expanding the data, keep refining the models, and keep asking what it means to know a forest crown as precisely as we know a building’s footprint in a city.