The page you see when you search for something online is more than a neat list of links. It’s a stage where attention is won or lost, where a single character or a tiny image can tilt what you notice first and what you scroll past. A team at Radboud University in Nijmegen, led by Norman Knyazev and Harrie Oosterhuis, asks a question that feels obvious only in retrospect: what if the way we present each result—how much space it gets, how long its snippet is, how many items fit on the screen—is part of the ranking itself? What if the order of results and the length of their presentation are not separate steps but two sides of the same decision?
Their study, presented at SIGIR 2025, formalizes a setting called the variable presentation length ranking problem. In plain terms, the researchers want to decide not just which document should sit at which position, but also how much vertical space that document should occupy on the page. The idea is simple to imagine: give one result more room to tell a richer story, and you can attract attention that a shorter snippet might miss. But the consequence is not obvious. More space for one item pushes others down and reduces how many items you can show at once. The math of attention becomes tangled. Yet the payoff could be big: better balancing of exposure across all the results, not just the top few.
What makes this especially human is that it mirrors real online behavior. People scrolling through a long list of products, articles, or answers don’t just read in order; they skim, they pause, they linger on a picture, they skip a paragraph, they notice something because it looks different. The Radboud team argues that a truly effective ranking system should embrace this reality rather than pretend every result is a fixed, equal block of information. In their framing, every decision about which items appear and how much space they claim is a decision that nudges the user’s attention and—even more subtly—who gets noticed at all. The result is not a rebranding of search, but a rethinking of what the search results page is allowed to be.
And there’s a human story behind the math. The lead author Norman Knyazev and co-author Harrie Oosterhuis bring a practical curiosity to a theoretical gap: standard ranking models assume that all results are the same size, so the objective is only about who’s on top. Real-world pages are not like that. The team’s work suggests that the best possible strategy might look like a choreographed performance, where the size of each performance piece (a text snippet, an image, or a richer card) is chosen in concert with its place in the lineup.
The puzzle of variable lengths
In traditional learning-to-rank systems, the goal is straightforward: rank documents in order of decreasing relevance. The score you assign to each document, and the way you discount the value of lower positions, tells you how to place them. The theory behind this—often summarized as the probability ranking principle—favors putting the most relevant results first and assumes each result is presented in a fixed, equal format. But what if you could change the format on the fly? If you can give some results extra space to speak, does that make the top slot more valuable, or does it backfire by pushing down too many other candidates?
That is exactly the tension Knyazev and Oosterhuis formalize. They introduce a slot-based view of rankings where each placement is a pair: a document and the length of its presentation. A full ranking now consists of a budget of slots (think of a fixed vertical height for the page), and each (document, length) pair consumes some of those slots. The first piece of the ranking starts at slot s1 = 1, the next at s2 = 1 + length(d1), and so on. The catch is that the same document should not appear more than once, and you must respect the total slot budget. In this world, the per-position discount factors and the expected reward depend on both the position and the length of the item—a subtle but crucial shift from fixed-length rankings.
One striking theoretical upshot is that the old PRP—the idea that the best ranking simply sorts by relevance—does not hold when you allow variable lengths. The authors prove that optimal strategies cannot be decomposed into a separate ranking policy and a fixed length-selection policy. In other words, “rank first, then decide length” is not the right mindset. The best solution must be a joint policy that decides where to place a result and how much space it should take, all at once. It’s a reminder that the architecture of a user interface can quietly rewrite the math of optimization.
VLPL: a joint distribution for place and space
To tackle this, the paper introduces VLPL—the Variable Length Plackett-Luce distribution. It’s an extension of the classic Plackett-Luce (PL) model, which already treats ranking as a sequence of choices drawn from softmax distributions without replacement. VLPL upgrades this by sampling document-length pairs (d, l) for each slot, while ensuring that a document isn’t picked twice and that there are enough slots left to place a chosen length. The outcome is a probabilistic model over sequences of (document, length) pairs, not just documents.
Crucially, the authors don’t stop at theory. They derive gradient estimators—VLPL-1 and a more sample-efficient VLPL-2—that let you train models to maximize the expected attractiveness of a ranking. Here, attractiveness is a blend of two factors: how relevant a document is and how attractive it appears when given a certain length. The gradient decomposes into a future reward after a placement, an immediate reward, and a risk term that accounts for what you’ve given up by placing something now. It’s a policy-gradient feel, but tailored to the quirks of variable-length presentations.
Sampling, which could seem unnerving given the combinatorial explosion of possible (document, length) choices, is handled with a neat trick. Start by sampling an invalid ranking where each document appears with every possible length, then map that onto a valid ranking under the slot budget. This mapping preserves the distribution you care about, so you don’t pay a steep computational price just to explore possibilities. The authors also show how to reuse samples to improve efficiency—a boon for training on large datasets where each ranking carries a lot of information.
VLPL isn’t just a single algorithm. It’s a family of methods that can be applied in two modes: in-processing, where you train a model to optimize the joint ordering-and-length policy, and post-processing, where you adjust the scores after a model has been trained to maximize the same objective. The latter is especially appealing in practice because it lets you squeeze value from existing systems without a full retraining cycle.
What this means for the real world
The researchers tested VLPL on two large, real-world-like data sets: Yahoo! Webscope and MSLR-WEB30k. They ran a battery of experiments to answer four questions, and the answers are surprisingly resonant with everyday browsing behavior. When the true attractiveness values for documents are known (the oracle setting), VLPL-based methods consistently beat all baselines by a meaningful margin. In the Yahoo! and MSLR settings, the gains are substantial—think double-digit percentage improvements in the EA metric, which capturesBoth how relevant a result is and how likely users are to notice it given its length. The gains persist, albeit at smaller scales, when attractiveness is learned rather than given. In other words, even with imperfect guesses about what users will find attractive, VLPL holds up and still nudges rankings toward better overall exposure and engagement.
One especially telling pattern comes from looking at the learned lengths themselves. In the oracle setting, VLPL tends to place shorter presentations at the top and progressively lengthens results deeper in the ranking. This aligns with a practical intuition: you want to grab the user’s attention with concise, high-signal entries near the top, while longer previews can be used later to give more context to highly relevant items that would benefit from extra visibility. The effect is amplified when the underlying position bias is strong—when people are more sensitive to the top few results and gradually skim further down the page.
Beyond the numbers, there’s a longer-term implication for how search and discovery interfaces are designed. If you can jointly optimize order and length, you can tailor experiences to different devices, contexts, or even individual users. On mobile, where screen real estate is scarce, shorter previews might be preferred, while desktop experiences could gracefully deploy richer previews for a subset of results without sacrificing the overall set shown above the fold. The takeaway is not that “more space is always better,” but that space, when allocated intelligently, can unlock more relevant items without crowding out them or overwhelming the user.
From a product perspective, this line of work reframes a stubborn truth: the classic ten blue links model is a simplification. Real pages are dynamic canvases, and a successful ranking system may need to be comfortable with some of the complexity of allocating space. VLPL makes that comfort into a practical tool—an explicit, learnable way to balance the tug-of-war between attracting attention and showing a broader set of candidates.
Why this matters for the future of search and discovery
At its heart, this work is about a simple, human insight: attention is not a fixed resource. It flows, it shifts with context, and it’s easier to notice something that looks richer or larger—even if its intrinsic relevance is only modest. The VLPL framework makes that dynamic precise, so systems can learn how to distribute attention across a batch of candidates in a principled, data-driven way. In practice, that means better performance with the same screen real estate, or the same performance with cleverer use of space.
The study also highlights an important methodological point. The probability ranking principle, a cornerstone of classical information retrieval theory, can fail in the real world once you allow presentation length to influence attention and visibility. When you admit length as a design dimension, you need new tools, new intuitions, and—crucially—a different notion of what constitutes a good ranking. VLPL delivers that toolkit and demonstrates it on tasks that resemble the messy, scrolling reality of modern SERPs.
For developers and researchers, the takeaway is twofold. First, if you’re building a system that shows results with adjustable previews, you should consider joint optimization of order and length; treating them as separate steps can leave valuable strategies on the table. Second, even if you don’t deploy a full VLPL-style model, the core idea—embed presentation decisions into learning objectives and evaluation metrics—is portable. A little more thought about how much space each item gets can translate into meaningful improvements in engagement, recall, and satisfaction.
As we move toward more adaptive and context-aware interfaces, a future that blends layout with ranking seems not only plausible but increasingly necessary. Content platforms, e-commerce pages, and knowledge portals will be faced with choices about how much detail to show at a glance, how many items to reveal before users scroll, and how those decisions ripple through the user’s journey. The Radboud team’s work is a thoughtful, rigorous step toward systems that understand the texture of attention well enough to guide it—but without stealing the agency of the user. It’s a fascinating reminder that in the age of intelligent interfaces, presentation is not a cosmetic afterthought; it’s a form of communication, and when done well, it can help users find what truly matters.
Institutional note: The research was conducted at Radboud University in Nijmegen, The Netherlands, with Norman Knyazev as the lead author and Harrie Oosterhuis as a co-author. The work is presented as part of SIGIR 2025 and explores a fundamental rethinking of how ranking and presentation interact on modern information portals.