Your keystrokes reveal how students learn to code

The idea of watching someone code is usually private, a sprint through a personal problem space. Yet a team of researchers decided to treat the act of writing software as a kind of diagnostic data stream. They built KOALA, a configurable tool that lives inside JetBrains IDEs and quietly records what students do as they solve programming tasks. The result isn’t just a fancy plugin; it’s a new way to see how beginners approach problems, where they stumble, and what tools actually help them move forward. The project comes from JetBrains Research in Berlin, with collaborators at Utrecht University, and it centers on lead author Daniil Karol and a team of researchers who care about education as much as code.

KOALA’s promise is straightforward, even a little radical: let educators choose exactly what to measure, when to snapshot code, and which IDE features to enable or disable during a task. Instead of a one-size-fits-all data dump, KOALA lets researchers tailor data collection to the questions they care about—how students refactor, how often they press a hotkey, which files they switch to, and how the environment itself shapes solving strategies. It’s akin to swapping from a static exam to a dynamic, instrumented lab where every keystroke and window focus becomes evidence about learning.

A new window into how students solve problems

At its core, KOALA is a three-part system: a plugin inside the IDE, a server that collects and stores data, and a dashboard plus a converter that makes the data usable to researchers. What makes KOALA different from earlier tools is the level of configurability. Researchers can specify, via YAML files, the exact set of files to monitor, how frequently to snapshot code, which IDE settings to toggle, and which questions to ask students through optional surveys. This means you can study, for example, how first-year students approach a basic coding task in a Kotlin course, and then compare that to how third-year students tackle a refactoring challenge in the same environment—without rewriting the tool each time.

In a case study embedded in the paper, 28 students from two universities solved problems in two JetBrains Academy courses. The dataset dwarfs most classroom studies: more than 127,000 code snapshots and 585,000 IDE events, including hotkeys, focus shifts between files, runs, debugs, and window interactions. The scale matters because subtle learning signals—how often a student switches from editing to running code, or how a particular refactor is invoked—can reveal misperceptions or strategic gaps that aren’t obvious from a final submission alone.

KOALA’s design philosophy is openness and adaptability. The plugin runs inside multiple JetBrains IDEs—IntelliJ IDEA, PyCharm, CLion—while the server and dashboard are designed to be deployed in standard cloud environments. The project’s authors emphasize privacy: data collection requires explicit user consent, and the system is careful to read only the files specified in the configuration, excluding personal information unless a researcher explicitly requests it. This balance—rich data with consent-aware boundaries—offers a concrete path toward ethically studying how people learn to program.

From IDE to insight

KOALA isn’t just a data collector; it’s a pipeline. The YAML configurations define a scenario: the order of tasks, the files students solve, the IDE settings to enable or disable, and the surveys to present. When students engage, KOALA captures code snapshots—potentially after every keystroke, or at a coarser interval depending on the study’s needs—along with a stream of IDE actions: running tests, debugging, opening the build window, or switching focus between files. It even records hotkeys, a detail that often reveals how students prefer to work at speed, bypassing menus in favor of keyboard shortcuts.

On the back end, a server—built with the Ktor framework—receives and organizes the data, storing it in a database and preserving raw data for safety. Researchers can then use a visualization dashboard to glimpse high-level patterns: how much time students spend on each task, which windows they rely on, and which actions dominate a solution path. Significantly, KOALA offers a ProgSnap2 converter, aligning its data with a widely used format for programming-process data. In other words, what KOALA learns can slot into existing research ecosystems, letting scientists compare studies across courses, institutions, or languages without reformatting every dataset from scratch.

The tool is deliberately ambitious about integration and reuse. It’s open source under MIT, designed to be dropped into a classroom with minimal friction, and engineered to connect with other analytics tools and datasets. That openness isn’t just convenience; it matters for building a shared empirical foundation for how people learn programming in real environments, not just in controlled labs.

What it could change about teaching coding

If you’ve ever suspected that “how you write code” mirrors “how you think about the problem,” KOALA gives that suspicion a more empirical edge. The data could illuminate questions that educators have wrestled with for years: which ideas students actually grasp early, where misconceptions quietly take root, and which features of the IDE help or hinder thinking. For example, by analyzing hotkeys and refactoring patterns, instructors can see which advanced techniques novices actually adopt—and which ones overwhelm them, suggesting where curricula should emphasize fundamentals or gradually introduce complexity.

KOALA’s case study includes a peek at hotkey usage during a Kotlin refactoring course. It highlights the most popular shortcuts—essentially, short cuts to reorganize code quickly. The data show students leaning on actions like Reformat, Inline, Move, and Introduce Constant. That snapshot of workflow isn’t just trivia; it speaks to how learners negotiate the affordances of a professional tool. Do they rely on the IDE to enforce style, or do they push against it, discovering misalignments between their mental model and the tool’s expectations? This kind of insight can inform how educators design guided experiences that align with actual coding practices.

The measurable, fine-grained nature of KOALA’s data also invites longer-term possibilities. Researchers could build adaptive learning aids that respond to a student’s current pattern: if a learner consistently underuses a particular refactoring, the system could offer targeted hints or practice in that lane. If a student’s focus drifts away from the code they’re solving, a gentle nudge could steer attention back to the critical task. In a broader sense, KOALA hints at a future where classrooms feel more like development environments, where teaching and tooling converge to scaffold thinking rather than merely deliver content.

There’s also a practical benefit for researchers who study programming education as a discipline. By standardizing data through ProgSnap2, KOALA helps researchers compare apples to apples across experiments. The same student-facing setup could yield different kinds of data depending on the research question, and then be re-packaged to join global datasets. It’s a modest ambition with big implications: a shared, big-picture view of how people learn to code across styles, languages, and institutions.

Of course, the vision comes with caveats. The more granular the data, the heavier the responsibility to protect privacy. The KOALA team foregrounds consent and restricts data collection to only configured files, but any broad deployment in university settings will need robust governance, transparent opt-ins, and ongoing oversight. The authors acknowledge this?not as a rhetorical flourish but as a practical constraint that guides what kind of questions researchers can ask at any given moment. This is learning technology with moral weight, not a flashy gadget.

The KOALA project also nudges the broader ecosystem toward more humane research practices. By enabling teachers to tailor data collection to their pedagogy, KOALA avoids the trap of turning students into raw data generators. It treats learning as a collaborative venture between people and tools, where both sides can adapt to each other. In that sense, KOALA is less about surveillance and more about storytelling: a way to narrate a student’s growth through the actual steps they take, the mistakes they correct, and the strategies they reuse or abandon.

Looking ahead, the authors acknowledge that YAML, while powerful for configuration, can become unwieldy as experiments scale. Their wishlist includes a graphical interface for configuring KOALA’s settings, which would lower barriers for instructors who aren’t comfortable editing text-based configuration files. They also envision widening the study to multiple universities to build a richer, more diverse dataset of fine-grained IDE interactions. If that happens, the field could gain a robust empirical map of how novices become proficient programmers, not through anecdotes or one-off assessments, but through a compendium of real-world problem-solving.

In short, KOALA turns the IDE into a kind of observatory for learning. It captures not only what code a student writes, but how the student moves through the tools that professional developers rely on. It’s a move from silent submission to living, analyzable practice. And if done with care, it could make teaching programming as dynamic, data-informed, and humane as the act of writing code itself.