Collecting Data in Gaming: The Hunt for Riftbound’s Second Expansion
A practical, research-focused guide to how the Riftbound community collected, managed, and reproduced expansion data.
Collecting Data in Gaming: The Hunt for Riftbound’s Second Expansion
How a volunteer community, telemetry, and research-grade data management practices converged during the Riftbound expansion campaign — and what researchers and designers can learn about reproducibility, consent, and community engagement.
Introduction: Why this case matters to researchers and designers
Riftbound’s expansion as a living lab
The announcement of Riftbound’s second expansion transformed a fandom into an active data-generating ecosystem. Players began logging builds, sharing playtests, and clustering around hypotheses for balance. That grassroots data collection mirrored many features of formal research projects: protocols, version control, and iterative analysis. In this guide we use Riftbound as a concrete case study to examine how gaming communities collect, manage, and reuse data, and how those practices align with established research data management (RDM) and reproducibility principles.
Why gaming data is high-value and high-risk
Data produced by games includes telemetry, chat logs, forum polls, and user-submitted playtests. This combination is valuable for designers, academics studying user experience, and community managers aiming to improve engagement. At the same time, it brings privacy, security, and trust issues that researchers routinely face. Lessons from digital security reporting, such as the analysis of the WhisperPair vulnerability, are directly relevant when community datasets contain personal identifiers or sensitive behavioral traces (digital security lessons from WhisperPair).
Who should read this guide
This deep dive is for game designers, community managers, academic researchers, and advanced players who run playtests. If you want reproducible results, ethical consent workflows, or a practical step-by-step to run a community-driven data collection campaign, the next sections are a practical blueprint grounded in real-world examples.
Section 1 — Motivations: Why communities collect data
Balancing and meta discovery
Players collect data to discover overperforming builds, nerf/buff candidates, and emergent strategies. In Riftbound’s expansion campaign, competitive clans ran systematic queues to estimate win rates and power curves, echoing formal experimental arms with control groups and repeated measures. These grassroots analytics often use spreadsheets, shared trackers, and communal dashboards to aggregate tens of thousands of game outcomes.
Content creation and visibility
Streams and videos build narratives around balance shifts. Community members use collected statistics to craft persuasive content — think patch breakdowns or hero tier lists — which in turn drive engagement. Streaming and review dynamics influence what data projects gain traction, a phenomenon reminiscent of how live reviews shape audience attention in other domains (live reviews impact audience engagement).
Research curiosity and modding
Some players approach game data as an experimental platform. Researchers and modders test hypotheses about reward schedules, user retention, and emergent cooperative systems. Community-powered experimentation is fertile ground for insights that inform design, and it often leverages practices from other creative domains — for example, cross-disciplinary lessons on crafting impactful experiences from the art world (creating impactful gameplay lessons from the art world).
Section 2 — Anatomy of Riftbound’s data collection campaign
Kickoff and community organization
Shortly after the leak of expansion patch notes, Riftbound community leads posted organized playtest schedules in Discord and Reddit. They set explicit playtest windows, variables to change (e.g., item X’s cooldown), and data templates for logging match outcomes. That first step — agreeing on a protocol — is the single most important activity for ensuring comparable, interpretable results.
Channels and artifacts
Data artifacts included annotated spreadsheets, telemetry extracts from client-side logs, VOD timestamps for verification, and user surveys about subjective experience. Forums aggregated qualitative reports while automated scripts parsed log files to extract structured events. Community channels replicated features we see in platform development: feature fatigue and update conversations often mirror broader software discussions, which communities navigate using methods similar to social platforms managing feature overload (navigating feature overload on social platforms).
Roles and governance
Volunteers took on roles like data steward, scrapers, and statisticians. They enforced a modest governance model: a transparent changelog, version-coded datasets, and a central repository. This mirrors academic projects where data curators and PIs set policies for dataset access and reuse.
Section 3 — Methods: How the community gathered data
Automated telemetry and client logging
Community contributors wrote parsers for Riftbound’s local logs and aggregated anonymized telemetry. These parsers converted raw events into schemas suitable for analysis (e.g., match_id, player_class, timestamp, outcome). Converting to a stable schema is crucial for reproducibility; without it, later analysts cannot compare datasets reliably.
Structured playtests and A/B-style matches
Players ran structured matches that varied one factor at a time — a practical analog of A/B testing. They recorded player composition, target metrics, and environmental settings. Systematic replication of these small
Related Topics
Dr. Marcus Hale
Senior Editor, researchers.site
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Collaboration in Music and Research: Insights from Duran Duran's Journey
Decoding the Puzzle of Academic Word Play: Engaging Students with Word Games
Artistic Challenges in Academia: The Case of Renée Fleming’s Resignation
Can AI Write, Review, and Publish Science? Rethinking Peer Review in the Age of Automation
Unpacking Emotional Resonance in Documentary Film: Insights from 'Josephine'
From Our Network
Trending stories across our publication group