Gamevibe: A multimodal affective game corpus

What makes people interested in games? What if we could capture aspects of experience just by looking at the gameplay screen, as a mirror of a viewer's experience? We could then learn to design better games and speed up research towards general AI models of player experience.
Gamevibe: A multimodal affective game corpus
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks
We present GameVibe as a novel dataset containing human-annotated labels for
engagement when viewing video clips of 30 different first-person shooter games

Why FPS games?

First-person shooter (FPS) games are hugely popular on live-streaming platforms such as Twitch, attracting hundreds of millions of viewers every month. This makes such games an ideal choice for content creators, streamers and game designers. The FPS genre has been evolving for over 30 years, so FPS games are also diverse in visual style, content, and game modes; this diversity poses a significant challenge for general AI research. These factors make FPS games an attractive platform for our research, motivating us to collect gameplay videos from 30 popular games published between 1991 and 2023. In total, the GameVibe dataset contains 120 videos, including in-game audio, of approximately 1 minute each.

Examples of games which fall under four categories defined in our corpus based on the graphical style
of the game (stylized versus realistic) and game era (retro vs modern).

What type of data do we provide?

We provide the original unprocessed video clips in the repository, including the in-game audio. We also include latent representations extracted from pre-trained foundation models. These include latent representation extracted from Video Masked Autoencoders and Masked Video Distillation for visuals, as well as BEATS and MFCC latent representations for audio. The engagement labels were provided using the PAGAN platform, with the data taking the form of unbounded, time-continuous signals. We provide the raw signals, as well as various stages of signal processing including aggregation, normalization and outlier filtering in the data repository. 

Example annotation provided through PAGAN using the RankTrace annotation
tool. The top portion depicts a clip from one of the games in the corpus whilst
the bottom shows a live render of the annotation trace provided so far.

How do we reliably collect this data?

Reliably collecting data on human emotion is one of the fundamental challenges in AI and human feedback research. Emotions are subjective; this makes it difficult to assess the quality of the data we collect. Are disagreements between annotators exposed to the same stimulus down to actual differences in their emotion response, due to noise in the annotation tool, or due to a misunderstanding of the annotation process? With this challenge in mind, we developed a quality-assurance pipeline which assesses the reliability of human participants, and can help us understand the reliability of the collected data. We conducted a study to ensure the validity of this approach, where we compared the reliability of crowdsourced labels versus data collected in the lab. This helped us ensure that our labels were as representative of the participants’ internal state as possible.

To ensure there is enough data on each video to approximate the ground truth accurately, every clip in the dataset is annotated by 5 human participants. As part of our repository, we provide scripts to process these labels and extract the ground truth signals for each clip, including methods for removing outliers based on their degree of disagreement with other participants. This flexibility allows the dataset to cater to research projects which want to model the subjectivity of the labels, as well as those which solely seek to model the consensus of participants.

Visualization of GameVibe's annotation processing pipeline and its effect on the session data.

Who is this for?

Using the latent representations provided in the repository, our initial studies show that affect models can reliably predict viewer engagement in unseen clips and annotators when trained on clips of the same game. This can be useful for game developers and content creators to assess the quality of new content. One key strength of the dataset is that it comprises 30 different games, enabling studies on the generalizability of multimodal affect models to predict labels in unseen games. We believe that this can help progress towards the ultimate goal of general AI models of human affect

Where to find it?

The GameVibe corpus, including the audiovisual stimuli and the human-annotated labels can be found at https://doi.org/10.17605/OSF.IO/P4NGX. This repository also contains processing scripts and some examples of latent representations extracted using well-established foundation models such as VideoMAEv2 (for video) and BEATS (for audio), as described above.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Artificial Intelligence
Mathematics and Computing > Computer Science > Artificial Intelligence
Research Data
Research Communities > Community > Research Data
User Interfaces and Human Computer Interaction
Mathematics and Computing > Computer Science > Computer and Information Systems Applications > User Interfaces and Human Computer Interaction

Related Collections

With collections, you can get published faster and increase your visibility.

Epidemiological data

This Collection presents a series of articles describing epidemiological datasets spanning diverse populations, ecosystems, and disease contexts. Data are presented without hypotheses or significant analyses, and can be derived from population surveys, health registries, electronic health records, field sampling, or other sources.

Publishing Model: Open Access

Deadline: Mar 27, 2025

Neuroscience data to understand human behaviour

This Collection presents descriptions of datasets combining brain imaging or neurophysiological data performed alongside real-world tasks or exposure to different stimuli.

Publishing Model: Open Access

Deadline: Jan 30, 2025