The Second Perception Test Challenge

Overview

Following the successful 2023 iteration, we organise the second Perception Test Challenge with the goal of benchmarking multimodal perception models on the Perception Test (blog, github) - a diagnostic benchmark created by Google DeepMind to comprehensively probe the abilities of multimodal models across:

video, audio, and text modalities
four skill areas: Memory, Abstraction, Physics, Semantics
four types of reasoning: Descriptive, Explanatory, Predictive, Counterfactual
six computational tasks: multiple-choice video-QA, grounded video-QA, object tracking, point tracking, action localisation, sound localisation

You can try yourself the Perception Test here.

Check the Perception Test github repo for details about the data and annotations format, baselines, and metrics.

Check the Computer Perception workshop at ECCV2022 for recorded talks and slides introducing the Perception Test benchmark.

Check the First Perception Test challenge for details of the previous challenge.

Challenge

Details about challenge tracks coming soon. Prizes totalling 15k EUR are available.

Timeline

May 15th - June 15th, 2024: Challenge server goes live with data from the validation split

End of June, 2024: Held-out test split released

September 14th, 2024: Deadline for submissions

September 21st, 2024 : Winners announced

Early October, 2024: Challenge-workshop at ECCV2024, Milan

The Second Perception Test Challenge

Workshop at ECCV 2024

Overview

Challenge

Timeline

Workshop

Provisional agenda

Speakers

Abhinav Gupta

Josh Tenenbaum

Organizers

Joe Heyward

Joao Carreira

Dima Damen

Andrew Zisserman

Viorica Pătrăucean