Update

About

I am a PhD researcher in Computer Vision at the University of Bristol, supervised by Prof. Dima Damen and a member of the MaVi group. My research focuses on multi-modal video understanding, particularly in egocentric video — including audio-visual deep learning, action recognition/detection, and 3D eye-gaze interaction priming. I am currently a PhD intern with the Visual Representation Learning team at Naver Labs Europe.

Prior to my PhD, I earned a First Class Honours MEng in Computer Science from the University of Bristol, where my dissertation on "Video GANs for Human-Object Interactions" was highly graded. Alongside research, I've gained teaching experience across multiple undergraduate modules, contributing to both coursework design and lab-based support.

I've worked across a range of projects involving large-scale multi-modal datasets and model development, contributing to research outputs such as HD-EPIC, EPIC-Sounds, TIM, and OSNOM. My hands-on experience spans dataset construction, baseline benchmarking, multi-modal model design, and open-source codebase development.

My technical strengths lie in deep learning, computer vision, and audio-visual modelling, with strong practical proficiency in Python (PyTorch) and capabilities with C++ and Javascript.

Funded by the Engineering and Physical Sciences Research Council (EPSRC).

Email: jacob.chalk@bristol.ac.uk


News


Research

Current list of all research projects:

HD-EPIC: A Highly-Detailed Egocentric Video Dataset
Toby Perrett*, Ahmad Darkhalil*, Saptarshi Sinha*, Omar Emara*, Sam Pollard*, Kranti Parida*, Kaiting Liu*, Prajwal Gatti*, Siddhant Bansal*, Kevin Flanagan*, Jacob Chalk*, Zhifan Zhu*, Rhodri Guerrier*, Fahd Abdelazim*, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen
*: Equal Contribution
Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Webpage] [arXiv] [Code]
Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind
Chiara Plizzari, Shubham Goel, Toby Perrett, Jacob Chalk, Angjoo Kanazawa, Dima Damen
International Conference on 3D Vision (3DV), 2025
[Webpage] [arXiv] [Code]
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk*, Jaesung Huh*, Evangelos Kazakos, Andrew Zisserman, Dima Damen
*: Equal Contribution
Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Webpage] [arXiv] [Code]
EPIC-Sounds: A Large-scale Dataset of Actions That Sound
Jaesung Huh*, Jacob Chalk*, Evangelos Kazakos, Dima Damen, Andrew Zisserman
*: Equal Contribution
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
[Webpage] [arXiv] [Code]


Teaching


Miscellaneous

Presentations
Conference Reviewer
Journal Reviewer
Honours and Awwards