a New Benchmark for Video Comprehension

Dataset description

To assess models trained for the VideoMCC task we created a large-scale dataset comparable in size with largest available datasets. The VideoMCC dataset was created from TV News Show videos donated by Internet Archive using a semi-automatic procedure described in details in the paper.

The VideoMCC dataset is available for free to research groups, individual researches and non-profit organisations for research purposes. For commercial use please contact the Archive at

VideoMCC dataset description:

For dataset specifications please refer to README file.