To assess models trained for the VideoMCC task we created a large-scale dataset comparable in size with largest available datasets. The VideoMCC dataset was created from TV News Show videos donated by Internet Archive using a semi-automatic procedure described in details in the paper.
The VideoMCC dataset is available for free to research groups, individual researches and non-profit organisations for research purposes. For commercial use please contact the Archive at firstname.lastname@example.org.
VideoMCC dataset description: