This project was more or less a test of capabilities, however it may be of interest to those exploring the video data or NLP stuff.
The data can be downloaded here.
The data is split into 4 folders, one for each riotarchive node, each folder with a list of location folders, in each location folder is a number of JSON files making up the model results.
The structure of these results is pretty simple, it's a JSON object with 3 main keys.
•url - is the location of the file on riotarchive
•text - is the raw text results from the model
•segments - is the text broken up into time intervals with probablities
If there are other datasets that may be of interest to apply speech recognition too please reach out to us.