Speech Recognition

This project uses Whisper for speech recognition on videos from left wing riots/protests. Whisper offers several models, the model used is for these results the "small.en" model.

This project was more or less a test of capabilities, however it may be of interest to those exploring the video data or NLP stuff.

The data can be downloaded here.

The data is split into 4 folders, one for each riotarchive node, each folder with a list of location folders, in each location folder is a number of JSON files making up the model results.
The structure of these results is pretty simple, it's a JSON object with 3 main keys.
•url - is the location of the file on riotarchive
•text - is the raw text results from the model
•segments - is the text broken up into time intervals with probablities


If there are other datasets that may be of interest to apply speech recognition too please reach out to us.