Today I read a paper titled “Segmentation, Indexing, and Visualization of Extended Instructional Videos”
The abstract is:
We present a new method for segmenting, and a new user interface for indexing and visualizing, the semantic content of extended instructional videos.
Given a series of key frames from the video, we generate a condensed view of the data by clustering frames according to media type and visual similarities.
Using various visual filters, key frames are first assigned a media type (board, class, computer, illustration, podium, and sheet).
Key frames of media type board and sheet are then clustered based on contents via an algorithm with near-linear cost.
A novel user interface, the result of two user studies, displays related topics using icons linked topologically, allowing users to quickly locate semantically related portions of the video.
We analyze the accuracy of the segmentation tool on 17 instructional videos, each of which is from 75 to 150 minutes in duration (a total of 40 hours); the classification accuracy exceeds 96%.