|
Vu Digital's service uses predictive analytics, ], ], object recognition, and audio/image detection to extract ] from video including transcripts of the audio, and time-tagged references to screen text and appearances of persons or images of interest, such as objects or logos. The core technology includes splitting a video into two components, audio and the video frames. Both components are then processed using speech-to-text transcription, text extraction from images, facial recognition and image recognition. The output is not only the transcript of the video and image frames, but metadata that is timestamped with frame references. For example, if a brand logo is found an hour and a half into a video, the metadata would include that time reference a 01:30. With this technological approach video classification/clustering, search engine indexing, and personalization for content, including targeted advertisements are possible. {{Citation needed}} |
|
Vu Digital's service uses predictive analytics, ], ], object recognition, and audio/image detection to extract ] from video including transcripts of the audio, and time-tagged references to screen text and appearances of persons or images of interest, such as objects or logos. |