Clustering and Retrieval of Video Content Using Speech and Text Information
Nowadays video lecturing is becoming moreand more popular due to its various advantages than classroom learning. Many institutes and organizations are using this method for learning. So there is an enormous amount of data available in video lecturing form. To extract the exact video and exact information through this vast information collection is a tedious task. In this paper we introduce the techniques for automatically retrieving the information from video files to collect it as a metadata for those files. For efficient retrieval of text from videos by the OCR (Optical Character Recognition) tool and retrieval of speech information by using the ASR (Automatic Speech Recognition) tool is used. First of all we do segmentation and classification of video files for extracting the keyframes. Then the OCR and ASR tool is used for collecting the information and it will be stored as a metadata for the file. At last, we provide the efficient browsing for these videos by using the clustering and ontology concept.
OCR, ASR, Content,retrieval,Tesseract,OCR