Sparse Representation for Crowd Attributes Recognition
Human behavior analysis has become a critical area of research in computer vision and artificial intelligence research community. In recent years, video surveillance systems of crowd scenes have witnessed an increased demand in different applications such as safety, security, entertainment and personal mental health. Although many methods have been proposed, certain limitations exist, and many unresolved issues remain open. In this study, we proposed a novel spatio-temporal sparse coding representation, based on sparse coded features with k-means singular value decomposition for robust classification of crowd behaviors. Extensive experiments have shown that dictionary learning method with sparsely coded features captured vital structures of video scenes and yielded discriminant descrip-tors for classifications than conventional bag-of-visual-features. Relying on the measurable features of crowd scenes and motion characteristics, we can represent different attributes of the crowd scenes. Experiments on hundreds of video scenes were carried out on publicly available datasets. Quantitative evaluation indicates that the proposed model display superior accuracy, precision, and recall in classifying human behaviors with linear Support Vector Machine (SVM) when compared to the state-of-the-art methods. The proposed method is conceptually simple and easy to train: thereby achieving an accuracy of 93.50 percentage., a precision of 93.40 percentage. and a recall of 95.96 percentage.
Human behavior, Crowd scenes, Histogram of optical flow, Histogram of oriented gradient, Artificial intelligence and Sparse coding.