DeepAct: A Deep Neural Network Model for Activity Detection in Untrimmed Videos

Yeongtaek Song and Incheol Kim
Volume: 14, No: 1, Page: 150 ~ 161, Year: 2018
10.3745/JIPS.04.0059
Keywords: Activity Detection, Bi-directional LSTM, Deep Neural Networks, Untrimmed Video
Full Text:

Abstract
We propose a novel deep neural network model for detecting human activities in untrimmed videos. The process of human activity detection in a video involves two steps: a step to extract features that are effective in recognizing human activities in a long untrimmed video, followed by a step to detect human activities from those extracted features. To extract the rich features from video segments that could express unique patterns for each activity, we employ two different convolutional neural network models, C3D and I-ResNet. For detecting human activities from the sequence of extracted feature vectors, we use BLSTM, a bi-directional recurrent neural network model. By conducting experiments with ActivityNet 200, a large-scale benchmark dataset, we show the high performance of the proposed DeepAct model.

Article Statistics
Multiple requests among the same broswer session are counted as one view (or download).
If you mouse over a chart, a box will show the data point's value.


Cite this article
IEEE Style
Yeongtaek Song and Incheol Kim, "DeepAct: A Deep Neural Network Model for Activity Detection in Untrimmed Videos," Journal of Information Processing Systems, vol. 14, no. 1, pp. 150~161, 2018. DOI: 10.3745/JIPS.04.0059.

ACM Style
Yeongtaek Song and Incheol Kim, "DeepAct: A Deep Neural Network Model for Activity Detection in Untrimmed Videos," Journal of Information Processing Systems, 14, 1, (2018), 150~161. DOI: 10.3745/JIPS.04.0059.