Gonuguntla, N, Mandal, B and Puhan, NB (2019) Enhanced Deep Video Summarization Network. In: 30th British Machine Vision Conference, 9-12 Sep 2019, Cardiff.

[thumbnail of 0007Downloaded.pdf]
0007Downloaded.pdf - Accepted Version

Download (800kB) | Preview


Video summarization is understanding video which aims to get an abstract view of the original video sequence by the concatenation of keyframes representing the highlights of the video. In this work, we propose an enhanced deep summarization network (EDSN) to summarize videos. We implement a reinforcement learning based framework to train our EDSN, where we design a novel reward function which considers the spatial and temporal features of the original video to be included in the summary. The reward function is formulated using the spatial and temporal scores obtained for each frame of the video using the temporal segment networks. During training, the reward function seeks to generate a summary by including the frames with high temporal and spatial scores, while the EDSN strives for earning higher rewards by learning to produce more diverse summaries. The method is completely unsupervised since no labels are required during training. Extensive experiments on two benchmark datasets show that the proposed approach achieves state-of-the-art performance.

Item Type: Conference or Workshop Item (Paper)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Natural Sciences > School of Computing and Mathematics
Depositing User: Symplectic
Date Deposited: 25 Oct 2019 08:21
Last Modified: 25 Oct 2019 11:47
URI: https://eprints.keele.ac.uk/id/eprint/7089

Actions (login required)

View Item
View Item