top of page
Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection
Author
Jiao, Y. et al.
2018
|
IEEE Transactions on Multimedia
Publication type
Artículo de revista
Language
Inglés
Keywords
Summary
The video highlight detection task is to localize key elements (moments of user's major or special interest) in a video. Most of existing highlight detection approaches extract features from the video segment as a whole without considering the difference of local features both temporally and spatially. Due to the complexity of video content, this kind of mixed features will impact the final highlight prediction. In temporal extent, not all frames are worth watching because some of them only contain the background of the environment without human or other moving objects. In spatial extent, it is similar that not all regions in each frame are highlights especially when there are lots of clutters in the background. To solve the above problem, we propose a novel three-dimensional (3-D) (spatial+temporal) attention model that can automatically localize the key elements in a video without any extra supervised annotations. Specifically, the proposed attention model produces attention weights of local regions along both the spatial and temporal dimensions of the video segment
Url
bottom of page