Title VSRNet: End-to-end video segment retrieval with text query
Authors Sun, Xiao
Long, Xiang
He, Dongliang
Wen, Shilei
Lian, Zhouhui
Affiliation Peking Univ, Wangxuan Inst Comp Technol, Beijing 100871, Peoples R China
Meituan Inc, Beijing 100102, Peoples R China
Baidu Inc, Dept Comp Vis VIS Technol, Beijing 100085, Peoples R China
Baidu Inc, Beijing 100085, Peoples R China
Issue Date Nov-2021
Publisher PATTERN RECOGNITION
Abstract Users are sometimes interested in specific segments of an untrimmed video when using the video search engine. Targeting at this demand, we explore a novel research topic of text query based video segment retrieval (VSR). Different from the conventional video retrieval task or localizing text descriptions in a single video, it requires the retrieval of the most relevant video from a large collection as well as localizing the start and end timestamps of a segment that matches the text query best from the video. A direct solution is to perform video-level matching first, and then apply description localization among such video candidates. Such two-stage based methods are not able to utilize complementary information of each stage, and are time-consuming in inference. In this paper, We propose VSRNet, an end-to-end framework that efficiently retrieves video at segment granularity with two branches. In the first branch, individual videos and texts are mapped to a common space for stand-alone ranking. In the second branch, we propose a supervised text-aligned attention mechanism and calculate the response of every frame to the text query, from which the frames with high scores are aggregated as segment proposals. Extensive experiments conducted on ActivityNet Captions and DiDeMo verify the effectiveness of our method and show that our solution significantly outperforms the state of the art. (C) 2021 Elsevier Ltd. All rights reserved.
URI http://hdl.handle.net/20.500.11897/623853
ISSN 0031-3203
DOI 10.1016/j.patcog.2021.108027
Indexed EI
SCI(E)
Appears in Collections: 待认领

Files in This Work
There are no files associated with this item.

Web of Science®


0

Checked on Last Week

Scopus®



Checked on Current Time

百度学术™


0

Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.