Research Library

open-access-imgOpen AccessLarge Model based Sequential Keyframe Extraction for Video Summarization
Author(s)
Kailong Tan,
Yuxiang Zhou,
Qianchen Xia,
Rui Liu,
Yong Chen
Publication year2024
Keyframe extraction aims to sum up a video's semantics with the minimumnumber of its frames. This paper puts forward a Large Model based SequentialKeyframe Extraction for video summarization, dubbed LMSKE, which contains threestages as below. First, we use the large model "TransNetV21" to cut the videointo consecutive shots, and employ the large model "CLIP2" to generate eachframe's visual feature within each shot; Second, we develop an adaptiveclustering algorithm to yield candidate keyframes for each shot, with eachcandidate keyframe locating nearest to a cluster center; Third, we furtherreduce the above candidate keyframes via redundancy elimination within eachshot, and finally concatenate them in accordance with the sequence of shots asthe final sequential keyframes. To evaluate LMSKE, we curate a benchmarkdataset and conduct rich experiments, whose results exhibit that LMSKE performsmuch better than quite a few SOTA competitors with average F1 of 0.5311,average fidelity of 0.8141, and average compression ratio of 0.9922.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here