
Blind video quality assessment via spatiotemporal statistical analysis of adaptive cube size 3D‐DCT coefficients
Author(s) -
Cemiloglu Enes,
Yilmaz Gokce Nur
Publication year - 2020
Publication title -
iet image processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.401
H-Index - 45
eISSN - 1751-9667
pISSN - 1751-9659
DOI - 10.1049/iet-ipr.2019.0275
Subject(s) - discrete cosine transform , computer science , cube (algebra) , artificial intelligence , quality assessment , video quality , computer vision , pattern recognition (psychology) , statistical analysis , quality (philosophy) , statistics , mathematics , image (mathematics) , medicine , metric (unit) , philosophy , operations management , external quality assessment , epistemology , pathology , combinatorics , economics
There is an urgent need for a robust video quality assessment (VQA) model that can efficiently evaluate the quality of a video content varying in terms of the distortion and content type in the absence of the reference video. Considering this need, a novel no reference (NR) model relying on the spatiotemporal statistics of the distorted video in a three‐dimensional (3D)‐discrete cosine transform (DCT) domain is proposed in this study. While developing the model, as the first contribution, the video contents are adaptively segmented into the cubes of different sizes and spatiotemporal contents in line with the human visual system (HVS) properties. Then, the 3D‐DCT is applied to these cubes. Following that, as the second contribution, different efficient features (i.e. spectral behaviour, energy variation, distances between spatiotemporal frequency bands, and DC variation) associated with the contents of these cubes are extracted. After that, these features are associated with the subjective experimental results obtained from the EPFL‐PoliMi video database using the linear regression analysis for building the model. The evaluation results present that the proposed model, unlike many top‐performing NR‐VQA models (e.g. V‐BLIINDS, VIIDEO, and SSEQ), achieves high and stable performance across the videos with different contents and distortions.