z-logo
open-access-imgOpen Access
Cracking BING and Beyond
Author(s) -
Qiyang Zhao,
Zhibin Liu
Publication year - 2014
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5244/c.28.23
Subject(s) - computer science , window (computing) , pascal (unit) , weighting , template , rectangle , set (abstract data type) , object (grammar) , artificial intelligence , mathematics , programming language , operating system , medicine , geometry , radiology
The problem, generic objectness proposal, aims to reduce the candidate windows for object detection tasks. The popular evaluation criterion for related methods is detection-rate/windows-amount(DR-#WIN), where DR is the percentage of groundtruth objects covered by proposal windows. An object is considered “covered” by a window only if the strict PASCAL-overall criterion [3] is satisfied (the intersection of a proposal window and the object rectangle is not smaller than half of their union, so we call it “0.5-criterion” for short). Under the DR-#WIN evaluation framework, BING [2] in CVPR2014, obtains the best performance on the VOC2007 test set. It recalls 96.2% objects with only 1,000 proposal windows. The more surprising is the method is totally a realtime one. The authors of BING suggest that, after being resized to a fixed size (8× 8), almost all annotated rectangle regions share a common characteristics in gradients [2]. This commonness is captured by a template W learned from training images with a linear SVM. Besides this, the subtle differences between diverse width/height configurations are captured in a re-weighting model. Therefore BING consists of two stages: calculating W in stage I, and learning the re-weighting model in stage II. Furthermore, BING uses smart bitwise operations to calculate the inner product of W and candidate windows, so to improve the efficiency. We designed several templates by hands to substitute W , to verify whether templates play a key role in BING. These templates become less correlated to W in turn, but their performances on VOC 2007 test set are very close, see Fig.1.a. Next we discarded any templates and directly assigned the scores of stage I with uniformly random values (we call this method RAND-SCORE). Surprisingly, the performance of RANDSCORE is even very close to BING, as shown in Fig.1.b. It is clear that these templates do not have as strong significance as suggested in [2]. Then what on earth makes BING performing so well? To get the deep insight, we finished a theoretical analysis from the view of combinatorial geometry. We try to construct a small set of windows to “cover” all legal rectangles (we call it a full cover set). This is an atypical covering problem in combinatorial geometry [1]. We proposed four lemmas to solve it in the full paper. In conclusion, for an image of the width M and height N, we can use s(i, j)windows of the width 2i · √ 2 and height 2 j · √ 2 to cover all 2i ≤ w ≤ 2i+1,2 j ≤ h ≤ 2 j+1 rectangle regions, where s(i, j) = ⌈ M−2i ( 1− √ 2 2 ) ·2i· √ 2 ⌉ · ⌈ N−2 j ( 1− √ 2 2 ) ·2 j · √ 2 ⌉. Suppose the image size

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom