久久久久99精品av免费观看,精品成人免费一区二区在线播放

當前位置： OFweek 人工智能網(wǎng) > 計算機視覺 > 正文

深蘭科技|目標檢測二十年間的那些事兒

2020-07-31 10:00

（3） Fast RCNN

2015年，R． Girshick提出了Fast RCNN檢測器［19］，這是對R－CNN和SPPNet的進一步改進。Fast RCNN使我們能夠在相同的網(wǎng)絡配置下同時訓練檢測器和邊界框回歸器。在VOC07數(shù)據(jù)集上，F(xiàn)ast RCNN將mAP從58．5％（ RCNN）提高到70．0％，檢測速度是R－CNN的200多倍。

雖然Fast－RCNN成功地融合了R－CNN和SPPNet的優(yōu)點，但其檢測速度仍然受到提案檢測的限制。然后，一個問題自然而然地出現(xiàn)了：“ 我們能用CNN模型生成對象提案嗎？ ” 之后的Faster R－CNN解決了這個問題。

（4） Faster RCNN

2015年，S． Ren等人提出了Faster RCNN檢測器［20］，在Fast RCNN之后不久。Faster RCNN 是第一個端到端的，也是第一個接近實時的深度學習檢測器（COCO mAP＠．5＝42．7％，COCO mAP＠［．5，．95］＝21．9％， VOC07 mAP＝73．2％，VOC12 mAP＝70．4％）。Faster RCNN的主要貢獻是引入了區(qū)域提案網(wǎng)絡（RPN）從而允許幾乎所有的cost－free的區(qū)域提案。從RCNN到Faster RCNN，一個目標檢測系統(tǒng)中的大部分獨立塊，如提案檢測、特征提取、邊界框回歸等，都已經(jīng)逐漸集成到一個統(tǒng)一的端到端學習框架中。

雖然Faster RCNN突破了Fast RCNN的速度瓶頸，但是在后續(xù)的檢測階段仍然存在計算冗余。后來提出了多種改進方案，包括RFCN和 Light head RCNN。

（5） Feature Pyramid Networks（FPN）

2017年，T．－Y．Lin等人基于Faster RCNN提出了特征金字塔網(wǎng)絡（FPN）［21］。在FPN之前，大多數(shù)基于深度學習的檢測器只在網(wǎng)絡的頂層進行檢測。雖然CNN較深層的特征有利于分類識別，但不利于對象的定位。為此，開發(fā)了具有橫向連接的自頂向下體系結構，用于在所有級別構建高級語義。由于CNN通過它的正向傳播，自然形成了一個特征金字塔，F(xiàn)PN在檢測各種尺度的目標方面顯示出了巨大的進步。在基礎的Faster RCNN系統(tǒng)中使用FPN骨架可在無任何修飾的條件下在MS－COCO數(shù)據(jù)集上以單模型實現(xiàn)state－of－the－art 的效果（COCO mAP＠．5＝59．1％，COCO mAP＠［．5，．95］＝ 36．2％）。FPN現(xiàn)在已經(jīng)成為許多最新探測器的基本組成部分。

基于卷積神經(jīng)網(wǎng)絡的單級檢測器

單階段檢測的發(fā)展及各類檢測器的結構［2］

（1） You Only Look Once （YOLO）

YOLO由R． Joseph等人于2015年提出［22］。它是深度學習時代的第一個單級檢測器。YOLO非�？欤篩OLO的一個快速版本運行速度為155fps， VOC07 mAP＝52．7％，而它的增強版本運行速度為45fps， VOC07 mAP＝63．4％， VOC12 mAP＝57．9％。YOLO是“ You Only Look Once ” 的縮寫。從它的名字可以看出，作者完全拋棄了之前的“提案檢測＋驗證”的檢測范式。相反，它遵循一個完全不同的設計思路：將單個神經(jīng)網(wǎng)絡應用于整個圖像。該網(wǎng)絡將圖像分割成多個區(qū)域，同時預測每個區(qū)域的邊界框和概率。后來R． Joseph在 YOLO 的基礎上進行了一系列改進，其中包括以路徑聚合網(wǎng)絡（Path aggregation Network， PAN）取代FPN，定義新的損失函數(shù)等，陸續(xù)提出了其 v2、v3及v4版本（截止本文的2020年7月，Ultralytics發(fā)布了“YOLO v5”，但并沒有得到官方承認），在保持高檢測速度的同時進一步提高了檢測精度。

必須指出的是，盡管與雙級探測器相比YOLO的探測速度有了很大的提高，但它的定位精度有所下降，特別是對于一些小目標而言。YOLO的后續(xù)版本及在它之后提出的SSD更關注這個問題。

（2） Single Shot MultiBox Detector （SSD）

SSD由W． Liu等人于2015年提出［23］。這是深度學習時代的第二款單級探測器。SSD的主要貢獻是引入了多參考和多分辨率檢測技術，這大大提高了單級檢測器的檢測精度，特別是對于一些小目標。SSD在檢測速度和準確度上都有優(yōu)勢（VOC07 mAP＝76．8％，VOC12 mAP＝74．9％， COCO mAP＠．5＝46．5％，mAP＠［．5，．95］＝26．8％，快速版本運行速度為59fps）。SSD與其他的檢測器的主要區(qū)別在于，前者在網(wǎng)絡的不同層檢測不同尺度的對象，而后者僅在其頂層運行檢測。

（3） RetinaNet

單級檢測器有速度快、結構簡單的優(yōu)點，但在精度上多年來一直落后于雙級檢測器。T．－Y．Lin等人發(fā)現(xiàn)了背后的原因，并在2017年提出了RetinaNet［24］。他們的觀點為精度不高的原因是在密集探測器訓練過程中極端的前景－背景階層不平衡（the extreme foreground－background class imbalance）現(xiàn)象。為此，他們在RetinaNet中引入了一個新的損失函數(shù) “ 焦點損失（focal loss）”，通過對標準交叉熵損失的重構，使檢測器在訓練過程中更加關注難分類的樣本。焦損耗使得單級檢測器在保持很高的檢測速度的同時，可以達到與雙級檢測器相當?shù)木�。（COCO mAP＠．5＝59．1％，mAP＠［．5，．95］＝39．1％）。

參考文獻：

［1］Zhengxia Zou， Zhenwei Shi， Member， IEEE， Yuhong Guo， and Jieping Ye， Object Detection in 20 Years： A Survey Senior Member， IEEE

［2］Xiongwei Wu， Doyen Sahoo， Steven C．H． Hoi， Recent Advances in Deep Learning for Object Detection， arXiv：1908．03673v1

［3］K． He， X． Zhang， S． Ren， J． Sun， Deep residual learning for image recognition， in： CVPR， 2016．

［4］R． Girshick， J． Donahue， T． Darrell， J． Malik， Rich feature hierarchies for accurate object detection and semantic segmentation， in： CVPR， 2014．

［5］K． He， G． Gkioxari， P． Dollar， R． Girshick， Mask r－cnn， in： ICCV， 2017．

［6］L．－C． Chen， G． Papandreou， I． Kokkinos， K． Murphy， A． L． Yuille， Semantic image segmentation with deep convolutional nets and fully connected crfs， in： arXiv preprint arXiv：1412．7062， 2014．

［7］Y． LeCun， Y． Bengio， and G． Hinton， “Deep learning，” nature， vol． 521， no． 7553， p． 436， 2015．

［8］P． Viola and M． Jones， “Rapid object detection using a boosted cascade of simple features，” in Computer Vision and Pattern Recognition， 2001． CVPR 2001． Proceedings of the 2001 IEEE Computer Society Conference on， vol． 1． IEEE， 2001， pp． I–I．

［9］P． Viola and M． J． Jones， “Robust real－time face detection，” International journal of computer vision， vol． 57， no． 2， pp． 137–154， 2004．

［10］C． Papageorgiou and T． Poggio， “A trainable system for object detection，” International journal of computer vision， vol． 38， no． 1， pp． 15–33， 2000．

［11］N． Dalal and B． Triggs， “Histograms of oriented gradients for human detection，” in Computer Vision and Pattern Recognition， 2005． CVPR 2005． IEEE Computer Society Conference on， vol． 1． IEEE， 2005， pp． 886–893．

［12］P． Felzenszwalb， D． McAllester， and D． Ramanan， “A discriminatively trained， multiscale， deformable part model，” in Computer Vision and Pattern Recognition， 2008． CVPR 2008． IEEE Conference on． IEEE， 2008， pp． 1–8．

［13］P． F． Felzenszwalb， R． B． Girshick， and D． McAllester， “Cascade object detection with deformable part models，” in Computer vision and pattern recognition （CVPR）， 2010 IEEE conference on． IEEE， 2010， pp． 2241–2248．

［14］P． F． Felzenszwalb， R． B． Girshick， D． McAllester， and D． Ramanan， “Object detection with discriminatively trained part－based models，” IEEE transactions on pattern analysis and machine intelligence， vol． 32， no． 9， pp． 1627– 1645， 2010．

［15］A． Krizhevsky， I． Sutskever， and G． E． Hinton， “Imagenet classification with deep convolutional neural networks，” in Advances in neural information processing systems， 2012， pp． 1097–1105．

［16］R． Girshick， J． Donahue， T． Darrell， and J． Malik， “Regionbased convolutional networks for accurate object detection and segmentation，” IEEE transactions on pattern analysis and machine intelligence， vol． 38， no． 1， pp． 142– 158， 2016．

［17］K． E． Van de Sande， J． R． Uijlings， T． Gevers， and A． W． Smeulders， “Segmentation as selective search for object recognition，” in Computer Vision （ICCV）， 2011 IEEE International Conference on． IEEE， 2011， pp． 1879–1886．

［18］K． He， X． Zhang， S． Ren， and J． Sun， “Spatial pyramid pooling in deep convolutional networks for visualrecognition，” in European conference on computer vision． Springer， 2014， pp． 346–361．

［19］R． Girshick， “Fast r－cnn，” in Proceedings of the IEEE international conference on computer vision， 2015， pp． 1440–1448．

［20］S． Ren， K． He， R． Girshick， and J． Sun， “Faster r－cnn： Towards real－time object detection with region proposal networks，” in Advances in neural information processing systems， 2015， pp． 91–99．

［21］T．－Y． Lin， P． Dollar， R． B． Girshick， K． He， B． Hariharan， and S． J． Belongie， “Feature pyramid networks for object detection．” in CVPR， vol． 1， no． 2， 2017， p． 4．

［22］J． Redmon， S． Divvala， R． Girshick， and A． Farhadi， “You only look once： Unified， real－time object detection，” in Proceedings of the IEEE conference on computer vision and pattern recognition， 2016， pp． 779–788．

［23］W． Liu， D． Anguelov， D． Erhan， C． Szegedy， S． Reed， C．－Y． Fu， and A． C． Berg， “Ssd： Single shot multibox detector，” in European conference on computer vision． Springer， 2016， pp． 21–37．

［24］T．－Y． Lin， P． Goyal， R． Girshick， K． He， and P． Dollar， “Focal loss for dense object detection，” IEEE transactions on pattern analysis and machine intelligence， 2018．

<上一頁 1 2 3

本地收藏打印推薦給朋友

聲明： 本文由入駐維科號的作者撰寫，觀點僅代表作者本人，不代表OFweek立場。如有侵權或其他問題，請聯(lián)系舉報。

發(fā)表評論

共0條評論，0人參與

登錄登錄即可訪問所有OFweek服務

用戶名/郵箱/手機：
密碼：
忘記密碼？
用其他賬號登錄： QQ | 微信 | 新浪微博

請輸入評論內(nèi)容...

請輸入評論/評論長度6~500個字

暫無評論

暫無評論

圖片新聞

最新發(fā)布

最新活動更多

一周熱點月點擊榜

企業(yè)服務廣告服務獵頭服務薪酬報告

人工智能獵頭職位更多

掃碼關注公眾號
OFweek人工智能網(wǎng)
獲取更多精彩內(nèi)容

文章糾錯

x

_*文字標題：

_*糾錯內(nèi)容：

聯(lián)系郵箱：

_*驗證碼：

看不清，點擊換一張

粵公網(wǎng)安備 44030502002758號

感谢您访问我们的网站，您可能还对以下资源感兴趣：

精品剧情v国产在线观看

精品一区二区三区在线观看视频肉体奉公hd中文字幕看片在线男女h视频