논문 원제목)
Rich feature hierarchies for accurate object detection and semantic segmentation
논문 개인 공부 기록 남기기 포스팅입니다.
전반적 중요 내용 요약이므로 논문 전체의 디테일 요소까지
다루지 않을 수 있음을 알려드립니다.
[Training]
Supervised pre-training)
In this paper, we pre-trained CNN on large auxiliary dataset(ILSVRC 2012)
with image-level annotations.(i.e...No bounding box labels)
Pre-training was performed by using Caffe CNN library(open source)
Domain-specific fine-tuning)
To adapt our CNN to the new task => continue SGD training of CNN parameters
(Based on warped region proposals from VOC)
This paper treated all region proposals which have higher figure of IOU upper
than 0.5(>=0.5) overlap with ground-truth box as positives and the rest as negatives.
Object category classifiers)
<Example : Detection of cars>
Background region should be a negative example.
(Since it doesn't have any correlation with cars)
Q : However, how to deal with some kind of regions
which are partially overlapped a car?
A : Solution would be using IoU overlap threshold.
Also, setting appropriate overlap threshold is an important task.
Once features are extracted and training labels are applied, we optimize one linear SVM per class.
Since the training data is too large to fit in memory, we adopt the standard
hard negative mining method.
This method enables quick converge and stops increment of mAP
after only a single pass over all images.
Q : Then, what is hard negative maining method? A : Literally hard negative means it's hard to recognize some dataset as negative. So this can cause misunderstanding negative one to positive one. Therefore, if we mine this kind of tricky datasets and train them to our model, this will bring about decrement of false negative errors. |
[Conclusion]
This paper proposed object detection algorithm named as "R-CNN" which enabled 30% improvement
over the best previous results on PASCAL VOC 2012.
2 insights made it possible.
First, application of high-capacity CNN to bottom-up region proposals in order to
localize and segment objects.
Second, this paper proposed a paradigm for training large CNNs when labeled data is scarce
->That paradigm is based on pre-training for abundant data and fine-tuning for scarce data.
This shows that supervised pre-training/domain-specific fine-tuning will be highly effective
for variety of data-scarce vision problems.
This paper also has significant value by using combination of classical tools for computer vision & CNNs.
이상으로 R-CNN 정리 요약 포스팅을 마치겠습니다.
오늘도 감사합니다^^
한글버전으로 R-CNN & Fast R-CNN 요약 정리본을 보실려면 아래를 클릭해 주세요!
'논문 공부' 카테고리의 다른 글
AUTOVC 논문 정리(2) (0) | 2021.03.01 |
---|---|
AUTOVC 논문 정리(1) (0) | 2021.02.27 |
R-CNN & Fast R-CNN 비교 정리 (0) | 2021.02.07 |
R-CNN 정리(2) (0) | 2021.01.14 |
R-CNN 정리(1) (0) | 2021.01.11 |
Fast R-CNN 정리(4) (0) | 2021.01.07 |
Fast R-CNN 정리(3) (0) | 2021.01.04 |
Fast R-CNN 정리(2) (0) | 2021.01.03 |
댓글