논문 원제목)
Rich feature hierarchies for accurate object detection and semantic segmentation
논문 개인 공부 기록 남기기 포스팅입니다.
전반적 중요 내용 요약이므로 논문 전체의 디테일 요소까지
다루지 않을 수 있음을 알려드립니다.
[Abstract]
This paper propose simple & scalable detection algorithm called "R-CNN".
R-CNN is a mixture of region proposal and CNN.
(Regions with CNN features)
R-CNN has achieved significant increment of mAP more than 30%
compared to previous result on VOC 2012.
[Introduction]
This paper shows that
CNN can achieve higher object detection
performance on PASCAL VOC(dataset)
compared to systems based on simpler HOG-like features.
This result(upper side) brought about 2 problems.
1. localization of objects with deep network
2. training a high-capacity model with scarce labeled data.
Problem-1 Localization)
For two decades, CNN has used "sliding-window detector".
In order to maintain high spatial resolution, CNN typically have 2 conv and pooling layers.
Writers of this paper also considered sliding-window method.
Because of large receptive fields(195x195 pixels) and strides(32x32 pixels),
accurate localization would be challenging!
So, they adopted "Recognition using regions" method as a solution for CNN localization problem.
Overall flow)
1. Our method generates around 2000 category-independent(not processed)
region proposals for the input image.
(By applying Selective Search algorithm)
2. Extracts a fixed-length feature vector from each proposal using CNN.
3. Category-Specific Linear SVM -> classify each region.
During operating level 2 on the upper side,
"Affine Image Warping(Resizing)" is on process.
Because CNN input size is fixed for each region proposal
regardless of the regions shape.
So, by applying warping operation, some bit losses of image occurs.
Problem-2 Scarce data)
Reference URL : eehoeskrap.tistory.com/186
#Advance Information#
Pre-training
-> To initialize wieght and bias well in Multi Layered Perceptron.
Fine-tuning(기존 모델의 파라미터 미세 조정)
-> Give customized dataset as training dataset to pre-trained model(ex. VGG16,ResNet..etc)
and update its weight.
Solution for this problem(insufficient dataset to training large CNN)
is to use "Unsupervised pre-training".
Also, Fine-tuning for detection improves mAP figure by 8 percentage points.
이후 내용들은
R-CNN 정리(2) 포스팅을 참조해 주세요.
곧 올릴 예정입니다.
오늘도 감사합니다.
'논문 공부' 카테고리의 다른 글
AUTOVC 논문 정리(1) (0) | 2021.02.27 |
---|---|
R-CNN & Fast R-CNN 비교 정리 (0) | 2021.02.07 |
R-CNN 정리(3) (0) | 2021.01.21 |
R-CNN 정리(2) (0) | 2021.01.14 |
Fast R-CNN 정리(4) (0) | 2021.01.07 |
Fast R-CNN 정리(3) (0) | 2021.01.04 |
Fast R-CNN 정리(2) (0) | 2021.01.03 |
Fast R-CNN 정리(1) (0) | 2021.01.02 |
댓글