This paper is written by Ross Girshick(Microsoft Research)
논문 본문 정리 및 요약 내용 기록 남기기(개인 공부 포스팅입니다)
[2. Fast R-CNN architecture and training]
Fast R-CNN Procedure)
1. Input Image -> Conv&Max Pooling layers -> create feature map
2. From feature map of level 1 -> ROIs-> extract feature vector(fixed length)
3. Feature Vector -> Fully connected layers(FC) -> Split into 2 outputs
4. First one : Execute ROI Classification
Second one : Execute bounding box Regression
(4 real-valued numbers for each K object classes)
==> Adjust positions of bounding box
[2.1 The ROI pooling layer]
Main Concept)
=>This layer uses "max-pooling" to convert features inside ROI into
small feature-map with a fixed spatial extent of H X W
(H & W are hyper - parameters)
Structure of ROI)
=> (r,c,h,w)
(r,c) : top-left corner coordinates
(h,w) : height and width values
ROI pooling Operation)
h & w : Height and Width of ROI window
H & W : Height and Width of sub-window
1. Create approximate size of grid by calculation of h/H and w/W
2. Apply Max-Pooling in each sub-window
그림 출처 www.researchgate.net/figure/Illustration-of-the-RoI-pooling-operation_fig4_333521857
[2.2 Initializing from pre-trained networks]
There are 3 pre-trained networks for this experiment.
Three tranformations will proceed during Initialization Process)
1. The last max pooling layer -> replaced by ROI pooling layer
2. Last FC & Softmax - > replaced by two splitted layers
(Softmax Classification + Bounding Box Regressions)
3. Number of input values changed -> 2 data inputs
(list of images + list of ROIs)
[2.3 Fine-tuning for detection]
SPP net is unable to update weights... Then why??
Root Cause : Back-Propagation in SPP net is not efficient!!!!
Not Efficient = Inefficient! ( Because of Training Inputs are large)
-> 가끔식 ROI가 수용 구역을 전체 이미지로 설정할 수 있기 때문이다.
But, Fast R-CNN training is different!
Fast R-CNN training Strategy)
Sample N images hierarachically & Sample R/N ROIs from each image.
So, from singular image-> multiple ROIs will be extracted(shares computation & memory)
Therefore, small value in number of images(N) is needed!
(Smaller N will decrease mini-batch computation!)
For example, with 2 images(N=2) and 128 ROIs(R=128)
training will be 64x faster than sampling only 1 ROI(R=1) from 128 different images(N=128)
마지막 정리 문장)
Fast R-CNN uses streamlined training process with one fine-tuning stage that jointly optimizes
a softmax classifier and bounding box regression.
이후 내용도 추가 포스팅을 통해 올릴 예정입니다.
오늘도 감사합니다.
'논문 공부' 카테고리의 다른 글
AUTOVC 논문 정리(1) (0) | 2021.02.27 |
---|---|
R-CNN & Fast R-CNN 비교 정리 (0) | 2021.02.07 |
R-CNN 정리(3) (0) | 2021.01.21 |
R-CNN 정리(2) (0) | 2021.01.14 |
R-CNN 정리(1) (0) | 2021.01.11 |
Fast R-CNN 정리(4) (0) | 2021.01.07 |
Fast R-CNN 정리(3) (0) | 2021.01.04 |
Fast R-CNN 정리(1) (0) | 2021.01.02 |
댓글