Fast R-CNN 정리(2)

This paper is written by Ross Girshick(Microsoft Research)

논문 본문 정리 및 요약 내용 기록 남기기(개인 공부 포스팅입니다)

[2. Fast R-CNN architecture and training]

Fast R-CNN Procedure)

1. Input Image -> Conv&Max Pooling layers -> create feature map

2. From feature map of level 1 -> ROIs-> extract feature vector(fixed length)

3. Feature Vector -> Fully connected layers(FC) -> Split into 2 outputs

4. First one : Execute ROI Classification

Second one : Execute bounding box Regression

(4 real-valued numbers for each K object classes)

==> Adjust positions of bounding box

[2.1 The ROI pooling layer]

Main Concept)

=>This layer uses "max-pooling" to convert features inside ROI into

small feature-map with a fixed spatial extent of H X W

(H & W are hyper - parameters)

Structure of ROI)

=> (r,c,h,w)

(r,c) : top-left corner coordinates

(h,w) : height and width values

ROI pooling Operation)

h & w : Height and Width of ROI window

H & W : Height and Width of sub-window

1. Create approximate size of grid by calculation of h/H and w/W

2. Apply Max-Pooling in each sub-window

그림 출처 www.researchgate.net/figure/Illustration-of-the-RoI-pooling-operation_fig4_333521857

Figure 6. Illustration of the RoI pooling operation.

Download scientific diagram | Illustration of the RoI pooling operation. from publication: Hierarchical Feature Aggregation from Body Parts for Misalignment Robust Person Re-Identification | In this work, we focus on the misalignment problem in person re-i

www.researchgate.net

[2.2 Initializing from pre-trained networks]

There are 3 pre-trained networks for this experiment.

Three tranformations will proceed during Initialization Process)

1. The last max pooling layer -> replaced by ROI pooling layer

2. Last FC & Softmax - > replaced by two splitted layers

(Softmax Classification + Bounding Box Regressions)

3. Number of input values changed -> 2 data inputs

(list of images + list of ROIs)

[2.3 Fine-tuning for detection]

SPP net is unable to update weights... Then why??

Root Cause : Back-Propagation in SPP net is not efficient!!!!

Not Efficient = Inefficient! ( Because of Training Inputs are large)

-> 가끔식 ROI가 수용 구역을 전체 이미지로 설정할 수 있기 때문이다.

But, Fast R-CNN training is different!

Fast R-CNN training Strategy)

Sample N images hierarachically & Sample R/N ROIs from each image.

So, from singular image-> multiple ROIs will be extracted(shares computation & memory)

Therefore, small value in number of images(N) is needed!

(Smaller N will decrease mini-batch computation!)

For example, with 2 images(N=2) and 128 ROIs(R=128)

training will be 64x faster than sampling only 1 ROI(R=1) from 128 different images(N=128)

이미지 출처 : https://m.blog.naver.com/PostView.nhn?blogId=laonple&logNo=220776743537&proxyReferer=https:%2F%2Fwww.google.com%2F

마지막 정리 문장)

Fast R-CNN uses streamlined training process with one fine-tuning stage that jointly optimizes

a softmax classifier and bounding box regression.

이후 내용도 추가 포스팅을 통해 올릴 예정입니다.

오늘도 감사합니다.

저작자표시 비영리 변경금지 (새창열림)

'논문 공부' 카테고리의 다른 글

AUTOVC 논문 정리(1) (0)	2021.02.27
R-CNN & Fast R-CNN 비교 정리 (0)	2021.02.07
R-CNN 정리(3) (0)	2021.01.21
R-CNN 정리(2) (0)	2021.01.14
R-CNN 정리(1) (0)	2021.01.11
Fast R-CNN 정리(4) (0)	2021.01.07
Fast R-CNN 정리(3) (0)	2021.01.04
Fast R-CNN 정리(1) (0)	2021.01.02

Candy's AI Study Archive

Fast R-CNN 정리(2)