본문 바로가기
논문 공부

Fast R-CNN 정리(2)

by Candy Lee 2021. 1. 3.

This paper is written by Ross Girshick(Microsoft Research)

 

 

 

논문 본문 정리 및 요약 내용 기록 남기기(개인 공부 포스팅입니다)

 

 

 

[2. Fast R-CNN architecture and training]

Fast R-CNN architecture

 

Fast R-CNN Procedure)

 

1. Input Image -> Conv&Max Pooling layers -> create feature map

 

2. From feature map of level 1 -> ROIs-> extract feature vector(fixed length)

 

3. Feature Vector -> Fully connected layers(FC) -> Split into 2 outputs

 

4.  First one : Execute ROI Classification

 

    Second one : Execute bounding box Regression

    (4 real-valued numbers for each K object classes)  

    ==> Adjust positions of bounding box

 

 

 

 

[2.1 The ROI pooling layer]

Main Concept)

=>This layer uses "max-pooling" to convert features inside ROI into

    small feature-map with a fixed spatial extent of H X W

    (H & W are hyper - parameters)

 

Structure of ROI)

=> (r,c,h,w)

(r,c) : top-left corner coordinates

(h,w) : height and width values

 

 

ROI pooling Operation)

 

h & w : Height and Width of ROI window

 

H & W : Height and Width of sub-window

 

1. Create approximate size of grid by calculation of h/H and w/W

 

2. Apply Max-Pooling in each sub-window

 

 

ROI pooling operation picture

 

 

그림 출처 www.researchgate.net/figure/Illustration-of-the-RoI-pooling-operation_fig4_333521857

 

Figure 6. Illustration of the RoI pooling operation.

Download scientific diagram | Illustration of the RoI pooling operation. from publication: Hierarchical Feature Aggregation from Body Parts for Misalignment Robust Person Re-Identification | In this work, we focus on the misalignment problem in person re-i

www.researchgate.net

 

 

 

 

[2.2 Initializing from pre-trained networks]

There are 3 pre-trained networks for this experiment.

 

Three tranformations will proceed during Initialization Process)

 

1. The last max pooling layer -> replaced by ROI pooling layer

 

2. Last FC & Softmax - > replaced by two splitted layers

   (Softmax Classification + Bounding Box Regressions)

 

3. Number of input values changed -> 2 data inputs

                                        (list of images + list of ROIs)

 

 

 

 

[2.3 Fine-tuning for detection]

SPP net is unable to update weights... Then why??

 

Root Cause : Back-Propagation in SPP net is not efficient!!!!

 

 

Not Efficient = Inefficient! ( Because of Training Inputs are large)

-> 가끔식 ROI가 수용 구역을 전체 이미지로 설정할 수 있기 때문이다.

 

But, Fast R-CNN training is different!

 

 

 

Fast R-CNN training Strategy)

Sample N images hierarachically & Sample R/N ROIs from each image.

So, from singular image-> multiple ROIs will be extracted(shares computation & memory)

 

Therefore, small value in number of images(N) is needed!

(Smaller N will decrease mini-batch computation!)

 

For example, with 2 images(N=2) and 128 ROIs(R=128) 

training will be 64x faster than sampling only 1 ROI(R=1) from 128 different images(N=128)

 

 

이미지 출처 : https://m.blog.naver.com/PostView.nhn?blogId=laonple&logNo=220776743537&proxyReferer=https:%2F%2Fwww.google.com%2F

 

 

마지막 정리 문장)

Fast R-CNN uses streamlined training process with one fine-tuning stage that jointly optimizes

a softmax classifier and bounding box regression.

 

 

 

이후 내용도 추가 포스팅을 통해 올릴 예정입니다.

 

오늘도 감사합니다.

반응형

'논문 공부' 카테고리의 다른 글

AUTOVC 논문 정리(1)  (0) 2021.02.27
R-CNN & Fast R-CNN 비교 정리  (0) 2021.02.07
R-CNN 정리(3)  (0) 2021.01.21
R-CNN 정리(2)  (0) 2021.01.14
R-CNN 정리(1)  (0) 2021.01.11
Fast R-CNN 정리(4)  (0) 2021.01.07
Fast R-CNN 정리(3)  (0) 2021.01.04
Fast R-CNN 정리(1)  (0) 2021.01.02

댓글