본문 바로가기
논문 공부

Fast R-CNN 정리(3)

by Candy Lee 2021. 1. 4.

This paper is written by Ross Girshick(Microsoft Research)

 

 

 

논문 본문 정리내용 기록 남기기(개인 공부 포스팅입니다)

 

지난 글인 Fast R-CNN 정리(2)의 마지막 내용인 Fine-tuning detection 

 

이어서 진행하겠습니다.

 

 

[2.3 Fine-tuning for detection] - 지난 포스팅 이어서..

SPP net is unable to update weights... Then why??

 

Root Cause : Back-Propagation in SPP net is not efficient!!!!

 

 

Not Efficient = Inefficient! ( Because of Training Inputs are large)

-> 가끔식 ROI가 수용 구역을 전체 이미지로 설정할 수 있기 때문이다.

 

But, Fast R-CNN training is different!

 

 

 

Fast R-CNN training Strategy)

Sample N images hierarachically & Sample R/N ROIs from each image.

So, from singular image-> multiple ROIs will be extracted(shares computation & memory)

 

Therefore, small value in number of images(N) is needed!

(Smaller N will decrease mini-batch computation!)

 

For example, with 2 images(N=2) and 128 ROIs(R=128) 

training will be 64x faster than sampling only 1 ROI(R=1) from 128 different images(N=128)

 

 

이미지 출처 : https://m.blog.naver.com/PostView.nhn?blogId=laonple&logNo=220776743537&proxyReferer=https:%2F%2Fwww.google.com%2F

 

 

Fast R-CNN uses streamlined training process with one fine-tuning stage that jointly optimizes

a softmax classifier and bounding box regression.

 

There are several components of this procedure.

(Loss, mini-batch, sampling strategy, back-propagation through ROI pooling, SGD hyper-parameters)

 

 

 

 

[Mulit task Loss]

Fast R-CNN has 2 splited output layers.

 

2 output layers on Fast R-CNN

 

First one Output)

K+1 Classification based on discrete probability by using softmax per ROI.

 

Second one Output)

bbox regression offsets for each of K object classes. (DataType : tuple)

 

Format of bbox regression Offsets

Second layer output tadjustment of objet proposal.

(scale-invariant & log-space h/w shift)                   

 

 

Formula Definition) 

 

1. u : ground truth class & Each ROI is labeled with this "u"

2. v: ground truth bbox regression target(ideal position of bbox)

 

This paper use "multi-task loss" L  on each labeled ROI to jointly train for 

Classification & bbox Regression.

 

 

 

Equation Summary)

 

수식 정리본(그림판 이용)

Bracket [u>=1]

Catch-all background -> u=0 -> That means L loc is ignored.(No ground truth for Background ROIs)

 

 

 

 

 

[Mini-batch Sampling]

N=2 (number of images)

R=128(number of ROI)

 

With those two conditions, this paper

considers only 25% of the ROIs with IOU figure at least 0.5

& Foreground object classes(u>=1 except u=0(background))

 

The remaining ROIs will be sampled from object proposals as background(label u=0)

(IOU range : [0.1,0.5))

 

 

 

 

[Back-propagation through ROI pooling layers]

xi=i-th activation input -> ROI pooling layer

yrj=j-th output of  r-th ROI

 

single value of xi => assigned to several different output values yrj(N개)

 

 

Equation 4 in paper

 

ROI pooling layer backward function)

->compute partial derivative(편미분) a L/a xi

 

Also a L/ a yrj will be accumlated if i is the argmax selected for yrj by max pooling.

                                                             (condition of Bracket)

 

 

[SGD hyper-parameters]

Fully Connected Layers(FC) -> used for Softmax Classification & BBox Regression

                                                            -> Initialized from zero-man Gaussian distributions(정규분포)

                                                                Standard Deviations(표준 편차) : 0.01 & 0.001

 

 

 

 

 

 

[2.4 Scale Invariance]

To achieve Scale Invariance?

 

There are 2 ways~

 

1. Brute force learning<-use image pyramids

 

Both training & testing -> each image should be resized(redefined) to  pre-defined pixel size

 

 

2. Multi-scale Approach

 

Image pyramid ==> execute scale-normalize operation for each object proposal

 

Multi-scale training ==> Test with smaller networks only!(Due to GPU memory limits)

 

 

 

이후 논문 공부도 포스팅 예정입니다.

 

오늘도 감사합니다.

반응형

'논문 공부' 카테고리의 다른 글

AUTOVC 논문 정리(1)  (0) 2021.02.27
R-CNN & Fast R-CNN 비교 정리  (0) 2021.02.07
R-CNN 정리(3)  (0) 2021.01.21
R-CNN 정리(2)  (0) 2021.01.14
R-CNN 정리(1)  (0) 2021.01.11
Fast R-CNN 정리(4)  (0) 2021.01.07
Fast R-CNN 정리(2)  (0) 2021.01.03
Fast R-CNN 정리(1)  (0) 2021.01.02

댓글