Fast R-CNN 정리(3)

This paper is written by Ross Girshick(Microsoft Research)

논문 본문 정리내용 기록 남기기(개인 공부 포스팅입니다)

지난 글인 Fast R-CNN 정리(2)의 마지막 내용인 Fine-tuning detection

이어서 진행하겠습니다.

[2.3 Fine-tuning for detection] - 지난 포스팅 이어서..

SPP net is unable to update weights... Then why??

Root Cause : Back-Propagation in SPP net is not efficient!!!!

Not Efficient = Inefficient! ( Because of Training Inputs are large)

-> 가끔식 ROI가 수용 구역을 전체 이미지로 설정할 수 있기 때문이다.

But, Fast R-CNN training is different!

Fast R-CNN training Strategy)

Sample N images hierarachically & Sample R/N ROIs from each image.

So, from singular image-> multiple ROIs will be extracted(shares computation & memory)

Therefore, small value in number of images(N) is needed!

(Smaller N will decrease mini-batch computation!)

For example, with 2 images(N=2) and 128 ROIs(R=128)

training will be 64x faster than sampling only 1 ROI(R=1) from 128 different images(N=128)

이미지 출처 : https://m.blog.naver.com/PostView.nhn?blogId=laonple&logNo=220776743537&proxyReferer=https:%2F%2Fwww.google.com%2F

Fast R-CNN uses streamlined training process with one fine-tuning stage that jointly optimizes

a softmax classifier and bounding box regression.

There are several components of this procedure.

(Loss, mini-batch, sampling strategy, back-propagation through ROI pooling, SGD hyper-parameters)

[Mulit task Loss]

Fast R-CNN has 2 splited output layers.

First one Output)

K+1 Classification based on discrete probability by using softmax per ROI.

Second one Output)

bbox regression offsets for each of K object classes. (DataType : tuple)

Second layer output tk adjustment of objet proposal.

(scale-invariant & log-space h/w shift)

Formula Definition)

1. u : ground truth class & Each ROI is labeled with this "u"

2. v: ground truth bbox regression target(ideal position of bbox)

This paper use "multi-task loss" L on each labeled ROI to jointly train for

Classification & bbox Regression.

Equation Summary)

Bracket [u>=1]

Catch-all background -> u=0 -> That means L loc is ignored.(No ground truth for Background ROIs)

[Mini-batch Sampling]

N=2 (number of images)

R=128(number of ROI)

With those two conditions, this paper

considers only 25% of the ROIs with IOU figure at least 0.5

& Foreground object classes(u>=1 except u=0(background))

The remaining ROIs will be sampled from object proposals as background(label u=0)

(IOU range : [0.1,0.5))

[Back-propagation through ROI pooling layers]

xi=i-th activation input -> ROI pooling layer

yrj=j-th output of r-th ROI

single value of xi => assigned to several different output values yrj(N개)

ROI pooling layer backward function)

->compute partial derivative(편미분) a L/a xi

Also a L/ a yrj will be accumlated if i is the argmax selected for yrj by max pooling.

(condition of Bracket)

[SGD hyper-parameters]

Fully Connected Layers(FC) -> used for Softmax Classification & BBox Regression

-> Initialized from zero-man Gaussian distributions(정규분포)

Standard Deviations(표준 편차) : 0.01 & 0.001

[2.4 Scale Invariance]

To achieve Scale Invariance?

There are 2 ways~

1. Brute force learning<-use image pyramids

Both training & testing -> each image should be resized(redefined) to pre-defined pixel size

2. Multi-scale Approach

Image pyramid ==> execute scale-normalize operation for each object proposal

Multi-scale training ==> Test with smaller networks only!(Due to GPU memory limits)

이후 논문 공부도 포스팅 예정입니다.

오늘도 감사합니다.

저작자표시 비영리 변경금지

'논문 공부' 카테고리의 다른 글

AUTOVC 논문 정리(1) (0)	2021.02.27
R-CNN & Fast R-CNN 비교 정리 (0)	2021.02.07
R-CNN 정리(3) (0)	2021.01.21
R-CNN 정리(2) (0)	2021.01.14
R-CNN 정리(1) (0)	2021.01.11
Fast R-CNN 정리(4) (0)	2021.01.07
Fast R-CNN 정리(2) (0)	2021.01.03
Fast R-CNN 정리(1) (0)	2021.01.02

Candy's AI Study Archive

Fast R-CNN 정리(3)