CS231 Lec.02 Image Classification

CS231 Summary & 마음대로 해석

CS231 Lec.02 Image Classification

은하철도119 2021. 1. 4. 16:43

Image Detection: 컴퓨터가 이미지를 어떻게 인식하는지 설명

Computer는 이미지르 grid number [0, 255] 의 숫자로 본다.

Pixels: 세로 x 가로 x RGB (3개의 색깔)
카메라 방향이 변경 될 때마다 pixel도 변경됨

Image Classification: 이미지를 카테고리별로 분류하는 것

Semantic Gap : 이미지별로 Classify 할 경우, 픽셀 값에 대한 차이값

단, 여기서 Image를 분류 할 때 마다 도전과제들이 있다.

Challenges: Viewpoint Variation

보는 관점에 따른 변형

Challenges: Intraclass Variation

같은 종 내에 baby kitten, adult cat 의 변형 모습 구분, Category 내에서의 다른 변형 부분

Challenges: Fine-Grained Categories

Different types of cats (eg. 다른 종류별의 고양이 구분)

Challenges: Background Clutter

Blend into the background (뒷 배경에 상관 없이 고양이를 구분해야 할 경우)

Challenges: Illumination Changes

Images Affected by lighting condition (빛에 의해 영향 받은 이미지)

Challenges: Deformation

Cat's weird posture (고양이의 특이한 모양 변형)

Challenges: Occlusion 폐색, 폐쇄, 맞물림, 교합

고양이가 물건 안에 숨는 경우 (eg. 고양이가 쿠션 뒤에 숨을 경우, 휴지 밑의 고양이 등)

Image Classification: Very Useful! 아주 유용하다긔!

Medical Imaging, Whale Recognition, Galaxy Classification
위의 같은 경우에 구분을 할때 유용하게 사용된다.

Building Block for other tasks!

Ex1) 사진 내에 있는 것들을 sub-region 부분을 classify 함으로써 Image detection 할 수 있다. (Object Detection, 이미지 내에서 물체를 파악하는 것)
Ex2) Playing Go, Image인식을 바둑 하는데도 application 가능 (바둑 두는 거에서 이미지로 위치를 파악함으로써 바둑의 게임 진행사항을 파악할 수도 있음)

An Image Classifier

Machine Learning Data-Driven Approach

Collect a dataset of images and labels
Use Machine Learning to train a classifier
Evaluate the classifier on new images

-> 위의 Image classifier를 만드는데에 머신러닝을 사용할 수 있다는 것

상) Giving labels on each images 하) testing the labels with prediction ?

Image classification Datasets: MNIST

Handwritten 된 이미지를 recognizing하는 것에 활용, simple dataset에 적용

Image classification Datasets: CIFAR10 & CIFAR100

점점 구분할 수 있는 dataset이 증가

Image Classification Datasets: ImageNet

Super Important dataset to bench mark.

MNIST랑 CIFAR에 비해서 아주 우수한 classification 능력을 가지고 있음

Image Classification Datasets: MIT Places (Places365?)

Image Classification Datasets: Omniglot

Written language 1623 카데고리 이상의 언어 인식

Meant to test Low shot learning = few shot learning

First Classifier: Nearest Neighbor
이미지를 비교하기 위한 Distance Metric

<상>은 교육받은 데이터를 외우고,

<하>는 각각의 테스트 이미지를 가장 근접한 이미지를 찾아내고 이것의 label을 가지고 옴

Prediction 알고리듬

Image classification Datasets: MNIST

Handwritten 된 이미지를 recognizing하는 것에 활용, simple dataset에 적용

Image classification Datasets: CIFAR10 & CIFAR100

점점 구분할 수 있는 dataset이 증가

Image Classification Datasets: ImageNet

Super Important dataset to bench mark.

MNIST랑 CIFAR에 비해서 아주 우수한 classification 능력을 가지고 있음

Image Classification Datasets: MIT Places (Places365?)

Image Classification Datasets: Omniglot

Written language 1623 카데고리 이상의 언어 인식

Meant to test Low shot learning = few shot learning

First Classifier: Nearest Neighbor
이미지를 비교하기 위한 Distance Metric

<상>은 교육받은 데이터를 외우고,

<하>는 각각의 테스트 이미지를 가장 근접한 이미지를 찾아내고 이것의 label을 가지고 옴

Prediction 알고리듬

Nearest Neighbor Decision Boundaries

The smooth 하기 위해서는 more neighbors 만들어야 한다.

그럼으로써 effect of outlier(밖으로 벗어나는 결과값)을 줄여준다.

K가 1> 경우, ties between classes (클래스 별로 동일한 아이들)이 생길 수 있음. 이럴 경우 break them somehow?

K-nearest Neighors: Distance Metric

L1 (Manhattan) distance
L2 (Euclidean) distance
Distance metric을 잘 골라서 K-nearest neighbor를 어떠한 데이터에도 적용이 가능함
하지만 L2 distance가 가장 유용하게 사용되고 있음

Hyperparameter (머신러닝모델)

Idea 1.: choose hyperparameters that work best on the data

Bad: K=1 always works perfectly on training data

Idea 2 : split data into train and test, choose hyperparameters that work best on test data

Bad: No idea how algorithm will perform on new data

Idea 3: split data into train, val, and test; choose hyperparameters on val and evaluate on test --> Better

Idea4: Cross-Validation: split data into folds, try each fold as validation and average the results

Useful for small datasets, but not used too frequently in deep learning

가장 이상적이고 정확 하지만, training 비용이 비싸서, 작은 데이터셋에만 사용되고 있음

K-nearest Neighbor: Universal Approximation

As the number of training samples goes to infinity, nearest neighbor can represent any function!

샘플 트레이닝을 많이 할 수록, nearest neighbor이 어떠한 기능도 가능해짐

Problem: Curse of Dimensionality (차원의 저주… gee)

For uniform coverage of space, 트레이닝 포인트가 기하급수적으로 디멘션에 따라서 증가한다