๐ก ๋ณธ ๋ฌธ์๋ '[Dataset] Object Detection/Segmentation Open Dataset: COCO Dataset '์ ๋ํด ์ ๋ฆฌํด๋์ ๊ธ์ ๋๋ค.
Object Detection/Segmentation Task๋ฅผ ํ๋ค๊ณ ํ๋ฉด ๊ฐ์ฅ ๊ฐ๋ณธ์ ์ผ๋ก ์์์ผ ํ๋ ๋ฐ์ดํฐ์ ์ธ COCO ๋ฐ์ดํฐ์ ์ ๋ํด ํ์ฉ๋ฐฉ๋ฒ๊น์ง ์ ๋ฆฌํ์์ผ๋ ์ฐธ๊ณ ํ์๊ธฐ ๋ฐ๋๋๋ค.
1. COCO ๋ฐ์ดํฐ์
COCO ๋ฐ์ดํฐ์ ๊ตฌ์กฐ
COCO ๋ฐ์ดํฐ์ ์ annotation์ json ํํ๋ก ๋์ด ์์ผ๋ฉฐ, ๊ธฐ๋ณธ์ ์ธ ๊ตฌ์กฐ๋ ๋ค์๊ณผ ๊ฐ์ ํ์์ ์ธ ํค๋ฅผ ๊ฐ์ ธ์ผ ํฉ๋๋ค.
'images': [
{
'file_name': 'COCO_val2014_000000001268.jpg',
'height': 427,
'width': 640,
'id': 1268
},
...
],
'annotations': [
{
'segmentation': [[192.81,
247.09,
...
219.03,
249.06]], # if you have mask labels
'area': 1035.749,
'iscrowd': 0,
'image_id': 1268,
'bbox': [192.81, 224.8, 74.73, 33.43],
'category_id': 16,
'id': 42986
},
...
],
'categories': [
{'id': 0, 'name': 'car'},
]
There are three necessary keys in the json file:
- images: contains a list of images with their informations like file_name, height, width, and id.
- annotations: contains the list of instance annotations.
- categories: contains the list of categories names and their ID.
After the data pre-processing, there are two steps for users to train the customized new dataset with existing format (e.g. COCO format):
- Modify the config file for using the customized dataset.
- Check the annotations of the customized dataset.
Here we give an example to show the above two steps, which uses a customized dataset of 5 classes with COCO format to train an existing Cascade MaskRCNN R50 FPN detector.
2. COCO ๋ฐ์ดํฐ์ ๋ค์ด๋ก๋
๋ณดํต, coco dataset์ ์๋ ๋งํฌ์์ ๋ค์ด๋ก๋๊ฐ ๊ฐ๋ฅํฉ๋๋ค๋ง, ์๋ฌด๋ฆฌ ๋ฐ์ดํฐ์ ์ ํด๋ฆญํด๋ ๋ค์ด๋ก๋๊ฐ ๋์ง ์์์ต๋๋ค.
- coco dataset download : https://cocodataset.org/#download
wget์ ํตํด ๋ค์ด๋ก๋ ํ๋ คํด๋ ๋งํฌ๊ฐ ๋ณ๊ฒฝ๋์๋์ง ๋ค์ด๋ก๋ ๋์ง ์์๊ณ , ์๋์ ๊ฐ์ด ๊ฒฝ๋ก๊ฐ ๋ณ๊ฒฝ๋ ๊ฒ์ ์ฐพ์์ต๋๋ค.
# images
wget http://images.cocodataset.org/zips/train2017.zip # train dataset
wget http://images.cocodataset.org/zips/val2017.zip # validation dataset
wget http://images.cocodataset.org/zips/test2017.zip # test dataset
wget http://images.cocodataset.org/zips/unlabeled2017.zip
# annotations
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
wget http://images.cocodataset.org/annotations/image_info_test2017.zip
wget http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip
์ ๊ฒฝ๋ก๋ค์์ ์ด๋ฏธ์ง๋ annotation ๋ชจ๋ train2017.zip ๋ถ๋ถ์ ์ํ๋ dataset์ผ๋ก ๋ณ๊ฒฝ(ex. train2015)ํ์ฌ ๋ค์ด๋ก๋๋ฅผ ์งํํ๋ฉด ๋ฉ๋๋ค. ์ถ๊ฐ๋ก annotations_trainval2017.zip ํ์ผ์ ๋ค์ด๋ฐ๊ณ unzip์ ํ๊ฒ ๋๋ฉด annotation ํ์ผ๋ค์ ์ป์ ์ ์์ต๋๋ค.
3. COCO API ์ฌ์ฉํ๊ธฐ
๋ค์ด๋ฐ์ coco dataset์ api๋ฅผ ์ฌ์ฉํ์ฌ python์์ ํ์ฉํ ์ ์์ผ๋ฉฐ, ์ด๋ฏธ์ง๋ฅผ ์๊ฐํ ํ๊ณ annotation ์ ๋ณด๊น์ง ์๊ฐํ ํ ์ ์์ต๋๋ค.
1) coco api ์ด๊ธฐํ
%matplotlib inline
from pycocotools.coco import COCO
import numpy as np
import skimage.io as io
import matplotlib.pyplot as plt
import pylab
pylab.rcParams['figure.figsize'] = (8.0, 10.0)
dataDir='..'
dataType='val2017'
# initialize COCO api for person keypoints annotations
annFile = '{}/annotations/person_keypoints_{}.json'.format(dataDir,dataType)
coco=COCO(annFile)
2) COCO categories and supercategories ์ถ๋ ฅ ํด ๋ณด๊ธฐ
cats = coco.loadCats(coco.getCatIds())
nms=[cat['name'] for cat in cats]
print('COCO categories: \n{}\n'.format(' '.join(nms)))
nms = set([cat['supercategory'] for cat in cats])
print('COCO supercategories: \n{}'.format(' '.join(nms)))
3) ์ด๋ฏธ์ง ์๊ฐํ
# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['person','dog','skateboard']);
imgIds = coco.getImgIds(catIds=catIds );
imgIds = coco.getImgIds(imgIds = [324158])
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
# load and display image
# I = io.imread('%s/images/%s/%s'%(dataDir,dataType,img['file_name']))
# use url to load image
I = io.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()
4) annotation ์๊ฐํ
# load and display keypoints annotations
plt.imshow(I); plt.axis('off')
ax = plt.gca()
annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco_kps.loadAnns(annIds)
coco_kps.showAnns(anns)
์ฐธ๊ณ
- [Github] coco API : https://github.com/cocodataset/cocoapi
- [Github] coco API Demo: https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoDemo.ipynb
- [mmdetection] Customize Datasets: https://mmdetection.readthedocs.io/en/v2.11.0/tutorials/customize_dataset.html