๐ก ๋ณธ ๋ฌธ์๋ '[Data] Segmentation ๋ฐ์ดํฐ ์์ถ ์๊ณ ๋ฆฌ์ฆ: Run Length Encoding(RLE) - coco mask to rle์ rle to mask ๊ฒ์ฆ๊น์ง'์ ๋ํด ์ ๋ฆฌํด๋์ ๊ธ์ ๋๋ค.
Segmentation ์์ masking ๋ฐ์ดํฐ๋ฅผ ์์ถํ ๋ ์ข ์ข ์ฌ์ฉํ๋ ์๊ณ ๋ฆฌ์ฆ์ธ Run Length Encoding(RLE)์ ๋ํด ์ ๋ฆฌํ์์ผ๋ฉฐ, Encoder, Decoder ์๊ณ ๋ฆฌ์ฆ์ ์ฝ๋ ์์ค์์ ์ ๋ฆฌํ์์ผ๋ ์ฐธ๊ณ ํ์๊ธฐ ๋ฐ๋๋๋ค.
1. Run Length Encoding(RLE) ๋?
RLE์ "Run Length Encoding"์ ์ฝ์๋ก, ์ด๋ฏธ์ง๋ ๋น๋์ค ๋ฐ์ดํฐ๋ฅผ ์์ถํ๋ ๋ฐ ์ฌ์ฉ๋๋ ์์ถ ์๊ณ ๋ฆฌ์ฆ ์ค ํ๋์ ๋๋ค. ์ด ์๊ณ ๋ฆฌ์ฆ์ ์ด๋ฏธ์ง์์ ์ฐ์๋ ํฝ์ ๊ฐ์ด ๋ฐ๋ณต๋๋ ๊ฒฝ์ฐ, ๊ทธ ๊ฐ์ ๋ฐ๋ณต๋๋ ํ์์ ํจ๊ป ์ ์ฅํ์ฌ ๋ฐ์ดํฐ๋ฅผ ์์ถํฉ๋๋ค. ์ด๋ฅผ ํตํด ๋ฐ์ดํฐ ์ฉ๋์ ์ค์ผ ์ ์์ต๋๋ค.
RLE Variation
์ด์ RLE์ ๋ํ ์๊ณ ๋ฆฌ์ฆ ์์ฒด๋ ๋ฐ๋ณต๋๋ ๋ฌธ์์ด์ ์ ์ฅํ๋ ๊ฒ์ด๋ผ๊ณ ์ดํดํ์ ๊ฒ์ ๋๋ค. ํ์ง๋ง ์ด๋ฅผ ๊ตฌํํจ์ ์์ด ๋ค์ํ Variation์ด ์์ด ๊ฐ ๋ณํ๋ ๋ฐฉ๋ฒ๋ค์ ๋ํด ์ธ๊ธํ๋๋ก ํ๊ฒ ์ต๋๋ค. 00111011 ๋ฌธ์์ด์ ์์ถํ๋ค๊ณ ๊ฐ์ ํ๊ณ ์๊ฐํ๊ฒ ์ต๋๋ค.
- 3372: 1์ด ์๋ ์์น๋ฅผ ์ ์ ํ ๋ช๊ฐ๊ฐ ์ฐ์๋์ด ์๋์ง๋ฅผ ๋์ด
- 2312: 0,1,0,1, .. ์ ์์๋ก ์ฐ์๋ ๋ฌธ์์ ๊ฐฏ์๋ฅผ ๋์ด
์ด์ ๊ฐ ๋ฐฉ๋ฒ์ Encoder Decoder์ ํด๋นํ๋ ์ฝ๋๋ฅผ ๋ณด๋ฉด์ ์ดํดํ๋๋ก ํ๊ฒ ์ต๋๋ค.
2. RLE Converter
1) 3372: 1์ด ์๋ ์์น๋ฅผ ์ ์ ํ ๋ช๊ฐ๊ฐ ์ฐ์๋์ด ์๋์ง๋ฅผ ๋์ด
mask to rle
RLE ์ธ์ฝ๋ฉ ํจ์์ธ mask2rle๋ ์ด๋ฏธ์ง ๋ง์คํฌ๋ฅผ ์ ๋ ฅ์ผ๋ก ๋ฐ์์ RLE ํ์์ผ๋ก ์์ถ๋ ๋ง์คํฌ๋ฅผ ๋ฐํ
# ref.: https://www.kaggle.com/stainsby/fast-tested-rle
def mask2rle(img):
"""
img: numpy array, 1 - mask, 0 - background
Returns run length as string formatted
"""
pixels = img.flatten()
pixels = np.concatenate([[0], pixels, [0]])
runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
runs[1::2] -= runs[::2]
return ' '.join(str(x) for x in runs)
- ์ด๋ฏธ์ง ๋ง์คํฌ๋ฅผ 1์ฐจ์ ๋ฐฐ์ด๋ก ๋ณํ
- ๋ฐฐ์ด์ ์์๊ณผ ๋์ 0์ ์ถ๊ฐ
- ์ฐ์๋๋ ๊ฐ์ ์์ ์ธ๋ฑ์ค ํ์
- ์ฐ์๋ ๊ฐ์ ๊ธธ์ด ๊ณ์ฐ
- ๊ธธ์ด ์ ๋ณด๋ฅผ RLE ํ์์ผ๋ก ๋ณํ. ์์ ์์น์ ๊ธธ์ด๋ฅผ ๋ฒ๊ฐ์๊ฐ๋ฉด์ ๊ธฐ๋กํ๋ฉฐ, ๊ธธ์ด๋ ์์ ์์น์ ์ฐจ์ด๋ก ํํ
- RLE ํ์์ผ๋ก ๋ณํ๋ ์ ๋ณด๋ฅผ ๋ฌธ์์ด๋ก ๋ฐํ
rle to mask
RLE ๋์ฝ๋ฉ ํจ์์ธ rle2mask๋ ์์ถ๋ RLE ํ์์ ๋ง์คํฌ๋ฅผ ์ ๋ ฅ์ผ๋ก ๋ฐ์์ ์๋์ ํํ๋ก ๋์ฝ๋ฉ๋ ์ด๋ฏธ์ง ๋ง์คํฌ๋ฅผ ๋ฐํ
def rle2mask(mask_rle: str, label=1, shape=DEFAULT_IMAGE_SHAPE):
"""
mask_rle: run-length as string formatted (start length)
shape: (height,width) of array to return
Returns numpy array, 1 - mask, 0 - background
"""
s = mask_rle.split()
starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
starts -= 1
ends = starts + lengths
img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
for lo, hi in zip(starts, ends):
img[lo:hi] = label
return img.reshape(shape) # Needed to align to RLE direction
- RLE ํ์์ ๋ง์คํฌ๋ฅผ ๊ณต๋ฐฑ ๋ฌธ์๋ฅผ ๊ธฐ์ค์ผ๋ก ๋ถ๋ฆฌํ์ฌ ์์ ์์น(starts)์ ๊ธธ์ด(lengths)๋ฅผ ์ป์
- ์์ ์์น๋ฅผ 0๋ถํฐ ์์ํ๋ ์ธ๋ฑ์ค๋ก ๋ณํํ๊ธฐ ์ํด 1 ๋นผ๊ธฐ
- ์ข ๋ฃ ์์น(ends)๋ฅผ ์์ ์์น(starts)์ ๊ธธ์ด(lengths)๋ฅผ ๋ํด์ ๊ตฌํ๊ธฐ
- ์ด๋ฏธ์ง ๋ง์คํฌ๋ฅผ ๋ํ๋ด๊ธฐ ์ํด ์ง์ ๋ ๋ชจ์(shape)์ ๋ง๊ฒ 0์ผ๋ก ์ด๊ธฐํ๋ ๋ฐฐ์ด ์์ฑ
- ์์ ์์น(starts)์ ์ข ๋ฃ ์์น(ends)๋ฅผ ์ด์ฉํ์ฌ ๋ง์คํฌ์ ํด๋น ์์ญ์ 1๋ก ์ค์
- ์ต์ข ์ ์ผ๋ก ์์ฑ๋ 1์ฐจ์ ๋ฐฐ์ด์ ์ง์ ๋ ๋ชจ์(shape)์ผ๋ก ์ฌ๊ตฌ์ฑํ์ฌ ์ด๋ฏธ์ง ๋ง์คํฌ๋ฅผ ๋ฐํํฉ๋๋ค.
2) 2312: 0,1,0,1, .. ์ ์์๋ก ์ฐ์๋ ๋ฌธ์์ ๊ฐฏ์๋ฅผ ๋์ด
mask to rle
import numpy as np
from itertools import groupby
def binary_mask_to_rle(binary_mask):
rle = {'counts': [], 'size': list(binary_mask.shape)}
counts = rle.get('counts')
for i, (value, elements) in enumerate(groupby(binary_mask.ravel(order='F'))):
if i == 0 and value == 1:
counts.append(0)
counts.append(len(list(elements)))
return rle
coco = COCO('/home/avs/dataset/train_exd/con2.json')
annids = coco.getAnnIds()
anns = coco.loadAnns(annids)
mask = coco.annToMask(anns[0])
rle = binary_mask_to_rle(mask)
print(rle)
rle to mask
rle segmentation count ๋ด๋ถ์ ์๋ ๋ฐฐ์ด์ np.array ๋ด๋ถ์ ๋ฃ๊ณ ์๋์ ์ฝ๋๋ฅผ ์คํ
from PIL import Image
enco1_arr = np.array([773632, 50, 669, 52, 668, 52, 668, 52, 668, 52, 668, 52, 668, 52, 668, 52, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 667, 53, 666, 54, 666, 54, 666, 54, 666, 54, 666, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 55, 665, 54, 667, 53, 667, 52, 668, 52, 668, 52, 669, 51, 669, 51, 670, 50, 671, 48, 673, 47, 675, 45, 677, 43, 678, 41, 679, 40, 682, 36, 684, 36, 685, 37, 683, 38, 681, 39, 681, 5, 1, 33, 681, 5, 1, 33, 681, 5, 1, 33, 681, 4, 3, 32, 688, 32, 689, 31, 693, 23, 101844, 0])
print(enco1_arr.sum())
enco1_pos = enco1_arr[0::2]
enco1_len = enco1_arr[1::2]
print(enco1_pos)
print(enco1_len)
mask_1d = np.zeros(720 * 1280, dtype=np.uint8)
start = 0
end = 0
for po, le in zip(enco1_pos, enco1_len):
start = end + po
end = start + le
mask_1d[start:end]=255
print(mask_1d)
mask2d = mask_1d.reshape(720, 1280, order='F')
print(mask2d)
im = Image.fromarray(mask2d)
im.save("your_file.jpeg")
์ฐธ๊ณ
- [StackOverFlow] Encode numpy array using uncompressed RLE for COCO dataset: https://stackoverflow.com/questions/49494337/encode-numpy-array-using-uncompressed-rle-for-coco-dataset
- [Kaggle] Even Faster Run Length Encoder: https://www.kaggle.com/code/hackerpoet/even-faster-run-length-encoder/script
- [StackOverFlow] How to create mask images from COCO dataset?: https://stackoverflow.com/questions/50805634/how-to-create-mask-images-from-coco-dataset
- [Github] How to convert Polygon to binary mask dynamically.: https://github.com/cocodataset/cocoapi/issues/229#issuecomment-441405898
- [Blog] Image Segmentation(3) - ๋ฐ์ดํฐ ์์ถ ์๊ณ ๋ฆฌ์ฆ : RLE ์ธ์ฝ๋ฉ/๋์ฝ๋ฉ: https://velog.io/@hajieun02/Image-Segmentation3-%EB%8D%B0%EC%9D%B4%ED%84%B0-%EC%95%95%EC%B6%95-%EC%95%8C%EA%B3%A0%EB%A6%AC%EC%A6%98-RLE-%EC%9D%B8%EC%BD%94%EB%94%A9%EB%94%94%EC%BD%94%EB%94%A9