DrawingProcess
๋“œํ”„ DrawingProcess
DrawingProcess
์ „์ฒด ๋ฐฉ๋ฌธ์ž
์˜ค๋Š˜
์–ด์ œ
ยซ   2025/06   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
  • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (964)
    • Profile & Branding (22)
      • Career (15)
    • IT Trends (254)
      • Conference, Faire (Experien.. (31)
      • News (187)
      • Youtube (19)
      • TED (8)
      • Web Page (2)
      • IT: Etc... (6)
    • Contents (97)
      • Book (66)
      • Lecture (31)
    • Project Process (94)
      • Ideation (0)
      • Study Report (34)
      • Challenge & Award (22)
      • 1Day1Process (5)
      • Making (5)
      • KRC-FTC (Team TC(5031, 5048.. (10)
      • GCP (GlobalCitizenProject) (15)
    • Study: ComputerScience(CS) (72)
      • CS: Basic (9)
      • CS: Database(SQL) (5)
      • CS: Network (14)
      • CS: OperatingSystem (3)
      • CS: Linux (39)
      • CS: Etc... (2)
    • Study: Software(SW) (95)
      • SW: Language (29)
      • SW: Algorithms (1)
      • SW: DataStructure & DesignP.. (1)
      • SW: Opensource (15)
      • SW: Error Bug Fix (43)
      • SW: Etc... (6)
    • Study: Artificial Intellige.. (149)
      • AI: Research (1)
      • AI: 2D Vision(Det, Seg, Tra.. (35)
      • AI: 3D Vision (70)
      • AI: MultiModal (3)
      • AI: SLAM (0)
      • AI: Light Weight(LW) (3)
      • AI: Data Pipeline (7)
      • AI: Machine Learning(ML) (1)
    • Study: Robotics(Robot) (33)
      • Robot: ROS(Robot Operating .. (9)
      • Robot: Positioning (8)
      • Robot: Planning & Control (7)
    • Study: DeveloperTools(DevTo.. (83)
      • DevTool: Git (12)
      • DevTool: CMake (13)
      • DevTool: NoSQL(Elastic, Mon.. (25)
      • DevTool: Container (17)
      • DevTool: IDE (11)
      • DevTool: CloudComputing (4)
    • ์ธ์ƒ์„ ์‚ด๋ฉด์„œ (64)
      • ๋‚˜์˜ ์ทจ๋ฏธ๋“ค (7)
      • ๋‚˜์˜ ์ƒ๊ฐ๋“ค (42)
      • ์—ฌํ–‰์„ ๋– ๋‚˜์ž~ (10)
      • ๋ถ„๊ธฐ๋ณ„ ํšŒ๊ณ  (5)

๊ฐœ๋ฐœ์ž ๋ช…์–ธ

โ€œ ๋งค์ฃผ ๋ชฉ์š”์ผ๋งˆ๋‹ค ๋‹น์‹ ์ด ํ•ญ์ƒ ํ•˜๋˜๋Œ€๋กœ ์‹ ๋ฐœ๋ˆ์„ ๋ฌถ์œผ๋ฉด ์‹ ๋ฐœ์ด ํญ๋ฐœํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ด๋ณด๋ผ.
์ปดํ“จํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋Š” ์ด๋Ÿฐ ์ผ์ด ํ•ญ์ƒ ์ผ์–ด๋‚˜๋Š”๋ฐ๋„ ์•„๋ฌด๋„ ๋ถˆํ‰ํ•  ์ƒ๊ฐ์„ ์•ˆ ํ•œ๋‹ค. โ€

- Jef Raskin

๋งฅ์˜ ์•„๋ฒ„์ง€ - ์• ํ”Œ์ปดํ“จํ„ฐ์˜ ๋งคํ‚จํ† ์‹œ ํ”„๋กœ์ ํŠธ๋ฅผ ์ฃผ๋„

์ธ๊ธฐ ๊ธ€

์ตœ๊ทผ ๊ธ€

์ตœ๊ทผ ๋Œ“๊ธ€

ํ‹ฐ์Šคํ† ๋ฆฌ

hELLO ยท Designed By ์ •์ƒ์šฐ.
DrawingProcess

๋“œํ”„ DrawingProcess

Study: Artificial Intelligence(AI)/AI: 3D Vision

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] 4D Gaussian Splatting: 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering (CVPR 2024)

2024. 11. 29. 22:00
๋ฐ˜์‘ํ˜•
๐Ÿ’ก ๐Ÿ’ก ๋ณธ ๋ฌธ์„œ๋Š” '4D Gaussian Splatting: 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering (CVPR 2024)' ๋…ผ๋ฌธ์„ ์ •๋ฆฌํ•ด๋†“์€ ๊ธ€์ด๋‹ค.
์›€์ง์ด๋Š” ์˜์ƒ์— ๋Œ€ํ•ด Scene์„ ๋žœ๋”๋งํ•˜๋Š” ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค. ๋…ผ๋ฌธ ์ œ๋ชฉ์— 4D๋ผ๊ณ  ๋˜์–ด ์žˆ๋Š”๋ฐ, 3D๋ชจ๋ธ์— ์ถ”๊ฐ€์ ์ธ 1 Dimension์€ time์ถ•์ž…๋‹ˆ๋‹ค. Static(๊ณ ์ •๋œ)ํ•œ Scene์„ ๋ชจ๋ธ๋งํ•˜๋Š” ๋ชจ๋ธ์—์„œ๋Š” Scene์ด ์›€์ง์ด๊ฒŒ ๋˜๋ฉด blur๊ฐ€ ์ƒ๊ธฐ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. Dynamic Scene์„ ๋ชจ๋ธ๋งํ•˜๋Š” ์—ฐ๊ตฌ์˜ ๊ฒฝ์šฐ ํ”ผ์‚ฌ์ฒด๊ฐ€ ์›€์ง์—ฌ๋„ ์‹œ๊ฐ„์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ ๋žœ๋”๋งํ•˜๊ฒŒ ๋˜๋ฏ€๋กœ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋žœ๋”๋ง ๋ฉ๋‹ˆ๋‹ค. Dynamic Scene์„ ๋ชจ๋ธ๋งํ•˜๋ฉด์„œ Gaussian Splatting ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•œ ์—ฐ๊ตฌ๋ฅผ ์†Œ๊ฐœํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
 - Project: https://guanjunwu.github.io/4dgs/
 - Paper: https://arxiv.org/abs/2303.09553
 - Github: https://github.com/hustvl/4DGaussians

Abstract

์ด ๋…ผ๋ฌธ์ด ๊ฐ€์ง€๋Š” contribution ์ค‘ ๊ฐ€์žฅ ๋ฉ”์ธ์ด ๋˜๋Š” ๋ถ€๋ถ„์„ ์ •๋ฆฌํ•ด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • Dynamic Scene Rendering ๊ฐœ์„ : ๊ฐ Gaussian์ด ray์— ๋…๋ฆฝ์ ์œผ๋กœ ์›€์ง์ผ ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„ํ•ด dynamic scene์˜ ํ’ˆ์งˆ๊ณผ ๋ Œ๋”๋ง ์†๋„๋ฅผ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค.
  • ํšจ์œจ์ ์ธ Encoding ๋ฐ ๋ชจ๋ธ๋ง: Multi-resolution HexPlane ๊ธฐ๋ฐ˜์˜ Neural Voxel Encoding ๋ฐฉ์‹์„ ํ™œ์šฉํ•ด spatial ๋ฐ temporal feature๋ฅผ ํšจ์œจ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ , Gaussian์˜ ๋ณ€ํ™”๋Ÿ‰(position, rotation ๋“ฑ)์„ ์ •๊ตํ•˜๊ฒŒ ๋ชจ๋ธ๋งํ–ˆ๋‹ค.
  • ํšจ์œจ์ ์ธ ํ•™์Šต ๊ณผ์ •: Static๊ณผ Dynamic part๋ฅผ ํ†ตํ•ฉ์ ์œผ๋กœ ํ•™์Šตํ•˜๋Š” two-stage training ๋ฐฉ์‹์„ ๋„์ž…ํ•ด ํ•™์Šต ํšจ์œจ์„ฑ์„ ๋†’์˜€์œผ๋ฉฐ, loss ์„ค๊ณ„๋ฅผ ํ†ตํ•ด ์ตœ์ ํ™” ์†๋„๋ฅผ ๊ฐœ์„ ํ–ˆ๋‹ค.

Related Works

NeRF๋ถ„์•ผ์—์„œ Fast Rendering ๊ธฐ์ˆ ๋กœ ์ด์Šˆ๊ฐ€ ๋˜์—ˆ๋˜ 3D Gaussian Splatting(์ด์ „ ํฌ์ŠคํŠธ link)์ด 8์›”์— arXiv์— ๋ฐœํ‘œ๋˜์—ˆ๋Š”๋ฐ, 2๋‹ฌ๋„ ๋˜์ง€ ์•Š์•„ dynamicํ•œ scene ๋žœ๋”๋งํ•˜๋Š” ํ›„์† ์—ฐ๊ตฌ๋“ค์ด ๋ฐœํ‘œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Dynamic 3DGS (link)๋Š” 3D Gaussian์˜ ๊ฐฏ์ˆ˜๋ฅผ ๊ณ ์ •ํ•˜๊ณ , ์‹œ๊ฐ„์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ 3D Gaussian์˜ position๊ณผ variance๋ฅผ trakcingํ•˜์—ฌ Dynamic Scene์„ ๋ชจ๋ธ๋งํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ์ ์€ denseํ•œ multi-view ์ž…๋ ฅ ์ด๋ฏธ์ง€๊ฐ€ ํ•„์š”๋กœ ํ–ˆ๊ณ , ์ด์ „ ํ”„๋ ˆ์ž„์˜ ๋ชจ๋ธ๋ง ๊ฒฐ๊ณผ๊ฐ€ ๋ถ€์ ์ ˆํ•˜๋ฉด ์ „์ฒด์ ์ธ ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š์•˜๊ณ , ์‹œ๊ฐ„์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ Gaussian๋“ค์ด ์กด์žฌํ•˜๋ฏ€๋กœ time t์˜ ํฌ๊ธฐ๋งŒํผ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ์ฆ๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • Deformable 3DGS (link)์—์„œ๋Š” dynamic scene์˜ motion์„ MLP ๊ธฐ๋ฐ˜์˜ deformation network๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์†Œ๊ฐœ ํ•  4D Gaussian Splatting์—์„œ๋Š” ์ด์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ training ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๋งŒ๋“ค์—ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

Methods

Overview

4D Gaussian์˜ ํ•ต์‹ฌ์€ Staticํ•œ 3D Gaussian์„ ๋งŒ๋“  ํ›„, ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๊ฐ 3D Gaussian๋“ค์˜ Position, Rotation, Scaling๋ณ€ํ™”๋Ÿ‰์„ ๋ชจ๋ธ๋งํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ๋ณ€ํ™”๋Ÿ‰์„ Deformation Field๋กœ ํ‘œํ˜„ํ•˜๋ฉฐ, ์œ„ ๊ทธ๋ฆผ์—์„œ input๊ณผ output๋ฅผ ๋จผ์ € ๋ณด๋ฉด, 3D Gaussian์„ ์ž…๋ ฅ์œผ๋กœ ์–ผ๋งˆ๋‚˜ ๋ณ€ํ˜• ๋˜์—ˆ๋Š”์ง€์— ๋Œ€ํ•ด position(x',y',z'), Covariance(r',s')์œผ๋กœ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์ด์ œ ์•ˆ์ชฝ์„ ๋ณด๋ฉด, 1์ฐจ์ ์œผ๋กœ 6๊ฐ€์ง€์˜ matrix์œผ๋กœ ๋ณ€ํ˜•๋˜๊ณ , 2์ฐจ์ ์œผ๋กœ ํŒŒ๋ž€์ƒ‰์œผ๋กœ ํ‘œ์‹œ๋œ feature vector๋กœ ํ•ฉ์ณ์ง€๋ฉฐ, MLP๋ฅผ ํ†ต๊ณผํ•˜์—ฌ ์ตœ์ข… ๊ฒฐ๊ณผ ๊ฐ’์„ ํš๋“ํ•จ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด์ „ NeRF๊ธฐ๋ฐ˜์˜ Dynamic Model (์™ผ์ชฝ๊ทธ๋ฆผ) ์—์„œ๋Š” ray์œ„์— point๋“ค์„ deformationํ–ˆ๊ธฐ์—, ๊ฐ point์˜ ์„œ๋กœ ๋‹ค๋ฅธ ์†๋„๋ฅผ ์ž˜ ๋ชจ๋ธ๋งํ•˜์ง€ ๋ชปํ•˜์—ฌ ํ€„๋ฆฌํ‹ฐ ํ•˜๋ฝ์ด ์žˆ์—ˆ์ง€๋งŒ, 4D Gaussian Splatting์—์„œ๋Š” ๊ฐ Gaussian์ด ray์— ์˜์กดํ•˜์ง€ ์•Š๊ณ  ์„œ๋กœ ๋‹ค๋ฅธ ์†๋„๋กœ ์ด๋™์ด ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์—,  t์‹œ๊ฐ„์— ๋”ฐ๋ผ Gaussian์˜ ์œ„์น˜๊ฐ€ ์ด๋™ํ•˜์˜€์„์ง€๋ผ๋„, ๋‹ค๋ฅธ ray๋ฅผ ํ†ตํ•ด ์ด๋™๋œ Gaussian์„ renderingํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

3D Gaussian Neural Voxel Encoding

Overview์—์„œ 6์ข…๋ฅ˜์˜ matrix๋กœ ๊ตฌ์„ฑ๋œ ๋ถ€๋ถ„์ด๋ฉฐ, MLP์˜ ์ž…๋ ฅ ๊ฐ’์„ ์ƒ์„ฑํ•ด์ฃผ๊ธฐ ๋•Œ๋ฌธ์— encoder๋กœ ํ‘œํ˜„ํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ์ด ๋ถ€๋ถ„์€ ์‹ค์ œ์ ์œผ๋กœ TensoRF(์ด์ „ ํฌ์ŠคํŠธ link, ECCV2022) ๊ฐœ๋…์ด ๋“ค์–ด๊ฐ„ HexPlane(link, CVPR2023)์ด๋ผ๋Š” ์—ฐ๊ตฌ๊ฐ€ ์ฐธ์กฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. 

์œ„ ๊ทธ๋ฆผ์€ TensoRF์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋„์‹ํ™”ํ•œ ๊ทธ๋ฆผ์ด๋ฉฐ, 3์ฐจ์›(3๊ฐœ์˜ ์„ )์„ 2์ฐจ์›(1๊ฐœ์˜ ํ‰๋ฉด๊ณผ 1๊ฐœ์˜ ์„ )์œผ๋กœ dimension reduction์ด ์ด๋ฃจ์–ด์ง„ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 6๊ฐœ ๊ทธ๋ฆผ์ค‘ ์œ„์— ์ฒซ๋ฒˆ์งธ์ค„์€ Color, ๋‘๋ฒˆ์งธ ์ค„์€ density์— ๋Œ€ํ•œ ๋ชจ๋ธ๋ง์ด๋ฉฐ, ์‹ค์ œ์ ์œผ๋กœ XYํ‰๋ฉด-Z์„ , XZํ‰๋ฉด-Y์„ , ZYํ‰๋ฉด-X์„  ์ด 3๊ฐœ์˜ ํƒ€์ž…์˜ Rank๋กœ ๋ชจ๋ธ๋ง๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์œ„ ๊ทธ๋ฆผ์€ HexPlane์—์„œ ๊ฐ€์ ธ์˜จ ๊ทธ๋ฆผ์ด๋ฉฐ, 3์ฐจ์›์— time t๊ฐ€ ์ถ”๊ฐ€๋˜์–ด 4๊ฐœ ์ฐจ์›์„ ๋ชจ๋ธ๋งํ•˜์˜€์œผ๋ฉฐ, 4์ฐจ์›(4๊ฐœ์˜ ์„ )์„ 2์ฐจ์›(2๊ฐœ์˜ ํ‰๋ฉด)์ธ XYํ‰๋ฉด-ZTํ‰๋ฉด, XZํ‰๋ฉด-YTํ‰๋ฉด, YZํ‰๋ฉด-XTํ‰๋ฉด์ด 3๊ฐœ ํƒ€์ž…์˜ Rank๋กœ ๋ชจ๋ธ๋ง๋ฉ๋‹ˆ๋‹ค. ์ด ๊ฐ’๋“ค์€ Outer Products์˜ Sum์ด ๋˜๊ณ  density๋Š” network์—†์ด ํ‘œํ˜„๋˜๊ณ , color๋Š” MLP๋ฅผ ํ†ต๊ณผํ•˜์—ฌ Color๋ฅผ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

๋‹ค์‹œ 4D Gaussian Splatting์„ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด์ œ ํฐ ์œค๊ณฝ์ด ๋ณด์ด์‹ค ๊ฒ๋‹ˆ๋‹ค.

๋…ผ๋ฌธ์ƒ์—์„œ๋Š” Encoding์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์€ ์—†๊ณ  ๋‹จ์ˆœํžˆ Multi-resolution HexPlane์œผ๋กœ 3D Gaussian์˜ spatial, temporal ๊ฐ’์„ encodingํ•˜์˜€๋‹ค๊ณ  ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ Multi-resolution๋ผ๋Š” ํ‚ค์›Œ๋“œ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•˜์ž๋ฉด, TensoRF์—์„œ Grid Resolution์ด๋ผ๋Š” ๊ฐœ๋…์ด ๋“ค์–ด๊ฐ€๋Š”๋ฐ, Grid Resolution์ด ์ปค์งˆ์ˆ˜๋ก, ์„  ๋˜๋Š” ๋ฉด์ด ๋” ์ด˜์ด˜ํ•ด์ง€๋ฉด์„œ 3D ์ขŒํ‘œ์— ๋Œ€ํ•œ high frequency feature๋ฅผ ๋” ์ž˜ ์žก์•„๋‚ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๋•Œ๋ฌธ์— TensoRF์—์„œ๋Š” iteration์ด ์ง„ํ–‰๋จ์— ๋”ฐ๋ผ, Grid Resolution์„ ์ ์  ์ฆ๊ฐ€์‹œํ‚ค๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. 4D Gaussian Splatting์—์„œ๋Š” Grid Resolution์„ ์ฆ๊ฐ€์‹œํ‚ค์ง€ ์•Š๊ณ , ์—ฌ๋Ÿฌ๊ฐœ(multi)์˜ Resolution์œผ๋กœ Rank๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์ด๋ฅผ MLP์˜ input์˜ feature๋กœ ์‚ฌ์šฉํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์œ„๋Š” ๋…ผ๋ฌธ ์ˆ˜์‹์ด๋ฉฐ, i,j์— ๋Œ€ํ•œ ๊ฒƒ์€ ๊ฐ ํ‰๋ฉด์˜ ์ฐจ์›์„ ์˜๋ฏธํ•˜๊ณ , R์€ ๊ทธ ์ฐจ์›์œผ๋กœ ๊ตฌ์„ฑ๋œ Rank๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. interpolation์€ ํƒ€๊ฒŸ ์ขŒํ‘œ์˜ ์ฃผ๋ณ€์˜ Tensor ๊ฐ’๋“ค๋กœ ๋ณด๊ฐ„(interpolation)ํ•œ๋‹ค๋Š” ์˜๋ฏธ๋ฅผ ๊ฐ–๊ณ  ์žˆ์œผ๋ฉฐ, ๊ธฐ์ดˆ ๊ฐœ๋…์— ๋Œ€ํ•ด์„  TensoRF(์ด์ „ ํฌ์ŠคํŠธ link) ์˜ Tensor Decomposition์ด๋ž€? ํŒŒํŠธ๋ฅผ ์ฐธ๊ณ  ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ฐ ์ฐจ์›์— ๋Œ€ํ•ด interpolationํ•œ ๊ฐ’์„ concatํ•˜์—ฌ voxel์— ๋Œ€ํ•œ feature๊ฐ’์œผ๋กœ ๋งŒ๋“ค๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

ํ•ด๋‹น feature๋“ค์ด MLP๋ฅผ ํ†ต๊ณผํ•˜๋ฉด Gaussian์˜ ๋ณ€ํ™”๋Ÿ‰์„ ์ถœ๋ ฅํ•œ๋‹ค๊ณ  ํ–ˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์œ„์™€ ๊ฐ™์ด ๋ชจ๋ธ๋งํ•˜๊ฒŒ ๋˜๋ฉด, ๊ณต๊ฐ„์ƒ์˜(x,yํ‰๋ฉด) ์ธ์ ‘ํ•œ voxel์€ ์œ ์‚ฌํ•œ feature๋“ค์„ ๋‚˜ํƒ€๋‚ด๊ณ , ์‹œ๊ฐ„์ƒ(xtํ‰๋ฉด)์œผ๋กœ ์ธ์ ‘ํ•œ voxel๋“ค์€ ์œ ์‚ฌํ•œ feature๋“ค์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ์ด ํŠน์ง•์œผ๋กœ ์ธํ•ด optimization์ด ์ง„ํ–‰๋จ์— ๋”ฐ๋ผ, Gaussian์˜ covariance๊ฐ€ ์ค„์–ด๋“ค๋ฉด์„œ  3D Gaussian๋“ค์ด denseํ•ด์ง€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์ด์ œ MLP ๋ถ€๋ถ„์„ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

overview๊ทธ๋ฆผ์—” MLP์˜ ๊ฒฐ๊ณผ ๊ฐ’์œผ๋กœ position, rotation, scale๋กœ ๋˜์–ด ์žˆ๋Š”๋ฐ, ์‹ค์ œ์ ์œผ๋กœ covariance value์ธ size์™€ rotation ๋ณ€ํ™”๋Ÿ‰์ด ์ถ”๊ฐ€๋˜๊ฒŒ ๋  ๊ฒฝ์šฐ training์†๋„ rendering ์†๋„ ๋‘˜๋‹ค ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋˜์–ด์„œ ์ƒ๋žตํ•˜๊ณ  position๋งŒ ์ง‘์ค‘ํ•ด์„œ optimization ํ–ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. (์ด์ „ ์—ฐ๊ตฌ์ธ Dynamic3DGS์—์„œ๋Š” size, color, opacity๊ฐ€ ๊ณ ์ •๋˜๊ณ  position๊ณผ rotation์ด update๋ฌ์—ˆ์Šต๋‹ˆ๋‹ค.)

Gaussians Deformation Computation

์œ„๋ฅผ ์ˆ˜์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

concatํ•œ feature f๋ฅผ ์ž…๋ ฅ์œผ๋กœํ•ด์„œ MLP์ธ g๋ฅผ ํ†ต๊ณผํ•˜๋ฉด ๋ณ€ํ™”๋Ÿ‰์„ ๊ฒฐ๊ณผ๊ฐ’์„ ์–ป์„ ์ˆ˜ ์žˆ๊ณ ,

์ด feature๋Š” ๋‹จ์ˆœํžˆ ๋”ํ•ด์ ธ์„œ, ๋‹ค์Œ time์„ ํ‘œํ˜„ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. (๋…ผ๋ฌธ 4๋ฒˆ์งธ์žฅ ๋งˆ์ง€๋ง‰ ์ค„์—์„  "we decide to omit the 3D Gaussian’s rotation and scaling" ๋ผ๊ณ  ํ•ด๋†“๊ณ ,  ๋…ผ๋ฌธ 5๋ฒˆ์งธ ์žฅ์—์„  ๊ณง๋ฐ”๋กœ ๋‹ค์‹œ rotation๊ณผ scale parameter๋ฅผ ์–ธ๊ธ‰ํ•ด์„œ ํ˜ผ๋™์Šค๋Ÿฝ๋„ค์š”.)

Optimization

์ด์ „ ๋ช‡๋ช‡์˜ ๋…ผ๋ฌธ์—์„œ๋Š” static part์™€ dynamic part๋Š” ๋…๋ฆฝ์ ์œผ๋กœ ํ•™์Šต์‹œ์ผฐ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์—ฌ๊ธฐ์—์„œ๋Š” 3D Gaussian Splatting์—์„œ ์ œ์•ˆํ–ˆ๋˜ ๋ฐฉ๋ฒ•๊ณผ ๋™์ผํ•˜๊ฒŒ static scene์„ ํ•™์Šต์‹œ์ผœ์„œ ํ€„๋ฆฌํ‹ฐ๋ฅผ ํ–ฅ์ƒ์‹œ์ผฐ๊ณ , ๊ทธ ํ›„์— dynamic scene์„ fine-tuningํ˜•ํƒœ๋กœ ํ•™์Šต์‹œ์ผฐ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. 

 

Loss๋Š” Color์— ๋Œ€ํ•œ L2-Loss์™€ TV Loss๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

Experimental Setting

์ดˆ๊ธฐํ™”์‹œ point cloud๋Š” ์ตœ๋Œ€ 8,000๊ฐœ๋งŒ ์œ ์ง€ํ•˜๋„๋ก ํ–ˆ๊ณ  trainingํ•˜๋Š”๋™์•ˆ ์ตœ๋Œ€ 20,000๊ฐœ ๋˜๋„๋ก ํ–ˆ์Šต๋‹ˆ๋‹ค. multi-resolution์˜ ํ•ด์ƒ๋„๋Š” 64x64, 128x128, 256x256, 512x512 ์ด 4๊ฐœ level๋กœ ๋‘์—ˆ์Šต๋‹ˆ๋‹ค. MLP์˜ hiden layer 64๊ฐœ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

Evaluation

PSNR ์„ค๋ช… : [ํ‰๊ฐ€ ์ง€ํ‘œ] PSNR / SSIM / LPIPS

RTX 3090 GPU๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์ธก์ •ํ–ˆ๊ณ , D-NeRF์˜ Synthetic dataset์œผ๋กœ,   800x800 ํ•ด์ƒ๋„๋กœ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

dynamic scene ๋ฐ์ดํ„ฐ์…‹ ์ด๊ธฐ ๋•Œ๋ฌธ์— 3DGS(3D Gaussian Splatting)์˜ PSNR์€ ๋‚ฎ์€ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด์ „ dynamic scene ์—ฐ๊ตฌ์— ๋น„ํ•ด ํ€„๋ฆฌํ‹ฐ๊ฐ€ ๋†’๊ณ , ๋žœ๋”๋ง ์†๋„๊ฐ€ ๋น ๋ฅธ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 

์•„๋ž˜๋Š” Nerfies์˜ Real-world ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ์ด๋ฉฐ,

์•„๋ž˜๋Š” DyNeRF์˜ Real-world ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.

์•„๋ž˜๋Š” HyperNeRF๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋žœ๋”๋งํ•œ ๊ฒฐ๊ณผ

์•„๋ž˜๋Š” DyNeRF ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋žœ๋”๋งํ•œ ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.

 

Ablation Study

Nueral Voxel Encoder (= Ours w/o Voxel Grid)

MLP๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Implicit ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ Dynamic 3DGS ๊ธฐ๋ฒ•์œผ๋กœ explicitํ•˜๊ฒŒ ๋ชจ๋ธ๋ง ํ•  ๊ฒฝ์šฐ, ๋žœ๋”๋ง ํ€„๋ฆฌํ‹ฐ๋Š” ๋–จ์–ด์ง€๊ณ  render์†๋„๊ฐ€ ํ–ฅ์ƒ, ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๊ฐ์†Œ๋˜์—ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. 

Model Capacity (= Ours w/ Larger MLP)

voxel plane resolution์„ ์ฆ๊ฐ€์‹œํ‚ค๊ฑฐ๋‚˜ MLP์˜ ํฌ๊ธฐ๋ฅผ 64 layer์—์„œ 256layer์œผ๋กœ ์ฆ๊ฐ€ ์‹œํ‚ฌ ๊ฒฝ์šฐ ์ƒ๋‹นํ•œ ํ€„๋ฆฌํ‹ฐํ–ฅ์ƒ์ด ์žˆ์—ˆ์œผ๋‚˜, ๋žœ๋”๋ง ์†๋„๋Š” ๊ฐ์†Œํ•˜์˜€์Šต๋‹ˆ๋‹ค. 

Two Stage Training (= Ours w/o static Stage)

D-NeRF, DyNeRF์—์„  point cloud์ •๋ณด๊ฐ€ ์ฃผ์–ด์ง€์ง€ ์•Š์•„์„œ ์ข€ ๋” ์–ด๋ ค์šด task์˜€๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. static stage, fine-tuning stage 2๊ฐœ๋กœ ๋ถ„ํ•  ํ•ด์„œ ํ•™์Šต์„ ์ง„ํ–‰ ํ•  ๊ฒฝ์šฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

Fast Traning (= Ours-7k)

7k๊นŒ์ง€๋งŒ ํ•™์Šต์‹œ์ผœ๋„ ๋‚˜์˜์ง€ ์•Š์€ PSNR์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•™์Šต์†๋„๊ฐ€ 7๋ถ„์œผ๋กœ ์ค„์—ˆ์Šต๋‹ˆ๋‹ค.

Image-based Loss(=Ours w/ Image Loss)

LPIPS Loss, SSIM Loss์™€ ๊ฐ™์€ Image-based Loss๋ฅผ ์ ์šฉ ํ•  ๊ฒฝ์šฐ, ํ•™์Šต์‹œ๊ฐ„์ด 2๋ฐฐ ๋А๋ ค์กŒ์Šต๋‹ˆ๋‹ค. ๋žœ๋”๋ง ์†๋„๋„ ๋А๋ ค์กŒ๊ณ  ํ€„๋ฆฌํ‹ฐ๋„ ๋‚ฎ์•„์กŒ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น Loss๋กœ motion part(fine-tuning ๋‹จ๊ณ„)๋ฅผ optimizationํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ต๊ณ  ๋ณต์žกํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. 

Closing..

NeRF๊ฐ€ ์ฒ˜์Œ ๋‚˜์˜ค๊ณ  ๋‚˜์„œ, ๋‹ค๋ฅธ ๋ฌธ์ œ๋“ค์„ ํ‘ธ๋Š” ํ›„์† ๋…ผ๋ฌธ๋“ค์ด ์Ÿ์•„์กŒ๋Š”๋ฐ, 3D Gaussian Splatting๋„ ๋น„์Šทํ•˜๊ฒŒ ํ›„์† ๋…ผ๋ฌธ์ด ์Ÿ์•„์ง€๋Š” ๊ฒƒ ์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค. Dynamic Scene๋žœ๋”๋ง, Text to 3D ๋ถ„์•ผ๊ฐ€ ๊ฐ€์žฅ ๋จผ์ € ๋‚˜์˜ค๋Š” ๊ฒƒ ๊ฐ™๊ณ , ๋‹ค๋ฅธ ์—ฐ๊ตฌ๋“ค๋„ ๊ธฐ๋Œ€ํ•ด๋ด…๋‹ˆ๋‹ค. Dynamic Scene์˜ ํ•™์Šต ์†๋„๊ฐ€ ์ ์  ๋ฐœ์ „ํ•˜๋ฉด, ๊ณง ์Šค๋งˆํŠธํฐ์ด๋‚˜ SNS์—์„œ ์‚ฌ์ง„์„ ๋Œ€์ฒดํ•ด์„œ NeRF๋ฅผ ์ ์šฉํ•œ ์„œ๋น„์Šค๊ฐ€ ์ƒ๊ฒจ๋‚˜์ง€ ์•Š์„๊นŒ ๊ธฐ๋Œ€ํ•ด๋ด…๋‹ˆ๋‹ค.

๋ฐ˜์‘ํ˜•
์ €์ž‘์žํ‘œ์‹œ ๋น„์˜๋ฆฌ ๋ณ€๊ฒฝ๊ธˆ์ง€ (์ƒˆ์ฐฝ์—ด๋ฆผ)

'Study: Artificial Intelligence(AI) > AI: 3D Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Ha-NeRF: NeRFwithRealWorld + CNN Appearance Embedding - Hallucinated Neural Radiance Fields in the Wild  (0) 2024.12.01
[Survey] Urban Driving Scene Reconstruction ๊ด€๋ จ ๋‚ด์šฉ ์ •๋ฆฌ  (0) 2024.12.01
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] HyperNeRF : A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields (ACM TG 2021)  (0) 2024.11.29
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Nerfies: Deformable Neural Radiance Fields (ICCV 2021)  (0) 2024.11.28
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] D-NeRF: Neural Radiance Fields for Dynamic Scenes (CVPR 2021)  (0) 2024.11.26
    'Study: Artificial Intelligence(AI)/AI: 3D Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
    • [๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Ha-NeRF: NeRFwithRealWorld + CNN Appearance Embedding - Hallucinated Neural Radiance Fields in the Wild
    • [Survey] Urban Driving Scene Reconstruction ๊ด€๋ จ ๋‚ด์šฉ ์ •๋ฆฌ
    • [๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] HyperNeRF : A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields (ACM TG 2021)
    • [๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Nerfies: Deformable Neural Radiance Fields (ICCV 2021)
    DrawingProcess
    DrawingProcess
    ๊ณผ์ •์„ ๊ทธ๋ฆฌ์ž!

    ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”