Deep Learning Inference Optimization Engineer(๋”ฅ๋Ÿฌ๋‹ ์ถ”๋ก  ์ตœ์ ํ™” ๊ฐœ๋ฐœ์ž)


ํ•ฉ๋ฅ˜ํ•˜๊ฒŒ ๋  ํŒ€์— ๋Œ€ํ•ด ์•Œ๋ ค๋“œ๋ ค์š”

๋ทฐ๋Ÿฐ์˜ DLํŒ€์€ Lidar ์„ธ์ƒ ์†์—์„œ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ๋ฅผ ํƒ๊ตฌํ•˜๊ณ , ์„ค๋ช… ๊ฐ€๋Šฅํ•œ ๊ธฐ์ˆ ์˜ ๋ฐฉํ–ฅ์„ฑ์„ ์ œ์‹œํ•˜๊ธฐ ์œ„ํ•ด์„œ Deep Learning System์„ ์—ฐ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

DLํŒ€์—๊ฒŒ ๋”ฅ๋Ÿฌ๋‹์€ ๋‹จ์ˆœํ•œ Operation์˜ ์กฐํ•ฉ์„ ํ†ตํ•œ ๋„คํŠธ์›Œํฌ ์ œ์•ˆ์—์„œ์˜ ์‹œ๊ฐ์ด ์•„๋‹Œ, ์ˆœ์ˆ˜ํ•˜๊ฒŒ ์‹œ์Šคํ…œ ์ž์ฒด๊ฐ€ ์—ฐ๊ตฌ์˜ ๋Œ€์ƒ์ด์ž ํƒ๊ตฌ์˜ ์ฃผ์ œ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์„ฑ๋Šฅ๊ณผ ์‹ค์‹œ๊ฐ„์„ฑ์˜ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„๋ฅผ ํƒ€ํ˜‘ํ•˜์ง€ ์•Š๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ๊ด€์ ์—์„œ ๋„คํŠธ์›Œํฌ ๋ฐ  Operation๋“ค์„ ๊ด€์ธกํ•˜๊ณ  ๊ด€์ธกํ•œ ์‚ฌ์‹ค์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋”ฅ๋Ÿฌ๋‹์€ ์šฐ๋ฆฌ์—๊ฒŒ ๊ทธ ์ž์ฒด๋กœ ๋ฌธ์ œ์ด์ž ๋ฌธ์ œ๋ฅผ ํ’€๊ฒŒ ๋„์™€์ฃผ๋Š” ๋„๊ตฌ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ Deep Learning์€ ์‹ค์ œ๋กœ Edge์—์„œ ๋ฐ”๋กœ ์ ์šฉ๋˜๋Š” ๊ฒฝํ—˜์ ์ด๊ณ  ์‹ค์งˆ์ ์ธ ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. ํŒ€ ๊ตฌ์„ฑ์›์€ ์ปค๋„์˜ ์„ธ์ƒ์—์„œ ๋ˆ„๊ตฌ๋ณด๋‹ค ์น˜์—ดํ•˜๊ณ , ์—ฐ๊ตฌ์˜ ๊ณผ์ •์—์„œ ๊ฒฐ๊ณผ๋ฅผ ์œ„ํ•ด ๋ฌด์—‡๊ณผ๋„ ํƒ€ํ˜‘ํ•˜์ง€ ์•Š์ง€๋งŒ, ์ปค๋„ ๋ฐ–์˜ ์„ธ์ƒ์—์„œ๋Š” ํŒ€์„ ์œ„ํ•ด ์ดํ•ด์™€ ํƒ€ํ˜‘์ด ๊ณต์กดํ•˜๋Š” ์„ธ์ƒ๋„ ๊ฐ™์ด ๋งŒ๋“ค์–ด ๊ฐˆ ์ˆ˜ ์žˆ๋Š” ๋™๋ฃŒ๋ฅผ ์ฐพ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•ฉ๋ฅ˜ํ•˜๋ฉด ํ•จ๊ป˜ํ•  ์—…๋ฌด์˜ˆ์š”
  • ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ถ”๋ก  ์ตœ์ ํ™” ๊ธฐ์ˆ  ๊ฐœ๋ฐœ (Quantization, pruning, knowledge distillation...)
  • Custom Operator(Plugin) ๊ฐœ๋ฐœ
  • ํƒ€๊ฒŸ ํ”Œ๋ ›ํผ์— ๋งž๋Š” ์ถ”๋ก  ์ตœ์ ํ™” ๋ชจ๋ธ ๊ฐœ๋ฐœ
์ด๋Ÿฐ ๋ถ„๊ณผ ํ•จ๊ป˜ํ•˜๊ณ  ์‹ถ์–ด์š”
  • C++ ๋ฐ Python ์— ๋Œ€ํ•œ ๊นŠ์€ ์ดํ•ด ๋ฐ ๋Šฅ์ˆ™ํ•œ ๊ฐœ๋ฐœ ์—ญ๋Ÿ‰์„ ๊ฐ€์ง€์‹  ๋ถ„
  • Linux ๊ธฐ๋ฐ˜ ์‘์šฉ ๋˜๋Š” ์‹œ์Šคํ…œ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ์—ญ๋Ÿ‰์„ ๊ฐ€์ง€์‹  ๋ถ„
  • PyTorch ๋“ฑ์˜ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ ๊ฐœ๋ฐœ ์—ญ๋Ÿ‰์„ ๊ฐ€์ง€์‹  ๋ถ„
์ด๋Ÿฐ ๋ถ€๋ถ„์ด ์žˆ์œผ๋ฉด ์ข‹์•„์š”
  • ๋ผ์ด๋‹ค/๋น„์ „์„ผ์„œ ์ธ์ง€(๋ถ„๋ฅ˜ ๋“ฑ) ๊ด€๋ จ ํ”„๋กœ์ ํŠธ ๊ฒฝํ—˜(๋”ฅ๋Ÿฌ๋‹/๋จธ์‹ ๋Ÿฌ๋‹ ํ™œ์šฉ)์ด ์žˆ์œผ์‹  ๋ถ„
  • OpenCL ํ˜น์€ CUDA ๊ฐœ๋ฐœ ๊ฒฝํ—˜๊ณผ GPU ์•„ํ‚คํ…์ฒ˜์— ๋Œ€ํ•œ ์ง€์‹์ด ์žˆ์œผ์‹  ๋ถ„
  • Quantization, compression, kernel optimization ๋ถ„์•ผ์˜ ์—ฐ๊ตฌ/๊ฐœ๋ฐœ ๊ฒฝํ—˜์ด ์žˆ์œผ์‹  ๋ถ„
  • TensorFlow Lite, TensorRT, OpenVINO ๋“ฑ์˜ ํ”„๋ ˆ์ž„ ์›Œํฌ๋ฅผ ์ด์šฉํ•œ optimization ๊ฐœ๋ฐœ ๊ฒฝํ—˜์ด ์žˆ์œผ์‹  ๋ถ„
  • On-Device AI ๊ด€๋ จ optimization ์—…๋ฌด๋ฅผ ๊ฒฝํ—˜ํ•˜์‹  ๋ถ„
  • Embedded device์—์„œ Low-precision deep learning model porting ์—…๋ฌด ๊ฒฝํ—˜์ด ์žˆ์œผ์‹  ๋ถ„
๊ทผ๋ฌด ์กฐ๊ฑด
  • ๊ทผ๋ฌดํ˜•ํƒœ : ์ •๊ทœ์ง (์ˆ˜์Šต 3๊ฐœ์›” ์ ์šฉ)
  • ๊ทผ๋ฌด์ผ์‹œ : ์ฃผ 5์ผ (์›”์š”์ผ - ๊ธˆ์š”์ผ)
  • ๊ทผ๋ฌด์ง€์—ญ : ์„œ์šธ ์„œ์ดˆ๊ตฌ ๊ฐ•๋‚จ๋Œ€๋กœ 311, 19์ธต
  • ๊ธ‰์—ฌ : ์—ฐ๋ด‰ (ํ˜‘์˜ ํ›„ ๊ฒฐ์ •)
์ œ์ถœ ์„œ๋ฅ˜
  • ์ด๋ ฅ์„œ or ๊ฒฝ๋ ฅ๊ธฐ์ˆ ์„œ (์ž์œ  ์–‘์‹) ํ•„์ˆ˜ ์ œ์ถœ 
  • ํฌํŠธํด๋ฆฌ์˜ค ์„ ํƒ 
๋ทฐ๋Ÿฐ ํ•ฉ๋ฅ˜ ์—ฌ์ •
์†Œ์†ํŒ€ DLํŒ€
๊ฒฝ๋ ฅ ์‚ฌํ•ญ๊ฒฝ๋ ฅ ๋ฌด๊ด€
๊ณ ์šฉ ํ˜•ํƒœ์ •๊ทœ์ง(์ˆ˜์Šต 3๊ฐœ์›”)
๋ณ‘์—ญํŠน๋ก€ ์ „๋ฌธ์—ฐ๊ตฌ์š”์›(์ „์ง๊ฐ€๋Šฅ)

๊ทผ๋ฌด์ง€๋Œ€ํ•œ๋ฏผ๊ตญ ์„œ์šธํŠน๋ณ„์‹œ ์„œ์ดˆ๊ตฌ ๊ฐ•๋‚จ๋Œ€๋กœ 311๊ธธ