KPConv : Flexible and Deformable Convolution for Point Clouds 논문 리뷰

Notice

Recent Posts

Recent Comments

Link

github

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

꿈꾸는 개발자

KPConv : Flexible and Deformable Convolution for Point Clouds 논문 리뷰 본문

논문 리뷰

KPConv : Flexible and Deformable Convolution for Point Clouds 논문 리뷰

Anssony 2024. 1. 11. 13:00

논문 제목 : KPConv : Flexible and Deformable Convolution for Point Clouds

Introduction

Kernel Point Convolution(KPConv)라는 point convolution 연산을 제안

KPConv는 3D filter로 이루어져있고, 이전 point convolution들의 한계점(네트워크 수렴의 어려움, 복잡성)을 극복

Euclidean space 상에서 kernel point와 가장 가까운 입력 포인트와 연산을 통해 특징을 추출함으로써 grid convolution보다 뛰어난 유연성을 갖고 있음

image-based convolution에서 영감을 받아 KPConv를 제안

rigid KPConv와 rigid KPConv에 deformable을 추가한 deformable KPConv를 제안

rigid KPConv
- 간단한 task(object classification or small segmentation datasets)에서 좋은 성능을 보임
Deformable KPConv
- 보다 어려운 task(다양한 객체가 있는 segmentation 데이터 셋)에서 좋은 성능을 발휘함
- 적은 수의 kernel point에서 더욱 강건함

Kernel Point Convolution

Kernel Function Defined by Points

Kernel function은 다음과 같다.

$ (F \ast g)(x) = \underset{x_{i} \in N_{x}}{\Sigma} g(x_{i} - x)f_{i} $

$ g(y_{i}) = \underset{k < K}{\Sigma} h(y_{i}, \tilde{x}_{k}) W_{k} $ $ h(y_{i}, \tilde{x}_{k}) = max (0, 1 - \frac{||y_{i} - \tilde{x}_{k}||}{\sigma}) $

KPConv는 일정한 반지름을 가진 구 형태의 kernel 을 사용하고 있음

$ y_{i} $ 는 $ x_{i} $ 와 입력 point 간 상대 거리를 나타내고, $ h $ 는 kernel point 와 함께 correlation을 구하게 되고, 구한 $ h $ 로부터 weight를 곱해 kernel function $ g $ 를 정의하는 것을 알 수 있다.

Rigid or Deformable Kernel

Rigid Kernel
- Repulsive potential 과 Attractive potential을 최소화하는 배치를 찾아야 한다.
  - Repulsive potential : kernel point 간의 거리가 멀수록 작아지는 값
  - Attractive potential :구의 중심에서 가까울수록 작아지는 값
  - 요약하면 kernel point 끼리는 멀어지고 구의 중심과는 kernel point가 가까워지는 kernel point의 위치를 찾는 것이 최적의 배치를 찾는 것
Deformable Kernel
- Deformable convolution 컨셉을 포인트 클라우드에 적용
- $ (F \ast g)(x) = \underset{x_{i} \in N_{x}}{\Sigma} g_{deform}(x - x_{i}, \Delta(x))f_{i} $
- $ g_{deform}(y_{i}, \Delta(x)) = \underset{k < K}{\Sigma} h(y_{i}, \tilde{x}_{k} + \Delta_{k}(x)) W_{k} $
- 그러나, point cloud의 non-uniform sampling 특성 상 연산을 적용하기가 어려움
  - 이러한 문제점을 다루기 위해 fitting regularization과 repulsive regularization을 적용. 두 regularization의 합이 최소화하도록 진행
  - $ L_{reg} = \underset{x}{\Sigma} L_{fit}(x) + L_{rep}(x) $
  - fitting regularization :
    - $ L_{fit}(x) = \underset{k < K}{\Sigma} \underset{y_{i}}{min} (\frac{||y_{i} - (\tilde{x}_{k} + \Delta_{k}(x))||}{\sigma})^{2} $
    - kernel point와 이웃간 거리가 가까울 수록 작은 값을 가짐
  - repulsive regularization :
    - $ L_{rep}(x) = \underset{k < K}{\Sigma} \underset{l \neq k}{\Sigma} h(\tilde{x}_{k} + \Delta_{k}(x), \tilde{x}_{l} + \Delta_{l}(x))^{2} $
    - kernel point 간의 거리가 멀수록 작은 값을 가짐

KPConv를 이용한 segmentation(KP-FCNN), classification(KP-CNN) 모델 구조

Convolutional block 내부 구성도 rigid KPConv(top), deformable KPConv(bottom)

Kernel Point Network Architectures

KP-CNN
- 5개의 classification convolutional network로 이루어져 있고, 각 layer는 2개의 convolutional block을 포함
- 마지막 layer를 거친 후 특징들은 global average pooling로 aggregation된 후 FC, softmax layer를 거쳐서 분류를 수행
KP-FCNN
- segmentation을 위한 fully convolutional network
- encoder에서 KP-CNN과 동일하고, decoder에서 final point-wise feature를 얻기 위해 nearest upsampling을 사용