Plane Fitting using PCA

Published: 18 Oct 2015 | Updated: 25 Mar 2016 | Tags: math note

Given a set of 3D points $\{ p_i \}$, we want to find out a plane (i.e., unit normal $n$ and center $q$) that describes this set of points as

\begin{equation} \label{eq_plane} n^\intercal (p_i-q)=0, \forall i \end{equation}

Of course real world data points usually won’t strictly satisfy equation $\eqref{eq_plane}$. Thus we define function

\begin{equation} \label{eq_ppdist} \text{dist}(p_i;n,q) \triangleq n^\intercal (p_i-q) \end{equation}

which represents the signed point-plane distance between the plane $(n,q)$ and data point $p_i$ (recall that $n$ is a unit normal vector).

Now the problem of plane fitting can be formulated as a least square problem with the following cost function

where $A(q) \triangleq [\cdots, p_i-q, \cdots]$.

If $n$ is fixed to some value, to minimize the cost function $\eqref{eq_cost}$, we need to zero the following partial derivative (check matrix calculus)

\begin{equation} \label{eq_cost_dq} \mathbf{0} = \frac{\partial \text{cost}(n,q)}{\partial q} \equiv \sum_i (2nn^\intercal q - 2nn^\intercal p_i). \end{equation}

Solving equation $\eqref{eq_cost_dq}$ leads to the optimal plane center $ q^* $ as (no matter what $n$ is)

\begin{equation} \label{eq_bestq} q^* = \frac{1}{|\{p_i\}|} \sum_i p_i. \end{equation}

Now to solve the optimal plane normal $n$, we can plug equation $\eqref{eq_bestq}$ back to the cost function $\eqref{eq_cost}$ as

\begin{equation} \label{eq_costn} \text{cost}(n; q^* ) \triangleq n^\intercal A( q^* ) A( q^* )^\intercal n = n^\intercal B( q^* ) n \end{equation}

where $ q^* $ becomes a fixed parameter of this new cost function, and . Recall that $A(q)$ is a $ 3 \times |\{p_i\}| $ matrix, thus $B(q)$ should be a $ 3 \times 3 $ positive-definite matrix. Now the optimal plane normal $ n^* $ needs to be solved by

which reminds us a classic problem that can be solved by the singular value decomposition $A(q) \equiv U(q)S(q)V(q)^\intercal$, leading to (omitting $(q)$ in $B$ and other relevant matrices for clarity)

Since $S=[\text{diag}(\sigma_1,\sigma_2,\sigma_3), \mathbf{0},\cdots,\mathbf{0}]$.

If we give each data point a corresponding weight $w_i$, then the distance equation $\eqref{eq_ppdist}$ can be changed to a weighted version as

\begin{equation} \label{eq_wdist} \text{wdist}(p_i,w_i;n,q) \triangleq w_i n^\intercal (p_i-q) \end{equation}