Given a set of 3D points $\{ p_i \}$, we want to find out a plane (i.e., unit normal $n$ and center $q$) that describes this set of points as
\begin{equation} \label{eq_plane} n^\intercal (p_i-q)=0, \forall i \end{equation}
Of course real world data points usually won’t strictly satisfy equation $\eqref{eq_plane}$. Thus we define function
\begin{equation} \label{eq_ppdist} \text{dist}(p_i;n,q) \triangleq n^\intercal (p_i-q) \end{equation}
which represents the signed point-plane distance between the plane $(n,q)$ and data point $p_i$ (recall that $n$ is a unit normal vector).
Now the problem of plane fitting can be formulated as a least square problem with the following cost function
where $A(q) \triangleq [\cdots, p_i-q, \cdots]$.
If $n$ is fixed to some value, to minimize the cost function $\eqref{eq_cost}$, we need to zero the following partial derivative (check matrix calculus)
\begin{equation} \label{eq_cost_dq} \mathbf{0} = \frac{\partial \text{cost}(n,q)}{\partial q} \equiv \sum_i (2nn^\intercal q - 2nn^\intercal p_i). \end{equation}
Solving equation $\eqref{eq_cost_dq}$ leads to the optimal plane center $ q^* $ as (no matter what $n$ is)
\begin{equation} \label{eq_bestq} q^* = \frac{1}{|\{p_i\}|} \sum_i p_i. \end{equation}
Now to solve the optimal plane normal $n$, we can plug equation $\eqref{eq_bestq}$ back to the cost function $\eqref{eq_cost}$ as
\begin{equation} \label{eq_costn} \text{cost}(n; q^* ) \triangleq n^\intercal A( q^* ) A( q^* )^\intercal n = n^\intercal B( q^* ) n \end{equation}
where $ q^* $ becomes a fixed parameter of this new cost function, and . Recall that $A(q)$ is a $ 3 \times |\{p_i\}| $ matrix, thus $B(q)$ should be a $ 3 \times 3 $ positive-definite matrix. Now the optimal plane normal $ n^* $ needs to be solved by
which reminds us a classic problem that can be solved by the singular value decomposition $A(q) \equiv U(q)S(q)V(q)^\intercal$, leading to (omitting $(q)$ in $B$ and other relevant matrices for clarity)
Since $S=[\text{diag}(\sigma_1,\sigma_2,\sigma_3), \mathbf{0},\cdots,\mathbf{0}]$.
If we give each data point a corresponding weight $w_i$, then the distance equation $\eqref{eq_ppdist}$ can be changed to a weighted version as
\begin{equation} \label{eq_wdist} \text{wdist}(p_i,w_i;n,q) \triangleq w_i n^\intercal (p_i-q) \end{equation}
TO BE CONTINUED…