Showing posts with label Pseudoinverses. Show all posts
Showing posts with label Pseudoinverses. Show all posts

Saturday, 14 August 2021

The Group Inverse

In this post, I will discuss about an another type of generalized inverse called the Group inverse. Unlike the Moore-Penrose inverse, this pseudo inverse is only confined with the singular matrices and not for rectangular matrices. Hence we restrict ourselves to square matrices only.

Consider the matrix $A=\begin{pmatrix}1 &0 &1\\ 0 &1 &1\\ 1 &0 &1\end{pmatrix}$, then the spectrum of $A$, $\sigma(A)=\{0,1,2\}$. Further, $A^{\dagger}\begin{pmatrix}1/3 &-1/3 &1/3 \\ -1/6 &2/3 &-1/6\\ 1/6 &1/3  &1/6  \end{pmatrix}$ and $\sigma(A^{\dagger})=\{0,2/3,1/2\}.$ So, $1\in  \sigma(A)$, but $1^{-1}=1\notin  \sigma(A^{\dagger}).$

In our previous discussion about the Moore-Penrose inverse, we remarked that the Moore-Penrose inverse of a matrix does not holds many properties that a nonsingular matrix holds. For example, if for a singular $A\in \mathbb{R}^{n\times n}$, $\lambda \in \sigma(A)$, then it may be possible that $\lambda^{\dagger} \notin \sigma(A^{\dagger})$. However, for a nonsingular matrix $A$, if $\lambda \in \sigma(A)$, then it always holds that $\lambda^{-1} \in \sigma(A^{-1})$. The same argument is also true if we consider an eigenvector $x$ of $A$ and $A^{-1}.$

Similarly, $B=X^{-1}AX$ does not imply that $B^{\dagger}=X^{-1}A^{\dagger}X$. But when $A$ is nonsingular then this property holds. So, the eigenvalue properties is not preserved when we compute $A^{\dagger}$. This problem is resolved by introducing another interesting generalized inverse called the Group inverse.

Definition: Let $A\in \mathbb{R}^{n\times n}$, then there is a matrix $X$ (if it exists) which satisfies the following three matrix equations:

(i) $AXA=A$

(ii) $XAX=X$

(iii) $AX=XA.$

This $X$ is called the Group inverse of the matrix $A.$ It exists only for those matrix $A$ whose index is 1, i.e., $rank(A)=rank(A^2)$ or equivalently, $R(A)\cap N(A)=\{0\}$. The Group inverse of  $A$ is denoted by $A^{\#}.$ It is also unique whenever it exists.

proof for uniqueness: Let $X$ and $Y$ both satisfy the above three equations, then

$X=(XA)X=(A)XX=AY(AX)X=AY(XA)X=(AY)X=YAX$

and 

$Y=YAY=Y(A)Y=Y(AX)(AY)=YXAYA=Y(XA)=YAX$. Hence $X=Y.$

The name `Group' is given because the set $\{\cdots,(A^{\#})^{2},A^{\#},AA^{\#},A,A^{2},\cdots\}$ together with usual matrix multiplication form an abelian group. It is not difficult to notice that the identity element of this group is $AA^{\#}.$

Example: Consider the nilpotent matrix $A=\begin{pmatrix}0 &1\\0 &0 \end{pmatrix}$, then $A^{2}=\begin{pmatrix}0 &0\\0 &0 \end{pmatrix}$. Clearly, $1=rank(A)\neq rank(A^2)=0$. Alternatively, we can see that $(1,0)^T\in R(A)\cap N(A)$.

From above example, it is easy to observe that the Group inverse of any nonzero nilpotent matrix can not exist because the range and null space can never be complimentary for a nilpotent matrix. From our elementary knowledge of linear algebra, we know that a (nonzero) nilpotent matrix is not diagonalizable. Below we first try to investigate the group inverse of those matrices which is diagonalizable.

Suppose $A$ is diagonalizable index 1 matrix, then there exists a nonsingular matrix $P$ and a diagonal matrix $D=diag(\lambda_1,\lambda_2,\cdots,\lambda_n)$ ($n$ is order of $A$) such that $A=PDP^{-1}$. Then we can check that $A^{\#}=P^{-1}D^{\dagger}P$. 

Recall that $D^{\dagger}=diag(\lambda_1^{\dagger},\lambda_2^{\dagger},\cdots,\lambda_n^{\dagger}).$

The explicit formula for calculating Group inverse of a matrix $A$:

step 1: Find the full rank decomposition of $A$ as $A=FG$.

step 2: Check whether $(GF)^{-1}$ exists. If $GF$ is singular, then the group inverse of $A$ doesn't               exists.

step 3: Finally, $A^{\#}=F(GF)^{-2}G.$

Properties of Group inverse:

(i) Whenever $A$ is nonsingular, $A^{\#}=A^{-1}$

(ii) $A^{\# \#}=A$

(iii) $(A^T)^{\#}=(A^{\#})^{T}$, where $T$ stands for the transpose

(iv) $\lambda(\neq 0)\in \sigma(A)$, then $\lambda^{-1}\in \sigma(A^{\#})$

(v) $x$ is the eigenvector corresponding to the eigenvalue $\lambda (\neq 0)$ of $A$, then $x$ is also  the                eigenvector of corresponding to the eigenvalue $\lambda^{-1}$ of $A^{\#}.$


Theorem 1. Let $K$ be a square matrix of index 1, and let $L$ be such that $R(KL)\subset R(L)$. Then, $R(KL)=R(K)\cap R(L)$.

SOURCES: 

[1] Ben-Israel, A.; Greville, T. N. E., Generalized Inverses. Theory and Applications, Springer-Verlag, New York, 2003.

Friday, 23 July 2021

The Moore-Penrose Inverse (Pseudoinverse)

What is pseudo-inverse? Why it is called 'pseudo'?

We know that a square nonsingular matrix $A$ has an unique inverse, denoted as $A^{-1}$ (called usual inverse). But, what can we say about an inverse of a 'singular' or 'rectangular' matrix? The answer of this question has a deep connection with the 'pseudo-inverse' as follows:
  • The idea of an inverse of an arbitrary matrix is generalized for larger class of matrices (rectangular or singular). So, it is an extension or generalization of the usual inverse.
  • It has some similar properties like that of usual inverse. Still these inverses don't have all the properties as that of usual matrix inverse (for e.g., $AX=XA=I$ is not always true for a pseudo-inverse $X$ of $A$) and that’s why it is called pseudo-inverse.

History: In many fields of applied mathematics in early 20th century, where a problem reduces to a system of linear equations of type $Ax=b$ whose coefficient matrix $A$ is rectangular (or singular), it was required to have some kind of  general reciprocal (we call it generali) of $A$ which enjoys the properties similar to usual matrix inverse. The first successful attempt to find a reciprocal of an arbitrary order matrix was done by the mathematician 'Moore'[2] in 1920. However, to deal with the problem of finding a reciprocal of an arbitrary order matrix, Moore used complicated notations and concepts which made his work very less popular in those years.
         Unaware of Moore's work, 'Penrose'[3] in 1955 rediscovered the definition and properties of a general reciprocal. He proved that a general reciprocal of a matrix is always unique and satisfies four matrix equations which are given below
                            $AXA=A$, $XAX=X$, $(AX)^T=AX$, $(XA)^T=XA,$
where $Z^T$ denotes the transpose of the matrix $Z$. This definition was much easy to understand for general readers. The two different definitions of these mathematician were found to be equivalent by 'Rado'[4] just after one year Penrose's work. So, in respect of these two mathematician (Moore and Penrose), we name the definition of a general reciprocal of a matrix given by Penrose as 'The Moore-Penrose Inverse'.

NOTE: Penrose recently received Nobel Prize for Physics for the year 2020 .

Definition given by Penrose: Let $A\in \mathbb{R}^{m\times n}$, then the Moore-Penrose inverse of $A$ is the unique matrix $X\in \mathbb{R}^{n\times m}$ that satisfies these four matrix equations
1. $AXA=A$, 
2. $XAX=X$, 
3. $(AX)^T=AX$, 
4. $(XA)^T=XA.$

We write $A^{\dagger}$ to represent the matrix $X$ that satisfies the above four equations.
One may notice that when we consider a nonsingular $A\in \mathbb{R}^{n\times n}$, then the Moore-Penrose inverse of $A$ coincides with the usual inverse $A^{-1}$.

Example: Let $A=\begin{pmatrix}1 &0 &1\\1 &1 &1 \end{pmatrix}$, then the Moore-Penrose inverse is $X=\begin{pmatrix}0.5 &0 \\-1 &1\\0.5 &0\end{pmatrix}$.
One can easily verify that $X$ satisfies all the above four Penrose equations.

Definition given by Moore: Given $A$, there is exactly one $X$ such that, for suitable $Y$ and $Z$,
                                              $AXA=A$, $X=YA^T=A^TZ$.
This $X$ satisfies 
                               $XAX=X$, $(AX)^T=AX$, $(XA)^T=XA.$


One more equivalent form of the above definition using orthogonal projection is given below:

For every matrix $A$, there exists a unique matrix $X:R(A)\rightarrow R(A^{*})$ such that
                                           $AX=P_{R(A)}$ and $XA=P_{R(A^{*})}$.


One thing that is important to point out here is that the above definitions are the manipulated one as the original definition was very complicated. Moore's original definition was in fact for more general case when the field is a division ring having certain properties.

MATLAB command for computing the Moore-Penrose inverse: Computing the Moore-Penrose inverse with pen-paper set up is difficult specially when the size of a matrix is very large. So, in that case one can seek the help of MATLAB.
                  To compute the Moore-Penrose inverse $X$ of a matrix $A$ using matlab, simply write
'pinv(A)' in the command window.


Does the Moore-Penrose inverse is same as pseudo-inverse?
  • The simple and straight forward answer is 'NO'.
  • The Moore-Penrose inverse of a matrix $A$ is that matrix $X$ which satisfies all four Penrose equations and it is always unique. However, a matrix $X$ is called pseudo-inverse if it satisfies any one (or any of these combinations) of the four Penrose equations. A matrix may have many pseudo-inverse. This is clear from the next example.
  • The Moore-Penrose inverse is a particular pseudo-inverse of a matrix $A$. So, in general these two may not be same but there are many textbooks where authors call the Moore-Penrose inverse as pseudo-inverse. So, the reader must not get confused.
Consider $A=\begin{pmatrix}0 &1 &1\\1 &0 &1\\0 &0 &0 \end{pmatrix}$ and $X=\begin{pmatrix}0 &1 &0\\1 &0 &0\\0 &0 &1 \end{pmatrix}$. Then
                         $AXA=A$, $XAX=X$ but $(AX)^T\neq AX$ and  $(XA)^T\neq XA$.
This implies that $X$ (not unique) is a pseudo-inverse of $A$ but not the Moore-Penrose inverse. However, we can easily check that the Moore-Penrose inverse, $A^{\dagger}=\begin{pmatrix}-1/3 &2/3 &0\\2/3 &-1/3 &0\\1/3 &1/3 &0 \end{pmatrix}$.

For any $\lambda\in \mathbb{R}$, $\lambda^{\dagger}=\begin{cases}\frac{1}{\lambda}; ~\lambda\neq 0\\0; ~\lambda=0.\end{cases}$

Let $\sigma(A)$ be the set of all eigenvalues of $A\in \mathbb{R}^{n\times n}$ and $A^{\dagger}$ be the Moore-Penrose inverse of $A$, then we have the following comparison table that reflects the properties and behavior of the usual inverse and the Moore-Penrose inverse.


(Rectangular/Singular)

 (Nonsingular)

 $(A^{\dagger})^T=(A^{T})^{\dagger}$  $(A^{-1})^T=(A^{T})^{-1}$
 $(A^{\dagger})^{\dagger}=A$  $(A^{-1})^{-1}=A$
 $A^{\dagger}=(A^TA)^{\dagger}A^T$  $A^{\dagger}=A^{-1}$
 $(AB)^{\dagger}\neq B^{\dagger}A^{\dagger}$  $(AB)^{-1}= B^{-1}A^{-1}$
 $A^{\dagger}A \neq AA^{\dagger}\neq I$  $A^{-1}A= AA^{-1}= I$

 $\lambda\in \sigma(A)\nRightarrow \lambda^{\dagger}\in \sigma(A^{\dagger})$

 $\lambda\in \sigma(A)\Rightarrow \lambda^{-1}\in \sigma(A^{-1})$
 $B=S^{-1}AS\nRightarrow B^{\dagger}=S^{-1}A^{\dagger}S~~~~~~$  $B=S^{-1}AS\Rightarrow B^{-1}=S^{-1}A^{-1}S$

For nonsingular matrices $A$ and $B$, we know that the reverse order law $(AB)^{-1}=B^{-1}A^{-1}$ holds. But this is not true in case of the Moore-Penrose inverse, i.e., there may exist matrices $A$ and $B$ of any order such that $AB$ is defined but $(AB)^{\dagger}\neq B^{\dagger}A^{\dagger}$. Consider the following example:

Let $A=\begin{pmatrix}0.9572 &0.8003 &0.4218\\0.4854 &0.1419 &0.9157 \end{pmatrix}$ and $B=\begin{pmatrix}0.7922 &0.0357\\ 0.9595 &0.8491\\ 0.6557 &0.9340\end{pmatrix}$. Then,
$(AB)^{\dagger}=\begin{pmatrix}1.8108 &-2.0196\\-2.0442 &3.2869 \end{pmatrix}\neq \begin{pmatrix}1.0482 &-0.6218\\-0.7919 &0.9914 \end{pmatrix}=B^{\dagger}A^{\dagger}$.

 

Theorem 1. Let $A$ and $B$ be any two matrices such that $AB$ is defined. Then, the reverse order law $(AB)^{\dagger}=B^{\dagger}A^{\dagger}$ holds if and only if $R(A^TAB)\subset R(B)$ and $R(BB^TA^T)\subset R(A^T)$.

SOURCES: 
[1] Ben-Israel, A.; Greville, T. N. E., Generalized Inverses. Theory and Applications, Springer-Verlag, New York, 2003.

[2] Moore, E. H., On the reciprocal of the general algebraic matrix, Bull. Amer. Math. Soc. 26 (1920), 394–395.

[3] Penrose, R., A generalized inverse for matrices, Proc. Cambridge Philos. Soc. 51 (1955), 406–413.

[4] Rado, R., Note on generalized inverses of matrices, Proc. Cambridge Philos. Soc. 52 (1956), 600–601.

Normal Equations

 In this post we discuss about normal equations. But, first we prove the following result which will be useful in our subsequent argumen...