math111_logo Theory of Determinant

1. Definition of determinant

Let V be a vector space. Let D(v1, v2, ..., vn) be a function with n vectors of V as variables.

D is multilinear if it is linear in each vector variable

D( ..., u + v, ... ) = D( ..., u, ... ) + D( ..., v, ... ),
D( ..., cu, ... ) = c D( ..., u, ... ).

D is alternating if switching two variables introduces a negative sign

D( ..., u, ..., v, ... ) = - D( ..., v, ..., u, ... ).

Note that taking u = v in the alternating property gives D( ..., u, ..., u, ... ) = - D( ..., u, ..., u, ... ), so that D( ..., u, ..., u, ... ) = 0. Conversely, if D( ..., u, ..., u, ... ) = 0 for all u, then by the multilinearity of D, we have

0 = D( ..., u + v, ..., u + v, ... )
= D( ..., u, ..., u + v, ... ) + D( ..., v, ..., u + v, ... )
= D( ..., u, ..., u, ... ) + D( ..., u, ..., v, ... ) + D( ..., v, ..., u, ... ) + D( ..., v, ..., v, ... )
= D( ..., u, ..., v, ... ) + D( ..., v, ..., u, ... ),

which is the alternating property. Thus we proved the following.

If D is multilinear, then

D( ..., u, ..., v, ... ) = - D( ..., v, ..., u, ... ) ⇔ D( ..., u, ..., u, ... ) = 0.

A function D(A) of n by n matrices may be considered as a function with n column vectors of A as variables. Assuming D is multilinear and alternating, we will derive a formula for D.

First for a 2 by 2 matrix

A = [ a b ] = [ae1 + ce2, be1 + de2],
c d

the multilinearity and the alternating property give us

D(A) = D(ae1 + ce2, be1 + de2)
= aD(e1, be1 + de2) + cD(e2, be1 + de2) (D is linear in first variable)
= abD(e1, e1) + adD(e1, e2) + cbD(e2, e1) + cdD(e2, e2) (D is linear in second variable)
= adD(e1, e2) + cbD(e2, e1) (this consequence of alternating property)
= D(e1, e2) (ad - cb) (D is alternating)
= D(I2) detA,

which is a constant D(I2) multiplied to the determinant.

For the 3 by 3 case

A = [ a11 a12 a13 ] = [a11e1 + a21e2 + a31e3, a12e1 + a22e2 + a32e3, a13e1 + a23e2 + a33e3],
a21 a22 a23
a31 a32 a33

the multilinearity gives us

D(A) = D(a11e1 + a21e2 + a31e3, a12e1 + a22e2 + a32e3, a13e1 + a23e2 + a33e3)
= a11a12a13D(e1, e1, e1) + a11a12a23D(e1, e1, e2) + a11a12a33D(e1, e1, e3) + ...
= ∑1≤i,j,k≤3 ai1aj2ak3D(ei, ej, ek)

where i, j, k can be any of 1, 2, 3. By the alternating property, if any two of i, j, k are the same, then D(ei, ej, ek) = 0. This leaves only the terms with distinct i, j, k

D(A) = a11a22a33D(e1, e2, e3) + a11a32a23D(e1, e3, e2) + a21a12a33D(e2, e1, e3) + ...
= ∑i,j,k distinct ai1aj2ak3D(ei, ej, ek)

Since i, j, k are distinct numbers chosen from 1, 2, 3, they must be a permutation (rearrangement of positions) of (1, 2, 3). A permutation (i, j, k) can be changed to (1, 2, 3) by a sequence of transpositions (exchange of two positions). For example, (2, 3, 1) is changed to (1, 2, 3) by

(2, 3, 1) → (2, 1, 3) → (1, 2, 3),

where the blue color highlights the transpositions. By the alternating property, the sequence of transpositions tells us

D(e2, e3, e1) = - D(e2, e1, e3) = D(e1, e2, e3).

In general, we always have D(ei, ej, ek) = ± D(e1, e2, e3), where the sign is positive if the number of transpositions is even, and is negative if the number is odd. Now we can list all the terms with distinct (i, j, k) and how the signs are changed.

permutation transpositions to (1, 2, 3) term in D(A) equal to
(1, 2, 3) = (1, 2, 3) a11a22a33D(e1, e2, e3) a11a22a33D(e1, e2, e3)
(2, 3, 1) → (2, 1, 3) → (1, 2, 3) a21a32a13D(e2, e3, e1) a21a32a13D(e1, e2, e3)
(3, 1, 2) → (1, 3, 2) → (1, 2, 3) a31a12a23D(e3, e1, e2) a31a12a23D(e1, e2, e3)
(3, 2, 1) → (1, 2, 3) a31a22a13D(e3, e2, e1) -a31a22a13D(e1, e2, e3)
(2, 1, 3) → (1, 2, 3) a21a12a33D(e2, e1, e3) -a21a12a33D(e1, e2, e3)
(1, 3, 2) → (1, 2, 3) a11a32a23D(e1, e3, e2) -a11a32a23D(e1, e2, e3)

Originally, D(A) is the sum of all the terms in the third column. By using the alternating property, this is equal to the sum of all the terms in the fourth column. Comparing with the definition of determinant for 3 by 3 matrices, we get

D(A) = D(e1, e2, e3) detA = D(I3) detA,

which is a constant D(I3) multiplied to the determinant.

The pattern we saw in the computation of D(A) leads us to the following definition.

The determinant is a function det of n by n matrices, such that

Several issues need to be settled in order to rigrously justify the definition.

  1. The function as described in the definition really exists;
  2. The function is also unique (so there is no ambiguity in the definition);
  3. The function has all the properties (see here, here, here) we gave to the determinant early on.

[previous topic] [part 1] [part 2] [part 3] [part 4] [next topic]