matrix multiplication
Tuesday, August 25th, 2009I never can remember exactly how to multiply matrices. So, I’m going to explain it to you, and hopefully, that will explain it to me.
One of the first questions you might have is: “What is a matrix, and why would I want to multiply it?”
When we talk about a matrix in computer science, and probably in engineering, physics, chemistry, and probably in a movie starring Keanu Reaves and an absurd quantity of latex, we’re usually talking about an array of numbers. Like this:
1010
1001
1111
That’s a 4×3 array of numbers, a two dimensional matrix. A matrix is just an m dimensional group of numbers.
Typically, in comsci, we start labeling dimensions at i. So the array in Fig. 1 has two dimensions, i and j. If it were a three dimensional array, I would add a dimension k and so on.
A one dimensional matric is called a vector.
So why do we want to multiply matrices? There are a lot of reasons, but the one that is simplest for me to keep in my head is that a point on a screen is represented by a matrix, and to move that point through a rotation, translation, or scale, you multiply it times a transformation matrix. In other words, if you are playing a video game, using an illustration program like inkscape, or rendering an animation, there’s a whole lot of matrix multiplication going on!
Because matrix multiplication is fundamental to so many problems that computers have to solve, it’s a popular problem for discussing algorithm design and models of computation. It acts as a baseline measure, much like sorting. In parallel computation this is very important, because there is no settled and accepted general model for parallel computation and because parallel computation for graphics processing is so common. Your computer almost certainly has a graphics card in it processes in parallel.
And that’s why computer scientists should know how to multiply matrices. Because otherwise, you can’t talk intelligently with other computer scientists about important things, and you won’t understand papers that discuss those problems either.
There’s a great scene in Stand and Deliver in which James Olmos recites “A negative times a negative is a positive, a positive times a positive is a positive, a negative times a positive is a positive” and all the kids chant with him. After a few cycles, he asks “Why?”
The rule for vector multiplication are: a vector times a vector is a matrix. A vector times a vector is a scalar. A matrix times a matrix is a matrix.
This is confusing. The confusion is because there are two ways to multiply vectors in linear algebra, the inner product (also called the dot product) and the outer product. So, given an vector x and a vector y, there are two ways we can multiply it together. x · y will yeild a single (scalar) value. x × y will yield a matrix value.
Given two vectors of length n, such as
| [ | 1 | 2 | 3 | 4 | 5 | ] | [ | 6 | ||
| 7 | ||||||||||
| 8 | ||||||||||
| 9 | ||||||||||
| 0 | ] |
How do we multiply them together? If we want the dot product (x ⋅ y or xTy), we do this: (1 × 6)+(2 × 7)+(3 × 8)+(4 × 9)+(5 × 0), or 6+14+24+36+0, or [80]. A vector times a vector is a scalar.
The outer product is a little different. When we take the outer product, written as xyT or x × y, we would get a matrix:
| 1*6 | 1*7 | 1*8 | 1*9 | 1*0 |
| 2*6 | 2*7 | 2*8 | 2*9 | 2*0 |
| 3*6 | 3*7 | 3*8 | 3*9 | 3*0 |
| 4*6 | 4*7 | 4*8 | 4*9 | 4*0 |
| 5*6 | 5*7 | 5*8 | 5*9 | 5*0 |
Which brings us to the question of multiplying matrices. So, if you multiply a matrix times a matrix, you don’t get a super matrix, or a vector… you just get another matrix. How does this work?
Well, if you have a matrix with i rows and j columns, you can multiply it times a matrix with j rows and k columns to get a matrix with i rows and k columns. So to find the value of a cell i,kin that matrix, you just take the dot product of the ith row with the kth column.
Easy as pie.
Why this is so… honestly I’m not sure. The theory of linear algebra is a closed book to me. But I think I’ll remember this for qualifiers now.