Game programmers deal with homogeneous coordinates on a regular basis . They are a neat extension of standard three dimensional vectors and allow us to simplify various transforms. But do you really know, what they mean?
Usually gamedevs describe homogeneous coordinates like this: “You add a fourth coordinate to your Vec3 and call it w. Oh, and w is usually set to 1. Now you can do a rotation and translation using a single matrix*vector operation. Cool, isn’t it? 8) And when you do a projection, then the result will have w != 1. To get the projected point you have to divide by w so that in (x/w, y/w, z/w, 1) w equals 1 again. That’s called normalization!” Sometimes they might even add: “If w equals 0 then (x, y, z, 0) is called a ‘vector’ and if w equals 1 then it is called a ‘point’. Vectors are directions and won’t be affected by translations and points will.
Since my background is mathematics and not computer science, these explanations always made me cringe a little. Because they are kind of true and to be honest they are probably all you’ll ever need as a game programmer. If that’s fine with you then you can stop reading now. But if you want a bit more background information, then please continue.
You are still reading? Good. Homogeneous coordinates were introduced in linear algebra to describe projective spaces. Here’s what Mathworld has to say about projective spaces. If you are now thinking “WTF?!” then I agree. Mathematicians tend to describe simple things in a way so that no one but them can understand it (similar to lawyers and doctors.)
Let me translate this into English. To further simplify it, I will explain the concept using a 2D plane. This allows me to draw some diagrams to illustrate a projective plane . Homogeneous coordinates add another dimension, namely the w-coordinate. Since my 4D drawing skills are not that great it’s just easier to look at homogeneous coordinates for a 2D plane.
Step 1: Why was projective geometry invented? Answer: mathematicians do not like exceptions from rules. They literally despise them. In a plane the following statement is true: “Two distinct lines intersect at exactly one point, except when they are parallel.” Your average mathematician will now be running round “Oh, those damned parallel lines. I wish they wouldn’t exist!!”
Step 2: But one day a mathematician had an idea: “Wait a second, what if we can generalize Euclidean geometry? Let’s embed the 2D plane in 3D space. We’ll do this by calling all lines passing through origin (0,0,0) ‘points’ and then we’ll call all planes passing through origin ‘lines'” So basically what he suggested was to increase the dimension of primitives by one dimension. Points -> lines, lines -> planes. The whole 3D space is called the projective plane.
Step 3: If you think: “What are you talking about?” then don’t worry. It will all become clearer soon. (Oh and I am pointing the z axis down, so that the pictures later on will be easier to draw.)
Just consider this: every line through the origin can be written as k*(x,y,z) where (x,y,z) != (0,0,0). This also means that e.g. k*(1,2,3) represents the same line as k*(5,10,15) because (1,2,3) and (5,10,15) are collinear (which is math speech for being on the same line)
Step 4: Now let’s intersect some “points” (ie. green lines) and a “line” (ie. a red plane) with the plane z=1 (blue in this diagram)
As you can see, the green lines through the origin intersect the plane in points. The red plane intersects the plane in a line. The plane z=1 is called an affine view of the projective plane.
Step 5: So why did we do all this? Look at the “points” (ie. green lines) we talked about. All “points” which are lines of the form k*(x,y,z) where z is not zero will intersect our affine view, the z=1 plane.
k*(x,y,z) are called the homogeneous coordinates (aha!) of a “point” on a projective “plane” (which is actually the whole 3D space). For example if k*(12,8,4) are the coordinates for one “point” then (12,8,4) is one representative and (3,2,1) is its normalized form.
Now let’s remove the coordinate axes from the drawing. Then all that’s left is our affine view which looks just like a bog standard 2D plane with green points (which are actual points) and a red line (which is a real line).
Step 6: So what about points where z=0? Points with coordinates k*(x,y,0) are called “vanishing points” (or points at infinity) and they do not intersect the plane z=1.
What’s so nice about them? Let’s revisit intersections of lines and our panicking mathematicians. If you take two projective “lines” (which are planes through the origin) then they will always(!) intersect in a line through origin which as we now know is a “point”. If the “lines” are not parallel (ie. the intersection of their planes with z=1 are not parallel) then the resulting “point” will also intersect the plane z=1. If that sounds confusing then I hope this picture will clear things up.
If the “lines” are parallel this means that they will intersect in a “point”=line that is parallel to z=1, so it’ll be a “vanishing point”. E.g. these two red planes (=”lines”) intersect in k*(1,0,0) which is the x-axis.
All “vanishing points” together form a line: the plane z=0 which is called the “vanishing line”.
Step 7: Conclusion: In a projective plane we can finally say that “all lines intersect in exactly one point”. For mathematicians that’s super nice, it makes them sleep better.
Step 8: Why are lines parallel to z=1 called “vanishing points”? Remember that z=1 is just one affine view of our projective plane. We could take any other plane (not going through the origin) and intersect it with our “points” and “lines” to get another affine view. For example if you tilt the plane z=1 a bit so that it intersects the x-axis then you would be able to “see” the vanishing point which is the intersection of the two parallel lines in the picture above.
You can see this effect everyday. Look at train tracks. They are parallel, but when you look at them vanishing into the distance they seem to intersect in an inifinitely distant point. So projective geometry is actually the study of euclidean geometry when it is being projected.
Step 9: Another way of visualizing a projective plane would be imagining a sphere centered around the origin. Then all “lines” would result in circles and “points” in a pair of matching dots on the sphere. The advantage of this method is that there wouldn’t be be any difference between normal and vanishing points. But this model is less intuitive than the affine view.
Step 10: Summary: Homogeneous 3D coordinates (which are 4D) are just a generalization of the things I’ve explained above. Coordinates are extended with a fourth coordinate w. Two vectors (x,y,z,w) and (x’,y’,z’,w’) represent the same point in 3D space if one is a multiple of the other. (x,y,z,1) is called the normalized form. Points of the form (x,y,z,0) are again called “vanishing points”.
The distinction point vs vector is something that I’ve only heard in the context of computer science. But I admit that it does make sense to say that (x,y,z,0) are “directions” and (x,y,z,1) are “points”. When you think back at our projective plane then every “line” (which is a plane through origin) can be described by a point p=(x,y,1) and a point d=(dx,dy,0). Those two vectors certainly won’t be collinear and therefore always span a “line” (=plane). s*p will be a “point” that intersects z=1, and t*d is a vanishing point that can be interpreted as the direction of the line which is exactly how we define normal affine lines: p+k*d.
Projective geometry lets you write affine transforms using a single matrix operation (but you already knew that). It also let’s you write projections using a matrix operation. The simplest being (x,y,z,1) -> (x,y,z,z) . After renormalization we get (x/z,y/z,1,1). But keep in mind: In projective space (x/z,y/z,1,1) and (x,y,z,z) represent the same point.
The same analogy for affine views applies for a 3D space. Basically our world is only one single 3D affine view of a 3D projective space embedded in 4D! Try that next time to impress people with CSI techno babble!
(For the interested: What mathematicians also like about projective planes is that they fulfill the principle of duality. Simplified this means that any theorem about points and lines can be converted into a “dual” theorem by exchanging the words line and point and change “intersect” to “contained on”. The simplest would be “For two distinct points there is always a line that contains them.” The dual statement is “Two distinct lines always intersect in one point”. Obviously this is not true for affine lines because of parallel lines, but for a projective plane this statement is true. And it can be proven that for all statements that are true in a projective plane, the dual statement is also true. Why is this important? Because if a theorem is difficult to prove then maybe its dual counterpart is easier to prove which would then result in proving the dual version as well.)
Ok, that’s it. I hope I could demystify homogeneous coordinates and didn’t confuse you even more. It won’t change the way you’ll use them in your 3D engine, but maybe you’ll look at them in a different way now. Homogeneous coordinates are used in the field of projective geometry which generalizes affine geometry.
By the way: quaternions also haven’t been invented to let game developers do nice rotations. But that’s a different story …
Update (28th Jan 2013): The term “normalized” can be confusing when it comes to homogeneous coordinates. To clear this up: When we deal with homogeneous coordinates (x, y, z, w), then the normalized form means w==1. When talking about ordinary 3d coordinates, then normalized means length==1. In the above article whenever I use the term normalized, I mean normalized in the sense of normalized homogeneous coordinates, NOT unit length.