Formulas \begin{equation} d = \frac{|\ve{n} \cdot \ve{r} - a|}{|\ve{n}|} \label{plane_distance} \end{equation} and \begin{equation} d = \frac{|\ve{b} \times (\ve{r} - \ve{a})|}{|\ve{b}|} \label{line_distance} \end{equation} are useful in calculating the shortest distance between a point and a line or a plane in 3D space, but the uses of dot and cross products seem random and unable to be generalized into higher dimensions. This article discusses an interpretation of the two formulas, and hence provides a generalization. It also acts as an introduction to exterior product, which, the author believes, is the very idea behind these formulas.
There is something interesting about \eqref{plane_distance} where \(\ve{r}\) is the the position vector of a point, \(\ve{n} \cdot \ve{r} = a\) is the equation of a plane, and \(d\) is the shortest distance between them. Note that the dot product \(\ve{n} \cdot \ve{r}\) appears in both the plane equation and the distance formula. A simple way to think of this is that when a point is on the plane, the distance between the point and the plane must be zero, that is to say \[ \ve{n} \cdot \ve{r} - a = 0, \] which is equivalent to our plane equation.
Now let's consider the case of a point and a line. With the position vector of a point being \(\ve{r}\), and the equation of a line being \(\ve{r} = \ve{a} + \lambda \ve{b}\), the distance is given by \eqref{line_distance}, which means, following the same idea as above, the line must have another equation \begin{equation} \ve{b} \times \ve{r} = \ve{b} \times \ve{a}. \label{line_eq} \end{equation} In fact, if we have a vector equation of a plane, just like a vector equation of a line, \[ \ve{r} = \ve{a} + \lambda \ve{b} + \mu \ve{c}, \] the cartesian equation that corresponds to it would be \begin{equation} (\ve{b} \times \ve{c}) \cdot \ve{r} = (\ve{b} \times \ve{c}) \cdot \ve{a}, \label{plane_eq} \end{equation} which might suggest a different way of understanding the equations.
Firstly, the expression \((\ve{b} \times \ve{c}) \cdot \ve{a}\) is the same thing as the determinant of the matrix with \(\ve{b}\), \(\ve{c}\), and \(\ve{a}\) being its column vectors. This will become obvious later, but you can readily prove this algebraically. Also, a determinant is not only an enlargement factor, but also the signed volume of the parallelepiped (i.e. the 3D version of a parallelogram) formed by the column vectors. This is like the core idea behind determinants. With that said, \eqref{plane_eq} is effectively saying that the volume formed by vectors \(\ve{b}\), \(\ve{c}\), and \(\ve{r}\) equals the volume formed by vectors \(\ve{b}\), \(\ve{c}\), and \(\ve{a}\). For the two volumes to be equal, vector \(\ve{r}\) must have the same height \(h\) as vector \(\ve{a}\), relative to the shared base formed by vectors \(\ve{b}\) and \(\ve{c}\).
height vectorbeing the difference of vectors \(\ve{r}\) and \(\ve{a}\);
Let's look at lines now. For equation \eqref{line_eq}, we are using cross
product solely, which kind of like the directional area,
compared to
volumes calculated by determinants. A cross product of two vectors has its
magnitude being the area of the parallelogram formed by the two vectors, and
its direction being normal to both vectors. Here, since \(\ve{b} \times
\ve{r}\) has the same direction as \(\ve{b} \times \ve{a}\), the plane of
vectors \(\ve{b}\) and \(\ve{r}\) is actually the same as the plane of
vectors \(\ve{b}\) and \(\ve{a}\). That is to say, vectors \(\ve{r}\),
\(\ve{a}\), \(\ve{b}\) are coplanar. The magnitudes of the two cross
products are also equal. Since they are areas of two parallelograms on the
same plane with the same base \(\ve{b}\),
Note that we basically utilized the idea of volumes and directional areas to derive the equation for a linear object (i.e. a line, plane, or any higher-dimensional object), and to calculate the distance from a point to that object. Therefore, it is potentially helpful to consider the properties that directional hypervolumes might have.
Let's denote the volume of the parallelepiped formed by vectors \(\ve{a}\), \(\ve{b}\), and \(\ve{c}\) as \(V(\ve{a}, \ve{b}, \ve{c})\), and the area formed by vectors \(\ve{a}\) and \(\ve{b}\) as \(A(\ve{a}, \ve{b})\). It seems that instead of calculating the volume \(V(\ve{a}, \ve{b}, \ve{c})\) directly, we could first of all calculate the directional area \(A(\ve{a}, \ve{b})\), and then combine the magnitude of that area as well as its direction with the third vector \(\ve{c}\), namely taking taking vectors \(\ve{a}\) and \(\ve{b}\) as the base, and multiplying it by the height. We could also take vectors \(\ve{b}\) and \(\ve{c}\) or vectors \(\ve{a}\) and \(\ve{c}\) as the base, multiplied by the height of the remaining vector, getting the same volume. This fact suggests that the directional hypervolume of vectors follows something similar to associative law, in other words, \(V(A(\ve{a}, \ve{b}), \ve{c}) = V(\ve{a}, A(\ve{b}, \ve{c}))\); but apparently, associativity can only exist in binary operations, so let's try to redefine the calculation of hypervolumes as a binary operation, and see what is going to happen by considering a few properties of this operation.
Definition In an \(n\)-dimensional vector space, denote the directional hypervolume of the parallelotope (a higher-dimensional generalization of parallelograms and parallelepipeds) formed by the vectors \(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_m\) as \((\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m)\), where \(m \le n\). The operation \(\wedge\) is known as exterior product, or wedge product.
Property 1 Exterior product satisfies associative law, i.e. \[ (\ve{a} \wedge \ve{b}) \wedge \ve{c} = \ve{a} \wedge (\ve{b} \wedge \ve{c}). \] This is implied when we wrote \[ \ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m. \]
Property 2 Exterior product is negated when any two vectors are swapped, i.e. \[ \ldots \wedge \ve{a} \wedge \ldots \wedge \ve{b} \wedge \ldots = - (\ldots \wedge \ve{b} \wedge \ldots \wedge \ve{a} \wedge \ldots). \]
If we swap any two vectors, they are still the same collection of vectors, but in a different orientation, so we will have the same magnitude, with the exact opposite direction. An important implication of this property is that \begin{equation} \ldots \wedge \ve{a} \wedge \ldots \wedge \ve{a} \wedge \ldots = 0. \label{alternating_to_zero} \end{equation}
Property 3 Exterior multiplication distributes over addition, i.e. \[ \ldots \wedge (\alpha + \beta) \wedge \ldots = (\ldots \wedge \alpha \wedge \ldots) + (\ldots \wedge \beta \wedge \ldots), \] where \(\alpha\) and \(\beta\) can be either vectors or exterior products of the same number of vectors.
This is property cannot be proved directly, but the words below are attempting to give you an idea of where this comes from. Let's firstly consider a simpler case, where \(m\) equals \(n\), in other words, the dimension of the space equals the number of vectors, and we will only consider addition between vectors, not exterior products, since we haven't defined additions for them yet. With that given, an exterior product is basically an \(n\)-dimensional hypervolume in an \(n\)-dimensional space, with no direction but a sign representing the orientation of the vectors. Then if you divide one of the vectors into two vectors, the height of the original vector would just equal the sum of the heights of the two new vectors, relative to the base formed by the rest of the vectors.
A determinant may be considered as a scalar, but generally speaking, most exterior products have directions, despite the fact that they are not simply vector directions. That leaves us a question: how do we know if two directions are the same, and what exactly makes two exterior products equal to each other. We may define two exterior products of the same quantity of vectors, \(\alpha\) and \(\beta\), to be equal if and only if \[ \alpha \wedge \gamma = \beta \wedge \gamma, \] where \(\gamma\) is an arbitrary exterior product. Since we already have determinants defined, an easy way to utilize this definition is to arrange the number of vectors that make up \(\gamma\) so that \(\alpha \wedge \gamma\) and \(\beta \wedge \gamma\) are both determinants, which we already know how to calculate.
After being able to verify whether two exterior products are equal, we still have to find a way in which we can determine the sum of two exterior products. Following the same idea, we define Property 3 in an equivalent form: the exterior products \(\alpha\) and \((\beta + \gamma)\) are to be equal if and only if \[ \alpha \wedge \delta = (\beta \wedge \delta) + (\gamma \wedge \delta), \] where that \(\delta\) is again an arbitrary exterior product. Note that this definition is consistent with vector addition, since, as we have seen in \eqref{prop_3_spec}, exterior multiplication already distributes over vector addition.
Hitherto, with its three properties, exterior product has already been properly defined. Let's look at an example of how the calculation is done.
Example Calculate \[ \begin{bmatrix} a \\ b \\ c \end{bmatrix} \wedge \begin{bmatrix} d \\ e \\ f \end{bmatrix}. \]
Rewrite each vector into the form \(a \ve{i} + b \ve{j} + c \ve{k}\), applying Property 3, and then we have \[ (ae - bd) (\ve{i} \wedge \ve{j}) + (bf - ce) (\ve{j} \wedge \ve{k}) + (cd - af) (\ve{k} \wedge \ve{i}), \] which has its components exactly the same as the cross product of the two vectors. This makes sense because we are aiming to calculate directional hypervolumes, and cross product is basically directional area.
We still have to figure how to find the magnitude of an exterior product, i.e. its hypervolume. In the case of cross products, the norm of the resulting vector is the maginitude of directional area. Therefore, we might make a reasonable inference that adding the square of every individual component in an exterior product would give us the the square of that hypervolume, in the same way as taking the norm of a vector. An informal proof is provided below, but you may also take this result as granted, and skip the proof part.
Define the square root of the sum of squares of all components of an exterior product as its norm, written as \(\norm{\alpha}\). For instance, the norm of the exterior product in the last example is \[ \sqrt{(ae - bd)^2 + (bf - ce)^2 + (cd - af)^2}. \] Denote the hypervolume formed by vectors \(\ve{a}_1\) to \(\ve{a}_m\) as \(V(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_m)\); we might also use this notation in a subtler way, as we already have. We will prove that the magnitude of a hypervolume equals the norm of the exterior product that corresponds to it, in other words, \[ \norm{\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m} = V(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_m), \] or equivalently, as \(V\) is positive, \begin{equation} \label{norm_vol} \norm{\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m}^2 = V^2(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_m). \end{equation}
Projections We will use the idea of projection quite often, so it is worth to first of all mention some ideas about it.
To project an object in some direction, the easiest way is to drop the components that are parallel it, since all those components will be zero after the projection. Components of an exterior product can be considered as projections. For an exterior product of \(m\) vectors in an \(n\)-dimensional space, it is not hard to see that there are \(^nC_m\) components in total. Each of them contains \(m\) base vectors, and it may be considered as the projection of the object in all directions except those.
We will use mathematical induction. Suppose that in a vector space of dimension \((n - 1)\), equation \eqref{norm_vol} always holds. We are going to show that the same equation holds in an \(n\)-dimensional vector space.
When there are \(n\) vectors, there is only one component, the determinant, in the exterior product, and it equals the hypervolume \(V(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_n)\), so the relation \eqref{norm_vol} is obviously true.
When there are less than \(n\) vectors, suppose there are \(m\) vectors, \(\ve{a}_1\) to \(\ve{a}_m\), where \(m\) is less than \(n\). Given one of the standard unit vectors, \(\ve{e}\), each vector can be resolved into the form \(\ve{a}'_i + a\ve{e}\), and \(\ve{a}'_i\) is a vector that does not have any \(\ve{e}\) component; then, the left hand side of \eqref{norm_vol} can be rewritten as \[ \norm{\ve{a}'_1 \wedge \ve{a}'_2 \wedge \ldots \wedge \ve{a}'_m + \omega \wedge \ve{e}}^2, \] where \(\omega\) is an exterior product. Since the two terms has no components in common, it can be further decomposed, so we have \begin{equation} \label{two_proj} \norm{\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m}^2 = \norm{\ve{a}'_1 \wedge \ve{a}'_2 \wedge \ldots \wedge \ve{a}'_m}^2 + \norm{\omega \wedge \ve{e}}^2. \end{equation}
The expression \((\ve{a}'_1 \wedge \ve{a}'_2 \wedge \ldots \wedge \ve{a}'_m)\), has no \(\ve{e}\) components, so it is basically of dimension \((n - 1)\), and after applying our assumption for mathematical induction, we get \begin{equation} \label{proj_h_volume} \norm{\ve{a}'_1 \wedge \ve{a}'_2 \wedge \ldots \wedge \ve{a}'_m}^2 = V^2(\ve{a}'_1, \ve{a}'_2, \ldots, \ve{a}'_m). \end{equation} The vector \(\ve{a}'_i\) can be obtained by dropping the \(\ve{e}\) component of vector \(\ve{a}_i\), which is equivalent to projecting vector \(\ve{a}_i\) onto a space that is perpendicular to \(\ve{e}\).
Following the same idea, the expression \(\norm{\omega \wedge \ve{e}}^2\)
can be considered as the sum of squares of a few projections. Denote the
factor of a projection as \(k\), and then we have
\[ \begin{align*} \norm{\omega \wedge \ve{e}}^2 & = \sum \text{projection}^2
\\ & = \sum (k_iV)^2 \\ & = V^2 \sum k_i^2. \end{align*} \]
If we cut the object into slices, then since they are all parallel to each
other, factors \(k\)'s would be identical for all of them. Let's find \(\sum
k_i^2\) for a specific slice. Consider a slice obtained by two cuts
perpendicular to the \(\ve{e}\) direction, with an infinitasimally small
distance between them, so it is almost of dimension \((m - 1)\), one
dimension lower than the original object. Denote its height
in the
\(\ve{e}\) direction as \(\delta h\), its width
perpendicular to the
cuts as \(\delta w\), its \((m - 1)\)-dimensional hypervolume as \(\lambda\),
and the projection of \(\lambda\) as \(l\).
Finally, add \eqref{cos_2} and \eqref{sin_2}, and from \eqref{two_proj}, we obtain \[ \norm{\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m}^2 = V^2(\ve{a}_1, \ve{a}_2, \ldots, \ve{a}_m).\] It is obvious that we can show \eqref{norm_vol} in a lower dimension, so by mathematical induction, we have proved that \eqref{norm_vol} is true in all dimensions.
■
We are now able to write the distance formulas in a more general way. For an object with a vector equation \[ \ve{r} = \ve{a} + \sum_{i = 1}^{m} s_i \ve{a}_i, \] denote \((\ve{a}_1 \wedge \ve{a}_2 \wedge \ldots \wedge \ve{a}_m)\) as \(\alpha\), so the equation can be rewritten equivalently in wedge notation as \[ \alpha \wedge \ve{r} = \alpha \wedge \ve{a}, \] and the distance between a point with position vector \(\ve{r}\) and the object is \[ \frac{\norm{\alpha \wedge (\ve{r} - \ve{a})}}{\norm{\alpha}}. \]