Generalization
We begin with the most concrete form of vector spaces, one that is closely in tune with what we learned when we were first introduced to two- and three-dimensional vectors using real numbers as scalars. However, we have seen that the complex numbers are a perfectly legitimate and useful field of numbers to work with. Therefore, our concept of a vector space must include
the selection of a field of scalars. The requirements for such a field are that it have binary operations of addition and multiplication that satisfy the usual arithmetic laws: Both operations are closed, commutative, and associative;
have identities and satisfy distributive laws. And there exist additive inverses for all elements and multiplicative inverses for nonzero elements. Although other fields are possible, for our purposes the only fields of scalars areF=R andF=C.Unless there is some indication to the contrary, the field of scalars will be assumed to be the default, the real numbersR.However it should be noted that there are other fields of importance such asQ, the field of rational numbers, or the finite fieldFqof integers modulop, wherepis a prime number.
The latter has significant applications in coding theory and cryptology.
A formal definition of vector space will come later. For now we describe a “vector space” over a field of scalars F as a nonempty set V of vectors of the same size, together with the binary operations of scalar multipli- cation and vector addition, subject to the following laws: For all vectors u,v ∈V and scalarsa∈F, (a) (Closure of vector addition)u+v∈V.(b)
Vector Negatives and Subtraction (Closure of scalar multiplication) av ∈ V. For vectors u,v, we define
−u= (−1)uandu−v=u+ (−v).
Very simple examples are R2 and R3, which we discuss below.
Another is any line through the origin in R2, which takes the form V = {c(x0, y0)|c∈R}.
Geometrical vector spaces. We may have already seen the Geometrical Vectors vector idea in geometry or calculus. In those con- texts, a vector was supposed to represent a direction and a magnitude in two- or three-dimensional space, which is not the same thing as a point, that is, location in space. At first, one had to deal with these intuitive definitions until they could be turned into something more explicitly computational, namely the displacements of a vector in coordinate directions. This led to the following two vector spaces over the field of real numbers:
R2={(x, y)|x, y∈R}, R3={(x, y, z)|x, y, z∈R}.
The distinction between vector spaces and points becomes a little hazy here.
Once we have set up a coordinate system, we can identify each point in two- or three-dimensional space with its coordinates, which we write in the form of a tuple, i.e., a vector. The arithmetic of these two vector spaces is just the standard coordinatewise vector addition and scalar multiplication. One can visualize the direction represented by a vector(x, y)by drawing an arrow, i.e., directed line segment, from the origin of the coordinate system to the point with coordinates(x, y). The magnitude of this vector is the length of the arrow, which is just
x2+y2. The arrows that we draw onlyrepresent the vector we are thinking of. More than one arrow could represent the same vector as in Figure 3.1. The definitions of vector arithmetic could be represented
geometrically too. For example, to get the sum of vectorsuandv, one places a representative of vector u in the plane, then places a representative of v whose tail is at the head of v, and the vector u+v is then represented by the third leg of this triangle, with base at the base of u. To get a scalar multiple of a vector w one scales w in accordance with the coefficient. See Figure3.1. Though instructive, this version of vector addition is not practical for calculations.
u−v v
w
u
w 2w u
v u+v
3 2 1
2 3
1 y
O 4 5
P
Q
R
Fig. 3.1: Displacement vectors and graphical vector operations.
As a practical matter, it is also convenient to draw directed line segments connecting points; such a vector is called a displacement vector. For exam- ple, see Figure 3.1 for representatives of a displacement vector w = −−→
P Q Displacement and Position Vector from the point P with coordinates
(1,2)to the pointQwith coordinates
(3,3).One of the first nice outcomes of vector arithmetic is that this displace- ment vector can be deduced from a simple calculation,
w= (3,3)−(1,2) = (3−1,3−2) = (2,1). A displacement vector of the formw=−−→
OR, whereOis the origin, is called a position vector.
Geometrical vector spaces look a lot like the object we studied in Chapter2 with the tuple notation as a shorthand for column vectors. The arithmetic of R2 and R3 is the same as the standard arithmetic for column vectors. Now, even though we can’t draw real geometrical pictures of vectors with four or more coordinates, we have seen that larger vectors are useful in our search for solutions of linear systems. So the question presents itself, why stop at three?
The answer is that we won’t! We will use the familiar pictures of R2 andR3 to guide our intuition about vectors in higher-dimensional spaces, which we now define.
Definition 3.1.Standard Real Vector Space Thestandard vector space of dimension n, wherenis a positive integer, over the reals is the set of vectors
Rn={(x1, x2, . . . , xn)|x1, x2, . . . , xn∈R}
together with the standard vector addition and scalar multiplication. (Recall that (x1, x2, . . . , xn)is shorthand for the column vector [x1, x2, . . . , xn]T.)
We see immediately from the definition that the required closure properties of vector addition and scalar multiplication hold, so these really are vector spaces in the sense defined above. The standard real vector spaces are often called the real Euclidean vector spaces once the notion of a norm (a notion of length covered in the next chapter) is attached to them.
Homogeneous vector spaces. Graphics specialists and others find it important to distinguish between geometrical vectors and points (locations) in three-dimensional space. They want to be able to simultaneously manipulate these two kinds of objects, in particular, to do vector arithmetic and operator manipulation that reduces to the ordinary vector arithmetic when applied to geometrical vectors.
Here’s the idea that neatly does the trick: Set up a coordinate sys- tem and identify geometrical vectors in the usual way, that is, by their coordinates x1, x2, x3. Do the same with geometrical points. To distinguish between the two, embed them as vectors x = (x1, x2, x3, x4) ∈ R4 with the understanding that if x4 = 0, then x represents a geometrical vec- tor, and if x4 = 0, then x represents a geometrical point. The vector x is called a homogeneous vector and R4 with the standard vector operations is called homogeneous space. If x4 = 0, then x represents a point whose
Homogeneous Vectors and Points coordinates are x1/x4, x2/x4, x3/x4, and this point is said to be obtained from the vectorxbynormalizing the vector. Notice that the line through the origin that passes through the pointP = (x1, x2, x3,1)consists of vectors of the form (tx1, tx2, tx3, t), wheret is any real number. Conversely, any such nonzero vector is normalized (tx1/t, tx2/t, tx3/t, t/t) = P. In this way, such lines through the origin correspond to points. (Readers who have seen pro- jective spaces before may recognize this correspondence as identifying finite points in projective space with lines through the origin in R4. The ideas of homogeneous space actually originate in projective geometry.)
Now the standard vector arithmetic forR4 allows us to do arithmetic on geometrical vectors, for if x = (x1, x2, x3,0) and y = (y1, y2, y3,0) are such vectors, then as elements of R4 we have
x+y= (x1, x2, x3,0) + (y1, y2, y3,0) = (x1+y1, x2+y2, x3+y3,0), cx=c(x1, x2, x3,0) = (cx1, cx2, cx3,0),
which result in geometrical vectors.
Example 3.1.Interpret the result of adding a point and vector in homoge- neous space.
Solution.Notice that we can’t add two points and obtain a point without some extra normalization; however, addition of a pointx= (x1, x2, x3,1)and vectory= (y1, y2, y3,0)yields
x+y= (x1, x2, x3,1) + (y1, y2, y3,0) = (x1+y1, x2+y2, x3+y3,1). This has a rather elegant interpretation as the translation of the point x by the vector y to another point x+y. It reinforces the idea that geometrical vectors are simply displacements from one point to another.
We can’t draw pictures ofR4, of course. But we can get an intuitive feeling for how homogenization works by moving down one dimension. RegardR3as homogeneous space for the plane that consists of points(x1, x2,1). Figure3.2 illustrates this idea.
x1
x2
x3
(0,0,0) (0,0,1)
x+y= (x1+y1, x2+y2,1) x= (x1, x2,1)
(tx1, tx2, t)
y= (y1, y2,0)
Fig. 3.2: Homogeneous space for planar points and vectors.
As in Chapter2, we don’t have to stop at the reals. For those situations in which we want to use complex numbers, we have the following vector spaces:
Definition 3.2.Standard Complex Vector Space Thestandard vector space of dimension n, where n is a positive integer, over the complex numbers is the set of vectors
Cn={(x1, x2, . . . , xn)|x1, x2, . . . , xn∈C}
together with the standard vector addition and scalar multiplication.
The standard complex vector spaces are also sometimes called Euclidean spaces. It’s rather difficult to draw honest spatial pictures of complex vectors.
The spaceC1 isn’t too bad: Complex numbers can be identified by points in
the complex plane. What aboutC2? Where can we put(1 + 2i,3−i)? It seems that we need four real coordinates, namely the real and imaginary parts of two independent complex numbers, to keep track of the point. This is too big to fit in real three-dimensional space, where we have only three independent coordinates. We don’t let this technicality deter us. We can still draw fake vector pictures of elements of C2 to help our intuition, but do the algebra of vectors exactly from the definition.
Example 3.2.Find the displacement vector from the point P with coordi- nates(1 + 2i,1−2i)to the pointQwith coordinates (3 + i,2i).
Solution.We compute
−−→P Q= (3 + i,2i)−(1 + 2i,1−2i)
= (3 + i−(1 + 2i),2i−(1−2i))
= (2−i,−1 + 4i).
Abstraction
We can see hints of a problem with the coordinate way of thinking about geometrical vectors. Suppose the vector in question represents a force. In one set of coordinates the force might have coordinates (1,0,1). In another, it could have coordinates(0,1,1). Yet the the force doesn’t change, only its rep- resentation. This suggests an idea: Why not think about geometrical vectors as independent of any coordinate representation? From this perspective, geo- metrical vectors are really more abstract than the row or column vectors we have studied so far.
This line of thought leads us to consider an abstraction of our concept of vector space. First we have to identify the essential vector space properties, enough to make the resulting structure rich, but not so much that it is tied down to an overly specific form. We saw in Chapter2that many laws hold for the standard vector spaces. The essential laws were summarized in Section2.1.
These laws become the basis for our definition of an abstract vector space.
About notation: Just as in matrix arithmetic, for vectorsu,v, we under- stand thatu−v=u+ (−v). We also suppress the dot (ã) of scalar multipli- cation and usually writeauinstead ofaãu.
Abstract Vector Space An (abstract) vector space is a nonempty set V of elements called vectors, together with operations of vector addition (+) and scalar multiplication ( ã ), such that the following laws hold for all vectors u,v,w∈V and scalars a, b∈F:
(1) (Closure of vector addition)u+v∈V.
(2) (Commutativity of addition)u+v=v+u.
(3) (Associativity of addition)u+ (v+w) = (u+v) +w.
(4) (Additive identity) There exists an element0∈V such thatu+0=u= 0+u.
(5) (Additive inverse) There exists an element−u∈V such thatu+ (−u) = 0= (−u) +u.
(6) (Closure of scalar multiplication)aãu∈V.
(7) (Distributive law)aã(u+v) =aãu+aãv. (8) (Distributive law)(a+b)ãu=aãu+bãu. (9) (Associative law)(ab)ãu=aã(bãu). (10) (Monoidal law)1ãu=u.
Examples of these abstract vector spaces are the standard spaces just introduced, and these will be our main focus in this section. Yet, if we squint a bit, we can see vector spaces everywhere. There are other, entirely nonstandard examples, that make the abstraction worthwhile. Here are just a few such examples. Our first example is closely related to the standard spaces, though strictly speaking it is not one of them. It blurs the distinction between matrices and vectors in Chapter2, since it makes matrices into “vectors” in the abstract sense of the preceding definition.
Example 3.3.LetRm,ndenote the set of allm×nmatrices with real entries.
Show that this set, with the standard matrix addition and scalar multiplica- tion, forms a vector space.
Solution. We know that any two matrices of the same size can be Matrices as Vector Space added to yield a matrix of that size. Likewise,
a scalar times a matrix yields a matrix of the
same size. Thus, the operations of matrix addition and scalar multiplication are closed. Indeed, these laws and all the other vector space laws are sum- marized in the laws of matrix addition and scalar multiplication of page 70.
The next example is important in many areas of higher mathematics and is quite different from the standard vector spaces. Yet it is a perfectly legitimate vector space. All the same, at first it seems odd to think of functions as
“vectors” even though this is meant in the abstract sense.
Example 3.4.LetC[0,1]denote the set of all real-valued functions that are continuous on the interval [0,1] and use the standard function addition and scalar multiplication for these functions. That is, forf(x), g(x)∈C[0,1]and real numberc, we define the functions f+gand cf by
(f +g) (x) =f(x) +g(x) (cf) (x) =c(f(x)).
Show thatC[0,1]with these operations is a vector space.
Solution.We setV =C[0,1]and check the vector space axioms for thisV.
Function Space For the rest of this example, we let f, g, h be arbitrary elements of V. We know from calculus that the sum of any two continuous functions is continuous and that any constant times a continuous function is also continuous. Therefore, the closure of addition and that of scalar multiplication hold. Now for allxsuch that0≤x≤1, we have from the definition and the commutative law of real number addition that
(f +g)(x) =f(x) +g(x) =g(x) +f(x) = (g+f)(x).
Since this holds for all x, we conclude that f +g = g+f, which is the commutative law of vector addition. Similarly,
((f+g) +h)(x) = (f+g)(x) +h(x) = (f(x) +g(x)) +h(x)
=f(x) + (g(x) +h(x)) = (f+ (g+h))(x).
Since this holds for allx,we conclude that (f +g) +h=f + (g+h), which is the associative law for addition of vectors.
Next, if0 denotes the constant function with value0, then for anyf ∈V we have that for all 0≤x≤1,
(f+ 0)(x) =f(x) + 0 =f(x).
(We don’t write the zero element of this vector space in boldface because it’s customary not to write functions in bold.) Since this is true for allxwe have that f + 0 = f, which establishes the additive identity law. Also, we define (−f)(x) =−(f(x))so that for all0≤x≤1,
(f+ (−f))(x) =f(x)−f(x) = 0,
from which we see that f + (−f) = 0. The additive inverse law follows. For the distributive laws note that for real numbersa, band continuous functions f, g∈V, we have that for all0≤x≤1,
a(f+g)(x) =a(f(x) +g(x)) =af(x) +ag(x) = (af+ag)(x), which proves the first distributive law. For the second distributive law, note that for all0≤x≤1,
((a+b)g)(x) = (a+b)g(x) =ag(x) +bg(x) = (ag+bg)(x),
and the second distributive law follows. For the scalar associative law, observe that for all0≤x≤1,
((ab)f)(x) = (ab)f(x) =a(bf(x)) = (a(bf))(x), so that(ab)f =a(bf), as required. Finally, we see that
(1f)(x) = 1f(x) =f(x),
from which we have the monoidal law 1f = f. Thus, C[0,1] with the pre-
scribed operations is a vector space.
The preceding example could have just as well beenC[a, b], the set of all continuous functions on the interval a ≤x ≤b, where a < b. Indeed, most of what we say aboutC[0,1]is equally applicable to the more general space C[a, b]. We usually stick to the interval 0 ≤ x≤ 1 for simplicity. The next example is also based on the “functions as vectors” idea.
Example 3.5.One of the two sets V = {f(x)∈C[0,1]|f(1/2) = 0} and W ={f(x)∈C[0,1]|f(1/2) = 1}, with the operations of function addition and scalar multiplication as in Example 3.4, forms a vector space over the reals, while the other does not. Determine which.
Solution.Notice that we don’t have to check the commutativity of addi- tion, associativity of addition, distributive laws, associative law, or monoidal law. The reason is that we already know from the previous example that these laws hold when the operations of the space C[0,1]are applied to any elements ofC[0,1], whether they belong toV or W or not. So the only laws to be checked are the closure laws and the identity laws.
First letf(x), g(x)∈ V and let c be a scalar. By definition of the setV we have that f(1/2) = 0andg(1/2) = 0.Add these equations together and we obtain
(f+g)(1/2) =f(1/2) +g(1/2) = 0 + 0 = 0.
It follows thatV is closed under addition with these operations. Furthermore, if we multiply the identity f(1/2) = 0by the real numbercwe obtain that
(cf)(1/2) =cãf(1/2) =cã0 = 0.
It follows that V is closed under scalar multiplication. Now the zero function definitely belongs to V, since this function has value 0 at any argument.
Therefore, V contains an additive identity element. Finally, we observe that the negative of a functionf(x)∈V is also an element ofV, since
(−f)(1/2) =−1ãf(1/2) =−1ã0 = 0.
Therefore, the setV with these operations satisfies all the vector space laws and is an (abstract) vector space in its own right.
When we examine the setW in a similar fashion, we run into a roadblock at the closure of addition law. Iff(x), g(x)∈W, then by definition of the set W we have that f(1/2) = 1 and g(1/2) = 1. Add these equations together and we obtain
(f+g)(1/2) =f(1/2) +g(1/2) = 1 + 1 = 2.
This means thatf+gis not inW,so the closure of addition fails. We need go no further. If only one of the vector space axioms fails, then we do not have a vector space. Hence,W with these operations is not a vector space.
There is a certain economy in this example. A number of laws did not need to be checked by virtue of the fact that the sets in question were subsets of existing vector spaces with the same vector operations. Here are two more examples that utilize this economy.
Example 3.6.Show that the setP2 of all polynomial functions of degree at most two with the standard function addition and scalar multiplication forms a vector space.
Solution. Polynomial functions are continuous functions. As in the pre- ceding example, we don’t have to check the commutativity of addition, asso- ciativity of addition, distributive laws, associative law, or monoidal law since we know that these laws hold for all continuous functions. Let f, g∈ P2, say f(x) =a1+b1x+c1x2andg(x) =a2+b2x+c2x2. Letcbe any scalar. Then we have both
(f +g) (x) =f(x) +g(x) = (a1+a2) + (b1+b2)x+ (c1+c2)x2∈ P2
and
(cf) (x) =cf(x) =c
a1+b1x+c1x2
=ca1+cb1x+cc1x2∈ P2. Hence,P2is closed under the operations of function addition and scalar mul- tiplication. Furthermore, the zero function is a constant, hence a polynomial of degree at most two. Also, the negative of a polynomial of degree at most two is also a polynomial of degree at most two. So all of the laws for a vector space are satisfied and P2 is an (abstract) vector space.
Example 3.7.Show that the setSnof alln×nreal symmetric matrices with the standard matrix addition and scalar multiplication form a vector space.
Solution. Just as in the preceding example, we don’t have to check the commutativity of addition, associativity of addition, distributive laws, associa- tive law, or monoidal law since we know that these laws hold for any matrices, symmetric or not. Now letA, B∈Sn.This means by definition thatA=AT andB =BT.Letcbe any scalar. Then we have both
(A+B)T =AT +BT =A+B and
(cA)T =cAT =cA.
It follows that the setSnis closed under the operations of matrix addition and scalar multiplication. Furthermore, the zeron×nmatrix is clearly symmetric,
so the setSn has an additive identity element. Finally,(−A)T =−AT =−A, so each element of Sn has an additive inverse as well. Therefore, all of the laws for a vector space are satisfied, so Sn together with these operations is
an (abstract) vector space.
One of the virtues of abstraction is that it allows us to cover many cases with one statement. For example, there are many simple facts that are deducible from the vector space laws alone. With the standard vector spaces, these facts seem transparently clear. For abstract spaces, the situation is not quite so obvious. Here are a few examples of what can be deduced from the definition.
Example 3.8.Letv∈V,a vector space, and0the vector zero. Deducefrom the vector space properties alone that 0v=0.
Solution.Multiply both sides of the scalar identity0 + 0 = 0on the right by the vector vto obtain that
(0 + 0)v= 0v. Now use the distributive law to obtain
0v+ 0v= 0v.
Next add−(0v)to both sides (remember, we don’t know it’s 0yet), use the associative law of addition to regroup, and obtain that
0v+ (0v+ (−0v)) = 0v+ (−0v).
Now use the additive inverse law to obtain that 0v+0=0. Finally, use the identity law to obtain
0v=0,
which is what we wanted to show.
Example 3.9.Show that the vector spaceV has only one zero element.
Solution.Suppose that both 0and 0∗ act as zero elements in the vector space. Use the additive identity property of 0 to obtain that 0∗+0 = 0∗, while the additive identity property of 0∗ implies that 0+0∗ = 0. By the commutative law of addition,0∗+0=0+0∗.It follows that0∗=0, whence
there can be only one zero element.
There are several other such arithmetic facts that we want to identify, along with the one of this example. In the case of standard vectors, these facts are obvious, but for abstract vector spaces, they require a proof similar to the one we have just given. We leave these as exercises.