Affine Transformations - Mathematical Association of America

12

Affine Transformations

Chaotic features of the World erase And you will see its Beauty.

-- Alexander A. Block (1880?1921)1

12.1 Introduction

Suppose we are struggling with a geometric problem concerning an arbitrary triangle or an arbitrary parallelogram. How often we would wish for the triangle to be an equilateral or 45 - 90 - 45 triangle, or for the parallelogram to be a square! The solution is so easy in these cases. But we know that these would be just very particular instances of the problem. Solving them will make us feel better, but not much better. Well, the good news is that for some problems, solving just a particular instance turns out to be sufficient to claim that the problem is solved in complete generality! In this chapter we learn how to recognize some of these problems, and we justify such an approach.

We start by reviewing some familiar concepts. Let A and B be sets. A function or mapping f from A to B, denoted f : A B, is a set of ordered pairs (a, b), where a A and b B, with the following property: for every a A there exists a unique b B such that (a, b) f . The fact that (a, b) f is usually denoted by f (a) = b, and we say that f maps a to b. Another way to denote that f maps a to b is f : a b; if it is clear which function is being discussed, we will often just write a b. We also say that b is the image of a (in f ), and that a is a preimage of b (in f ). The set A is called the domain of f and the set B is the codomain of f . The set f (A) = {f (a) : a A} is a subset of B, called the range of f .

A function f : A B is surjective (or onto) if f (A) = B; that is, f is surjective if every element of B is the image of at least one element of A. A function f : A B is injective (or one-to-one) if each element in the range of f is the image of exactly one element of A; that is, f is injective if f (x) = f (y) implies x = y. A function f : A B is bijective if it is both surjective and injective.

1 Translated from the Russian by Vera Zubareva.

251

252

A

B

f

A

B

g

12 Affine Transformations

A

B

h

(i) f is injective

(ii) g is surjective

(iii) h is bijective

FIGURE 12.1.

If f : A B and g : B C are functions, then the composition of f and g, denoted g f , is a function from A to C such that (g f )(a) = g(f (a)) for any a A. The proof of Theorem 12.1 is left to the reader and can be found in many texts.

Theorem 12.1. A composition of two bijections is a bijection.

If f : A B, then f -1 : B A is the inverse of f if (f -1 f )(a) = a for any a A and (f f -1)(b) = b for any b B. A function f has an inverse if and only if f is a bijection.

Let E2 denote the Euclidean plane. Introducing a coordinate system2 OXY on E2, we can identify

every point P with the ordered

with

its

position

vector,

- OP

=

pair xP ,

of yP

its coordinates (xP , yP ); alternatively, P can be identified . The collection of all such vectors form a vector space,3

namely R2. -

If

x

represents

the

vector

with

initial

point

at

the

origin

and

terminal

point

at

(xP

,

yP

),

then OP , xP , yP , and x can also be used to denote x.

A transformation of a set is a bijection of the set to itself. It is easy to see that any transformation

f

: E2 E2 corresponds to a bijection f~ : R2 R2, in that f~( xP , yP

)=

x ,y

P

P

whenever

f (P ) = P . Since f and f~ uniquely define one another within a fixed coordinate system, we will

also refer to f~ as a transformation of the plane, and we will write f to denote either a mapping of

E2 to E2 or a mapping of R2 to R2. It will be clear from the context which of the two mappings f

represents.

-

Just as any point P corresponds to a set

in of

vOecXtoYrsco-OrrPespoof nRds2

to a unique vector OP , , where P . We say

each that

figure this set

in of

E2 uniquely vectors is a

fi{fgu(-Or ePi)n:

R-O2P, andwe

denote R2}. It

it is

again by . The not hard to make

set the

f () is defined as {f (P ) : P E2}, or relationship between point spaces and vector

spaces more precise, but we will not do it here.4 In fact, we freely interchange the representations

of point and vector, (x, y) and x, y , when they are domain elements of a function f .

Transformations of the plane and their application to solving geometry problems form the focus

of this chapter. The transformations we study will be of two types, illustrated by the following

examples:

f ( x, y ) = 2x - 3y, x + y and g( x, y ) = 2x - 3y + 1, x + y - 4 .

2

Recall

that

OXY

denotes

a

coordinate

system

(not

necessarily

Cartesian)

with

axes

OX

and

OY .

3 Students who have studied some linear algebra may recall that a vector space is a collection of objects on which an

"addition" operation may be performed in such a way that nice properties like commutativity and the existence of additive

inverses hold, but a precise definition of vector space is not necessary in order to continue reading. 4 See, for example, [34], [50], or [65] for rigorous expositions.

12.2 Matrices

253

At this point it is not obvious that f and g are bijections, but this will be verified later in the chapter. To get a more concrete sense of what f and g do, consider how they "transform" the vectors 0, 0 , 0, 1 , 1, 0 , and 1, 1 .

x

f (x)

g(x)

0, 0 0, 0

1, -4

0, 1 -3, 1 -2, -3

1, 0 2, 1

3, -3

1, 1 -1, 2 0, -2

Notice that the origin, 0, is fixed under f , while g( 0, 0 ) = 1, -4 . Notice also that f ( 0, 1 + 1, 0 ) = f ( 0, 1 ) + f ( 1, 0 ); again, this is not true of g. These properties of f are indicative of the linearity of that mapping. A function T : R2 R2 is called linear if T (x + y) = T (x) + T (y) for any vectors x and y, and T (kx) = kT (x) for any vector x and scalar k. The reader can verify that these properties hold for f but not for g.

As will be shown later in this chapter, both f and g map a line segment to a line segment. Therefore, knowing where f and g map the points corresponding to the vectors 0, 0 , 0, 1 , 1, 1 , and 1, 0 is sufficient for determining the image of the unit square, S, having vertices at these four points. Figure 12.2 shows S together with f (S) and g(S). Notice that both f (S) and g(S) are parallelograms; Theorem 12.7 will prove that this is not a coincidence.

12.2 Matrices

Transformations of E2 or R2 are often studied via another type of mathematical object, the matrix. Though the benefits of using the language of matrices are not striking when we study E2, matrices

(0,1)

f (1,1)

f ((0,1))

f ((1,1))

f ((1,0))

f (O)

O (1,0)

g ((1,1)) g

g ((0,1))

g ((1,0))

g (O)

FIGURE 12.2.

254

12 Affine Transformations

turn out to be very convenient when generalizing geometric notions of the plane to spaces of higher dimensions.5

An m ? n matrix A is a rectangular array of real numbers,

A

=

a11 a21 . . .

a12 a22 . . .

... ...

a1n a2n

. . .

.

am1 am2 . . . amn

The entry in the ith row and the j th column is denoted aij , and we often write A = [aij ]. Two

matrices A = [aij ] and A = [aij ] are called equal if they have an equal number of rows, an equal number of columns, and aij = aij for all i and j . When the matrix is n ? n, so that there are an equal number of rows and columns, the matrix is called a square matrix. Notice that a vector v = v1, v2

can be thought of as the 1 ? 2 matrix [v1 v2], called a "row vector." It can also be thought of as a

"column vector" by writing v as the 2 ? 1 matrix v1 . v2

If A = [aij ] and B = [bij ] are both m ? n matrices, then the sum A + B is the m ? n matrix

C = [cij ] in which cij = aij + bij . If A = [aij ] is an m ? n matrix and c R, then the scalar

multiple of A by c is the m ? n matrix cA = [caij ]. (That is, cA is obtained by multiplying each

entry of A by c.)

The product AB of two matrices is defined when A = [aij ] is an m ? n matrix and B = [bij ] is

an n ? p matrix. Then AB = [cij ], where cij =

n k=1

ai k bkj

.

For

example,

if

A

is

a

2

?

2

matrix,

and B is a 2 ? 1 matrix, then

AB = a11 a12 a21 a22

b11 b21

=

a11b11 + a12b21 a21b11 + a22b21

.

We say that here we multiply A by a (column) vector. Notice that BA is not defined in this case. If A and B are both 2 ? 2 matrices,

AB = a11 a12 a21 a22

b11 b21

b12 b22

=

a11b11 + a12b21 a21b11 + a22b21

a11b12 + a12b22 a21b12 + a22b22

.

Although BA is defined in this case, in general BA is not equal to AB. So matrix multiplication is not commutative. These two instances of matrix multiplication (when A is a 2 ? 2 matrix and B is a 2 ? 1 or a 2 ? 2 matrix) are the only ones we will need in this book. In what follows, no matter whether x is a 1 ? 2 vector or 2 ? 1 vector, when it is used in the expression Ax, it is always understood as a column vector, i.e., as a 2 ? 1 matrix.

Theorem 12.2 summarizes some of the most useful properties of matrix operations. Its proof can easily be produced by the reader (part (4) is the most difficult) or may be found in a standard linear algebra text.

5 Here, when we say "language," we mean the objects, their notation, operations on the objects, and properties of those operations ? similar to the "languages" of trigonometry, algebra, logic, and calculus.

12.2 Matrices

255

Theorem 12.2.

(1) If A and B are m ? n matrices, then A + B = B + A. (2) If A, B, and C are m ? n matrices, then A + (B + C) = (A + B) + C. (3) Given an m ? n matrix A, there exists a unique m ? n matrix B such that A + B =

B + A is the zero matrix (that is, the matrix with 0 in every entry). (4) If A is an m ? n matrix, B is an n ? p matrix, and C is a p ? q matrix, then

A(BC) = (AB)C. (5) If A and B are m ? n matrices, C is an n ? p matrix, and D is a q ? m matrix,

then (A + B)C = AC + BC and D(A + B) = DA + DB. (6) If r, s R, A is an m ? n matrix, and B is an n ? p matrix, then

(a) r(sA) = (rs)A = s(rA), and (b) A(rB) = r(AB).

(7) If r, s R, and A and B are m ? n matrices, then

(a) (r + s)A = rA + sA, and (b) r(A + B) = rA + rB.

Using the notation of matrices, we can represent the functions f ( x, y ) = 2x - 3y, x + y and g( x, y ) = 2x - 3y + 1, x + y - 4

using matrix multiplication as follows. First, let x = x , and let y

A=

2

-3 .

11

Then

f (x) = Ax = 2 -3

x .

11 y

One way to think about the matrix A corresponding to the transformation f is that the columns

of A specify the images of the vectors i = 1, 0 and j = 0, 1 . Using matrix multiplication,

we see that Ai = 2 -3 1 = 2 , and Aj = 2 -3 0 = -3 , as illustrated in

11 0

1

11 1

1

Figure 12.3.

If we let b =

1 -4

, then the same 2 ? 2 matrix A gives

g(x) = Ax + b =

2 1

-3 1

x y

+

1 -4

=

2x - 3y x+y

+

1 -4

=

2x - 3y + 1 x+y-4

.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download