6 H. L. Frisch and E. Wasserman, Chemical Topology, J. Am.
Chem. Soc. 83 (1961), 3789-3795.
7 V. F. R. Jones, A Polynomial Invariant for Knots via von Neumann Algebras, Bull. A. M. S. 12 (1985), 103-111.
8 D. Jonish and K. C. M ille tt, Isotopy Invariants of Graphs, manuscript 1987.
9 L. H. Kauffman, Formal Knot Theory, Princeton University Press, Mathematical Notes 30 (1983).
10 L. H. Kauffman.An Invariant of Regular Isotopy, to appear.
11 L. H. Kauffman, State Models and the Jones Polynomial, to appear in Topology.
12 L. H. Kauffman, A Polynomial Invariant for Rigid-vertex Graphs, A. M. S. Abstracts 7(1986), 262.
13 K. C. M ille tt, Configuration Census, Topological C hirality and the New Combinatorial Invariants, The Proceedings of the IUPAC International Symposium on Applications of Mathematical Concepts to Chemistry, Croatica Chemica Acta, Vol. 59 (3) (1 986),669-684.
14 K. C. M ille tt, Stereotopological Indices for a Family of Chemical Graphs, to appear in Journal of Computational Chemistry.
15 J. Simon, Topological C hirality of Certain Molecules, Topology Vol. 25 (2) (1986), 229-235.
16 D. M. Walba, Stereochemical Topology, Proceedings of Symposium on Chemical Applications of Topology and Graph Theory, University of Georgia, 1983, R. B. King, Ed., Elsevier Pub., 1983.
17 D. M. Walba, Topological Stereochemistry, Tetrahedron Vol.
41 (16) (1985), 3161-3212.
18 D. M. Walba, R. M. Richards, and R. C. Haltiwanger, Total Synthesis of the First Molecular Mobius Strip, J. Am. Chem.
Soc, 104 (1982), 3219-3221.
19 E. Wasserman, Chemical Topology, S cientific American 207(5) (1962), 94-102.
Graph Theory and Topology in Chemistry, A Collection of Papers Presented at an 91 International Conference held at the University of Georgia, Athens, Georgia, U.S.A.,
16-20 March 1987, R.B. King and D.H. Rouvray (Eds)
Studies in Physical and Theoretical Chemistry, Volume 51, pages 91-105
© 1987 Elsevier Science Publishers B.V., Amsterdam — Printed in The Netherlands
NEW DEVELOPMENTS IN REACTION TOPOLOGY
Paul G. Mezey
Department of Chemistry and Department of Mathematics, University of Saskatchewan, Saskatoon, Canada, S7N 0W 0
ABSTRACT
Chemical reactions and conformational changes can be represented by topological relations on potential energy hyper surf aces. These relations are of importance in computer-based synthesis design and molecular engineering. The properties of the reduced nuclear configuration space, a metric space M , allow for a concise representation of such chemical processes. This (3N-6)-dimensional space M has some counterintuitive properties, for example, some r e a c t io n paths show
"reflections" at certain internal nuclear configurations. Space M has boundaries which constrain the possibilities for providing set M with coordinate systems, essential for a detailed study of chemical processes. In this study a fundamental property o f space M is demonstrated, that is a condition for establishing coordinate systems and differentiability on M: set M can be converted into a manifold with boundary.
IN T R O D U C T IO N
For most organic molecules and for typical molecules of biochemical importance the calculation and analysis o f extensive potential energy hypersurfaces is a rather complicated task. It is customary to carry out a partial analysis, restricted either to subspaces or to small domains of a nuclear configuration space nR. These domains are usually neighborhoods o f equilibrium nuclear geometries (m inim a) or of transition structures (saddle points with a single negative canonical curvature).
However, it is not always a trivial task to select the chemically most important domains of nR for such an analysis. The general topological properties of energy
hypersurfaces, defined over nR, may aid the selection of these domains.
Similarities between chemical structures, the occurrence of intermediate species, interrelations between reaction mechanisms, and most other problems of theoretical synthesis design may be analysed in terms of the topological properties of potential surfaces and that of the underlying nuclear configuration space, i f in c h e m i c a l l y identifiable open sets, for example, the catchment regions C(A,,i) are given [2], then these sets may be provided with additional mathematical structure, for example it may appear advantageous to introduce l o c a l c o o r d i n a t e s y s te m s into open sets of nR, representing various molecular species. The theory of such local coordinate systems is m a n i f o l d t h e o r y ; t h i s retains some features of a geometrical model within the framework of a topological space.
In this study we shall utilize some of the topological results of earlier works [1-5]
in order to introduce a set of local coordinate systems in the neighbourhoods of various critical points, such as energy minima and saddle points of transition structures. First we shall review the manifold structure of a general configuration space nR that may be, in fact, only a subspace or a subset of a larger nuclear configuration space. One should note that the term "space” is used in a topological sense and it may apply to sets which have no vector space properties at all.
In earlier studies it has been shown that a rather general internal nuclear configuration space of (3N -6) dimensions is indeed a m e t r i c space, denoted by M , which has a global metric, valid for all nuclear configurations [4]. Space M also preserves continuity properties of potential energy hypersurfaces [5]. However, M is not in general a vector space, it has b o u n d a r i e s , essentially of two types, type L where reflection of reaction paths occur (typically at linear nuclear configurations) and type D, which points correspond to coincident nuclear positions (associated with formal nuclear reactions). The existence of these boundaries present difficulties for the introduction of a family of compatible local coordinate systems. However, it is possible to convert space M into a manifold with boundary, that allows for the introduction of compatible local coordinate systems and it also ensures appropriate differentiability properties.
T H E R E D U C E D N U C L E A R C O N F IG U R A T IO N SPACE M AS A M A N IF O L D W IT H B O U N D A R Y
Review of the Manifold Theory of a General Nuclear Configuration Space nR
Let us assume that a Born-Oppenheimer energy hypersurface E (r) of a molecular system of a specified electronic state is given, defined over a nuclear configuration space nR, that may be a subspace of a larger space. Space nR may be an n-dimensional space only in the topological sense, and it may, in fact, be only a
subset of a larger nuclear configuration space. Space nR may be provided with a metric and the £-neighborhoods of points of nR, r e nR, define the metric topology
W e are interested in providing this space with a set of local coordinate systems, and we shall use the methods of manifold theory for this end. A topological space (X , T ) is an n-dimensional topological manifold if for any two distinct points x and y of X there are disjoint open sets G x and Gy containing x and y respectively (that is, X is a Hausdorff space), and if X is covered by domains of n-dimensional coordinate systems . Inform ally, a set is a manifold if each of its points has a neighborhood locally similar to a neighborhood in a Euclidean space. In order to obtain a manifold, suitable for coordinate representations of various chemically identifiable subsets of nR, one may utilize the T c <-open sets of topological space (nR ,T c -) and an underlying metric space m odel[l-5]. A general T c - -open set is not necessarily open in the metric topology T of nR. However, a (nR ,T) m e t r i c
topological space is a Hausdorff space and also a n o r m a l topological space, fulfilling the following separation axiom for any two T-closed sets C (i) and C(j) of nR: if
T on nR.
C(i) n c g) = 0 ( i )
holds, then there exist T-open sets G (i) and G (j) such that
C (i) C G (i) e T , (2)
CO) C GO) e T , (3)
and
G (i) P i GO) = 0 ■ (4)
Denote the T-closure of catchment region C(X,,i) by C(i)
C (i) = T-closure [C(A,,i)], (5)
where we assume that all catchment regions have unique i indices and where the X
index is supressed in the notation. Let us assume that a class { G ( i) | of T-open sets fulfilling condition (2) is given. These T-open sets G (i) can then be used to define diffeomorphisms cp(i) (one-to-one, onto, continuous, and infinitely differentiable functions with continuous inverses) between subsets of nR and subsets of the n-dimensional Euclidean space nE. We shall assume that for every index i a diffeomorphism
cp(i) : G (i) ^ H (i) C nE (6)
is given, where H (i) is open in the usual metric of nE.
The Euclidean space nE is provided with the usual coordinate functions {Uj}, compatible with the usual (Euclidean) metric, and are interpreted by
ui (t) = ti (7)
where
t = (t1? . . . y e nE. (8)
Space (nR ,T ) is a metric topological space, it is also a Hausdorff space and function cp(i), being a diffeomorphism, is also a homeomorphism (a one-to-one, onto and continuous function with continous inverse) from an open set of nR to an open set of nE. Consequently, (p(i) is an n-dimensional coordinate system in nR.
Since the catchment regions C(A,,i) generate a partitioning o f the nuclear configuration space nR,
nR = U C(X,i) , (9)
it follows that open sets { G (i)} form an open cover of nR. That is, the nuclear configuration space nR is covered by domains of n-dimensional coordinate systems {(p(i)}, consequently, nR is an n-dimensional topological manifold.
The G (i) domain of coordinate system cp(i) is called the coordinate neighborhood of (p(i) and if r e G (i) then cp(i) is said to be the coordinate system at point r.
The composition of coordinate function Uj and diffeomorphism cp(i) w ill be denoted by xj,
(10) M " UJ <P(0
and the same term, "coordinate system", is used for both function tp(i) and for the set of functions
(x l>x2 ... xn) = (uj ° (p(i), u2 0 (p (i),... un o tp (i)). ( I D
We say that wo coordinate systems tp(i) and tp(j) of the nuclear configuration space nR are C °°-rela tedf bothi
<P(0 ° (<P(i)V1 e C “ (12)
and
9 (j) ° (<P(i))"! e C °°, (13)
where C°° denotes the class of infinitely differentiable functions. The above compositions are infinitely differentiable.
The open sets H (i) of Euclidean space nE are diffeomorphic images of the G (i) supersets of those sets C (i) which represent chemical species in nR. Consequently, relations among chemical species of the general nuclear configuration space nR (which is in general non-Euclidean) can be studied in a Euclidean space nE.
Relations between sets G (i) and G(j), where
G ( i ) n G ( j ) * 0 (14)
may be obtained by defining a homeomorphism
9(U) = 9 (0 ° OpG ))” 1 ( 15)
where
<P(ij): (p(j) (G (i) n GO)) <p(i) (G (i) n GO)). (16)
In the special case when the class { G (i)} of domains of coordinate systems is countable and if each mapping cp(ij) is differentiable, then nR is an n-dimensional differentiable manifold.
(17) I f the set
G (i) n G (j) C nR
is non-empty, then at least two coordinate systems, cp(i) and (p(j), are given within this set. The Jacobian determinant of the corresponding coordinate transformation (p(i) —ằ cp(j) is defined by virtue of eq. (11) as
det
9 x k0)
dx { (i)
(18)
where k is the row index and ? is the column index of the determinant. In set G (i) P i G (j) the homeomorphism cp(ij) always has inverse, which is cp(ji),
(<P(ij))-1 = 9 0 'i)- (19)
Consequently , the Jacobian determinant (18) is nonzero at each point of set G ( i ) n G G ) .
I f eq.(2) holds for all catchment regions of nR, then continuity and differentiation of functions given over the nuclear configuration space may be defined in terms of local coordinate systems in T^-open sets, representing various chemical structures. Clearly, the potential energy hypersurface E and properties of its derivatives can also be analysed in terms of local coordinate systems cp(i) over G (i) supersets of catchment regions C(A,,i), using in fact the (very convenient) Euclidean coordinates in open subsets H (i) of nE to label the corresponding points in G (i).
Take an arbitrary real valued function f defined over a T c -open subset G of the nuclear configuration space nR. Take a point r e G which is also an element of both sets G (i) and G(j),
r e G (i), (20)
r e G ( j ) . (21)
Let us consider the composed functions fo(cp(i))"l and f o(cp(j))_1, defined over cp(i)(G O G (i)) and cp(j)(G D G (j)), respectively. These functions may be thought of as function f expressed in terms of local coordinates generated in sets G (i) and G (j) by diffeomorphisms cp(i) and cp(j), respectively. An important property of
these functions is that fo((p(i))_1 is differentiable in a neighborhood of cp(i)(r) if and only if fo((p(j))-1 is differentiable in a neighbourhood of cp(j)(r), which follows directly from the identity
foOpC'))'1 = f o ^ G ) ) '1 o <PG) o (9G) ) ' 1= f ° OpG))'1 ° <PG0 • (22)
If the function f o ((pfj))"1 is differentiable, then f o ((p (i))‘ * is a composition of differentiable functions and it is also differentiable.
The above differentiability property guarantees that the Euclidean coordinate representations of functions defined over nR behave "properly" in the overlapping regions of coordinate domains. That is, when reference from a given chemical species C (^ ,i) is switched to reference to another chemical species C(A,',j), the change o f coordinate systems does not interfere with the continuity and differentiability properties of function f. Functions defined over the entire nuclear configuration space nR, such as the energy hypersurface E itself, may be treated locally [in coordinate neighborhoods G (i), containing a set C (X,i) of a chemical speciesl as functions defined on an ordinary Euclidean space. The global interpretation of configuration changes is also preserved by ensuring an orderly switch of coordinate systems.
For convenience, it is useful to choose the cp(i) coordinate systems in a special form, which assigns the origin of the Euclidean space,
(0 ,0 , ...0 ) g nE (23)
to the critical point c(i) in C(^,0,
c(i) g C(k, i) . (24)
The large amplitude motion formalism shows some analogies with the manifold representation of nR. A manifold representation of a multidimensional potential energy surface E is a combination of a purely topological model of relations among chemical structures, and a geometric model based on the metric space properties of nR. If (x j, x2 ... xn) are the local coordinates around point r of the n-dimensional manifold then an m-dimensional submanifold has the local equations
xm +1 = xm+2 = ••• = xn = 0 > (25)
and around point r the local coordinates are (X ],X2...x m) in the submanifold. A submanifold of a manifold may be thought of as a generalization of a cross-section.
M anifold Structure of the Reduced Nuclear Configuration Space M : a M anifold with Boundary
The manifold theoretical model outlined above is applicable to any metric nuclear configuration space nR for which catchment regions, their T-closures (5) and their T-open coordinate neighborhoods G (i) (eq. (2)) can be given. One condition for the existence of such open sets G (i) for all catchment regions C(A,,i) is that the set nR itself is open in the metric T , hence, either it has no boundaries or it does not contain them. Whereas for some choices o f nR this condition is automatically satisfied, this is not in general the case for metric space M . Nonetheless, analogous methods are applicable to space M as well, leading to a description based on the theory of manifolds with boundary.
The basic condition for a set being an n-dimensional manifold is the existence of a homeomorphism of a neighborhood of each point of the set to an open set of the n-dimensional Euclidean space. I f the set contains some of its boundary points, then no open set of an Euclidean space can be assigned by a homeomorphism to neighborhoods of such boundary points. However, a consistent development of differentiability on the set is still possible using properties of an Euclidean space. It is sufficient if a homeomorphism exists for neighborhoods of each point of the set to an Euclidean "half-space" nH,
nH = {r: r e nE, rn > 0 } . (26)
Such a half-space is a subset of an Euclidean space nE containing all points of nE except those having negative values for the last coordinate. This Euclidean half ^pace has a boundary,
9(nH ) = {r: r e "E, rn = 0}, (27)
that is the hyperplane for which the last coordinate is equal to zero. I f a homeomorphism exists for a neighborhood of each point of the object to a subset open within nH, then the object can be turned into a manifold with boundary.
For proper differentiability the boundary itself must be smooth, it cannot contain sharp edges and peaks and this is the very condition ensured by homeomorphisms to subsets open in nH.
The reflection properties of those points K of M which represent linear internal nuclear configurations lead to boundary points of type L. Furthermore, formal boundary points in M are also present as a consequence of "loss of dimension" when two or more nuclei become coincident; these points are denoted as type D. Such configurations (which involve formal nuclear reactions) are considered excluded from the family of chemically realistic configurations and the very exclusion of these
configurations and their close neighborhoods w ill allow the conversion of M into a
manifold with boundary.
Without a modification, such as the exclusion of "chemically impossible" points, space M is not in general a manifold with boundary. Since the boundaries of the two types, L and D , usually meet at a formal "edge",theirdifferentiability cannot be assured without a modification. Take a triatomic system AB C, and consider all possible internal configurations, with atom A fixed at the origin of the laboratory frame coordinate system. The above choice eliminates the three translational degrees of freedom. I f for noncoincident A and C nuclei the AC line segment is aligned with the X axis, then for the position of atom B it is sufficient to consider only the points of the X Y plane. In fact, even some of these configurations are equivalent by rigid rotations within the laboratory frame.
The following coordinates are zero by definition:
oII
IIu
II
CQNII<NII<
II<X
(28)
and the following four choices, (i), (ii), (iii), and (iv) are equivalent, representing the same internal configuration:
(i) x B , yb, x c (29)
(ii) - X B, Y b , -Xc (30)
(iii) X B> -yb> x c (31)
(iv) - X B, - Y b , -Xc . (32)
Consequently, all possible internal configurations of the A B C triatomic system can be represented uniquely by the following "quarter" of a three-dimensional space of coordinates X B, Y B and X c :
- o ° < X B < o o5 (33)
0 <yb , (34)
0 < X C. (35)
Considering the special case o f triatomic systems, there is a one-to-one correspondence between points of the subset of 3R satisfying constraints (33) - (35) and points K of the reduced nuclear configuration space M . Evidently, the boundary of this subset has an "edge" along axis X B, hence this subset is not a
manifold with boundary. The X BX C boundary plane contains all collinear nuclear configurations with Y B = 0, whereas points of boundary plane X BY B contain all nuclear configurations with coincident A and C nuclei, hence, these points represent nuclear reactions. Note that those points of the X BX C plane for which X B
= X c also correspond to nuclear reactions, obtained for coincident B and C nuclei.
Our purpose is to turn the chemically significant part of this subset (and in general of any space M ) into a manifold with boundary, and for this end we shall exclude points of nuclear reactions in such a way that the edge (in general the edges of M ) are
"smoothed" out. Our tools for this task are severelylimited: we are allowed to use only the concept of distance. However, by excluding certain open balls of appropriately chosen centers and radii, it is possible to develop a general technique for "slicing o f f points o f nuclear reactions as well as sharp edges and to obtain a smooth boundary.
The principal device for this purpose is the smooth "carpeted step function" h(x), having the following properties:
h(x) == 0 if x < a, (36)
h(x) is monotonously increasing if a < x < b, (37)
h(x) == 1 if b < x. (38)
Furthermore, function h(x) is everywhere smooth, that is, it is everywhere continuous and infinitely differentiable. (Note that the term "smooth" is sometimes used with a different meaning, implying differentiability only up to some fixed order. In this study, however, "smooth" w ill always mean "in finitely differentiable").
Such a function h(x) can be constructed as follows. Take the smooth function f(y),
0 if y < 0
f(y> - { (39)
exp (-l/y2) if 0 < y
and generate the "bump" function
g(y) = f(y-a)f(b-y) (40)
that is a smooth (everywhere infintely differentiable) function, zero outside the (a,b) interval, and finite positive within this interval. Then a "carpeted step function” can be defined as
(41) 101
h(x) = [ I g(y)dy] / [ J g(y)dy]
This function is indeed smooth everywhere, and it fulfills the conditions (36) - (38).
When generating a smooth boundary for the chemically relevant part of space M , we shall use only the metric properties of M . Only the concept of distance d(K, K') w ill be needed, and no reference w ill be given to details of particular representations of M . An excluded open set w ill be specified in terms of a union of open balls, hence the remaining part o f M w ill have a closed boundary. By making use of the
’’carpeted step function”, the location and the radii of the above open balls can be chosen so that this boundary w ill be smooth, leading, indeed, to a manifold with boundary.
W e shall keep the discussion general and topological, and no significance is attributed to the shapes of boundaries D and L in any given representation.
Boundary set D is defined as
D = U D (Y ), (42)
Y
the union of those points D(Y) e M where two nuclear positions coincide, or where in general for points
d(Y) G D(Y) (43)
the energy hypersuface E ( x ) ,
d(Y) = x g ^n E (44)
is not differentiable.
Set L is defined as the union of all those points K g M which are boundary points of M by some other criterion, called regular boundary points, such as those of linear
nuclear configurations in the laboratory frame:
L = kJ K (a ) , K (a ) = regular boundary point
a (e.g. linear configuration). (45)