The Woodlands, Texas, United States Abstract – The recently introduced Turing-complete Matrix Model of Computation MMC is a con-nectionist, massively parallel, formal mathematical model
Trang 1A new type of Structured Artificial Neural Networks
based on the Matrix Model of Computation
Sergio Pissanetzky Research Scientist Member, IEEE The Woodlands, Texas, United States
Abstract – The recently introduced Turing-complete
Matrix Model of Computation (MMC) is a
con-nectionist, massively parallel, formal mathematical
model that can be set up as a network of artificial
neurons and represent any other ANN The model is
hierarchically structured and has a natural ontology
determined by the information stored in the model
The MMC is naturally self-organizing and
dynami-cally stable The Lyapunov energy function is
inter-preted as a measure of biological resources, the
attrac-tors correspond to the objects in the natural ontology
The Scope Constriction Algorithm (SCA) minimizes
the energy by systematically switching the network
connections and reveals the ontology In this paper
we consider the MMC as a modeling tool for
applica-tions in Neuroscience We prove as a theorem that
MMC can represent ANNs We present a new, more
efficient version of SCA, discuss the advantages of
MMC ANNs, and illustrate with a small example
Keywords: neural networks, dynamic systems, ontologies,
self-organizing systems, artificial intelligence, semantic web
Work
The Matrix Model of Computation was introduced as
a natural algorithmic form of mathematical notation
amenable to be operated upon by algorithms expressed
in that same notation It is formally defined as a pair of
sparse matrices, the rows of which are tuples in a
rela-tional database Since MMC models can be easily
cre-ated by a parser from existing computer programs, and
then refactored by algorithm, the MMC was proposed as
a virtual machine for program evolution [1] Subsequent
work [2] proved that any finitely realizable physical
sys-tem can be modeled by the MMC, and showed that the
model is naturally self-organizing by way of an
algo-rithm that organizes the information categorically into
weakly-coupled classes of strongly-cohesive objects, an
ontology [3] Finally, applications to very diverse fields
such as theoretical Physics, business and UML models,
and OO analysis and design, were discussed and
illus-trated with small examples [4] Relations have been
applied for the analysis of programs and a relational
model of computation has been proposed [5] and
re-cently characterized by investigating its connection with the predicate transformer model [6]
In this paper we consider the MMC as a structured, massively parallel, generalized, self-organizing, artificial neural network In Section 2 we define the MMC, in-troduce terminology, discuss the hierarchical organiza-tion and parallelism, examine combinaorganiza-tions and con-versions between artificial neurons or ANNs and MMC models, training issues, and dynamics, and briefly com-pare ANNs and MMC with humans In Section 3 we present prove that any ANN can be described as an MMC model, and in Section 4 we present a new, more efficient and biologically plausible version of the Scope Constriction Algorithm, which gives the MMC its abil-ity to self-organize We close with a small example
of Computation
2.1 Definition The MMC is simple, yet very rich in features It is defined [1] as a pair of sparse matrices [7] M = (C, Q), where C is the matrix of services and
Q is the matrix of sequences The rows of C are the services, and the columns of C are the variables used
by the services A domain is the set of values allowed for a variable, and there is a domain associated with each variable Each variable plays a certain role in the service, indicated by A for an input variable or argu-ment, C for an output variable or codomain, and M for
a modifiable variable or mutator The roles A, C and
M are the elements of C in that service’s row
The concept of service is very general A service can represent a neuron, a neural network, a primitive math-ematical or logical operation in a standard computer, a method in a class, or an entire MMC Services can also have their own memory visible only to the service (e.g
a synaptic weight), and their own firing mechanisms Variables are also very general A numerical variable represents a value, a categorical variable represents an instance of an object in a class See Eq (2) below for
a small example of a matrix C, or previous publications [1, 2, 4] for more complete examples
The rows of Q are the sequences The columns of Q include the actors that initiate sequences, the links be-tween services, and the control variables that activate
or inhibit the links
Trang 22.2 The data channel The scope of a variable is the
vertical extent between the C or M where the variable
is first initialized and the terminal A where it is used
for the last time, in that variable’s column The set
of scopes represents a data channel where data carried
by the variables flows from its source, the initializing
services, to its destinations, the services that use the
data The sum of all scopes happens to be equal to the
vertical profile of C, immediately suggesting the use of
profile minimization techniques to make the data
chan-nel narrow, a highly desirable feature discussed below
2.3 MMC algebra and transformations MMC
has a rich algebra, which includes matrix operations
such as permutations, partitioning and submatricing,
relational operations such as joins, projections,
normal-ization and selection, and graph and set operations [1]
Algorithms can be designed based on these operations
to induce transformations on the MMC Of particular
interest are refactorings, defined as invariant
transfor-mations that preserve the overall behavior of the model
This is a general definition and it applies to all
sys-tems The MMC has been proposed for that purpose
[1] Algorithms can also be designed for training or for
self-organization One of them is discussed below
2.4 Control flow graph, linear submatrices, and
canonical submatrices A control flow graph (CFG)
is a directed graph G = (V, E) where a vertex v ∈ V
corresponds to each service in matrix C and an edge
e ∈ E corresponds to each tuple in matrix Q A path
in the CFG represents a possible flow of control The
path is said to be linear if its vertices have no
addi-tional incoming or outgoing edges except for the end
vertices, and the linear path is maximal if it can not be
enlarged without loosing the linear property Given a
set of services S, a submatrix of services can be defined
by deleting from matrix C all rows with services not in
S and all columns with variables not used by the
ser-vices in S A linear submatrix is a submatrix of serser-vices
based on the services contained in a linear path Linear
submatrices are very common in a typical MMC model
A service in a general MMC can initialize or modify
several variables at once, and a variable can be
repeat-edly re-initialized or modified As a result, a submatrix
of services can contain many C’s and M ’s in each row or
column However, the following simple refactoring can
convert any submatrix of services to a form without M ’s
and exactly one C in every row and every column: (1) if
a service has n > 1 codomains C, expand it into n
simi-lar services that initialize one variable at a time, and (2)
if a variable is mutated or assigned to more than once,
introduce a new local variable for each assignment or
mutation The resulting submatrix is square, and, since
there is only one C in every row and every column, a suitable (always legal) column permutation can bring it
to a canonical form, where all the C’s are on the diag-onal, the upper triangle is empty, the lower triangle is sparse and contains only A’s, and the lowermost A in each column is the terminal A in that column Canoni-cal submatrices correspond to the well-known single as-signment representation, a connectionist model directly translatable into circuits Examples of canonical matri-ces have been published ([4], figures 1, 2)
2.5 Ontologies The roles A, C and M in a row of matrix C establish an association between the service
in that row and the variables in the columns where the roles are located Since variables represent attributes and can take values, and services represent the pro-cesses and events where the variables participate, the association represents an object in the ontological sense [3] We refer to this object as a primitive object, and
we say that matrix C defines a primitive ontology of which the primitive objects are the elements and the domains are the classes Domains can be joined to form super-domains, of which the original domains are the subdomains Super-domains inherit the services and attributes of their subdomains Multiple-inheritance
is possible, and a subdomain can be shared by many super-domains In the ontology, the super-domains are subclasses and the subdomains are super-classes, and the super-classes subsume the subclasses The sub-domains of a super-domain can be replaced in matrix
C with a categorical variable representing that super-domain, and similarly, the associated services can be replaced with a “super-service” declared in an MMC submodel in terms of the subservices, thus reducing the dimension of C by submatricing The process can be continued on the simplified C, creating a hierarchy of models and submodels that represents an inheritance hi-erarchy These features have been previously discussed [1, 4] Primitive objects do in fact combine sponta-neously to form larger objects when the profile is mini-mized, giving rise to the self-organizing property of the MMC discussed below In a biological system an ob-ject could represent a cell, a neuron, a neural clique, an organ, or an entire organism
2.6 Parallelism A service declaration is the root of
a tree, where only the external interface is declared in
a row of C but links present in matrix Q progressively expand it in terms of more and more detailed declara-tions, down to the deepest levels where declarations are expressed in terms of services provided by the hardware
or wetware To accommodate traditional computational language, we say that services in a level invoke or call those in the lower levels The service declaration tree also functions as a smooth serial/parallel interface as
Trang 3well as a declarative/imperative interface The services
near the top are sequentially linked by the scopes of the
variables, but as the tree expands, many new local
vari-ables are introduced and the interdependencies weaken,
allowing parallelism to occur It is in this sense that
the MMC is considered as a massively parallel model
The smooth transition between the two architectures is
a feature of MMC models
2.7 ANN/MMC conversions and combinations
Structured models entirely based on artificial neurons
can be formulated for any system by creating an initial
MMC model with serial services down to the level where
parallelism begins to appear, and continuing with
tradi-tional ANNs from there on The services in the higher
levels are already connected in a network, and the
in-vocations of the lower level services involve only
eval-uations of conditionals Conditionals can, in turn, be
translated to ANN models, and at least one example of
such translations has been published [8] In this way, a
homogeneous model consisting entirely of artificial
neu-rons is obtained, where collective behavior and
robust-ness are prevalent in the ANNs while a higher level of
functional and hierarchical organization is provided by
the underlying MMC Another exciting possibility is to
combine the robustness and efficiency of ANNs with the
mathematical rigor and accuracy of traditional
comput-ers and the interoperability of the MMC by
implement-ing some services as ANNs and the rest as CPUs The
theorem presented in the next Section clarifies some
as-pects of these conversions
2.8 Training MMC operations can be used to design
algorithms that add or organize MMC data SCA is an
example SCA does not add data but it creates new
information about data and organizes it into structure
As such, it should be considered training Direct
train-ing is another example A modified parser can
trans-form a computer program into an MMC Conversions
from other sources such as business models or theories
of Physics are possible [4] There has been a recent
resurgence of interest in connectionist learning from
ex-isting information structures and processes [8, 9] In
addition, the ANNs in the MMC support all traditional
modes of training Conversely, a trained MMC network
will have a high ability to explain its decision-making
process, an important feature for safety-critical cases
2.9 Self-organization Under certain circumstances,
row and column permutations can be applied to C to
rearrange the order of the services and variables The
permutations can be designed in such a way that they
constrict the data channel by reducing the scopes of
the variables, and at the same time cause similar
prim-itive objects to spontaneously come together and
coa-lesce into larger, more cohesive, and mutually uncou-pled objects This process is called scope constriction, and is performed by the Scope Constriction Algorithm discussed below The transformation is also a refactor-ing because it preserves the behavior of the model The process can continue with the larger objects, progres-sively creating even larger objects out of the smaller ones The resulting hierarchical structure is the natural ontology of the model The natural ontology depends
on and is determined by the information contained in the model, and is therefore a property of the model Definitions and properties of cohesion and coupling are well established [10]
2.10 Dynamics It is possible to imagine a scenario where (1) new information keeps arriving, for example from training or sensory perception, (2) the scope con-striction process is ongoing, (3) the resulting natural ontology evolves as a result of the changes in the body
of information, and (4) an ability to “reason” in terms
of the new objects rather than from the raw information
is developed In such a scenario, some objects will stabi-lize, others will change, and new objects will be created This scenario is strongly reminiscent of human learn-ing, where we adapt our mental ontologies to what we learn about the environment It is also consistent with recent work on neural cliques [11], suggesting that in-ternal representations of exin-ternal events in the brain do not record exact details but are instead organized in a categorical and hierarchical manner, with collective be-havior prevalent inside each clique and a higher level of organization and functionality at the network level The scenario can find other important applications, such as semantic web development Some of these ideas are further discussed in Section 4 These ideas are not very well supported by traditional ANNs For quick refer-ence, Table 1 shows some of the features of ANN and MMC models that we have rated and compared with humans The comparison suggests that MMC models, particularly MMC/ANN hybrids, may be better suited
as models of the brain than ANNs alone, and may help
to develop verifiable hypotheses
Table 1 Ratings of ANN and MMC features com-pared with humans 1 = poor, 2 = good, 3 = best Supported feature humans ANN MMC
Trang 43 Describing Artificial Neural
Networks with MMC models
The Theorem of Universality for the MMC states that
“Every finitely realizable physical system can be perfectly
represented by a Matrix Model of Computation” [2] In
this Section we prove the following theorem:
Any ANN, consisting of interconnected artificial
neu-rons, can be equivalently described by an MMC model
where the neurons correspond to services and the
con-nections to scopes in the matrix of services
This theorem follows from the theorem of universality
However, in order to make the correspondence more
ex-plicit, we present the following proof by construction
In the ANN model, a nonlinear neuron is described by
the following equation:
yk= ϕ
m
X
i=1
wkixki+ bk
!
(1)
where k identifies the neuron, m is the number of inputs,
xki are the input signals, wki are the synaptic weights,
ϕ is the activation function, bkis the bias, and yk is the
output signal Service neuron k (nr k) in the following
MMC matrix of services C describes equation (1):
C =
serv ϕ {xki} {wki} bkyk{x`i} − x`1{w`i} b`y`
(2) where x`1 ≡ yk, and set notation is used Sets,
func-tions, etc, are considered objects in the ontological
sense, meaning for example that {xki} stands not only
for the elements of the sets but also their respective
car-dinalities and other properties they may possess
Ser-vice neuron ` (nr `) in eq (2) represents a second
neuron that has the output signal yk from neuron k
connected as its first input x`1 The scope of variable
yk, extending from the C to the A in that column,
rep-resents the network connection The rest of the proof is
by recurrence To add neurons, the same construction
is repeated as needed, and all connections to previous
neurons in the model are represented in the same way
This completes the proof
Algorithm (SCA)
In this Section, we present a new version of the SCA
algorithm with a lower asymptotic complexity than the
original version [2] The algorithm narrows the data
channel (§2.2) and reveals the natural ontology of the
model (§2.5) by minimizing the profile of the matrix
of services C SCA operates on a canonical submatrix (§2.4) of C, but for simplicity in presentation we shall assume that the entire C is in canonical form If N is the order of C and j is any of its columns, then Cjj= C
If there are any A’s in that column, then the downmost
A, say in row Dj, is the terminal A, and the length of the scope of the corresponding variable is Dj − j If there are no A’s, the variable is an output variable and
Dj= j The vertical profile of C is:
p(C) =
N
X
j=1
The variable in column j is initialized by the C in that column Then, the data travels down the scope to the various A’s in column j, and then horizontally from the A’s to the C’s in the corresponding rows, reaching as far as the C in column Dj, which corresponds to the terminal A in column j New variables are initialized
at the C’s, and the process repeats itself The “conduits
of information” that carry the traveling data constitute the data channel, and the lengths of the scopes are a measure of its width The maximum width Wm(C) and the average width Wa(C) of the data channel are defined
as follows:
Wm(C) = max
SCA’s goal is to reduce the lengths of the scopes and the width of the data channel by minimizing p(C)
In the canonical C, services are ordered the same
as the rows Matrix Q still applies, but is irrelevant because it simply links each service unconditionally to the service below it Commuting two adjacent services means reversing their order without affecting the overall behavior of the model The lengths of the scopes and the value of the profile p(C) depend on the order of the services, hence SCA achieves its goal by systematically seeking commutations that reduce the profile Since a behavior-preserving transformation is a refactoring, a commutation is an element of refactoring and SCA is a refactoring algorithm
Commutation is legal if and only if it does not reverse the order of initialization and use of any variable More specifically, a service in row i initializes the variable in column i, because Cii = C Since this is the only C
in that row, the service in row i and the service in row
i + 1 are commutative if and only if Ci+1,i is blank In other words, commutations are legal provided the C’s stay at the top of their respective columns For exam-ple, the two services in eq (2) are not commutative because of the presence of the A under the C in column
y Commutation preserves the canonical form of C
Trang 5Repeated commutation is possible If service S in
row i commutes with the service in row i − 1, the
com-mutation can be effected, causing S to move one row up,
and the service originally in row i − 1, one row down
If S, now in row i − 1, commutes with the service in
row i − 2, that commutation can be effected as well,
and so on How high can S go? Since there are no A’s
above the C in column i of S, all commutations will
be legal until the rightmost A in row i, say in column
Ri, gets to row Ri + 1 and encounters the C in row
Ri of that column Thus, service S can go upwards as
far as row Ri+ 1 by repeated commutation Similarly,
service S in row i can commute with the service in row
i + 1, then with the service in row i + 2, and so on, until
the C in column i of S encounters the uppermost A in
that column, say in row Ui, namely all the way down to
row Ui− 1 The range (Ri+ 1, Ui− 1) is the range of
commutation for service S in row i
Repeated commutation of services amounts to a
per-mutation of the rows of C To preserve the canonical
form, a symmetric permutation of the columns must
follow Thus:
where P is a permutation matrix The symmetric
per-mutation is also behavior-preserving, and it is a
refac-toring SCA can be formally described as a procedure
that finds P such that p(C) is minimized The
mini-mization of p(C) is achieved by systematically
examin-ing sets of legal permutations and selectexamin-ing those that
reduce p(C) the most However, SCA does not
guar-antee a true minimum In the process, p(C) decreases
smoothly, but individual scopes behave in a complicated
way as they get progressively constricted against the
constraints imposed by the rules of commutation The
refactoring forces related services and variables to
co-alesce into highly cohesive, weakly coupled clusters, a
phenomenon known as encapsulation The clusters are
recognized because few or no scopes cross intercluster
boundaries, they correspond to objects, and the term
constriction is intended to convey all these ideas The
original version of the algorithm, known as SCA2,
op-erates as follows:
(1) Select a row i of C in an arbitrary order
(2) Determine the range of commutation Ri, Uifor the
service in that row
(3) For each k, Ri < k < Ui, calculate p(Ck), where Ck
is obtained from C by permuting the service from
row i to row k, and select any k that minimizes p
(4) Permute the service to the selected row
(5) Repeat (1-4) until all rows are exhausted
(6) Repeat the entire procedure until no more
reduc-tions are obtained
To calculate the asymptotic complexity of SCA2 we
as-sume that C, being sparse, has a small, fixed number of off-diagonal nonzeros per row Assuming the roles are indexed by service, the calculation of Ri, Ui requires a small, fixed number of operations per row, or O(N ) op-erations for step (2) in total The calculation of the profile, eq 3, requires the calculation of Dj for each column j, which takes a small, fixed number of oper-ations per column, or O(N ) in total In a worst case scenario, the range for k in step (3) may be O(N ), so step (3) will require O(N2) operations per row, or a total of O(N3) for the entire procedure The rest of the operations is O(N ) or less Thus, the asymptotic complexity of SCA2 is O(N3), caused by the repeated calculation of the profile The new version of SCA dif-fers from SCA2 only in step (3), as follows:
(3) (new version) Calculate ∆i,k(C) for each k, Ri <
k < Ui, and select the smallest
∆i,k(C) is the increment in the value of the profile when the service in row i is reassigned to row k, and can be calculated based on the expression:
∆i,i+1(C) = qi+ pi− qi+1 (7) Let ni be the number of terminal A’s in row i, mj
be the number of terminal A’s in column j (0 or 1), and qi = ni − mi be the excess of terminal A’s for row/column i Also let pi be the number of terminal pairs in row i We say that a terminal pair exists in row i, column j when Ci,j = A and Ci+1,j is a terminal
A Equation 7 follows, and ∆i,kis obtained by repeated application of that equation
Assuming as we did before that the roles are indexed
by service, and the services by row and column, the cal-culation of Ri, Ui, qi, pi and ∆i,i+1 each takes a small, fixed number of operations, and the calculation of ∆i,k
for all k takes O(N ) operations Thus, the new step (3) takes O(N ) operations, and the asymptotic complex-ity of SCA is O(N2) The improvement in complexity
is due to the fact that actual values of the profile are never calculated The new SCA is a second order al-gorithm because the neutral subsets are properly taken care of as part of the range of commutation [2] SCA is a natural MMC algorithm in the sense that it modifies the MMC itself and is universal As such, and since the MMC is a program [1], SCA can be installed
as a part of MMC itself, making the MMC a dynamical system, a self-refactoring MMC where the energy (Lya-punov) function is the profile p(C) and the attractors are the objects that SCA converges to Since SCA is behavior-preserving, it can run in the background with-out affecting the operation of the MMC The dynamical operation is characterized by two well-differentiated but coexisting processes: (1) new information arrives as the
Trang 6result of some foreground training process and is
ap-pended to C, resulting in a large profile, and (2) SCA
minimizes the profile and updates the natural ontology
by creating new objects or modifying the existing ones
in accordance with the information that arrives The
objects are instated as new categorical variables and
op-eration continues, now in terms of the new objects Such
a system allows higher cognition such as abstraction and
generalization capabilities, and is strongly reminiscent
of the human mind, particularly if the creation of
ob-jects representing the natural ontology is interpreted as
“understanding”, and the recognition of objects for
fur-ther processing as “reasoning” These views offer a new
interpretation of learning and meaning
The term “energy” used above refers to resources
in general, including not just physical energy but also
building materials, or some measure of the physical
re-sources needed to implement the system When neurons
form their axons and dendrites they must maximize
in-formation storage but minimize resource allocation [12]
The correspondence between the scopes and the
net-work connections discussed in Section 3 suggests a
cor-respondence between their respective lengths as well, in
which case there should be a biological SCA-type
pro-cess that rewires the network by proximity or migrates
the neurons to shorten their connections Either way,
the net result is that neurons close in the logical
se-quence become also geometrically close, creating an
as-sociation between function and information similar to
an OOP object These observations are consistent with
the minimum wiring hypothesis, as well as with Horace
Barlow’s efficient coding hypothesis, Drescher’s schemas
[13], and Gell-Mann’s schemata [14] Similar
observa-tions may apply to other biological structures such as
organs, or to entire organisms
In comparison with other algorithms such as MDA,
we note that SCA uses no arbitrary parameters, is
ex-pandable in the sense that new elements and new classes
can be added and the model can grow virtually
indef-initely, both in size and refinement, and is biologically
plausible because it uses local properties, likely to be
available in a cell or an organ MDA, instead, uses
mathematical equations, very unlikely to exist in a
bio-logical environment
Applications for SCA can be found in many domains
An example in theoretical Physics was published [4],
where the model consists of 18 simple equations with
30 variables, and SCA constructs an ontology consisting
of a 3-level multiple-inherited hierarchy with 3 objects
in the most specialized class, that describes an
impor-tant law of Physics Here we consider classification
For classification, associations must be established be-tween some property of the objects to be classified and
a suitable discriminant or classifier Then, SCA finds patterns and classifies the objects dynamically For example, if the objects are points in some space, then the discriminant is a mesh of cells of the appropriate dimensionality and desired resolution, points are associ-ated with the cells that contain them, and the resulting classes are clusters of points If the objects are neurons that fire at different times, the discriminant is a mesh
of time intervals, neurons are associated with the time intervals where they fire, and the classes would be neu-ral cliques [11] Table 2 summarizes these observations Table 2 Parameters used for SCA classification objects property discriminant class
points position mesh of cells cluster of points neurons firing event time mesh neural clique Our classification example involves a set of 167 points defined by their coordinates in some space In the ex-ample, the space is two-dimensional, but the number of dimensions is irrelevant In Figure 1, the points are at the center of the symbols The discriminant consists of
4 overlapping meshes, simulating the continuity of the space The basic mesh consists of cells of size 1 × 1, and 3 more meshes are superposed with relative shifts
of (0.5, 0), (0, 0.5), and (0.5, 0.5), respectively The resulting matrix of services C is of order 1433, and is already in canonical form
××× × × ×
×× ×× ×× ××
× × ×
×
×
× ×
× × ×
× ×
+ + ++
+ + + + + + + + + + + + + + +
+ + + + + + + + + + + +++ +++ + + + +++ + + + + + + + + + ++ + + +++ ++ + + + + + + + +++ + ++ + + + ++ + ++ + ++ + ++ + ++ + + + + + +
+
△
△
△ △
△ △
△△
△△△ △△△△
△△△△△△ △ △△△ △
△
△ △
△ △
Figure 1 The set of points for the example The given points are at the center of the symbols, the symbols in-dicate the resulting classes
The initial 167 services initialize the points (assuming each service knows where to initialize them from, which
is irrelevant for our purpose) The next 345 services initialize all the necessary cells The last 921 services establish the point/cell associations Each service takes
Trang 7one point and one cell as arguments (indicated with an
“A” in that row and the corresponding columns), and
initializes one association (a “C” in that association’s
column) The initial profile is 299,565 and the data
channel’s average width is 209.1 and maximum width
is 1266 SCA converges in two passes, leaving a final
profile of 15,642 and a data channel with an average
width of only 10.9 and a maximum width of 705 The
points are classified into three clusters as indicated by
the symbols in Figure 1 The ontology for this system
consists of just one class with three objects, the clusters
MMC is a form of mathematical notation designed to
express our knowledge about a domain Any ANN can
be represented as an MMC, and ANN/MMC
combina-tions are also possible The models are formal, have a
hierarchical but flexible organization, and are
machine-interpretable Algorithms can be designed to induce
transformations, supported by a rich algebra of
opera-tions All modes of training are inherited In addition,
ANN/MMC models can be directly constructed from
existing ontologies such as business models, computer
programs or scientific theories
We believe that the MMC offers an excellent
oppor-tunity for creating realistic models of the brain and
nervous system, particularly when used in combination
with traditional ANNs A model can consist of many
submodels representing different subsystems and
hav-ing different degrees of detail, dependhav-ing on the extent
of the knowledge that is available or of interest for each
subsystem It is possible to start small and then grow
virtually indefinitely, or to add fine detail to a particular
submodel of interest, while still retaining
interoperabil-ity Dynamic, self-organizing submodels will find their
own natural ontologies, which can then be compared
with observation, an approach that is radically different
from the more traditional static man-made ontologies,
and has remarkable similarities with human and animal
learning MMC offers a framework for constructing,
combining, sharing, transforming and verifying
ontolo-gies
We conclude that the MMC can serve as an
effec-tive tool for neural modeling But above all, the MMC
will serve as a unifying notion for complex systems, by
bringing unity to disconnected fields, organizing
infor-mation, and providing convergence of concepts and
in-teroperability to tools and algorithms
References
[1] Sergio Pissanetzky “A relational virtual machine
for program evolution” Proc 2007 Int Conf on
Software Engineering Research and Practice, Las Vegas,
NV, USA, pp 144-150, June 2007 In this publication, the Matrix Model of Computation was introduced with the name Relational Model of Computation, but was later renamed because of a name conflict
[2] Sergio Pissanetzky “The Matrix Model of Compu-tation.” Proc 12th World Multi-Conference on Sys-temics, Cybernetics and Informatics: WMSCI ’08 Or-lando, Florida, USA, June 29 - July 2, 2008
[3] B Chandrasekaran, J R Josephson, and V R Ben-jamins “What are ontologies, and why do we need them? ” IEEE Intelligent Systems, Vol 14(1), pp
20-26 (1999)
[4] Sergio Pissanetzky “Applications of the Matrix Model of Computation.” Proc 12th World Multi-Conference on Systemics, Cybernetics and Informatics: WMSCI ’08 Orlando, Florida, USA, June 29 - July 2, 2008
[5] Jifeng He, C A R Hoare, and Jeff W Sanders
“Data refinement refined.” Lecture Notes In Computer Science, Vol 213, pp 187-196 (1986)
[6] Jeff W Sanders ”Computations and Relational Bundles.” Lecture Notes in Computer Science, Vol 4136,
pp 30-62 (2006)
[7] Sergio Pissanetzky Sparse Matrix Technology Aca-demic Press, London, 1984 Russian translation: MIR, Moscow, 1988
[8] J P Neto “A Virtual Machine for Neural Com-puters.” S Kollias et al (Eds) ICANN 2006, Part I, LNCS 4131, pp 525-534, 2006
[9] W Uwents, G Monfardini, H Blockeel, F Scarcelli, and Marco Gori “Two connectionist models for graph processing: an experimental comparison on relational data.” Mining and Learning with Graphs Workshop (MLG 2006), ECML/PKDD, Berlin (2006)
[10] S R Chidamber and C F Kemerer “A metrics suite for object oriented design.” IEEE Trans on Soft-ware Engng., Vol 22, pp.476-493 (1994)
[11] L Lin, R Osan, and J Z Tsien “Organizing prin-ciples of real-time memory encoding: neural clique as-semblies and universal neural codes.” Trends in Neuro-sciences, Vol 29, No 1, pp 48-57 (2006)
[12] D B Chklovskii, B W Mel, and K Svoboda
“Cortical rewiring and information storage.” Nature, Vol 431, pp 782-788 (2004)
[13] G Drescher Made-up Minds MIT Press, Cam-bridge, MA (1991)
[14] M Gell-Mann The Quark and the Jaguar W H Freeman and Co, New York (1994)
Acknowledgements To Dr Peter Thieberger (BNL, NY) for his generous and unrelenting support, without which this might not have happened