This dissertation addresses human activity recognition, espe-cially human-human interactions in realistic video material, such as movies, surveillance videos.. Besides, we believe that o
Independent Subspace Analysis (ISA) for image data
Definition of the ISA and it’s algorithm
Independent Component Analysis (ICA) [37] is a statistical model, which is defined by a linear transformation of latent independent variables In particular, let x‘ denote the grey- scale values in a small image patch, the ICA model expresses x’ as a linear superposition
33 of some features A: x'= As (4.1) where s is a vector whose elements are components (or coefficients) Note that s is different from patch to patch The matrix A is the same for all patches.
The basic assumption in the ICA model is that the components s are nongaussian and statistically independent Given a sufficient number of observations of image patches, the problem is then to estimate the values of A without knowing the values of latent components s This problem is restricted to the basic case where A is an invertible matrix Hence, estimation of A in Eq (4.1) is equivalent to determining the values of W in Eq (4.2): s = Wx' (4.2) where W is obtained by inverting the matrix A.
Independent Subspace Analysis (ISA) [35] is an interesting generalization of the basic ICA, and has the same model as in Eq (4.1) In contrast to the ICA, the components s are not assumed to be statistically independent In the ISA model, s can be divided into couple, triplet, or in general ô-tuples where ô is the dimension of subspace The ISA model assumes that the components inside a given ô-tuple may be dependent on each other, but dependencies among different ô-tuples are not allowed.
Figure 4.1 represents the ISA as a two-layer network, where the elements of the matrix
W in Eq (4.2) are weights in the first layer In this figure, the dimension of subspace is
2 (kK = 2) The objective of the ISA is to learn the weights W while the weights V in the second layer are fixed to represent the subspace structure of the units in the first layer.
Let x’ € R"*! again denote the input patch, the response of /—th unit in the first layer is defined by Eq (4.3): er = (> Wyx;')? (4.3) where W € R**” is the connection weights of the first layer; n and k are the input dimension and number of units in the first layer.
As illustrated in Figure 4.1, each unit of the second layer pools over a small neighbor- hood of adjacent first layer units Hence, the response of each second layer unit is defined by Eq (4.4): k k n filx's W,V) = | Viner =.) 35 Val 95 Wix;")? (4.4) l=1 l=1 j=l
Figure 4.1: The neural network architecture of an ISA network The blue and red bubbles represent units in the first and second layer respectively In this figure, the dimension of subspace is 2: each red bubble looks at 2 blue bubbles. where V € RTM** is the weights connecting units of the first layer to units of the second layer, and m is the number of units in the second layer The matrix V represents the subspace structure of the units in the first layer, and is defined by Eq (4.5):
1, if(@-De+1