1. Trang chủ
  2. » Giáo án - Bài giảng

on the applicability of spiking neural network models to solve the task of recognizing gender hidden in texts

10 4 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề On the applicability of spiking neural network models to solve the task of recognizing gender hidden in texts
Tác giả Alexander Sboev, Tatiana Litvinova, Danila Vlasov, Alexey Serenko, Ivan Moloshnikov
Trường học National Research Nuclear University MEPhI
Chuyên ngành Computational Science
Thể loại Procedia Computer Science
Năm xuất bản 2016
Thành phố Moscow
Định dạng
Số trang 10
Dung lượng 415,41 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The first one is to obtain synaptic weights for the spiking network by training a formal network.. Keywords: supervised learning, spike-timing-dependent plasticity, artificial neural netwo

Trang 1

Procedia Computer Science 101 , 2016 , Pages 187 – 196 YSC 2016 5th International Young Scientist Conference on Computational Science

Peer-review under responsibility of organizing committee of the scientific committee of the

5th International Young Scientist Conference on Computational Science

© 2016 The Authors Published by Elsevier B.V

On the applicability of spiking neural network models to solve the task of recognizing gender hidden in texts

Alexander Sboev1 ,2,3,4,5, Tatiana Litvinova2, Danila Vlasov1 ,4, Alexey Serenko2,

and Ivan Moloshnikov2 ,4

1 MEPhI National Research Nuclear University, Moscow, Russia

2 National Research Center Kurchatov Institute, Moscow, Russia

3 Plekhanov Russian University of Economics, Moscow, Russia

4 JSC “Concern ‘Systemprom’ ”, Moscow, Russia

5 Moscow Technological University (MIREA), Moscow, Russia

Sboev AG@nrcki.ru

Abstract

Two approaches to utilize spiking neural networks, applicable for implementing in neuromorphic hardware with ultra-low power consumption, in the task of recognizing gender of a text author are analyzed The first one is to obtain synaptic weights for the spiking network by training a formal network We show the results obtained with this approach The second one is a creation

of a supervised learning algorithm for spiking networks that would be based on biologically plausible plasticity rules We discuss possible ways to construct such algorithms

Keywords: supervised learning, spike-timing-dependent plasticity, artificial neural networks, spiking neural networks

Introduction

For a few last years the interest to spiking neural networks has been growing greatly as the result

of appearance of neuromorphic hardware capable of running such networks It, in turn, gives rise to necessity to develop approaches that can be implemented on such hardware for solving practical tasks Taking into account the fact that hardware with ultra-low power consumption gives a way to solve the mentioned tasks on autonomous devices, a problem of spiking neural network learning becomes particularly relevant The task of predicting gender of a text author

on base of linguistic parameters, that could be realised on these devices, is important, in particular, for security or conversational purposes

There are generally two approaches to using spiking networks in a classification task Since learning algorithms for artificial networks are developed more than those for spiking nets, the direct approach is to convert a trained formal network into a spiking one In [1] each formal neuron is replaced with several spiking ones They, along with the encoding and decoding

187

Trang 2

Figure 1: Algorithm steps

machinery, reproduce its activation function Furthermore, one can simply transfer synaptic weights from a trained formal network to a spiking network of same topology [2] We show

in Section 1.2 that after such transfer the spiking network achieves higher accuracy than the formal one in the Fisher’s Iris classification task, and in Section 1.3 apply this approach to the gender recognition task

Another approach is to implement learning in spiking neuron networks by biologically in-spired learning rules There has been published a number of synaptic plasticity models suitable for supervised learning [3, 4, 5], but still none has been based only on the current knowledge

of biological neural systems operating rules, namely, on the Hebb principle As the biologically plausible long-term plasticity model we consider spike-timing-dependent plasticity (STDP) [6]

It was in [7] shown to be suitable for unsupervised learning, but a supervised learning protocol based on it has not yet been developed In Section 2.2 the STDP parameters which allow

to receive several different synaptic weight distributions are demonstrated In Section 2.3 we show that any desired weight values can be reached in case of given proper value of correlation between input and output spike sequences Based on this fact, in Section 2.4 we suggest a supervised learning algorithm suitable for classification of rate-coded binary vectors

1 ANN to SNN mapping approach

We here used, following [2], the combined learning algorithm, involving artificial (ANN) and spiking neural networks (SNN) It consists of the following steps (fig 1):

1 Training the artifical neural network using backpropagation The neurons’ activation func-tion was ReLU for hidden layers and Softmax for the output layer Neuron biases were set to zero Input data was normalized so that the L2 norm of each vector was 1

2 Transferring the synaptic weights to the spiking neural network Integrate-and-fire neuron model was chosen, in which the membrane potential V obeys dVdt =

i



s∈S iwiδ(t − s), where Si is the sequence of spikes (spike train) on i-th input synapse, and wi is the synaptic weight Whenever the potential exceeds the threshold Θ, it is reset to zero and the neuron fires a spike

3 Encoding input data to spike trains Input vector component x was encoded by a Poisson spike train with mean frequencyx · νmax

4 Optimizing the spiking network parameters Besides νmax and Θ, simulation time T and simulation step Δt were adjusted According to [2],

Trang 3

• the simulation time should be set long enough to eliminate probabilistic influences of spike trains;

• correct classification is impossible if it requires a neuron to fire several spikes in one simulation step So, total input a neuron receives during one simulation step must not exceed the threshold This condition is confidently fulfilled if

νmax· Δt ·

i

To fulfill (1) all spiking neural network weights are divided by the normalization factorM , same for all neurons in a layer but unique for each layer,

M = 1

Θmaxj

⎝

j

wij

where wij is i-th synapse weight of j-th neuron in current layer The conditions above are necessary but not sufficient, so achieving maximal classification accuracy still requires adjusting

νmax and Θ

1.2 Fisher’s Iris classification

To test the algorithm described above the popular toy task of Fisher’s iris classification was solved The network had 4 neurons in the input layer, 4 neurons in the single hidden layer, 3 neurons in the output layer Spiking network weights were normalized according to (2) Each input vector was presented during 10 s The classification result was determined according to the output neuron that fired the most spikes during the simulation

1.2.1 Results

The mean classification error (the ratio of wrongly classified input samples to the total number

of samples) of ReLU network was 0.04 ± 0.01 on the training set and 0.06 ± 0.04 on the test set, averaged over 20 realizations of splitting to training and testing sets The spiking network can achieve higher accuracy than the ReLU one (Fig 2), with adjusted Θ andνmax reducing the error down to 0.04 ± 0.01 The higher Θ is, the higher the accuracy is, because the neuron has to integrate more input spikes before it fires an output spike

RusPersonality [8] is the first corpus of Russian-language texts labeled with data on their authors This free-to-use corpus contains over 1,850 documents, 230 words per document in average, from 1,145 respondents and is currently expanding A unique aspect of our corpus is the breadth of the metadata (gender, age, personality, neuropsychological testing data, education level, etc.) Another advantage is that, in contrast to the common approach of retrieving texts from social networks, all our samples were designed especially for this corpus Therefore they

do not contain any borrowings or citations All respondents were given a few themes to write about, same for male and female participants This, along with the large number of participants, allows to focus on the peculiarities caused by demographic characteristics of authors (gender in the case of the current paper) rather than by their individual styles

Trang 4























!

!

!

Figure 2: Fisher’s iris classification error of spiking network on the test set, divided by the error of ReLU network, in dependence of maximum input frequency νmax for different neuron thresholds Θ The error is averaged over 20 realizations of splitting to training and testing sets, and then over 5 independent realizations of input spike trains For distinctness, deviation bars are shown not for every point

As the input data for gender prediction, the following set of context-independent features was used to describe a text:

• Morphological features – the number of nouns, numerals, adjectives, prepositions, verbs, pronouns, interjections, articles, conjunctions, participles, infinitives, and the number of finite verbs

• Syntactical parameters – syntactic relations of different types

• Derivative coefficients which are different ratios of parts of speech (Trager index, dynamics coefficient, etc.)

• The number of exclamatory marks, question marks, dots, and of emoticons;

• The number of words pertaining to a particular “Emotion” group, e.g., “Anxiety”, “Dis-content”, the total of 37 categories

The highest gender classificacion accuracy obtained on our corpus is 0.86±0.05 [9], employing

a sophisticated combination of learning algorithms However, we are currently interested in the difference in accuracy between the spiking network and ReLU rather than in the absolute accuracy values

The training set contained 364 texts, the testing one 187 Network topology: 141 input neurons, 81 neurons in the first hidden layer, 19 neurons in the second hidden layer and 2 neurons

in the output layer Weight mapping was performed both with and without normalization (2)

Trang 5



















"!#$

%

%

%

%

%

%

%

%

%

 &'$

Figure 3: The dependence of gender recognizing error on maximum input frequency νmax for different neuron thresholds Θ, and also with weights normalization, in which case Θ was equal

to 1

1.3.1 Results

The classification error of ReLU neural network was 0.22 on the testing set Mean classification error on test set of spiking neural network with different Θ and νmax without normalization (2) and with normalization are shown in Fig 3 Again, as in the Iris classification task, the best accuracy was obtained at high input frequencies and thresholds The lowest error is 0.22, indicating that no losses took place during mapping

2 The principal possibility of applying Spike-Timing-Dependent Plasticity to the gender recognition task

In the Spike-Timing-Dependent Plasticity model, each synapse’ strength is described by a weight 0≤ w ≤ wmax, whose change depends on the exact moments tpreof presynaptic spikes andtpost of postsynaptic spikes:

Δw =

−W−·

w

wmax

μ −

· exp

−tpre− tpost

τ− , if tpre− tpost > 0;

W+·

1− w

wmax

μ +

· exp

−tpost− tpre

τ+ , if tpre− tpost < 0

(3)

Trang 6

Figure 4: The restricted symmetric spike pairing scheme Tics denote spikes, and a gray line mean taking that pair of spikes into account in the STDP weight change rule, potentiation in pre-before-post case and depression in post-before-pre case

where W+= 0.03, W = 1.035 · W+, τ+ = τ− = τcorr= 20 ms The rule with μ+= μ− = 0 is

called additive STDP, with μ+ = μ− = 1 – multiplicative, intermediate values 0≤ μ ≤ 1 are also possible

In case of additive STDP the additional constraint is needed to prevent the weight from falling below zero or exceeding the maximum value wmax= 1:

ifw + Δw > wmax, then Δw = wmax− w; if w + Δw < 0, then Δw = w

An important part of STDP rule is the scheme of pairing pre- and postsynaptic spikes when evaluating weight change according to the rule 3 Besides the all-to-all scheme, there exist several nearest-neighbour ones [6] We used the restricted symmetric scheme (Fig 4), in which

a presynaptic spike is paired with the last preceding postsynaptic, and vice versa, but a spike can participate neither in two depression pairs nor in two potentiation pairs

As the neuron model we used Leaky Integrate-and-Fire, in which the membrane potential dynamics is

dV

dt =

− (V (t) − Vresting)

τm +Isyn(t)

Cm +Iext

Cm; when V ≥ Vth=−54 mV, V → Vresting=−70 mV, and during the refractory period τref= 3 ms the neuron is insensitive to the synaptic input The membrane capacity Cm = 300 pF, the membrane leakage time constant τm= 10 ms The postsynaptic current is of exponential form:

a presynaptic spike from synapse i at time tsp adds

wi(tsp)qsyn

τsyne−t−tspτsyn Θ(t − tsp)

to Isyn, where qsyn= 0.75 nC, τsyn= 5 ms, wiis the synaptic weight and Θ(t) is the Heaviside step function

2.2 The possibility of reaching non-bimodal weight distributions by non-additive STDP

In case of additive STDP only 0 and wmax = 1 are the stable values of weight Using non-additive STDP allows to reach more wide range of weight distributions

To investigate the ability of weights to converge to the target, we used the protocol of [10]:

1 Preliminarily, the output train of the neuron with target weights and without STDP is recorded It will be then considered as the desired output

Trang 7











  













   Figure 5: Target synaptic weights and weights reached after applying protocol described in Section 2.2 In the left plot target weights are all equal to 0.5, and in the right plot target weights are distributed uniformly between 0 and 1

2 Then the neuron, now with STDP turned on, receives the same input trains, and is forced

to fire spikes in desired moments by stimulating it by current impulses

Fig 5 shows two examples of target weight distributions that can be reached during learning with the parameters that we found, μ+= 0.06 and μ−= 0.01

2.3.1 The correlation measure

The direction of average weight change is determined by the amount of correlation between input and output spike trains Defining the normed cross-correlation function as

kSpre(k · tbin)·kSpost(k · tbin)



k

Spre(k · tbin)Spost(k · tbin+ Δt),

where Spre/post(t) indicates a pre/postsynaptic spike respectively at time t, and tbin is the simulation step,

I =

τcorr

Δt=0

Γ(Δt) can be used as a rough correlation indicator, where τcorr is the STDP time window constant

2.3.2 Results

Here we artificially generated input and output spike trains with different values of correlation When STDP is applied to these trains, the weights reach some equlibrium state (Fig 6A) The obtained weights, in their turn, reproduce output signal with the same level of correlation with input as the initial artificial signal, see Fig 6B STDP was non-additive, with the parameters

as in Section 2.2

So, any desired weight value can be reached by making the neuron generate output with the proper amount of correlation with the corresponding input Based on this fact, we suggest the following protocol of supervised learning

Trang 8















'"

























 



Figure 6: Results of applying STDP to artificially generated input and output spike trains A: the dynamics of the Eucledian norm of the weight vector during the weights convergence B:

the dependence of correlation indicator I for artificially constructed input and output trains on weights obtained as the result of STDP learning on base of these trains (“artificial output”), and the correlation of the output that the neuron produces with the established weight (“neuron output”)

As the input data we used 10-dimensional binary vectors, having half components of 0 and the other half of 1 Each vector component of 1 was encoded by 10 synapses of the neuron receiving independent Poisson trains with mean frequency of 30 Hz, a component of 0 – by 10 independent 2-Hz trains Let each vector belong to one of two classes: C+, in response to which high output frequency is expected as proper classification, andC−, vectors from which should

produce low mean output frequency

Our model consisted of a single neuron with 100 incoming synapses, all excitatory STDP was additive withW+= 0.01, W = 1.035 · W+ Initially all weights were set to 0.4

2.4.1 The learning protocol

Input vectors are presented to the neuron in an alternating manner: a vector from C+ during

5 s, then a vector from C− for 1.5 s During the presentation of a vector from C− the neuron

is stimulated with constant current, high enough to make the mean output rate close to the highest possible 1/τref

2.4.2 Results

While the neuron is receiving an input vector from C+class, a synapse receiving high-frequency input contributes more to the neuron’s output, therefore its weight is more rewarded by STDP Vector components of 1 increase in 66% cases, and weights of synapses receiving components of

0 decrease in 66% cases with the parameters we have chosen When a vector fromC− class is

presented, the neuron output is caused by the stimulating current and is poorly correlated with input So, all weights decrease (for them not to fall to zero the duration of a vector from C−

Trang 9









 

 

 

 

 



Figure 7: Deviation β between actual and

tar-get weights during learning

















Figure 8: Mean firing rate of the neuron in re-sponse to the input vectors after learning The first three vectors belong to C+ class, and the second three to theC− class Firing rate was

averaged over 5 tries, each having independent 30-s input spike trains

is 1.5 s in contrast to 5 s of a vector from C+), but weights of high-frequency inputs decrease more due to higher number of post-before-pre events

We took six binary vectors:

S1= (1 1 1 1 0 0 0 0 1 0),

S2= (0 1 0 0 1 1 1 0 1 0),

S3= (1 0 0 1 1 1 1 0 0 1),

S4= (1 1 0 0 0 1 0 1 0 1),

S5= (0 1 0 1 1 0 0 1 0 1),

S6= (0 1 1 0 0 1 0 1 0 0);

three of which are linearly separable from the other three The target weights which separate them are known:

(1 0 1 0 1 0 1 0 1 0),

so we watched the deviation

β(t) =

100

i=1|wi(t) − wi

target|

100 i=1wi target

between actual and target weights during learning (Fig 7) After 6,045 s of learning (310 cycles

of presenting the whole set of vectors) the neuron clearly distinguishes the classes by its mean firing rate, as shown in Fig 8

Trang 10

There is a straightforward way to obtain the spiking network to solve the task of recognizing gender hidden in texts by training a well-studied artificial network and then use the ready weights in the spiking network implemented on a hardware with low energy consumption Results of mapping ANN to SNN demonstrate the same classification error of 0.22 of both ANN and SNN, indicating lossless mapping

It is also possible to implement supervised learning in a spiking network with spike-timing-dependent plasticity, based on controlling the correlation between input and output spike trains The proposed technique opens the way for using it in practical tasks, such as gender identifying

It is a question of further research

Acknowledgements

This work was supported by RSF, project 16-18-10050 “Identifying the Gender and Age of Online Chatters Using Formal Parameters of their Texts” Simulations were carried out using high-performance computing resources of federal center for collective usage at NRC “Kurchatov Institute”, http://computing.kiae.ru

References

[1] Chris Eliasmith How to build a brain: A neural architecture for biological cognition Oxford University Press, 2013

[2] Peter U Diehl, Daniel Neil, Jonathan Binas, Matthew Cook, Shih-Chii Liu, and Michael Pfeiffer Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing In IEEE International Joint Conference on Neural Networks (IJCNN), 2015

[3] R G¨utig and H Sompolinsky The tempotron: a neuron that learns spike timing-based decisions Nat Neurosci., 9(3):420–428, 2006

[4] Joseph M Brader, Walter Senn, and Stefano Fusi Learning real-world stimuli in a neural network with spike-driven synaptic dynamics Neural computation, 19(11):2881–2912, 2007

[5] Jan-Moritz P Franosch, Sebastian Urban, and J Leo van Hemmen Supervised spike-timing-dependent plasticity: A spatiotemporal neuronal learning rule for function approximation and decisions Neural computation, 25(12):3113–3130, 2013

[6] A Morrison, M Diesmann, and W Gerstner Phenomenological models of synaptic plasticity based on spike timing Biol Cybern., 98:459–478, 2008

[7] Peter U Diehl and Matthew Cook Unsupervised learning of digit recognition using spike-timing-dependent plasticity Frontiers in Computational Neuroscience, 2015

[8] OV Zagorovskaya, TA Litvinova, and OA Litvinova Elektronnyy korpus studencheskikh esse na russkom yazyke i ego vozmozhnosti dlya sovremennykh gumanitarnykh issledovaniy [electronic corpus of student essays and its applications in modern humanity studies] Mir nauki, kultury i obrazovaniya [World of Science, Culture and Education], 3(34):387–9, 2012

[9] A Sboev, T Litvinova, D Gudovskikh, R Rybka, and I Moloshnikov Machine learning models

of text categorization by author gender using topic-independent features (in review)

[10] R Legenstein, C Naeger, and W Maass What can a neuron learn with spike-timing-dependent plasticity Neural Computation, 17:2337–2382, 2005

... class="page_container" data-page="10">

There is a straightforward way to obtain the spiking network to solve the task of recognizing gender hidden in texts by training a well-studied artificial network. .. firing rate of the neuron in re-sponse to the input vectors after learning The first three vectors belong to C+ class, and the second three to theC− class Firing... and then use the ready weights in the spiking network implemented on a hardware with low energy consumption Results of mapping ANN to SNN demonstrate the same classification error of 0.22 of both

Ngày đăng: 04/12/2022, 16:01

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
[1] Chris Eliasmith. How to build a brain: A neural architecture for biological cognition. Oxford University Press, 2013 Sách, tạp chí
Tiêu đề: How to build a brain: A neural architecture for biological cognition
Tác giả: Chris Eliasmith
Nhà XB: Oxford University Press
Năm: 2013
[2] Peter U. Diehl, Daniel Neil, Jonathan Binas, Matthew Cook, Shih-Chii Liu, and Michael Pfeiffer.Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In IEEE International Joint Conference on Neural Networks (IJCNN), 2015 Sách, tạp chí
Tiêu đề: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing
Tác giả: Peter U. Diehl, Daniel Neil, Jonathan Binas, Matthew Cook, Shih-Chii Liu, Michael Pfeiffer
Nhà XB: IEEE
Năm: 2015
[3] R. G¨ utig and H. Sompolinsky. The tempotron: a neuron that learns spike timing-based decisions.Nat. Neurosci., 9(3):420–428, 2006 Sách, tạp chí
Tiêu đề: The tempotron: a neuron that learns spike timing-based decisions
Tác giả: R. Gütig, H. Sompolinsky
Nhà XB: Nature Neuroscience
Năm: 2006
[4] Joseph M Brader, Walter Senn, and Stefano Fusi. Learning real-world stimuli in a neural network with spike-driven synaptic dynamics. Neural computation, 19(11):2881–2912, 2007 Sách, tạp chí
Tiêu đề: Learning real-world stimuli in a neural network with spike-driven synaptic dynamics
Tác giả: Joseph M Brader, Walter Senn, Stefano Fusi
Nhà XB: Neural Computation
Năm: 2007
[5] Jan-Moritz P Franosch, Sebastian Urban, and J Leo van Hemmen. Supervised spike-timing- dependent plasticity: A spatiotemporal neuronal learning rule for function approximation and decisions. Neural computation, 25(12):3113–3130, 2013 Sách, tạp chí
Tiêu đề: Supervised spike-timing-dependent plasticity: A spatiotemporal neuronal learning rule for function approximation and decisions
Tác giả: Jan-Moritz P Franosch, Sebastian Urban, J Leo van Hemmen
Nhà XB: Neural Computation
Năm: 2013
[6] A. Morrison, M. Diesmann, and W. Gerstner. Phenomenological models of synaptic plasticity based on spike timing. Biol. Cybern., 98:459–478, 2008 Sách, tạp chí
Tiêu đề: Phenomenological models of synaptic plasticity based on spike timing
Tác giả: A. Morrison, M. Diesmann, W. Gerstner
Nhà XB: Biol. Cybern.
Năm: 2008
[7] Peter U. Diehl and Matthew Cook. Unsupervised learning of digit recognition using spike-timing- dependent plasticity. Frontiers in Computational Neuroscience, 2015 Sách, tạp chí
Tiêu đề: Unsupervised learning of digit recognition using spike-timing- dependent plasticity
Tác giả: Peter U. Diehl, Matthew Cook
Nhà XB: Frontiers in Computational Neuroscience
Năm: 2015
[8] OV Zagorovskaya, TA Litvinova, and OA Litvinova. Elektronnyy korpus studencheskikh esse na russkom yazyke i ego vozmozhnosti dlya sovremennykh gumanitarnykh issledovaniy [electronic corpus of student essays and its applications in modern humanity studies]. Mir nauki, kultury i obrazovaniya [World of Science, Culture and Education], 3(34):387–9, 2012 Sách, tạp chí
Tiêu đề: Elektronnyy korpus studencheskikh esse na russkom yazyke i ego vozmozhnosti dlya sovremennykh gumanitarnykh issledovaniy
Tác giả: OV Zagorovskaya, TA Litvinova, OA Litvinova
Nhà XB: Mir nauki, kultury i obrazovaniya
Năm: 2012
[9] A. Sboev, T. Litvinova, D. Gudovskikh, R. Rybka, and I. Moloshnikov. Machine learning models of text categorization by author gender using topic-independent features. (in review) Sách, tạp chí
Tiêu đề: Machine learning models of text categorization by author gender using topic-independent features
Tác giả: A. Sboev, T. Litvinova, D. Gudovskikh, R. Rybka, I. Moloshnikov
[10] R. Legenstein, C. Naeger, and W. Maass. What can a neuron learn with spike-timing-dependent plasticity. Neural Computation, 17:2337–2382, 2005 Sách, tạp chí
Tiêu đề: What can a neuron learn with spike-timing-dependent plasticity
Tác giả: R. Legenstein, C. Naeger, W. Maass
Nhà XB: Neural Computation
Năm: 2005

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w