doi: 10.1016/j.sbspro.2016.12.008 International Conference on Communication in Multicultural Society, CMSC 2015, 6-8 December 2015, Moscow, Russian Federation Factographic information
Trang 1Procedia - Social and Behavioral Sciences 236 ( 2016 ) 29 – 33
ScienceDirect
1877-0428 © 2016 The Authors Published by Elsevier Ltd This is an open access article under the CC BY-NC-ND license
( http://creativecommons.org/licenses/by-nc-nd/4.0/ ).
Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute).
doi: 10.1016/j.sbspro.2016.12.008
International Conference on Communication in Multicultural Society, CMSC 2015, 6-8 December
2015, Moscow, Russian Federation Factographic information retrieval for communication in
multicultural society Sergey Kulik*
National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Kashirskoe shosse 31, Moscow 115409, Russian
Federation
Abstract
A factographic information retrieval with human involvement which consists of two stages is given detailed consideration in this paper In the first stage, the retrieval without direct human involvement is implemented In the second stage, the retrieval assumes the human involvement This retrieval includes a pattern recognition algorithm, and they are implemented for retrieval only one document among the variety of similar documents An analytical model of the retrieval block is developed This model is presented by effectiveness indicator: average length of the recommendatory list provided by the retrieval block enabling the human operator to take the final decision
© 2016 The Authors Published by Elsevier Ltd
Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute)
Keywords: Information retrieval; communication; recommendatory list; pattern recognition; factographic information; effectiveness
1 Main text
In practice, a factographic information retrieval (FIR) is used in communication technologies for multicultural society Automated Factographic Information Retrieval System (AFIRS) is a special case of question-answer systems (QA systems) (Tomljanovic, Pavlic, and Katic, 2014; Abdullah and Abdel-Kader Rehab, 2011; Sherkat and Farhoodi, 2014, Kulik, 2015) or data base fact retrieval systems (Salton, 1968) This AFIRS (Kulik, 2015) in its structure contains a block of recognition To develop this block we used neural networks (Galushkin, 2007)
* Corresponding author Tel.: +7-495-788-5699; fax: +7-499-324-2111
E-mail address: sedmik@mail.ru
© 2016 The Authors Published by Elsevier Ltd This is an open access article under the CC BY-NC-ND license
( http://creativecommons.org/licenses/by-nc-nd/4.0/ ).
Peer-review under responsibility of the National Research Nuclear University MEPhI (Moscow Engineering Physics Institute).
Trang 2Informative retrieval (Kulik, 2015; Ampazis and Iakovaki, 2004) plays a great role in information retrieval systems Effective FIR is very important and is sometimes critical for AFIRS where the factographic information retrieval is implemented Some important problems were considered in the indicated papers (Salton, 1968; Fukunaga, 1990; Galushkin, 2007; Feller, 1968) These paper and monograph (Kulik, 2015; Galushkin, 2007) are concerned with the problem of a neural networks The neural network technologies (Kulik, 2015; Galushkin, 2007) has been applied successfully to design biometrical system for AFIRS This paper (Kulik, 2015) deals with the problem of biometric systems of identification for AFIRS These papers (Kolmogorov, 1973; Feller, 1968) are concerned with the problem
of probability theory For instance, this paper (Kolmogorov, 1973) deals with the problem of axiomatic basis for modern probability theory Generally, the task of searching in the AFIRS is to find required factographic information, connected with the document which is searched For instance, the task of searching in multicultural society is to find a Document which Is Identical (DII) to the enquiry among the archive system documents with the help of their descriptions in the search factographic database (FDB) and, in case of this document detection, to give necessary factographic information from this document The automated factographic information retrieval system (see Fig.1) consists of different blocks
Fig 1 Automated factographic information retrieval system
The automated factographic information retrieval system includes six blocks: the block 1 of document indexing; the block 2 of retrieval; the block 3 of Recommendatory List (RL) processing; the block 4 – the searching FDB which includes document descriptions in the form of Searching Documents Patterns (SDP); the block 5 of recognition; the block 6 (archive of documents)
2 Methods
The research was conducted on the basis of methods: information retrieval (Salton, 1968), probability theory (Kolmogorov, 1973; Feller, 1968), neural networks (Kulik, 2015; Galushkin, 2007) and pattern recognition (Fukunaga, 1990)
3 Results and discussion
It is supposed that there are N documents which are stored in the archive and every enquiry have Searching Enquiry Pattern (SEP) Each document of the archive is located in the place which is uniquely determined by the registration number Only one document can be located in only one place All SDPs are stored in the searching array
of the factographic database in the form of a consecutive linear list
All SDPs are stored in the searching array of the factographic database in the form of a consecutive linear list Request for the search (or SEP) includes the description of the document which can be stored or can be absent in the documents array It is supposed that the request with R probability can be a description of the document which is identical to one of the archive documents Information in SDP and SEP can be misrepresented because of different
Trang 3hindrances (noises) or errors during the document indexation Comparison of SEP and SDP by the AFIRS is realised using patterns recognition algorithm (Fukunaga, 1990; Galushkin, (2007) and is characterised by P1, P2 probabilities where (Kulik, 2015):
x P1 – probability of the correct comparison of two identical documents based on their descriptions (determines the target mission probability);
x P2 – probability of the correct comparison of two non-identical documents based on their descriptions
(determines the false alarm probability)
It is supposed that Lx is average length of the RL Analytical formula was obtained to evaluate the factographic information retrieval effectiveness with the help of indicator (see (1), (2) and (3)):
x
where:
S – average length of the RL during the information retrieval in the area which includes DII, where
W–average length of the RL during the information retrieval in the area which does not include DII, where
In (1) the L x is:
x
What is more (see (1) and (4)), if R|0, the average length of the RL which is given by the AFIRS for the human
operator (this person makes a final decision) is
x
Following Feller (1968),we will denote:
.
§ ·
© ¹ The following analytical expressions for (3) were obtained to evaluate the effectiveness:
1
0
,
ª § · º ª § · º
2N m 1 2 m.
(6)
As a result of researches (1), (3) and (6), it was set that L x in (5)ischanged during the changes of L, N, and P2 A
small part of these researches for different L, N and P2 we can see in the tables 1, 2
Table 1 Example of effectiveness, if N=50000 and P2 =0.995
Trang 4L 10 70 170 240 242 245 247 250 800 3
5 10
x
Table 2 Example of effectiveness, if N=80000 and P2 =0.998
8 10
x
According to table 1, if P2=0.995, N=50000 (N – the number of SDP) and L=1000 (L – maximum length of the
RL), the length of the RL of the AFIRS is
250
x
L |
Analogously,according to the table 2, if P2=0.998 and L=140, the length of the RL of the AFIRS is
140
x
L |
Let us briefly consider the likely applications of the results
Firstly, these results allow us to develop an effective Factographic Information Retrieval System For example,
we create the AFIRS and share it with someone from other countries A wide variety of people representing different cultures could come to use AFIRS in order to effectively retrieve information which is important to them
So for instance, these results allow us to estimate the efficiency of the factographic information retrieval
Secondly, these results can be helpful in teaching university students The knowledge of Factographic information retrieval can be used as illustrative material in teaching modern information technology in the University Lecturers and students can make presentations on various topics related to communication in a multicultural society For example the Lecturer can talk about Automated Factographic Information Retrieval System or effectiveness of factographic information retrieval or Average length of the Recommendatory List, etc
4 Conclusion
Thus, as a result of research, the analytical formula (6) allowing the evaluation of the RL average length was developed Properties of the important indicator of effectiveness – the RL average length for the human operator were explored It allows reasonable analysis of the effectiveness of the factographic information retrieval which is implemented by the AFIRS Necessary software, allowing the evaluation of the retrieval’s effectiveness, was created for the developer of the Automated Factographic Information Retrieval System In future it is planned to obtain and explore two remaining evaluations of effectiveness They are: probability of the correct response to the enquiry of the factographic information retrieval and average number of comparison operations which are implemented by the Automated Factographic Information Retrieval System
References
Abdullah, M.M., and Abdel-Kader Rehab, F (2011) QASYO: A question answering system for YAGO ontology International Journal of
Database Theory and Application, 4(2), 99–112
Ampazis, N., and Iakovaki, H (2004) Cross-language information retrieval using Latent Semantic Indexing and Self-Organizing Maps
International Joint Conference on Neural Networks (IJCNN'2004), (1) Budapest, Hungary, 751–755
Feller, W (1968) An introduction to probability theory and its applications (Vol.1, 3rd ed.) New York: John Wiley & Sons
Fukunaga, K (1990) Introduction to statistical pattern recognition San Diego, San Francisco, New York, Boston, London, Sydney, Tokyo:
Elsevier Academic Press
Trang 5Galushkin, A.I (2007) Neural networks theory Berlin, Heidelberg: Springer
Kolmogorov, A.N (1973) Grundbegriffe der wahrscheinlichkeitsrechnung Berlin: Springer Reprint: Berlin, Heidelberg, New York: Springer
Verlag
Kulik, S.D (2015) Neural network model of artificial intelligence for handwriting recognition Journal of Theoretical and Applied Information
Technology, 73(2), 202–211
Salton, G (1968) Automatic information organization and retrieval New York: McGraw-Hill
Sherkat, E., and Farhoodi, M (2014) A hybrid approach for question classification in Persian automatic question answering systems 4th
International eConference on Computer and Knowledge Engineering (ICCKE), 29-30 Oct 2014, IEEE, 279–284
Tomljanovic, J., Pavlic, M., and Katic, M.A (2014) Intelligent question - answering systems: review of research 37th International Convention
on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 26-30 May 2014, IEEE, 1228–1233
... related to communication in a multicultural society For example the Lecturer can talk about Automated Factographic Information Retrieval System or effectiveness of factographic information retrieval. .. can be helpful in teaching university students The knowledge of Factographic information retrieval can be used as illustrative material in teaching modern information technology in the University... AFIRS in order to effectively retrieve information which is important to themSo for instance, these results allow us to estimate the efficiency of the factographic information retrieval