The book Multimedia information retrieval: Theory and techniques focuses on the processing and search tools applicable to the management of new multimedia documents. These matters merge in the methodology of MIR, an organic system composed of the TR, VR, VDR and AR systems.
Trang 3Series Editor: Ruth Rikowski (Email: Rikowskigr@aol.com)
Chandos’ new series of books is aimed at the busy information professional They have been specially commissioned to provide the reader with an authoritative view of current thinking They are designed to provide easy-to-read and (most importantly) practical coverage of topics that are of interest to librarians and other information professionals
If you would like a full listing of current and forthcoming titles, please visit www.chandospublishing.com or email wp@woodheadpublishing.com or telephone +44(0) 1223 499140
New authors: we are always pleased to receive ideas for new titles; if you would like to write
a book for Chandos, please contact Dr Glyn Jones on gjones@chandospublishing.com
or telephone +44 (0) 1993 848726
Bulk orders: some organisations buy a number of copies of our books If you are
interested in doing this, we would be pleased to discuss a discount Please email wp@woodheadpublishing.com or telephone +44 (0) 1223 499140
Trang 4Multimedia Information
Retrieval
Theory and techniques
R OBERTO R AIELI
Trang 5Avenue 4 Station Lane Witney Oxford OX28 4BN UK Tel: +44(0) 1993 848726 Email: info@chandospublishing.com www.chandospublishing.com www.chandospublishingonline.com Chandos Publishing is an imprint of Woodhead Publishing Limited
Woodhead Publishing Limited
80 High Street Sawston Cambridge CB22 3HJ UK Tel: +44(0) 1223 499140 Fax: +44(0) 1223 832819 www.woodheadpublishing.com
First published in 2013 ISBN: 978-1-84334-722-4 (print) ISBN: 978-1-78063-388-6 (online) Chandos Information Professional Series ISSN: 2052-210X (print) and ISSN: 2052-2118 (online)
Library of Congress Control Number: 2013941270
© R Raieli, 2013 British Library Cataloguing-in-Publication Data.
A catalogue record for this book is available from the British Library.
All rights reserved No part of this publication may be reproduced, stored in or introduced into a retrieval system, or transmitted, in any form, or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written permission of the publisher This publication may not be lent, resold, hired out or otherwise disposed of by way of trade in any form of binding or cover other than that in which
it is published without the prior consent of the publisher Any person who does any unauthorised act in relation to this publication may be liable to criminal prosecution and civil claims for damages.
The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this publication and cannot accept any legal responsibility or liability for any errors or omissions.
The material contained in this publication constitutes general guidelines only and does not represent to be advice on any particular matter No reader or purchaser should act on the basis of material contained in this publication without first taking professional advice appropriate to their particular circumstances All screenshots in this publication are the copyright of the website owner(s), unless indicated otherwise This book was originally published in Italian by Editrice Bibliografica s.r.l., Milan, Italy, with the title
Nuovi metodi di gestione dei documenti multimediali.
Revised English edition
Translated by Giles Smith
Typeset by Domex e-Data Pvt Ltd., India.
Printed in the UK and USA.
Trang 6by Roberto Sicilia b) Content-based search founded on
concrete, figurative data on the same painting by
4.1 Example from Grosky: multimedia content-based indexing 95
4.2 Another example from Grosky: content-based
4.3 Hierarchy of possible representative levels in a document 1214.4 Example of ‘collaborative filtering from Amazon’s website 129
Trang 75.9 Example of MIR 1485.10 a) Formal comparison between search example and
5.11 Selection of a colour range as an example for searching 1535.12 Analysis of the constitutive elements of a video 1595.13 Video-browsing modes a) Slide show b) Storyboard 1605.14 A scheme of representative image selection 1625.15 A recapitulative image of the different video processing
constraints in relation to their computability 1635.16 ‘Talk to Me’ interface, didactic system of Automatic
5.17 ‘AudioFex’ AR module of the MUVIS system 1665.18 ‘Beat Histogram’ of different styles of music 167
6.1 Example of the Scheda F on the Album di Romana site 1786.2 ‘Collage summary’ built starting from the descriptive
metadata in a video produced by the Informedia II system 1806.3 Model of the possible applications of MPEG-7 in
7.4 Search phases in QuickLook a) Browsing and image
model choice b) Definition of textual data
c) System answer and indications of
‘relevance/non-relevance’ d) Final system response 199–200
7.6 Scanner realize during the VASARI project 203
7.8 The ‘PicToSeek’ search screen a) The system’s selection
interface b) Upload from the Web of a search image
c) Individuation of a more precise model
Trang 87.9 Demo of the ‘Sphere browser’ of the MediaMill system 207
7.11 Application of AudioID via a cell phone 2107.12 QBIC search interface a Search through colour range
7.13 Example of a colour search of the Hermitage DC
a Definition of the colour range b Search results 2127.14 Example of a colour-formal search of the Hermitage DC
a Definition of the colours and forms b Search results 2137.15 Search interface based on colour histograms on the
7.16 WebSEEK module for defining the search histogram 2157.17 Example of a search through sketches using Retrievr 2167.18 Interface of the Virage VS LiveMedia system,
7.19 Operating phases of the Informedia II system
a Analysis of a documentary video
b Relationship model of the analyzed elements 2187.20 Phases in the demo driven by Sound Fisher
a Similarity search b Filter of the search with specific
varied data c Addition of music tracks to the system
d Content-based rearrangement of the archive 220–17.21 Video Mail Retrieval system model a Browser
7.23 Example of Google Goggles’ functionality 2257.24 Module for radiology content-based analysis from a
system designed at the National Library of Medicine 2337.25 Visual analysis and search screen of the GIS Web
8.1 Example of content-related structural analysis of a visual
document a Segmentation of an image into blocks b Grey
scale calculation for each block c Complete light-dark
Trang 98.2 Automatic analysis model of a multimedia object
a 3D model b Structural definition c Calculation
8.3 Low-level characteristics in a document a Original VO
b Form and skeleton c Extremities and ‘Centre of Gravity’ 2508.4 Definitions of the ‘meanings’ of VO using low-level
characteristics: models of bowling, ski slalom, golf,
8.5 Figurative example of the square-triangles 253
8.6 Example of similarity match in a musical search
a Search model b Bach fugue with elements similar
8.7 Definition and omission of a search sample
a Search via model design b Modification of
retrieved object c New search using the modified sample 2598.8 Searching in the ‘photographs’ archive of PicToSeek via
8.9 Search in the ‘graphics’ archive of PicToSeek via the
8.10 Demo page of the Video Content Description and
Exploration tool (ViCoDE), one of the products of the
8.12 Model of fingerprint treatment according to the MPEG-7 module 271
9.1 Noise example as a consequence of a formal search using Retrievr 2799.2 Example of information loss in a colour search using the
9.3 Model of an integrated content-based and ‘concept-based’ system developed during the Sculpteur project,
Trang 109.5 Demo of colour-formal query using the QBIC system
interface, as applied to the Hermitage Digital Collection 2909.6 Example of a search conducted with contentual and
textual parameters, run using QBIC’s interface developed
9.7 Relationship model between VR, OPAC and a Virtual
Tables
4.1 Example of relationships between different image
4.2 ‘Concept-based’ and content-based search models 119
Trang 11I acknowledge with thanks Maria Teresa Biagetti, for supporting the MIR project during my PhD course, and Giovanni Solimine for following the publication of the book in Italian A special thank you to Luisa Marquardt for helping me plan the English version of the book, and to Michele Costa, head of Editrice Bibliografica, for granting translation
rights of the original edition Nuovi metodi di gestione dei documenti
multimediali (Milano, Bibliografica, 2010).
Trang 12AACR (Anglo-American Cataloging Rules)
AIB (Associazione Italiana Biblioteche)
AIDA (Associazione Italiana Documentazione Avanzata)
CBIR (Content Based Information Retrieval)
IFLA (International Federation of Library Associations and
institutions)
IR (Information Retrieval)
ISBD (International Standard Bibliographic Description)
JPEG (Joint Photographic Experts Group)
LIS (Library and Information Science)
MARC (Machine Readable Cataloging)
MIR (Multimedia Information Retrieval)
MPEG (Moving Picture Experts Group)
NLP (Natural Language Processing)
OPAC (Online Public Access Catalogue)
TREC (Text Retrieval Conference)
TRECVID (TREC Video Retrieval)
VDR (Video Retrieval)
W3C (World Wide Web Consortium)
OWL (Web Ontology Language)
XML (Extensible Mark-up Language)
Trang 13Multimedia Information Retrieval
Towards an improved user access and
satisfaction
The production of multimedia works and their increasing availability on the Internet poses the question about how to search for them, and successfully retrieve them in an efficient and effective way
Information Retrieval (IR) has usually been considered a mainly library-related issue; in terms of information analysis and processing
by librarians (conceptual analysis, content description, indexing, development and application of thesauri etc.); and, from the user’s viewpoint, in terms of searching for information and retrieving it through library catalogues, bibliographic databases etc In brief, text retrieval has been the main way to retrieve information, intended as textual information or information, textually described In the second part of the twentieth century, the diffusion of information in electronic form and, since the mid-1990s, the wealth and availability of non-print media such as digital objects, music, images, pictures and videos, have emphasized the user’s role in his/her independency from the library This
is the so-called disintermediation era, where the intermediary, the
‘middleman’ (Cobo, 2011) is cut out from the production and distribution
Trang 14(Cobo and Moravec, 2011) in the field of digital literacy or media and information literacy education They are encompassed by the so-called NBIC paradigm, where nano,1 bio, info and cognitive (NBIC) areas and technologies converge and sometimes merge These four areas have been identified as key ones in the National Science Foundation Report (NCF, 2003) Creativity and the production of creative works will also benefit from the development of the NBIC as an integrated field (Bainbridge et al., 2003) In a futurist and trans-humanist’s view, by
2020 ‘Engineers, artists, architects, and designers will experience tremendously expanded creative abilities, both with a variety of new tools and through improved understanding of the wellsprings of human creativity’ (Orca, 2012) Transformative technologies will help to create new expressions of arts.2 New forms of creative works will emerge: they shall not be related or confined only to current art-forms For instance, pictures, images and the production of content where images are fundamental (as in many applied sciences, like medicine, engineering etc.) are expected to increase significantly New technological solutions are flourishing and spreading, like the application of nanotechnologies in the production and application of nanofibrous media For instance, the interest in quantum dots application is ever increasing both in the research field and in the corporate one: quantum dots3 are particularly useful in the STEM sector, e.g., for drug discovery (Rosenthal et al., 2011),
or in sectors where images of high quality and definition are required, and specific technological solutions are needed (for instance, aiming at better quality pictures, as developed by InVisage.4
The interesting trends in a closer integration of media, with a consequent increasing convergence (Jenkins, 2008) of media, technology and humans, remind us that factors – like users’ perspective and behaviour – have to be taken into account rather more than in the past, especially when designing tools and planning services that aim at assisting
in retrieving multimedia information Digital natives (Prensky, 2001a, 2001b) very often use technology in a ‘bricolage’ (tinkering) way (Oblinger and Oblinger, 2005), and show a clear preference for online information, available in digital form and accessible 24/7, rather than a printed version.5 This also affects the way they search for information, process and use it throughout their academic life They prefer information that can be accessed very easily (Ucak, 2007) They multi-task; actively participate in social media; produce multimedia content; and often have
a need for retrieving music and pictures They need to find media, resources, and multimedia information that are relevant to them
Trang 15Visuality: visual information needs and visual skills are not exclusive features of young people today: they are relevant in many professions (e.g., surgeons) Furthermore, visual queries are proved to be more efficient and effective in a cross-lingual issue Images, pictures, music etc are usually described and indexed in a textual way – their content is forced into a textual form – and are retrieved using text (keywords, descriptors etc.) in traditional IR The conversion of a text or single words into an effective image would facilitate the search for information
in multilingualism or in cross-lingual context (Lin, Chang and Chen, 2006) Multimedia Information Retrieval (MIR) can be crucial to finding media other than in a textual form, so that the user’s multimedia information needs can be accurately addressed (Ren and Blackwell, 2009)
In terms of user perspective (and user satisfaction), improved user access to multimedia content is discussed in many meetings6 and is the aim of research and projects There are many useful examples in the corporate field: for example, Shazam started as ‘a simple service designed
to connect people in the UK with music they heard but didn’t know’
(http://www.shazam.com/music/web/about.html) It has also been the
overall goal of PetaMedia (Peer-To-Peer Tagged Media), a network of excellence – comprising four national networks from the Netherlands, Switzerland, the UK and Germany – funded by the EU 7th Framework Programme, and active from March 2008 to September 2011 It aimed at building the foundations of ‘a European virtual centre of excellence’, where multimedia content can be accessed using user-generated annotations and the structures of peer-to-peer and social networks Among the research projects developed within PetaMedia, one is particularly in tune
with the aim of Raieli’s work: Off The Beaten Track (OTBT) is based on
that triple synergy: user relationships (i.e., a social network); media interactions (i.e., user-contributed annotations); and a multimedia collection (i.e., material for multimedia analysis) On this basis, an
user-interesting prototype Near2Me, an outdoor tourist guide, was developed
It incorporated the following PetaMedia technologies:
1 “Geotag-based location recommendation;
2 Place naming based on a geotag and textual tags;
3 Retrieval of diversified images for a location, using image properties and textual tags;
4 Determination of subject-related authority based on comments made
by peers on the user’s uploaded content;
5 Tag clustering and cluster naming;
6 UGC/tag propagation using object duplicate detection”
Trang 16Many research challenges were faced while developing the prototype at different levels – interface, technology integration, evaluation – to get useful
information and significant feed-back from user-perspective testing Near2Me
functions as a tourist guide that helps the tourist to explore an area and find interesting places to visit, according to his/her (geotagged) location An animated video also provides the user with an audio-visual overview of attractions, landmarks, cultural places etc Many field trials, involving over 1,000 users, were carried out in order to test and validate the integration of the triple synergy and the user perspective Locations, topics and experts were the most appreciated perspectives by the participants in the study The trial then resulted in a balanced combination between the two goals – the former, technology-oriented, and the latter, user-oriented (PetaMedia, 2012: 12–15) Other projects are also exploring and developing image query and recognition, users’ interaction etc.7
The shift from the technological dimension to the social, interconnected and interactive dimension of media and communication shows how McLuhan’s ideas – the global village and “the medium is the message” – have been actualizing during the last few years The still traditional separation between cold and hot media has been overtaken by the predominance of software over hardware, which is now shifting to an increasing range
of tools and media These are characterized by different levels of integration, flexibility and interactivity; features that make them more (or less) relevant and useful to a user They also carry and transmit lifestyles and values Furthermore, the content is key Analysts define four models of content and related scenarios, with different levels of privacy, data protection and exchange: Premium content (with a low level of interactivity and a pay-per-view fruition); Interactive immersion (e.g., multimedia content); Social media (peer-to-peer, interactive and social construction and aggregation of content); and the guide’s scenario, where content is aggregated by users who cannot modify it (Valori, 2009: 224) On one hand, the way content is created, aggregated, used etc is affected by the functionality and the features of the platform(s) where it is made available On the other hand the user’s competences in retrieving and processing media and information makes a difference Those competences are defined in many ways: MIL or Media and Information Literacy (UNESCO, s.a.), trans-literacy, multiple literacies, new literacies, cyber-culture etc Despite the different emphasis on one or another aspect, they are undoubtedly crucial not only in personal or individual terms of retrieving multimedia information They are relevant
in educational terms, where user education also means both building up
Trang 17the cultural competences of understanding and producing information and media (UNESCO, 2009; 2011), and raising an active and creative member of society, as recently discussed at ENS – Ecole Normale Superieure, Cachan, Paris (Frau-Meigs, Bruillard and Delamotte, 2012), and in terms of providing librarians (and library and multimedia software developers) with the vital and needed feedback to enhance MIR.
As briefly described above, MIR seems to hold many perspectives and great potential to be developed, as mentioned and taken into account by Roberto Raieli here Technological solutions and experiences are also explored in his book, even though they are not the main aim of this work, the technological and practical field being a fast growing and changing one: it is honestly hard to keep pace – especially in a book – with its continuous development Even though Raieli’s work published here is mainly a translation of the Italian edition, this work is an accurate revised edition, with substantial adaption to the international context.8
In this general and dynamic scenario, Raieli’s work is definitely a welcome and useful contribution that provides the international library and information community with foundational knowledge on MIR The ongoing development of complex multimedia systems for effective web-mining (Ordonez de Pablos et al., 2013) make MIR an interesting field for further research, development and enhancement
Luisa Marquardt, Roma Tre University, Rome
Cobo, C., Moravec J W (2011) ‘Aprendizaje Invisible Hacia una Nueva Ecología de la Educacio´n’ Barcelona (Spain): Universitat de Barcelona Online version available in PDF format at URL:
http://www.aprendizajeinvisible.com/download/AprendizajeInvisible.pdf
Trang 18Cobo, C., Scolari, C and Pardo Kuklinski, H (2011) ‘Knowledge Production
and Distribution in the Disintermediation Era’ Available at SSRN: http:// ssrn.com/abstract=1920766
distribution-in-the-disintermediation-era-oii10
http://www.slideshare.net/HugoPardoKuklinski/knowledge-production-and-Frau-Meigs, D., Bruillard, É and Delamotte, É (eds) (2012) ‘Le e-Dossiers de l’Audiovisuel: L’Éducation aux Cultures de l’Information Support de Réflexion au Colloque Translittératies’ Enjeux de Citoyenneté et de Céativité ENS-Cachan et Université Sorbonne nouvelle 7–9 Novembre 2012.
http://www.stef.ens-cachan.fr/manifs/translit/colloque_translit.html
S.l.: Cachan, Paris: INA.160 Online version available at URL:
laudiovisuel/e-dossier-leducation-aux-cultures
http://www.ina-sup.com/ressources/dossiers-de-laudiovisuel/les-e-dossiers-de-Jenkins, H (2008) ‘Convergence Culture Where Old and New Media Collide’ New York: NYU Press.
Lin, W C., Chang Y C and Chen H H (2007) ‘Integrating Textual and Visual Information for Cross-language Image Retrieval: a Trans-media Dictionary
Approach’ Information Processing Management, 43 (2) (March): 488–502
Available in PDF format at URL:
Ordóñez de Pablos, P et al (2013) ‘Advancing Information Management
through Semantic Web Concepts and Ontologies’ IGI Global, 1-433 Web
(27 November 2012) doi: 10.4018/978-1-4666–2494-8
PetaMedia (2008–2011) Research Projects projects-3.html
http://www.petamedia.eu/research-PetaMedia (2012) ‘Project Final Report’ (February 2012) Available in PDF
format at URL: http://www.petamedia.eu/final-report-169.html
Ren, F., Bracewell, D B (2009) ‘Advanced Information Retrieval’ Electronic Notes in Theoretical Computer Science, 225 (8) (2 January): 303–317 Rosenthal, S J et al (2011) ‘Biocompatible Quantum Dots for Biological Applications’ Chemistry & Biology, 18 (1) (28 January): 10–24.
Tamine-Lechani, L., Boughanem, M., Daoud, M (2010) ‘Evaluation of Contextual Information Retrieval Effectiveness: Overview of Issues and Research’.
Knowledge and Information Systems, 24 (1): 1–34
Uçak, Nazan Özenç (2007) ‘Internet Use Habits of Students of the Department
of Information Management, Hacettepe University, Ankara’ The Journal of Academic Librarianship, 33 (6): 697–707 Available in PDF format at URL: http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/internet%20use%20habits.pdf.
UNESCO (s.a.) ‘Media and Information Literacy’ (webpages) at URL: http:// portal.unesco.org/ci/en/ev.php-URL_ID=15886&URL_DO=DO_ TOPIC&URL_SECTION=201.html
Trang 19UNESCO (2009) ‘Mapping Media Education Policies in the World: Visions, Programmes and Challenges’ Edited by Divina Frau-Meigs and Jordi Torrent New York, USA; Huelva, Spain: The United Nations – Alliance of Civilizations
in collaboration with Grupo Comunicar Online version available in PDF
format at URL: http://unesdoc.unesco.org/images/0018/001819/181917e.pdf
UNESCO (2011) ‘Media and Information Literacy Curriculum for Teachers’ Edited by Alton Grizzle and Carolyn Wilson, Paris: UNESCO Online version
available in PDF format at URL: http://unesdoc.unesco.org/images/0019/ 001929/192971e.pdf
Valori, G.E (2009) ‘Il Futuro è già qui: gli Scenari che Determineranno le Vicende del nostro Pianeta’ Milano: Rizzoli.
Notes
1 Nano technologies have a high financial potential too, and are seen by venture capitalists (although some are still reluctant in investing in them) ‘as the next
“big thing” after the dotcom crash’ See, e.g., Siemon, C (2010) ‘Financing
6 th Kondratieff’s Start ups: A Schumpeterian Problem Reconsidered from an
Evolutionary Perspective […]’ Bremen: University of Applied Sciences (SME Working Papers: 2: 25 http://www.hs-bremen.de/internet/einrichtungen/fakultaeten/ f1/forschung/kmu/002-sme_working_papers_siemon.pdf Nevertheless, an
interesting upward trend from emerging economies (such as the BRICS countries and their institutions) shows how they have been investing an increasing amount of money over the last few years See: Roco, M.C – Mirkin C.A – Hersam M C (2010) ‘Nanotechnology Research Directions for
Societal Needs in 2020: Retrospective and Outlook’ NSF, WTEC report Berlin and Boston: Springer, available in PDF format at http://www.wtec.org/ nano2/Nanotechnology_Research_Directions_to_2020/ The content is also
available both in hard copy and e-book via the Springer website:
http://www.springer.com/materials/nanotechnology/book/978-94-007–1167-9
2 See, for example: the (still discussed) Transhuman Art Gallery that ‘features
a select cast of international artists focused on examining the transformative technologies of today The work presented transcends a multitude of media, including innovations such as 3D printing and virtual reality The Transhuman Art Gallery is a virtual collection of vanguard artwork, attempting to evoke anticipation for the future The challenge of defining a transhumanist aesthetic is concerned with an attempt to find new forms of representation’
http://www.transhumanart.com/
3 Defined by Rosenthal et al (2011) as ‘a nanometer-sized crystal of inorganic semiconductor, or semiconductor nanocrystal’.
4 http://www.invisage.com/technology
5 See, for example, the contributions in the thematic issue: ‘New Learners, New
Literacies, New Libraries’ School Libraries Worldwide, 14(2) (January 2008) http://www.iasl-online.org/pubs/slw/july08.htm
6 See, for example, the series of ACM Multimedia Systems Conference http:// www.mmsys.org/?q=node/68 and the contributions in the annual conference
proceedings.
Trang 207 Like the indoor mobile museum guide application developed for the Olympic Museum of Lausanne The application provides audio-visual information concerning the exhibits of the museum and its goal is to make the visit to the museum more interactive and enjoyable, as shown in the video:
http://www.youtube.com/watch?NR=1&v=IJy9RatDu3Q&feature=endscreen
8 The more philosophical and conceptual parts have been reduced; terminology and references to facts now obsolete have been updated; the structure of the book has been changed as well, and now shows a different articulation of chapters and paragraphs The reference list and the illustrations have been updated and radically modified.
Trang 21Never before could it be said that the purpose of librarianship as a profession is at a turning point.
Traditionally, we have dealt with managing physical objects (documents) and only marginally did our attention wander to their content; it was enough to describe them and make them available From this point of view, Robert Musil drew a fine, almost caricatured profile of the
librarian: ‘the secret of a good librarian’ – he writes in The Man Without
Qualities – ‘is that he never reads any of the literature in his charge other
than the title and the table of contents Anyone who lets himself go and starts reading a book is lost as a librarian!’ However, in the digital universe, distinctions between the content and the document that conveys it are becoming increasingly subtle and, while continuing to exist, sometimes lose their meaning From document handlers we turn into – and have already partly become – content handlers
Is it possible for us not to deal with content when we handle intangible documents, often not even described as objects? Is it possible for us not to deal with content in the age of FRBR (Functional Requirements for Bibliographic Records), that doesn’t just describe documents, but aims to put them in relation to each other, to go beyond the materiality of the documents,
to take care of the works, their expressions, their manifestations?
The horizon of reference is moving from catalogued mediation, to information mediation, to documentation mediation
And if that were not enough, added to this is another equally significant transformation I refer to the fact that digital libraries are storing increasingly heterogeneous objects, for content, for purpose and for format: digitization results in text or image format, digital native documents, moving and still images, audio files and other audio-visual material, teaching aids, hypertexts, 3D paths; and this is without mentioning other types of materials that are built up by the contribution
of users and come as a result of using resources such as the so-called User Generated Content (UGC)
Trang 22One of the consequences of this multifaceted reality can be seen in the need for libraries and librarians to lay out methods and tools to organize and research content according to criteria consistent with the nature of each specific document A digital library, wanting firstly to be a library and wishing to offer a high quality mediation service, cannot confine itself to collections of documents, without setting itself the objective of making available to users adequate instruments for accessing the richness, complexity and diversity of their collections.
Conceptually, this is not a novelty in an absolute sense, and in traditional librarianship we find many consolidated precedents that take their cue from the same requirement To mention only one of the most
classic examples of the discipline of cataloguing, we can think of how
catalogues of antique books were prepared, for which there is usually an analytical description that takes into account the primary interest of those who consult these tools This interest is usually not the textual content but the book itself as an artefact, the circumstances of its publication and the vicissitudes of its circulation That’s why over-emphasis is given to the watermark or tethering, or the typefaces used for printing, or to persons other than the author equally related to the works, editions and specimens: translators, editors, commentators, critics, annotators, prefacers, printers, publishers, booksellers, illustrators, holders and donors, dedicators and dedicatees, and so on In this case, librarianship uses a mediating language tailored to the particular characteristics of the antique book, even though
it always concerns itself with the container and not the content
In collections of electronic documents, the ability to develop new languages of mediation and descriptions is enhanced by advanced navigation and search systems, allowing us to develop working tools calibrated to the specific needs of the recipients of a specific project Even
in this case, some examples may be helpful Within the limits of digital
libraries dealing with literary texts, if we consult Biblioteca Italiana (BibIt) or LIZ (Letteratura Italiana Zanichelli), we find a very rich
search functionality (free search for terms or elements; proximity search; previous word search; footnote search, phrase search, character search, caption search, paragraph search, references search; search for terms in
a foreign language; concordances; and indices, to name just a few) This enables scholars to work effectively, locating words in a text or a corpus
of texts, generating the contexts in which forms are linked together It is
no coincidence that this type of project is successful when it manages to satisfactorily blend three types of competence: the librarian and mediation, technical and informatics, and the specialist carried by content experts
Trang 23It is no surprise then, that when textual documents join together with other document types, the need is felt to develop multimedia search methodologies capable of holding together Text Retrieval (TR) search methodologies, based on textual information for the treatment and search of textual documents; with Visual Retrieval (VR) set for searching visual document; Video Retrieval (VDR) for audio-visual video treatment; and Audio Retrieval (AR) methodologies based on sound data for processing and searching audio.
The MIR systems aim to analyze, process and research the objective
content of documents with a content-based approach that aims to
overcome traditional search and analysis based on textual equivalents describing the content of a document, or term-based systems Pursuing this path is vital to making concrete a course of repeatedly enunciated actions that so far has been little more than sloganeering: here we refer to the integration between libraries, archives and museums which the European Union has promoted
Roberto Raieli is no stranger to studies on this issue, and has already demonstrated mastery of the subject I remember, along with other minor contributions, a collection of essays edited in 2004 along with Perla Innocenti,1 for AIDA; and the volume that presented the proceedings of the seminar promoted by AIB in December of the same year at the Roma Tre University.2 Years and years of analysis and study, broadened by three years
of work done as part of a PhD, have now ripened and, thanks to a timely and rigorous description of the state of the art, what begins to emerge is an organic treatment strategy for documents and information that aims to resolve a fascinating matter fraught with difficulties: To give users a research methodology based on the ‘objective’ informative content of documents, which refers to their contents and their forms of expression.Having clearly identified the direction in which to move does not mean
to say the search is over This book shows the prospects of study that must continue to be investigated, enriching the results of theoretical research with the results of experiments extended to a significant body of documents, with contributions coming from users and observations of their behaviour This volume describes the sectors and contexts in which these methodologies can find a profitable application and recounts the most interesting experiences that have already been made at an international level
Thanks to the possibilities offered by technology, the revolutionary MIR system can allow document search by applying storage and retrieval techniques that operate directly on the content of digital objects within databases, to search for images, audio-visual and sound, as texts, exploiting the specific language characterizing each type of document
Trang 24The scope of this deeply innovative methodology has the eyes of the world on it, with everyone waiting to see it being achieved, and the fascinating proposals illustrated in this book, finally realized and fully operational.
‘For every document there is its retrieval’ Ranganathan would say
Giovanni Solimine, Sapienza University, Rome
Trang 25Roberto Raieli is a librarian in the Roma Tre University Arts Library,
Italy Roberto has collaborated with both scientific and humanities libraries, and has been involved in studies on digital libraries and multimedia information, on which he has published Roberto is on the
editorial staff of the Italian LIS journal AIB Studi (old Bollettino AIB),
and is a member of groups dealing with electronic resources, virtual libraries, and open archives Roberto has expertise in film direction; he has directed various theatre plays and short films; and he has been published on a wide range of subjects, also founding and directing the
Italian literary journal Línfera Roberto holds a degree in Philosophy,
and a degree and a PhD in Library and Information Science
Important precedents of Roberto’s publications regarding Multimedia
Information Retrieval are the international book MultiMedia Information
Retrieval: metodologie ed esperienze internazionali di content-based retrieval per l’informazione e la documentazione (Multimedia Information Retrieval: Methodologies and International Experiences of Content-based Retrieval for the Information and the Documentation),
edited by Roberto Raieli and Perla Innocenti (Roma, AIDA, 2004); the
book L’informazione multimediale dal presente al futuro: le prospettive
del MultiMedia Information Retrieval (The Multimedia Information from the Present to the Future: the Perspectives of the Multimedia Information Retrieval), edited by Roberto Raieli (Roma, AIB Lazio,
2005); and the book Nuovi metodi di gestione dei documenti
multimediali: principi e pratica del MultiMedia Information Retrieval (New Methods of Management of Multimedia Documents: Principles and Practice of Multimedia Information Retrieval) (Milano, Editrice
Bibliografica, 2010) Moreover, he has a series of articles published on
Knowledge Organization, Bollettino AIB, AIDA Informazioni, and
other periodicals and books on different subjects
Trang 26Abstract This introduction summarizes the entire book The book focuses
on the processing and search tools applicable to the management of new multimedia documents These matters merge in the methodology of MIR,
an organic system composed of the TR, VR, VDR and AR systems One of
the book’s goals is to demonstrate the limitations of operating within the terms of a generic Information Retrieval (IR) system, through textual language only MIR offers a better alternative, whereby each type of digital document can be analyzed and searched by language elements appropriate
to its nature MIR’s approach to information search, which directly handles
the concrete content of the documents, is defined as content-based The integration of content-based, or the content-related conception of information handling, with the traditional semantic conception, has the
potential to provide the advantages of both systems in the accessing of information.
Key words: multimedia information retrieval, content-based information
retrieval, multimedia documents, digital libraries, image and video processing, audio processing, artificial intelligence, meaning, semantic gap.
Euripides, descended here, and spoke to the thieves, the pickpockets, the robbers, the parricidal, the people galore in the Ogre And when they heard the puns, the retorts, the rambling talks, they lost their heads, and took him for an Ark of science And he bristled his crest, planting himself on the throne where once sat Aeschylus
Aristophanes, The Frogs
The first time I happened to see myself with a book in my hands, taken at random without looking from one of the shelves, I felt a thrill of horror Would I, thus, be reduced like Romitelli, to feel obliged to read, I librarian, for all those who do not come into the
Trang 27library? And I threw the book on the ground But then I picked it up; and – yes – I also started to read, and also with only one eye, because the other would not see.
Luigi Pirandello, Il fu Mattia Pascal
The famous comedy The Frogs by Aristophanes, in some respects can be
considered the parody of a library, always hovering between the enrichment of what pleases the owner and the consideration of what is more fashionable; between the procurement of what is really important and what needs to be there anyway; between conserving what is of value produced in the past and attention to innovations that will ensure future development It could be the library of Aristophanes himself, filled with the writings of the Greek tragedians, whose names are listed and compared, without wanting to eliminate some and even dedicating a privileged place to someone more
The well-known novel by Luigi Pirandello, Il fu Mattia Pascal, in
certain passages is an explicit caricature of a librarian, constantly hovering between himself and confrontation with society; between privacy and professional role; between proposing something useful or the futility of doing so; either for himself or for others It may be a librarian like so many Pirandello must have seen, apprehensive of having and wanting to be a librarian, and being clearly stigmatized by an author not seeking to define if it is a criticism of those who hold the function,
or a criticism of the function itself
Between the two points of library and librarian that support the ideal line of Library Science, other countless points can be placed, endlessly referencing other citations, with each adding and justifying other not only literary examples What does this line in substance represent? This line is full of positive and negative examples connecting time, places, and decisive figures in the development of human culture, close to every expression of progress in every age It is possible to infer the ingrained and foremost incarnations of libraries and librarians, and everything connected to them, from the perspective of information and universal knowledge This has always provided Library Science with continuous stimuli to follow the sometimes swirling progress of knowledge and its tools for development and transmission, proposing itself as the actual
science of information Libraries have become places for the conservation
and dissemination of information and culture, with librarians inhabiting
the role of the professionals running the libraries’ activities.
Trang 28So it seems appropriate to start this premise with a talk based on the
revolution of some perspectives developed by librarianship and Library
and Information Science (LIS), crucially emphasizing that every revolution must take the whole of history into consideration, without breaking the progressive and natural continuity of the cultural and scientific traditions
in which the revolution takes place, in this case Library Science
This book aims to develop the diverse issues linked to MIR and is divided into two main parts: the first, broader, part examines the subject and theory of MIR, while the second part is devoted to presenting MIR systems and techniques In the international context, these matters are familiar to computer scientists, engineers and mathematicians, but it is necessary that librarians, documentalists, and information managers also gain familiarity with the related technologies The fields of research that could be interested in MIR’s innovations are constantly increasing, from Medicine to Geography, from Engineering
to the Arts and Music, and each discipline presents specific problems and needs
The discussion focuses on treatment and search tools applied in managing multimedia documents, in particular within the framework of hybrid or digital libraries, whose databases no longer principally contain
textual documents, but also documents of various genres, visual, audio, audio-visual or multimedia in the full sense of the word This links to the
issue of dissemination and document use, and is a key objective in the activity of a library, highlighting the need for new, multimedia procedures
to search and retrieve all types of digital documents
Among the objectives of this book it is demonstrated that information
research may be operationally limited within the terms of generic IR In
traditional practice, all kinds of document treatments are constricted by conditions of analysis, and search via a textual language only It is necessary instead to consider wider criteria, such as those of MIR, where each type of digital document can be processed and searched through
language elements, or more appropriately, through a meta-language
more representative of its nature It is therefore possible to distinguish,
in the new, general and integrated multimedia research methodology, a method of TR based on textual information for the treatment and search
of textual documents; a method of VR set to visual data search; one of VDR for audio-visual video processing; and one of AR using sound data for handling audio files
Trang 29In databases where the content of the stored documents is substantially textual, it is appropriate that access keys are terms and phrases extracted
from within that same content With multimedia databases on the other hand, it is somewhat inaccurate to externally attribute a textual
description to one that has its content based on a different structure of
sense In addition, if in the case of textual documents it can often be
appropriate to analyse the concept and give it a descriptor, this is not as
effective for multimedia documents, whose subjective limits in analysis are greater, and it is not always the concepts that interest more than the
concrete content of shapes, colours, movements, sounds, or the words.
MIR systems establish a search approach that directly treats the objective content of documents, and for this reason it is defined as being
content-based, as opposed to traditional systems of analysis and search
based on terms descriptive of such concrete content and defined as
term-based Thanks to the possibilities offered by technology, MIR systems
allow the analysis and search for multimedia documents by applying storage and retrieval techniques that operate largely on the contents of digital objects within the database, and the search for images, audio-visual material and sound recordings; and with texts the search may exploit the characteristics of each document’s specific language Users
can then consult the archive with search strategies based on similarities,
or on other modes such as approximation, and measure relationships and values using forms, structures, words, figures, movements, sounds, colours, outlines, etc as search keys for querying It is a more useful system which allows the formulation of queries not to be forced within
the boundaries of language, but may be sent as they are, using shapes,
colours, movements, sounds or words, just as spontaneously and immediately as they come to the user, and likewise can be grasped and satisfied by the system If the monopoly of textual information is no longer generally accepted, it is contradictory that there remains a monopoly of the typical methods of finding that information If the search cannot occur on the physicality of the strokes, the sonata, photography, the film, or the writing, it takes place on their direct digital correspondence, directly in the domain of the effective formal, structural, spatial, temporal or sound values
Traditional textual database interfaces allow users to search an index composed exclusively of terms extracted from documents or inserted into textual metadata What must happen then is that interfaces make it possible to formulate queries in different dimensions, not just through terms, but through images and sounds In doing so, it will be possible to
search indexes substantially different from conventional ones, and they
Trang 30will be much richer, composed of excerpts from the full written or spoken audio-visual material, key images, a sequence of simple geometric figures, melodies, shapes, colours, movements or sounds in general without excluding the importance of maintaining the terminological, conceptual, or descriptive data, or aspects not specifically related to the concrete content of the documents.
From a computer-centred point of view, without objecting to
user-centred perspectives and considering all matters of interest to the user,
the problem lies in the construction of specific, comprehensive and effective multimedia indexes; in the elaboration of high level search systems with many interrelated options; in developing data analysis algorithms that can compute large numbers of variables; in the development of systems of evaluation and sorting of results that improve the quality of response by interacting with the user’s indications, as well
as in the development of research and analysis paradigms that can relate
the objective automatic machine representations with refined human
intellectual analysis.
MIR is a revolutionary organic system, specialized in the effective
treatment of all types of digital multimedia The complex whole constitutes a new information treatment strategy, ahead of the traditional methodologies that are based on the centrality of human conceptual and terminological approaches It aims to solve the issue of research based on the actual and objective information content, with an automatic and content-related approach, as well looking at solutions to the architectural problems required in the new large multimedia databases Provision
must be made for the integration of a conception of the treatment and search of information, revolutionarily content-related with a traditionally
semantic concept, by combining and composing the merits and benefits
of each system’s approach to accessing information
Clarifying the sense of cohesion and internal coherence of the MIR complex, it is essential to conclude, therefore, that a good level of effectiveness in document research can be achieved only by using
combined methods and techniques based on a document’s significance, through controlled terms, on the representation of content, through
textual, visual, sound and audio-visual elements The various related systems of TR, VR, VDR and AR must be considered to be in
content-constant harmony with each other and also with semantic systems The
principle of a semantic consultation is based upon a precise method of selecting a part of the large quantities of documents in an archive according to thematic areas, titles or authors It is also the basis of limiting the various inefficiencies that are specific to a content-based
Trang 31search Above all the individual procedures operate best when in continuous interaction, in a unique and organic search interface This allows consultation strategies to combine words, figures, movements, sounds and concepts, useful for a complex document search that is rich
at all levels of sense and significance, and is able to overcome many of the risks of falling into the gap of solely semantic or solely content-
related document processing
The necessity and efficiency of an organic system does not preclude each particular system having a recognizable specificity, from either a technological or methodological point of view In IR, then, it promises not just methodological innovation, but a renewal of basic and general principles, not surpassing, but including at a higher level the specific
methodology that is still the core of traditional IR This revolution wants
to keep the fields, forms and methods of IR, but organize it alongside other multimedia methods, composing in the most satisfactory way the diversification of semantic and content-related access to information and documents
Without restrictions, and recomposing the apparent rift with tradition,
IR can become a term much richer in meaning, and be practically synonymous with MIR So it may contest the role of a new term to indicate the development process which has occurred, and make disputed terminology unnecessary, like the word Multimedia before Information Retrieval It could also be sufficient to talk about a renewed and re-evaluated Information Retrieval, aware of the different and complete aspects of an up-to-date and efficient methodology of information research
The first part of the book is devoted to discussing the theories elaborated within the framework of national and international studies on MIR, considering in particular relationships with principles of Library Science and Documentation, marking the possible theoretical position, and addressing the operational position of the new information processing systems It is possible to outline an overview of the more advanced aspects of IR theories that MIR can, in a way, revolutionize The discussion also addresses principles of indexing and processing of information and digital multimedia documents
The setting of the argument offers an epistemological perspective shared between aesthetics, hermeneutics and pragmatics It provocatively tries on one hand to frame MIR theory in relation to general and stable principles of Library Science and information research, but on the other hand attempts to indicate, liberate and develop the new, the revolutionary,
Trang 32the pragmatic and the dynamic that exist in a modern vision of based information processing Taken under review for these objectives are classic texts famed as much for Librarianship as Documentation and
content-IR, as well as recent writings and avant-garde scholars which also cover technical fields such as linguistic and informatics theory, philosophy, and mathematics
Then the more advanced principles of the so-called term-based system
of indexing and search are discussed, namely the current progress of indexing principles that are used at the foundation of Information Retrieval To give a broad spectrum of criticized theories, reference is made to some of the most advanced proposals for effective IR systems in any type of collection Many of these theories have strained to find a valid conjecture for innovating IR, and even though this applies to textual documents, not considering the specificities of multimedia documents, they are valuable, however, for discussing the methodology
of TR inside MIR In contrast, if applied tout court to multimedia
documents, these proposals show all their limits of being designed for primarily textual information
In this direction of research, some essays are presented that have laid the foundations for the theory of handling content-based multimedia documents Through the discussion of issues relating to exclusively terminological representations of the multimedia content of a document, they also show that the resulting linguistic representation of the search is limited by the criticality of the term-based treatment Many lines of enquiry point to a definition of search that goes beyond the information objectively related to specific words or constructs, aiming at the qualities
of the multimedia content itself If the simplest of such requests,
unrefined in relation to time, place, action, to forms of expression or technical characteristics, can still be met by term-based storage and search systems, more complex search categories require a further operation, not always leading to the desired result
After a quick history of MIR’s principles to assess the specific contribution of these research projects and their progress, come a series
of clarifications on some of the unresolved problems from such studies
In the conclusion are designated some important future directions of study and research One of the first issues looked at is the need to rapidly move from the academic and experimental phase of the MIR systems to commercial application, promoting cooperation between research and industry The next fundamental perspective, rich in implications, is indicated in the developments of Web 2.0, the Semantic Web, and in particular to ontologies and folksonomies as practical directions of
Trang 33development to the many issues related to the treatment and search of multimedia documentation Finally, with regard to the usefulness of MIR’s methods, it is pointed out that research and experimentation aimed at the improvement of the structure and methodology of systems should always be based on the actual needs of users, and their responses
to new operational proposals
In the survey on the actual functionality of MIR, and its role in the context of information search, the discussion of the limits of what was
positively proposed as the content-relatedly objective, mathematically
factual access to documents, is pointed out The discussion on the gap between it and conceptual-interpretative access is then presented and
internationally appointed as the semantic gap The semantic gap is
defined as the non-coincidence between the objective information that can be extracted directly from a document and the different interpretations that the same data can receive from each user in each specific situation Given that the meaning of a multimedia document is rarely explicit, the purpose of the system must be to provide support to overcome this gap, between the simplicity of handling content offered by the machine, and the rich semantics of user expectation A satisfactory proposal for a solution is to integrate the method of the new guides for navigation in the Semantic Web, or ontologies It is to establish an organic approach to multimedia search, able to take into account the concrete and conceptual, the content-related and semantic; and the representation of the documents
In fact, whatever their nature, they can always be inserted in spaces logically in relation to each other, as well as being able to be searched without influences in those semantic positions using MIR methods
Added to all of this is the consideration of imagination and creativity
as a way of conducting research By accepting the use of ontologies as being compatible with MIR, in order to escape their rigours, which seem
to bring on again the problem of the rigidity of classification schemes typical to IR, it has been suggested combining ontology with folksonomy This would provide free systems of collaborative content categorization
of documents based on labels assigned directly by end-users The integration between the semantic tools of ontologies and folksonomies, which are in turn integrated in content-based instruments, can reconcile the principles of semantic-interpretative treatment and content-relatedly objective treatment within the general organic nature of MIR instruments
The definition of the aims and objectives of MIR is finally depicted through a pattern of difference in principle between IR as is known to
Trang 34all, and MIR as a methodology evolved towards amplifying the
documentary referential as a whole IR is a search system using terms for
recovering textual information, also applied to sound, visual and
audio-visual documents MIR proposes itself as a search system using text,
images and sounds, for documents of any type, or those that are
multimedia in the full sense of the word
The fundamentals of possible MIR theory are not introduced through the rejection of any possibility of interpretation and conceptual representation of the content documented or the document In considering the semantic limits of the content-based system, an appropriate intellectual intervention in the search is often necessary to clarify strategies and increase chances of retrieval This need to integrate content-related and semantic access to documents must therefore lead to the definition of a unique and organic system of multimedia information processing; one which is able to simultaneously consider the research needs which are both semantic-interpretative in character and content-relatedly objective
With respect above all to the organic complexity of MIR, out of the four specific methodologies, TR, VR, VDR and AR, it is emphasized that
to reach a good level of precision in document retrieval from a multimedia database, it requires the presence of all modes All types of research; textual, visual, audio and audio-visual, can operate in continuous and organic interaction In one system, with one search screen, in the form of
a search formula which can be used, by combining where appropriate words, images, sounds and terms, to search for every complex document,
whose information content extends to all levels of sense and
significance.
The argument ends after proposing a definition of MIR theory through the recapitulation of the principles set out above, and the definition and distinction of the theoretical aspects of specific TR, VR, VDR and AR.The second part of the book introduces the technical aspects, and aims
to demonstrate the state of the art in MIR as well as the existing possibilities of its practical implementation, evaluating its effectiveness under the demands of different user types and in different areas of research The demonstration of these points is based on theoretical definitions developed in part one of the book, of which the second means
to constitute their practical application, including the description of the operation of some specific systems, both experimental and implemented
in libraries, archives and documentation centres
Trang 35In addition to the exemplification of the modalities and procedures of operating systems currently in use, the main mathematical and engineering innovations allowing the realization of systems, machines, analysis software, storage, search and retrieval of multimedia documentation are indicated, underlining the malfunctions and inefficiencies found in end-usage This book attempts to mediate between the required technicalities
of different fields of study, and a language acceptable to Library Science
As a technical-theoretical premise, a general and rapid overview is made of the current state of multimedia document management, as done
by the most advanced traditional IR systems This overview is followed
by a more detailed investigation of VR systems, VDR and AR Without delving into the description of TR, although it is an organic part of content-based MIR methods, the criteria are widely covered in many available studies on IR Here then, is a rundown of the various MIR systems made by different research teams from around the world, and currently in use in diverse sectors
Having dealt with the methodology and technology of MIR, it can be noted that there are no specific and isolated applications of TR, if not those of direct search on a full-text document developed, however, according to another point of view of the problem, more traditionally IR-related The problems with TR’s methods and techniques are widely seen and exhibited as part of research on IR, and it is in this context that there are many explanations of what can be, even in the organic whole
of MIR, a functioning system for treating and searching textual documents This, however, without prejudice to the consideration that textual research can and should come, as in all MIR, through the integration of content-based methods, such as those based on free full-text search, and semantic methods, such as most traditional terminological searches
Subsequently discussed is the question of whether cultural sectors can find real advantages in using the technology of MIR and its systems This
discussion includes how literary fields should not be excluded as a matter
of principle, and that the greatest advantages may be in the scientific and technical, as well as the general and popular fields, according to each particular case Among the more advanced application areas is that of biomedical research which has always involved a great deal of interest and study, mainly for its high social significance Very promising are also applications in the field of Earth sciences and geographical information, both for practical importance, and some good system results A sector of
Trang 36considerable development, despite some basic problems, is the documentation of the visual arts, where there is investment in digital catalogues, museums and galleries for the growth and improvement of
public enjoyment Other important areas of MIR application are the
scientific, engineering, social sciences, archives of the police forces, and other generic fields, as well as the Web The validity of operational systems in each sector can be moved forward using benchmarking methods; evaluation of end user results; and continuous attention to feedback and the information produced during use
MIR’s procedures play a central role in the analysis and content-based
indexing of multimedia materials, and also in the phases and structures
for search and retrieval In general, the process of the functioning of the systems can distinguish two major interrelated parts: operations relating
to the creation of the database and the indexing of documents, and operations relating to the process of search and retrieval The paradigms and protocols so far trialled in the design and implementation of systems
are all fairly similar, and are based not only on the direct treatment of
the actual content of the document, but also in consideration of the role
and the importance of traditional descriptive treatment Fundamental to this is the ability to create a new type of index relating to the data in
documents, that in addition to traditional terminological elements is enriched by content-related elements In presenting the various operations, also compared are the different methods of automatic, semi-automatic and manual document treatment, declaring their merits, defects and the necessity for mutual integration of the capabilities of machines and humans Also discussed are the problematic aspects, or inefficiencies of data-processing algorithms, and the specific cases of noise and loss of information
The discussion then moves on to the functioning and the
technical-organic integration of the content-based operational modes of TR, VR,
VDR and AR It is stressed that in order to achieve proper and effective
use of MIR, the integration between content-based modes must also be organically linked to the integration with the semantic modes of
treatment and the search of information and documents In general are clarified the practical solutions of integration that can be achieved based
on the processing of content and document descriptions, and the possible operational integration of different procedures Also discussed is the issue of MIR’s integration with other management systems for sorting and searching, in use for non-multimedia or non-digital documents that together constitute the heritage of any given library or archive Each
Trang 37theoretical or paradigmatic perspective only becomes valid if it contains the development of true search interfaces for multimedia retrieval, enabling the end user to set and conduct searches with any combination
of the various text, visual and sound modes
To all this, there is always a negative response to the hypothesis that content-based treatment alone may be the most effective system for searching information, explaining the importance that descriptive and semantic metadata still have in treating all types of documents One
cannot expect to eliminate the refined contribution of human
interpretation on the search process, founded on voluminous previous knowledge and a sophisticated elaboration of content impossible for machines to obtain However developed they are, the elaborate
algorithms that can calculate the possible interpretations of a set of
objective content-related information, they are unable to bridge the so
called semantic gap, which continues to be created between low-level
mechanical consideration of an object’s appearance, in face of the
high-level evaluation of the human idea of that object.
Trang 38A cultural and theoretical context for
MIR theory
Abstract: This first chapter is dedicated to presenting the cultural context in
which MIR is developing, and different theories that it may involve Also addressed are the principles for the indexing and processing of information and digital multimedia documents, provocatively putting forward an
epistemological perspective shared between aesthetics, hermeneutics and pragmatics This perspective attempts to frame MIR theory in relation to
general and stable principles of library sciences and information search.
Key words: information society, multimedia documents, information handling,
epistemology, hermeneutics, aesthetics, pragmatics, scientific principles.
The information and knowledge society
To put into context the various issues discussed in this book, we need to take a bird’s-eye view of our current information and knowledge society This book is aimed at discussing a general cultural context; identifying the need for information; opportunities to use multimedia documentation;
as well as defining a specific scientific context in which to develop theories and techniques for managing and searching such documentation
In discussing the available resources that preserve and disseminate information and knowledge, we should bear in mind that books like
other objects used to log information are useful tools technologically
adequate to the times Discussion on the forms of media development can be narrowed down to the definitions ‘digital convergence’, ‘media revolution’, ‘multimedia’, ‘interactivity’ and ‘hypermedia’, to interpret the compositional forms between technologies; including even the most
Trang 39complex compositional forms between technology and information, which affect each other reciprocally.1
It is fair to say that the press has monopolized culture, information and the preservation of memory, without ever truly rendering secondary the often more instructional communication given to images and sound The media revolution that characterized the twentieth century was felt at all levels of society, to say nothing of the media’s current digital development, enhanced by Internet circulation and possible end user interactivity What we are left with is an information object that could appear rather complex, yet is familiar and easily definable: a digital, interconnected, interactive, multimedia object Admittedly, it is perhaps lacking a comprehensive definition for all its forms – we could typically say multimedia or even hypermedia – but what is not lacking is its establishment and deployment in day-to day-usage in the information society
This is nothing new After all, an information object, a document, is composed of many parts: textual, visual, audio-visual, nonlinear viewing, and can be connected to other objects This has always been possible even with technologically older documents The real news is the information object in its digital form, is its electronic progress that maximises the potential of the multimedia object, rendering it a powerful communication tool, adaptable to present needs, and therefore offering valuable and effective diffusion throughout society Let alone, of course, the lively commercial interest that has always surrounded innovations, which in addition to fulfilling needs, certainly creates and offers new ones So, communication tools that suffuse our current society are seeing
a convergence towards the digital, going far beyond simple multimedia Therefore, the multimedia document is probably the only effective support for information and knowledge today
Within the framework of the current information society, it must be stressed that the structures and services dealing with the organization of knowledge must make a precise distinction, and maintain a separation between information and knowledge Information consists of an ever-increasing amount of data, to be organized precisely if one then wants to transform that information into knowledge What must remain intact therefore, are the functions that only professional information services possess: ‘to organize, filter and mediate’ the ever-increasing amount of information available, whatever its nature or technological support, without being misled as the common user can, from it being more or less
attractive.2
Trang 40Knowledge is the result of thorough study, reflection, and correction
It is simultaneously individual, collective and contemporary work developed over time A researcher must be able to distinguish false or erroneous information from that which is valid, and seek knowledge that may provide insights for intuition Only bibliographic services, librarians and archivists, technically and theoretically properly defined, can foster the process of transforming information into true knowledge, which is the real resource of contemporaneity and progress, not only in academic and intellectual terms, but also in economic and social terms
In today’s society then, knowledge must confront technological input more than ever, but this need does not necessarily act negatively nor weaken the ability to know The development of digital technology in every area has led to version 2.0 of whatever was there before, and in the
context of library sciences talk has begun of Library 2.0, as the successive
step from the digital library, seeing libraries as a collection of digital documents fully immersed in Web 2.0
There cannot be indifference towards the rich technologies present in Web 2.0 when it comes to the development of libraries, document archives
or other structures that process information It is a constant ‘conversation’,
as underlined in the Italian Manifesto per le biblioteche digitali.3 Easy enthusiasm for Web 2.0 aside, it is necessary to define and guarantee some working methods and principles, that do not result in an identity loss for
the digital library as a library Its identity must strengthen, not disappear
in the charming but insidious relationship of fully shared communication technologies Total division of everything which is the spirit of Web 2.0
can seriously put at risk questions of personal reflection, identity and
recognition Fragmentation and annulment will threaten identity in
advance of thinking in common, deciding together, or acting jointly.
These doubts, often psychological and sociological in nature, arise to
a lesser degree for digital libraries than compared to single Web users But in any case, libraries must be cautious in using Web 2.0, giving constant consideration to the manner new tools are used in the realization
of objectives They must not be overwhelmed or led up inappropriate paths, but continue to examine themselves if it is a problem for the
library to surrender privacy on certain activities, since it is often its
mission to share, show and advise Should libraries steer away from this course, or commit to it and lead it towards their own ends?
In the daily panorama of communication, information and knowledge,
to radically and stringently question the potential and effectiveness of