1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Handbook of research on advanced data mining techniques and applications for business intelligence

466 47 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 466
Dung lượng 19,02 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence Shrawan Kumar Trivedi BML Munjal University, India Shubhamoy Dey Indian Institute of M

Trang 2

Handbook of Research

on Advanced Data

Mining Techniques and Applications for Business Intelligence

Shrawan Kumar Trivedi

BML Munjal University, India

Shubhamoy Dey

Indian Institute of Management Indore, India

Anil Kumar

BML Munjal University, India

Tapan Kumar Panda

Jindal Global Business School, India

A volume in the Advances in Business

Information Systems and Analytics (ABISA)

Book Series

Trang 3

Published in the United States of America by

Web site: http://www.igi-global.com

Copyright © 2017 by IGI Global All rights reserved No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher Product or company names used in this set are for identification purposes only Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.

Library of Congress Cataloging-in-Publication Data

British Cataloguing in Publication Data

A Cataloguing in Publication record for this book is available from the British Library.

All work contributed to this book is new, previously-unpublished material The views expressed in this book are those of the authors, but not necessarily of the publisher.

For electronic access to this publication, please contact: eresources@igi-global.com

CIP Data Pending

ISBN: 978-1-5225-2031-3

eISBN: 978-1-5225-2032-0

This book is published in the IGI Global book series Advances in Business Information Systems and Analytics (ABISA) (ISSN: 2327-3275; eISSN: 2327-3283)

Trang 4

The Advances in Business Information Systems and Analytics (ABISA) Book Series (ISSN 2327-3275) is published by IGI Global, 701

E Chocolate Avenue, Hershey, PA 17033-1240, USA, www.igi-global.com This series is composed of titles available for purchase ally; each title is edited to be contextually exclusive from any other title within the series For pricing and ordering information please visit http://www.igi-global.com/book-series/advances-business-information-systems-analytics/37155 Postmaster: Send all address changes to above address Copyright © 2017 IGI Global All rights, including translation in other languages reserved by the publisher No part of this series may be reproduced or used in any form or by any means – graphics, electronic, or mechanical, including photocopying, recording, taping,

individu-or infindividu-ormation and retrieval systems – without written permission from the publisher, except findividu-or non commercial, educational use, including classroom teaching purposes The views expressed in this series are those of the authors, but not necessarily of IGI Global.

IGI Global is currently accepting manuscripts for publication within this series To submit a pro-posal for a volume in this series, please contact our Acquisition Editors at Acquisitions@igi-global.com

or visit: http://www.igi-global.com/publish/

• Decision Support Systems

• Legal information systems

• Business Intelligence

• Data Analytics

• Business Process Management

• Business Information Security

• Management information systems

• Data Management

• Strategic Information Systems

• Statistics

Coverage

The successful development and management of information systems and business analytics is crucial

to the success of an organization New technological developments and methods for data analysis have allowed organizations to not only improve their processes and allow for greater productivity, but have also provided businesses with a venue through which to cut costs, plan for the future, and maintain competitive advantage in the information age

The Advances in Business Information Systems and Analytics (ABISA) Book Series aims to present

diverse and timely research in the development, deployment, and management of business information systems and business analytics for continued organizational development and improved business value

Mission

ISSN:2327-3275 EISSN:2327-3283

Madjid Tavana

La Salle University, USA

Advances in Business Information Systems and Analytics (ABISA) Book Series

Trang 5

Titles in this Series

For a list of additional titles in this series, please visit: www.igi-global.com

Business Analytics and Cyber Security Management in Organizations

Rajagopal (EGADE Business School, Tecnologico de Monterrey, Mexico City, Mexico & Boston University, USA) and Ramesh Behl (International Management Institute, Bhubaneswar, India)

Business Science Reference • copyright 2017 • 346pp • H/C (ISBN: 9781522509028) • US $215.00 (our price)

Handbook of Research on Intelligent Techniques and Modeling Applications in Marketing Analytics

Anil Kumar (BML Munjal University, India) Manoj Kumar Dash (ABV-Indian Institute of Information ogy and Management, India) Shrawan Kumar Trivedi (BML Munjal University, India) and Tapan Kumar Panda (BML Munjal University, India)

Technol-Business Science Reference • copyright 2017 • 428pp • H/C (ISBN: 9781522509974) • US $275.00 (our price)

Applied Big Data Analytics in Operations Management

Manish Kumar (Indian Institute of Information Technology, Allahabad, India)

Business Science Reference • copyright 2017 • 251pp • H/C (ISBN: 9781522508861) • US $160.00 (our price)

Eye-Tracking Technology Applications in Educational Research

Christopher Was (Kent State University, USA) Frank Sansosti (Kent State University, USA) and Bradley Morris (Kent State University, USA)

Information Science Reference • copyright 2017 • 370pp • H/C (ISBN: 9781522510055) • US $205.00 (our price)

Strategic IT Governance and Alignment in Business Settings

Steven De Haes (Antwerp Management School, University of Antwerp, Belgium) and Wim Van Grembergen (Antwerp Management School, University of Antwerp, Belgium)

Business Science Reference • copyright 2017 • 298pp • H/C (ISBN: 9781522508618) • US $195.00 (our price)

Organizational Productivity and Performance Measurements Using Predictive Modeling and Analytics

Madjid Tavana (La Salle University, USA) Kathryn Szabat (La Salle University, USA) and Kartikeya Puranam (La Salle University, USA)

Business Science Reference • copyright 2017 • 400pp • H/C (ISBN: 9781522506546) • US $205.00 (our price)

Data Envelopment Analysis and Effective Performance Assessment

Farhad Hossein Zadeh Lotfi (Islamic Azad University, Iran) Seyed Esmaeil Najafi (Islamic Azad University, Iran) and Hamed Nozari (Islamic Azad University, Iran)

Business Science Reference • copyright 2017 • 365pp • H/C (ISBN: 9781522505969) • US $160.00 (our price)

701 E Chocolate Ave., Hershey, PA 17033Order online at www.igi-global.com or call 717-533-8845 x100

To place a standing order for titles released in this series, contact: cust@igi-global.com

Mon-Fri 8:00 am - 5:00 pm (est) or fax 24 hours a day 717-533-8661

Trang 6

Editorial Advisory Board

AnkitaTripathi,Amity University Gurgaon – Haryana, India

List of Reviewers

A.SheikAbdullah,Thiagarajar College of Engineering, India

A.M.Abirami,Thiagarajar College of Engineering, India

M.AfsharAlam,Jamia Hamdard University, India

TamizhArasi,VIT University, India

A.Askarunisa,KLN College of Information Technology, India

BalamuruganBalusamy,VIT University, India

YiChai,Chongqing University, China

A.A.Chari,Rayalaseema University, India

T.K.Das,VIT University, India

HirakDasgupta,Symbiosis International University, India

SanjivaShankarDubey,SSD Consulting, India

G.R.Gangadharan,University of Hyderabad, India

BelayGebremeskel,Chongqing University, China

RashikGupta,BML Munjal University, India

ZhongshiHe,Chongqing University, China

PriyaJha,VIT University, India

PonnuruRamalingaKarteek,BML Munjal University, India

KaushikKumar,Birla Institute of Technology Mesra, India

RaghvendraKumar,Lakshmi Narain College of Technology Jabalpur, India

C.Mahalakshmi,Thiagarajar College of Engineering, India

AmirManzoor,Bahria University, Pakistan

VinodKumarMishra,Bipin Tripathi Kumaon Institute of Technology, India

PriyankaPandey,Lakshmi Narain College of Technology Jabalpur, India

PrasantKumarPattnaik,KIIT University, India

S.Rajaram,Thiagarajar College of Engineering, India

VadlamaniRavi,University of Hyderabad, India

SupriyoRoy,Birla Institute of Technology Mesra, India

Trang 7

HannaSawicka,Poznan University of Technology, Poland

S.Selvakumar,G K M College of Engineering and Technology, India

NitaH.Shah,Gujarat University, India

PouryaShamsolmoali,CMCC, Italy

AruneshSharan,AS Consulting, India

G.Sreedhar,Rastriya Sanskrit Vidhya Pheet University, India

TimmarajuSrimanyu,University of Hyderabad, India

R.Suganya,Thiagarajar College of Engineering, India

K.Suneetha,Jawaharlal Nehru Technological University, India

HimanshuTiruwa,Bipin Tripathi Kumaon Institute of Technology, India

KhadijaAliVakeel,Indian Institute of Management Indore, India

MalathiVelu,VIT University, India

MasoumehZareapoor,Shanghai Jiao Tong University, China

Trang 8

List of Contributors



Abdullah, A Sheik/Thiagarajar College of Engineering, India 1,34,162 Abirami, A M./Thiagarajar College of Engineering, India 1,162

Alam, M Afshar/Jamia Hamdard University, India 62

Arasi, Tamizh /VIT University, India 259

Askarunisa, A /KLN College of Information Technology, India 162

Balusamy, Balamurugan /VIT University, India 259

Biswas, Animesh /University of Kalyani, India 353

Chari, A Anandaraja/Rayalaseema University, India 298

Das, T K./VIT University, India 142

Dasgupta, Hirak /Symbiosis Institute of Management Studies, India 15

De, Arnab Kumar/Government College of Engineering and Textile Technology, India 353

Dubey, Sanjiva Shankar/BIMTECH Greater Noida, India 209

Gangadharan, G R./Institute for Development and Research in Banking Technology, India 379

Gebremeskel, Gebeyehu Belay/Chongqing University, China 90

Gupta, Rashik /BML Munjal University, India 192

He, Zhongshi /Chongqing University, China 90

Jha, Priya /VIT University, India 259

Kumar, Kaushik /Birla Institute of Technology, India 284

Kumar, Raghvendra /LNCT College, India 52

Mahalakshmi, C /Thiagarajar College of Engineering, India 162

Manzoor, Amir /Bahria University, Pakistan 128,225 Mishra, Vinod Kumar/Bipin Tripathi Kumaon Institute of Technology, India 175

Pandey, Priyanka /LNCT College, India 52

Pattnaik, Prasant Kumar/KIIT University, India 52

Ponnuru, Karteek Ramalinga/BML Munjal University, India 192

Rajaram, S /Thiagarajar College of Engineering, India 34

Ravi, Vadlamani /Institute for Development and Research in Banking Technology, India 379

Roy, Supriyo /Birla Institute of Technology, India 284

Sawicka, Hanna /Poznan University of Technology, Poland 315

Selvakumar, S /G K M College of Engineering and Technology, India 1,34,162 Shah, Nita H./Gujarat University, India 341

Shamsolmoali, Pourya /CMCC, Italy 62

Sharan, Arunesh /AS Consulting, India 209

Sreedhar, G /Rashtriya Sanskrit Vidyapeetha (Deemed University), India 298

Suganya, R /Thiagarajar College of Engineering, India 34

Trang 9

Suneetha, Keerthi /SVEC, India 240

Timmaraju, Srimanyu /Institute for Development and Research in Banking Technology, India 379

Tiruwa, Himanshu /Bipin Tripathi Kumaon Institute of Technology, India 175

Trivedi, Shrawan Kumar/BML Munjal University, India 192

Vakeel, Khadija Ali/Indian Institute of Management Indore, India 250

Velu, Malathi /VIT University, India 259

Yi, Chai /Chongqing University, China 90

Zareapoor, Masoumeh /Shanghai Jiao Tong University, China 62

Trang 10

Table of Contents



Preface xxii Acknowledgment xxvi

Section 1 Business Intelligence With Data Mining: Process and Applications Chapter 1

AnIntroductiontoDataAnalytics:ItsTypesandItsApplications 1

A Sheik Abdullah, Thiagarajar College of Engineering, India

S Selvakumar, G K M College of Engineering and Technology, India

A M Abirami, Thiagarajar College of Engineering, India

A Sheik Abdullah, Thiagarajar College of Engineering, India

R Suganya, Thiagarajar College of Engineering, India

S Selvakumar, G K M College of Engineering and Technology, India

S Rajaram, Thiagarajar College of Engineering, India

Chapter 4

SecureDataAnalysisinClusters(IrisDatabase) 52

Raghvendra Kumar, LNCT College, India

Prasant Kumar Pattnaik, KIIT University, India

Priyanka Pandey, LNCT College, India

Chapter 5

DataMiningforSecureOnlinePaymentTransaction 62

Masoumeh Zareapoor, Shanghai Jiao Tong University, China

Pourya Shamsolmoali, CMCC, Italy

M Afshar Alam, Jamia Hamdard University, India

Trang 11

Chapter 6

TheIntegralofSpatialDataMiningintheEraofBigData:AlgorithmsandApplications 90

Gebeyehu Belay Gebremeskel, Chongqing University, China

Chai Yi, Chongqing University, China

Zhongshi He, Chongqing University, China

Section 2 Social Media Analytics With Sentiment Analysis: Business Applications and Methods Chapter 7

A M Abirami, Thiagarajar College of Engineering, India

A Sheik Abdullah, Thiagarajar College of Engineering, India

A Askarunisa, KLN College of Information Technology, India

S Selvakumar, G K M College of Engineering and Technology, India

C Mahalakshmi, Thiagarajar College of Engineering, India

Chapter 10

Aspect-BasedSentimentAnalysisofOnlineProductReviews 175

Vinod Kumar Mishra, Bipin Tripathi Kumaon Institute of Technology, India

Himanshu Tiruwa, Bipin Tripathi Kumaon Institute of Technology, India

Chapter 11

SentimentAnalysiswithSocialMediaAnalytics,Methods,Process,andApplications 192

Karteek Ramalinga Ponnuru, BML Munjal University, India

Rashik Gupta, BML Munjal University, India

Shrawan Kumar Trivedi, BML Munjal University, India

Chapter 12

OrganizationalIssueforBISuccess:CriticalSuccessFactorsforBIImplementationswithintheEnterprise 209

Sanjiva Shankar Dubey, BIMTECH Greater Noida, India

Arunesh Sharan, AS Consulting, India

Chapter 13

EthicsofSocialMediaResearch 225

Amir Manzoor, Bahria University, Pakistan

Trang 12

Section 3 Big Data Analytics: Its Methods and Applications Chapter 14

Balamurugan Balusamy, VIT University, India

Priya Jha, VIT University, India

Tamizh Arasi, VIT University, India

Malathi Velu, VIT University, India

Chapter 17

StrategicBest-in-ClassPerformanceforVoicetoCustomer:IsBigDatainLogisticsaPerfect

Match? 284

Supriyo Roy, Birla Institute of Technology, India

Kaushik Kumar, Birla Institute of Technology, India

Section 4 Advanced Data Analytics: Decision Models and Business Applications

Chapter 18

FirstLookonWebMiningTechniquestoImproveBusinessIntelligenceofE-Commerce

Applications 298

G Sreedhar, Rashtriya Sanskrit Vidyapeetha (Deemed University), India

A Anandaraja Chari, Rayalaseema University, India

Animesh Biswas, University of Kalyani, India

Arnab Kumar De, Government College of Engineering and Textile Technology, India

Trang 14

Detailed Table of Contents



Preface xxii Acknowledgment xxvi

Section 1 Business Intelligence With Data Mining: Process and Applications Chapter 1

AnIntroductiontoDataAnalytics:ItsTypesandItsApplications 1

A Sheik Abdullah, Thiagarajar College of Engineering, India

S Selvakumar, G K M College of Engineering and Technology, India

A M Abirami, Thiagarajar College of Engineering, India

Dataanalyticsmainlydealswiththescienceofexaminingandinvestigatingrawdatatoderiveusefulpatternsandinference.Dataanalyticshasbeendeployedinmanyoftheindustriestomakedecisionsatproperlevels.Itfocusesupontheassumptionandevaluationofthemethodwiththeintentionofderivingaconclusionatvariouslevels.Varioustypesofdataanalyticaltechniquessuchaspredictiveanalytics,prescriptiveanalytics,descriptiveanalytics,textanalytics,andsocialmediaanalyticsareusedbyindustrialorganizations,educationalinstitutionsandbygovernmentassociations.Thiscontextmainlyfocusestowardstheillustrationofcontextualexamplesforvarioustypesofanalyticaltechniquesanditsapplications

Chapter 2

DataMiningandStatistics:ToolsforDecisionMakingintheAgeofBigData 15

Hirak Dasgupta, Symbiosis Institute of Management Studies, India

Intheageofinformation,theworldaboundswithdata.Inordertoobtainanintelligentappreciationofcurrentdevelopments,weneedtoabsorbandinterpretsubstantialamountsofdata.Theamountofdatacollectedhasgrownataphenomenalrateoverthepastfewyears.Thecomputeragehasgivenusboththepowertorapidlyprocess,summarizeandanalysedataandtheencouragementtoproduceandstoremoredata.Theaimofdataminingistomakesenseoflargeamountsofmostlyunsuperviseddata,insomedomain.DataMiningisusedtodiscoverthepatternsandrelationshipsindata,withanemphasisonlargeobservationaldatabases.ThischapteraimstocomparetheapproachesandconcludethatStatisticiansandDataminerscanprofitbystudyingeachother’smethodsbyusingthecombinationofmethodsjudiciously.Thechapteralsoattemptstodiscussdatacleaningtechniquesinvolvedindatamining

Trang 15

Chapter 3

DataClassification:ItsTechniquesandBigData 34

A Sheik Abdullah, Thiagarajar College of Engineering, India

R Suganya, Thiagarajar College of Engineering, India

S Selvakumar, G K M College of Engineering and Technology, India

S Rajaram, Thiagarajar College of Engineering, India

Classificationisconsideredtobetheoneofthedataanalysistechniquewhichcanbeusedovermanyapplications.Classificationmodelpredictscategoricalcontinuousclasslabels.Clusteringmainlydealswithgroupingofvariablesbaseduponsimilarcharacteristics.Classificationmodelsareexperiencedbycomparingthepredictedvaluestothatoftheknowntargetvaluesinasetoftestdata.Dataclassificationhasmanyapplicationsinbusinessmodeling,marketinganalysis,creditriskanalysis;biomedicalengineeringanddrugretortmodeling.Theextensionofdataanalysisandclassificationmakestheinsightintobigdatawithanexplorationtoprocessingandmanaginglargedatasets.Thischapterdealswithvarioustechniques,methodologiesthatcorrespondtotheclassificationproblemindataanalysisprocessanditsmethodologicalimpactstobigdata

Chapter 4

SecureDataAnalysisinClusters(IrisDatabase) 52

Raghvendra Kumar, LNCT College, India

Prasant Kumar Pattnaik, KIIT University, India

Priyanka Pandey, LNCT College, India

This chapter used privacy preservation techniques (Data Modification) to ensure Privacy. Privacypreservationisanotherimportantissue.Apicture,wherenumberofclientsowningtheirclustereddatabases(IrisDatabase)wishtorunadataminingalgorithmontheunionoftheirdatabases,withoutrevealinganyunnecessaryinformationandrequirestheprivacyoftheprivilegedinformation.Therearenumbersofefficientprotocolsarerequiredforprivacypreservingindatamining.Thischapterpresentedvariousprivacypreservingprotocolsthatareusedforsecurityinclustereddatabases.TheXln(X)protocolandthesecuresumprotocolareusedinmutualcomputing,whichcandefendprivacyefficiently.Itsfocusesonthedatamodificationtechniques,whereithasbeenmodifiedourdistributeddatabaseandafterthatsandedthatmodifieddatasettotheclientadminforsecuredatacommunicationwithzeropercentageofdataleakageandalsoreducethecommunicationandcomputationcomplexity

Chapter 5

DataMiningforSecureOnlinePaymentTransaction 62

Masoumeh Zareapoor, Shanghai Jiao Tong University, China

Pourya Shamsolmoali, CMCC, Italy

M Afshar Alam, Jamia Hamdard University, India

Thefrauddetectionmethodrequiresaholisticapproachwheretheobjectiveistocorrectlyclassifythetransactionsaslegitimateorfraudulent.Theexistingmethodsgiveimportancetodetectallfraudulenttransactionssinceitresultsinmoneyloss.Forthismostofthetime,theyhavetocompromiseonsomegenuinetransactions.Thus,themajorissuethatthecreditcardfrauddetectionsystemsfacetodayisthatasignificantpercentageoftransactionslabelledasfraudulentareinfactlegitimate.These“falsealarms”delaythetransactionsandcreatesinconvenienceanddissatisfactiontothecustomer.Thus,theobjective

Trang 16

ofthisresearchistodevelopanintelligentdataminingbasedfrauddetectionsystemforsecureonlinepaymenttransactionsystem.Theperformanceevaluationoftheproposedmodelisdoneonrealcreditcarddatasetanditisfoundthattheproposedmodelhashighfrauddetectionrateandlessfalsealarmratethanotherstate-of-the-artclassifiers

Chapter 6

TheIntegralofSpatialDataMiningintheEraofBigData:AlgorithmsandApplications 90

Gebeyehu Belay Gebremeskel, Chongqing University, China

Chai Yi, Chongqing University, China

Zhongshi He, Chongqing University, China

DataMining(DM)isarapidlyexpandingfieldinmanydisciplines,anditisgreatlyinspiringtoanalyzemassivedatatypes,whichincludesgeospatial,imageandotherformsofdatasets.Suchthefastgrowthsofdatacharacterizedashighvolume,velocity,variety,variability,valueandothersthatcollectedandgeneratedfromvarioussourcesthataretoocomplexandbigtocapturing,storing,andanalyzingandchallengingtotraditionaltools.TheSDMis,therefore,theprocessofsearchinganddiscoveringvalu-ableinformationandknowledgeinlargevolumesofspatialdata,whichdrawsbasicprinciplesfromconceptsindatabases,machinelearning,statistics,patternrecognitionand‘soft’computing.UsingDMtechniquesenablesamoreefficientuseofthedatawarehouse.ItisthusbecominganemergingresearchfieldinGeosciencesbecauseoftheincreasingamountofdata,whichleadtonewpromisingapplica-tions.TheintegralSDMinwhichwefocusedinthischapteristheinferencetogeospatialandGISdata

Section 2 Social Media Analytics With Sentiment Analysis: Business Applications and Methods Chapter 7

SocialMediaasMirrorofSociety 128

Amir Manzoor, Bahria University, Pakistan

Overthelastdecade,socialmediausehasgainedmuchattentionofscholarlyresearchers.Onespecificreasonofthisinterestistheuseofsocialmediaforcommunication;atrendthatisgainingtremendouspopularity.Everysocialmediaplatformhasdevelopeditsownsetofapplicationprogramminginterface(API).ThroughtheseAPIs,thedataavailableonaparticularsocialmediaplatformcanbeaccessed.However,thedataavailableislimitedanditisdifficulttoascertainthepossibleconclusionsthatcanbedrawnaboutsocietyonthebasisofthisdata.Thischapterexploresthewayssocialresearchersandscientistscanusesocialmediadatatosupporttheirresearchandanalysis

Chapter 8

BusinessIntelligencethroughOpinionMining 142

T K Das, VIT University, India

Businessorganizationshavebeenadoptingdifferentstrategiestoimpressupontheircustomersandattractthemtowardstheirproductsandservices.Ontheotherhand,theopinionsofthecustomersgatheredthroughcustomerfeedbackshavebeenagreatsourceofinformationforcompaniestoevolvebusinessintelligencetorightlyplacetheirproductsandservicestomeettheever-changingcustomerrequirements.Inthiswork,wepresentanewapproachtointegratecustomers’opinionsintothetraditionaldatawarehousemodel.WehavetakenTwitterasthedatasourceforthisexperiment.First,wehavebuiltasystemwhichcanbe

Trang 17

usedforopinionanalysisonaproductoraservice.Thesecondprocessistomodeltheopiniontablesoobtainedasadimensionaltableandtointegrateitwithacentraldatawarehouseschemasothatreportscanbegeneratedondemand.Furthermore,wehaveshownhowbusinessintelligencecanbeelicitedfromonlineproductreviewsbyusingcomputationalintelligencetechniquelikeroughsetbasedataanalysis

Chapter 9

SentimentAnalysis 162

A M Abirami, Thiagarajar College of Engineering, India

A Sheik Abdullah, Thiagarajar College of Engineering, India

A Askarunisa, KLN College of Information Technology, India

S Selvakumar, G K M College of Engineering and Technology, India

C Mahalakshmi, Thiagarajar College of Engineering, India

It requires sophisticated streaming of big data processing to process the billions of daily socialconversationsacrossmillionsofsources.Datasetneedsinformationextractionfromthemanditrequirescontextualsemanticsentimentmodelingtocapturetheintelligencethroughthecomplexityofonlinesocialdiscussions.SentimentanalysisisoneofthetechniquestocapturetheintelligencefromSocialNetworksbasedontheusergeneratedcontent.Therearemoreandmoreresearchesevolvingaboutsentimentclassification.Aspectextractionisthecoretaskinvolvedinaspectbasedsentimentanalysis.TheproposedmodelingusesLatentSemanticAnalysistechniqueforaspectextractionandevaluatessenti-scoresofvariousproductsunderstudy

Chapter 10

Aspect-BasedSentimentAnalysisofOnlineProductReviews 175

Vinod Kumar Mishra, Bipin Tripathi Kumaon Institute of Technology, India

Himanshu Tiruwa, Bipin Tripathi Kumaon Institute of Technology, India

Sentimentanalysisisapartofcomputationallinguisticsconcernedwithextractingsentimentandemotionfromtext.Itisalsoconsideredasataskofnaturallanguageprocessinganddatamining.Sentimentanalysismainlyconcentrateonidentifyingwhetheragiventextissubjectiveorobjectiveandifitissubjective,thenwhetheritisnegative,positiveorneutral.Thischapterprovideanoverviewofaspectbasedsentimentanalysiswithcurrentandfuturetrendofresearchonaspectbasedsentimentanalysis.ThischapteralsoprovideaaspectbasedsentimentanalysisofonlinecustomerreviewsofNokia6600.Toperformaspectbasedclassificationweareusinglexicalapproachoneclipseplatformwhichclassifythereviewasapositive,negativeorneutralonthebasisoffeaturesofproduct.TheSentiwordnetisusedasalexicalresourcetocalculatetheoverallsentimentscoreofeachsentence,postaggerisusedforpartofspeechtagging,frequencybasedmethodisusedforextractionoftheaspects/featuresandusednegationhandlingforimprovingtheaccuracyofthesystem

Chapter 11

SentimentAnalysiswithSocialMediaAnalytics,Methods,Process,andApplications 192

Karteek Ramalinga Ponnuru, BML Munjal University, India

Rashik Gupta, BML Munjal University, India

Shrawan Kumar Trivedi, BML Munjal University, India

Firmsareturningtheireyetowardssocialmediaanalyticstogettoknowwhatpeoplearereallytalkingabouttheirfirmortheirproduct.Withthehugeamountofbuzzbeingcreatedonlineaboutanythingand

Trang 18

everythingsocialmediahasbecome‘the’platformofthedaytounderstandwhatpubliconawholearetalkingaboutaparticularproductandtheprocessofconvertingallthetalkingintovaluableinformationiscalledSentimentAnalysis.SentimentAnalysisisaprocessofidentifyingandcategorizingapieceoftextintopositiveornegativesoastounderstandthesentimentoftheusers.Thischapterwouldtakethereaderthroughbasicsentimentclassifierslikebuildingwordclouds,commonalityclouds,dendrogramsandcomparisoncloudstoadvancedalgorithmslikeKNearestNeighbour,NạveBiasedAlgorithmandSupportVectorMachine

Chapter 12

OrganizationalIssueforBISuccess:CriticalSuccessFactorsforBIImplementationswithintheEnterprise 209

Sanjiva Shankar Dubey, BIMTECH Greater Noida, India

Arunesh Sharan, AS Consulting, India

ThischapterwillfocusonthetransformativeeffectBusinessIntelligence(BI)bringstoanorganizationdecisionmaking,enhancingitsperformance,reducingoverallcostofoperationsandimprovingitscompetitiveposture.Thischapterwillenunciatethekeyprinciplesandpracticestobridgethegapbetweenorganizationrequirementsvs.capabilitiesofanyBItool(s)byproposingaframeworkoforganizationalfactorssuchasuser’srole,theiranalyticalneeds,accesspreferencesandtechnical/analyticalliteracyetc.EvaluationmethodologytoselectbestBItoolsproperlyalignedtotheorganizationinfrastructurewillalsobediscussed.SofterissuesandorganizationalchangeforsuccessfulimplementationofBIwillbefurtherexplained

Chapter 13

EthicsofSocialMediaResearch 225

Amir Manzoor, Bahria University, Pakistan

Overthelastdecade,socialmediaplatformshavebecomeaverypopularchannelofcommunication.Thispopularityhassparkedanincreasinginterestamongresearcherstoinvestigatethesocialmediacommunication. Many studies have been done that collected the publicly available social mediacommunicationdatatounearthsignificantpatterns.However,onesignificantconcernraisedoversuchpracticeistheprivacyoftheindividual’ssocialmediacommunicationdata.Assuchitisimportantthatspecificethicalguidelinesareinplaceforfutureresearchesonsocialmediasites.Thischapterexploresvariousethicalissuesrelatedtoresearchesrelatedtosocialnetworkingsites.Thechapteralsoprovidesasetofethicalguidelinesthatfutureresearchesonsocialmediasitescanusetoaddressvariousethicalissues

Section 3 Big Data Analytics: Its Methods and Applications Chapter 14

BigDataAnalyticsinHealthCare 240

Keerthi Suneetha, SVEC, India

Withthearrivaloftechnologyandrisingamountofdata(BigData)thereisaneedtowardsimplementationofeffectiveanalyticaltechniques(BigDataAnalytics)inhealthsectorwhichprovidesstakeholderswithnewinsightsthathavethepotentialtoadvancepersonalizedcaretoimprovepatientoutcomesandavoid

Trang 19

unnecessarycosts.Thischaptercovershowtoevaluatethisbigvolumeofdataforunknownandusefulfacts,associations,patterns,trendswhichcangivebirthtonewlineofhandlingofdiseasesandprovidehighqualityhealthcareatlowercosttoallcitizens.ThischaptergivesawideinsightofintroductiontoBigDataAnalyticsinhealthdomain,processingstepsofBDA,ChallengesandFuturescopeofresearchinhealthcare

Chapter 15

MiningBigDataforMarketingIntelligence 250

Khadija Ali Vakeel, Indian Institute of Management Indore, India

Thischapterelaboratesonminingtechniquesusefulinbigdataanalysis.Specifically,itwillelaborateonhowtouseassociationrulemining,selforganizingmaps,wordcloud,sentimentextraction,networkanalysis,classification,andclusteringformarketingintelligence.Theapplicationofthesewouldbeondecisionsrelatedtomarketsegmentation,targetingandpositioning,trendanalysis,sales,stockmarketsandwordofmouth.Thechapterisdividedintwosectionsofdatacollectionandcleaningwhereweelaborateonhowtwitterdatacanbeextractedandminedformarketingdecisionmaking.Secondpartdiscussesvarioustechniquesthatcanbeusedinbigdataanalysisforminingcontentandinteractionnetwork

Chapter 16

PredictiveAnalysisforDigitalMarketingUsingBigData:BigDataforPredictiveAnalysis 259

Balamurugan Balusamy, VIT University, India

Priya Jha, VIT University, India

Tamizh Arasi, VIT University, India

Malathi Velu, VIT University, India

Bigdataanalyticsinrecentyearshaddevelopedlightningfastapplicationsthatdealwithpredictiveanalysisofhugevolumesofdataindomainsoffinance,health,weather,travel,marketingandmore.Businessanalyststaketheirdecisionsusingthestatisticalanalysisoftheavailabledatapulledinfromsocialmedia,usersurveys,blogsandinternetresources.Customersentimenthastobetakenintoaccountfordesigning,launchingandpricingaproducttobeinductedintothemarketandtheemotionsoftheconsumerschangesandisinfluencedbyseveraltangibleandintangiblefactors.ThepossibilityofusingBigdataanalyticstopresentdatainaquicklyviewableformatgivingdifferentperspectivesofthesamedataisappreciatedinthefieldoffinanceandhealth,wheretheadventofdecisionsupportsystemispossibleinallaspectsoftheirworking.Cognitivecomputingandartificialintelligencearemakingbigdataanalyticalalgorithmstothinkmoreontheirown,leadingtocomeoutwithBigdataagentswiththeirownfunctionalities

Chapter 17

StrategicBest-in-ClassPerformanceforVoicetoCustomer:IsBigDatainLogisticsaPerfect

Match? 284

Supriyo Roy, Birla Institute of Technology, India

Kaushik Kumar, Birla Institute of Technology, India

Foranyforward-lookingperspective,organizationalinformationwhichistypicallyhistorical,incompleteandmostofthetimeinaccurate,needstobeenrichedwithexternalinformation.However,traditionalsystemsandapproachesareslow,inflexibleandcannothandlenewvolumeandcomplexityofinformation.

Trang 20

Bigdata,anevolvingterm,basicallyreferstovoluminousamountofstructured,semi-structuredorunstructuredinformationintheformofdatawithapotentialtobeminedfor‘bestinclassinformation’.Primarily,bigdatacanbecategorizedby3V’s:volume,varietyandvelocity.Recenthypearoundbigdataconceptspredictsthatitwillhelpcompaniestoimproveoperationsandmakesfasterandintelligentdecisions.Consideringthecomplexitiesinrealmsofsupplychain,inthisstudy,anattempthasbeenmadetohighlighttheproblemsinstoringdatainanybusiness,especiallyunderIndianscenariowherelogisticsarenaismostunstructuredandcomplicated.Conclusionmaybesignificanttoanystrategicdecisionmaker/managerworkingwithdistributionandlogistics

Section 4 Advanced Data Analytics: Decision Models and Business Applications

Chapter 18

FirstLookonWebMiningTechniquestoImproveBusinessIntelligenceofE-Commerce

Applications 298

G Sreedhar, Rashtriya Sanskrit Vidyapeetha (Deemed University), India

A Anandaraja Chari, Rayalaseema University, India

WebDataMiningistheapplicationofdataminingtechniquestoextractusefulknowledgefromwebdatalikecontentsofweb,hyperlinksofdocumentsandwebusagelogs.Thereisalsoastrongrequirementoftechniquestohelpinbusinessdecisionine-commerce.WebDataMiningcanbebroadlydividedintothreecategories:Webcontentmining,WebstructureminingandWebusagemining.Webcontentdataarecontentavailedtouserstosatisfytheirrequiredinformation.Webstructuredatarepresentslinkageandrelationshipofwebpagestoothers.Webusagedatainvolveslogdatacollectedbywebserverandapplicationserverwhichisthemainsourceofdata.ThegrowthofWWWandtechnologieshasmadebusinessfunctionstobeexecutedfastandeasier.Aslargeamountoftransactionsareperformedthroughe-commercesitesandthehugeamountofdataisstored,valuableknowledgecanbeobtainedbyapplyingtheWebMiningtechniques

Chapter 19

ArtificialIntelligenceinStochasticMultipleCriteriaDecisionMaking 315

Hanna Sawicka, Poznan University of Technology, Poland

Thischapterpresentstheconceptofstochasticmultiplecriteriadecisionmaking(MCDM)methodtosolvecomplexrankingdecisionproblems.Thisapproachiscomposedofthreemainareasofresearch,i.e.classicalMCDM,probabilitytheoryandclassificationmethod.Themostimportantstepsoftheideaarecharacterizedandspecificfeaturesoftheappliedmethodsarebrieflypresented.TheapplicationofElectreIIIcombinedwithprobabilitytheory,andPrometheeIIcombinedwithBayesclassifieraredescribedindetails.Twocasestudiesofstochasticmultiplecriteriadecisionmakingarepresented.Thefirstoneshowsthedistributionsystemofelectrotechnicalproducts,composedof24distributioncenters(DC),whilethecorebusinessofthesecondoneistheproductionandwarehousingofpharmaceuticalproducts.BasedontheapplicationofpresentedstochasticMCDMmethod,differentwaysofimprovementsofthesecomplexsystemsareproposedandthefinali.e.thebestpathsofchangesarerecommended

Trang 21

Chapter 21

OnDevelopmentofaFuzzyStochasticProgrammingModelwithItsApplicationtoBusiness

Management 353

Animesh Biswas, University of Kalyani, India

Arnab Kumar De, Government College of Engineering and Textile Technology, India

Thischapterexpressesefficiencyoffuzzygoalprogrammingformultiobjectiveaggregateproductionplanning in fuzzy stochastic environment. The parameters of the objectives are taken as normallydistributedfuzzyrandomvariablesandthechanceconstraintsinvolvejointCauchydistributedfuzzyrandomvariables.Inmodelformulationprocessthefuzzychanceconstrainedprogrammingmodelisconvertedintoitsequivalentfuzzyprogrammingusingprobabilistictechnique,α-cutoffuzzynumbersandtakingexpectationofparametersoftheobjectives.Defuzzificationtechniqueoffuzzynumbersisusedtofindmultiobjectivelinearprogrammingmodel.Membershipfunctionofeachobjectiveisconstructeddependingontheiroptimalvalues.Afterwardsaweightedfuzzygoalprogrammingmodelisdevelopedtoachievethehighestdegreeofeachofthemembershipgoalstotheextentpossiblebyminimizinggroupregretsinamultiobjectivedecisionmakingcontext.Toexplorethepotentialityoftheproposedapproach,productionplanningofahealthdrinksmanufacturingcompanyhasbeenconsidered

Trang 22

Compilation of References 397 About the Contributors 427 Index 436

Trang 23

Preface

Thecompleteworkofthisbookisdividedintofoursections.Thefirstsectiontitled“BusinessIntelligencewithDataMining:ProcessandApplications”includesallthechaptersrelatedtobusinessAnalyticswithdatamininganditsapplications.Thesecondsectiontitled“SocialMediaAnalyticswithSentimentAnalysis:BusinessApplicationsandMethods”containsallthechaptersrelatedtosocialmediaanalyticstechniquesanditsapplicationsofbusinessintelligence.Inthethirdsectiontitled“BigDataAnalytics:ItsMethodsandApplications”coversallthechaptersrelatedtobigdataprocessesanditsapplications.Thelastsectionincludesthechaptersrelatedtoadvancedecisionmodelsforbusinessanalyticstitledas

“AdvanceDataAnalytics:DecisionModelsandBusinessApplications”.Thebriefdescriptionofeachsectionasfollows:

Thefirstsectionofthisbookis“BusinessIntelligenceWithDataMining:ProcessandApplications”wherethechaptersrelatedtodataminingmethodsanditsapplicationshavebeendiscussed.ThefirstchapterofthissectionauthoredbyA.SheikAbdullah,S.Selvakumar,andA.M.Abirami,explainsaboutdataanalyticswheretheyexplainDataanalyticsmainlydealswiththescienceofexaminingandinvestigatingrawdatatoderiveusefulpatternsandinference.Dataanalyticshasbeendeployedinmanyoftheindustriestomakedecisionsatproperlevels.Itfocusesupontheassumptionandevaluationofthemethodwiththeintentionofderivingaconclusionatvariouslevels.Varioustypesofdataanalyticaltechniquessuchaspredictiveanalytics,prescriptiveanalytics,descriptiveanalytics,textanalytics,andsocialmediaanalyticsareusedbyindustrialorganizations,educationalinstitutionsandbygovernmentassociations.Thiscontextmainlyfocusestowardstheillustrationofcontextualexamplesforvarioustypesofanalyticaltechniquesanditsapplications.Inthesecondchapter,HirakDasguptaaimstocomparetheapproachesandconcludethatstatisticiansanddataminerscanprofitbystudyingeachother’smethodsbyusingthecombinationofmethodsjudiciously.Thechapteralsoattemptstodiscussdatacleaningtechniquesinvolvedindatamining.ThethirdchapterofthissectionauthoredbyA.SheikAbdullah,R.Suganya,S.Selvakumar,andS.Rajaram,dealswithvarioustechniques,methodologiesthatcorrespondtotheclassificationproblemindataanalysisprocessanditsmethodologicalimpactstobigdata.ThefourthchapterwrittenbyRaghvendraKumar,PrasantKumarPattnaikandPriyankaPandey,presentedvariousprivacypreservingprotocolsthatareusedforsecurityinclustereddatabases.TheXln(X)pro-tocolandthesecuresumprotocolareusedinmutualcomputing,whichcandefendprivacyefficiently.Itsfocusesonthedatamodificationtechniques,whereithasbeenmodifiedourdistributeddatabaseandafterthatsandedthatmodifieddatasettotheclientadminforsecuredatacommunicationwithzeropercentageofdataleakageandalsoreducethecommunicationandcomputationcomplexity.ThefifthchapterofthissectionauthoredbyMasoumehZareapoor,PouryaShamsolmoaliandM.AfsharAlam,showstheperformanceofnewcreditcardfrauddetectiontechniquewhichisbasedon,firstlybalancingxxii

Trang 24

thetransactionrecords,andthenappliestheproposedalgorithmtodetectthefraudulenttransactions.Attheend,weconductaseriesofexperimentstoevaluatetheeffectivenessofourproposedtechniques.

Inthe chapter sixauthoredbyBelayGebremeskel,YiChai,andZhongshiHe,incorporatestremendous

novelideasandmethodologiesastheintegralofspatialdatamining(SDM),whichishighlypertinentandserveasasingleinferencematerialforresearchers,experts,andotherusers

Thesecondsectionofthisbookis“SocialMediaAnalyticsWithSentimentAnalysis:BusinessApplicationsandMethods”wherethechaptersrelatedtosocialmediaanalyticsmethodsandrelatedapplicationshavebeendiscussed.InChapter7authoredbyAmirManzoor,exploresthewayssocialresearchersandscientistscanusesocialmediadatatosupporttheirresearchandanalysis.Chapter8writtenbyT.K.Das,presentsanewapproachtointegratecustomers’opinionsintothetraditionaldatawarehousemodel.HehastakenTwitterasthedatasourceforthisexperimentwhereatfirst,asystemwhichcanbeusedforopinionanalysisonaproductoraservicehasbeenbuilt.Thesecondprocessistomodeltheopiniontablesoobtainedasadimensionaltableandtointegrateitwithacentraldatawarehouseschemasothatreportscanbegeneratedondemand.Furthermore,hehasshownhowbusinessintelligencecanbeelicitedfromonlineproductreviewsbyusingcomputationalintelligencetechniquelikeroughsetbasedataanalysis.Chapter9authoredbyA.M.Abirami,A.SheikAbdullah,A.Aska-runisa,S.Selvakumar,andC.Mahalakshmiproposesamodelingtechniquethatuseslatentsemanticanalysis(LSA)techniqueforaspectextractionandevaluatessenti-scoresofvariousproductsunderstudy.InChapter10,VinodKumarMishra,andHimanshuTiruwaprovideanoverviewofaspectbasedsentimentanalysiswithcurrentandfuturetrendofresearchonaspectbasedsentimentanalysis.ThischapteralsoprovidesanaspectbasedsentimentanalysisofonlinecustomerreviewsofNokia6600.Toperformaspectbasedclassificationtheyareusinglexicalapproachoneclipseplatformwhichclassifythereviewasapositive,negativeorneutralonthebasisoffeaturesofproduct.Thesenti-wordnetisusedasalexicalresourcetocalculatetheoverallsentimentscoreofeachsentence,postaggerisusedforpartofspeechtagging,frequencybasedmethodisusedforextractionoftheaspects/featuresandusednegationhandlingforimprovingtheaccuracyofthesystem.Chapter11writtenbyPonnuruRamalingaKarteek,RashikGupta,andShrawanKumarTrivedi,takethereaderthroughbasicsentimentclassi-fierslikebuildingwordclouds,commonalityclouds,dendrogramsandcomparisoncloudstoadvancedalgorithmslikeKNearestNeighbour,NạveBiasedAlgorithmandSupportVectorMachine.InChapter12,SanjivaShankarDubeyandAruneshSharanenunciatethekeyprinciplesandpracticestobridgethegapbetweenorganizationrequirementsvs.capabilitiesofanyBItool(s)byproposingaframeworkoforganizationalfactorssuchasuser’srole,theiranalyticalneeds,accesspreferencesandtechnical/analyticalliteracyetc.Chapter13authoredbyAmirManzoorexploresvariousethicalissuesrelatedtoresearchesrelatedtosocialnetworkingsites.Thischapteralsoprovidesasetofethicalguidelinesthatfutureresearchesonsocialmediasitescanusetoaddressvariousethicalissues

Thethirdsectionofthisbookis“BigDataAnalytics:ItsMethodsandApplications”wherethechaptersrelatedtoBigdataanalyticsmethodsandtheirapplicationshavebeendiscussed.Inthissection,Chapter14,writtenbyK.Suneetha,covershowtoevaluatethisbigvolumeofdataforunknownandusefulfacts,associations,patterns,trendswhichcangivebirthtonewlineofhandlingofdiseasesandprovidehighqualityhealthcareatlowercosttoallcitizens.ThischaptergivesawideinsightofintroductiontoBigDataAnalyticsinhealthdomain,processingstepsofBDA,ChallengesandFuturescopeofresearchinhealthcare.Chapter15authoredbyKhadijaAliVakeelelaboratesonminingtechniquesusefulinbigdataanalysis.Specifically,itwillelaborateonhowtouseassociationrulemining,self-organizingmaps,wordcloud,sentimentextraction,networkanalysis,classification,andclusteringformarketing

xxiii

Trang 25

intelligence.Theapplicationofthesewouldbeondecisionsrelatedtomarketsegmentation,targetingandpositioning,trendanalysis,sales,stockmarketsandwordofmouth.Thechapterisdividedintwosectionsofdatacollectionandcleaningwhereweelaborateonhowtwitterdatacanbeextractedandminedformarketingdecisionmaking.Secondpartdiscussesvarioustechniquesthatcanbeusedinbigdataanalysisforcontentandinteractionnetwork.InChapter16,BalamuruganBalusamy,PriyaJha,TamizhArasi,andMalathiVeludiscusstheBigdataanalyticsinrecentyearshaddevelopedlightningfastapplicationsthatdealwithpredictiveanalysisofhugevolumesofdataindomainsoffinance,health,weather,travel,marketingandmore.Businessanalyststaketheirdecisionsusingthestatisticalanalysisoftheavailabledatapulledinfromsocialmedia,usersurveys,blogsandinternetresources.Customersentimenthastobetakenintoaccountfordesigning,launchingandpricingaproducttobeinductedintothemarketandtheemotionsoftheconsumers’changesandisinfluencedbyseveraltangibleandintangiblefactors.Thepossibilityofusingbigdataanalyticstopresentdatainaquicklyviewableformatgivingdifferentperspectiveofthesamedataisappreciatedinthefieldoffinanceandhealth,wheretheadventofdecisionsupportsystemispossibleinallaspectsoftheirworking.Cognitivecomputingandartificialintelligencearemakingbigdataanalyticalalgorithmstothinkmoreontheirown,leadingtocomeoutwithbigdataagentswiththeirownfunctionalities.InChapter17,SupriyoRoyandKaushikKumar,exploretheusefulnessofapplyingbigdataconceptsintheseemergingareasoflogisticsareexploredwithdifferentdimensions.Conclusionofthispapermayseemtobesignificanttoanystrategicdecisionmaker/managerworkingwithspecificfieldofdistributionandlogistics

tions”wherethechaptersrelatedtoadvancedataanalyticstechniquesandtheirapplicationshavebeendiscussed.Chapter18,writtenbyG.SreedharandA.A.Chari,considerstheimportantelementofPageloadtimeofawebsiteforassessingtheperformanceofsomewell-knownonlineBusinesswebsitesthroughstatisticaltools.AlsothisresearchworkconsiderstheoptimumdesignaspectofBusinessweb-sitesleadingtoimprovementandbettermentofonlinebusinessprocess.Chapter19,writtenbyHannaSawicka,presentstheconceptofstochasticmultiplecriteriadecisionmaking(MCDM)methodtosolvecomplexrankingdecisionproblems.Thisapproachiscomposedofthreemainareasofresearch,i.e.classicalMCDM,probabilitytheoryandclassificationmethod.Themostimportantstepsoftheideaarecharacterizedandspecificfeaturesoftheappliedmethodsarebrieflypresented.TheapplicationofElectreIIIcombinedwithprobabilitytheory,andPrometheeIIcombinedwithBayesclassifieraredescribedindetails.Twocasestudiesofstochasticmultiplecriteriadecisionmakingarepresented.Thefirstoneshowsthedistributionsystemofelectro-technicalproducts,composedof24distributioncenters(DC),whilethecorebusinessofthesecondoneistheproductionandwarehousingofpharmaceuticalproducts.BasedontheapplicationofpresentedstochasticMCDMmethod,differentwaysofimprovementsofthesecomplexsystemsareproposedandthefinali.e.thebestpathsofchangesarerecommended.InChapter20,NitaH.Shahdiscussestheproblemthatanalyzesasupplychaincomprisedoftwofront-runnerretailersandonesupplier.Theretailers’offercustomersdelayinpaymentstosettletheaccountsagainstthepurchaseswhichisreceivedbythesupplier.Themarketdemandoftheretailerdependsontime,retailpriceandacreditperiodofferedtothecustomerswiththatoftheotherretailer.Thesuppliergivesitemswithsamewholesalepriceandcreditperiodtotheretailers.Thejointandindependentdeci-sionsareanalyzedandvalidatednumerically.Chapter21,writtenbyAnimeshBiswasandArnabKumarDe,expressesefficiencyoffuzzygoalprogrammingtechniqueformulti-objectiveaggregateproductionplanninginfuzzystochasticenvironment.Theparametersoftheobjectivesaretakenasnormallydistrib-utedfuzzyrandomvariablesandthechanceconstraintsinvolvejointCauchydistributedfuzzyrandom

Trang 26

variables.Inmodelformulationprocessthefuzzychanceconstrainedprogrammingmodelisconvertedintoitsequivalentfuzzyprogrammingformusingtheconceptsofprobabilistictechnique,α-cutoffuzzynumbersandtakingexpectationofparametersoftheobjectives.De-fuzzificationtechniqueoffuzzynumbersisusedtofindmulti-objectivelinearprogrammingmodel.Membershipfunctionofeachobjec-tiveisconstructeddependingontheiroptimalvalues.Afterwardsaweightedfuzzygoalprogrammingmodelisdevelopedtoachievethehighestdegreeofeachofthemembershipgoalstotheextentpossiblebyminimizinggroupregretsinamulti-objectivedecisionmakingcontext.Toexplorethepotentialityoftheproposedapproach,productionplanningofahealthdrinksmanufacturingcompanyhasbeenconsidered.Chapter22,writtenbyTimmarajuSrimanyu,VadlamaniRavi,andG.R.Gangadharan,focusesonrecommendationofcloudservicesbyrankingthemwiththehelpofopinionminingofusers’reviewsandmulti-attributedecisionmakingmodels(TOPSISandFMADMwereappliedseparately)intandemonbothquantitativeandqualitativedata.Surprisingly,bothTOPSISandFMADMyieldedthesamerankingsforthecloudservices

xxv

Trang 27

Acknowledgment

cifically,totheauthorsandreviewersthattookpartinthereviewprocess.Withouttheirsupport,thisbookwouldnothavebecomeareality

Theeditorswouldliketoacknowledgethehelpofallthepeopleinvolvedinthisprojectand,morespe-Wewouldliketothankeachoneoftheauthorsfortheircontributions.Theeditorswishtoacknowledgethevaluablecontributionsofthereviewersregardingtheimprovementofquality,coherence,andcontentpresentationofchapters.Mostoftheauthorsalsoservedasreferees;wehighlyappreciatetheirdoubletask.WearegratefultoallmembersofIGIpublishinghousefortheirassistanceandtimelymotivationinproducingthisvolume

WehopethereaderswillshareourexcitementwiththisimportantscientificcontributionthebodyofknowledgeaboutvariousapplicationsofHandbookofResearchonAdvancedDataMiningTechniquesandApplicationsforBusinessIntelligence

Shrawan Kumar Trivedi

BML Munjal University, India

Shubhamoy Dey

Indian Institute of Management Indore, India

Anil Kumar

BML Munjal University, India

Tapan Kumar Panda

Jindal Global Business School, India

xxvi

Trang 29

a conclusion at various levels Various types of data analytical techniques such as predictive ics, prescriptive analytics, descriptive analytics, text analytics, and social media analytics are used by industrial organizations, educational institutions and by government associations This context mainly focuses towards the illustration of contextual examples for various types of analytical techniques and its applications.

analyt-INTRODUCTION: DATA ANALYTICS

Data analytics is the knowledge of investigating raw data with the intention of deriving solution for a specified problem analysis Nowadays analytics has been used by many corporate, industries and institu-tions for making exact decision at various levels The mechanism of drawing solutions during analysis

of large datasets with the intention of determining hidden patterns and its relationship Analytics differs from mining with the mechanism of determining the new patterns, scope, techniques and its purpose

An Introduction to Data Analytics:

Its Types and Its Applications

Trang 30

An Introduction to Data Analytics

ANALYTICS PROCESS MODEL

The Mechanism of analytics has been used variantly with machine learning, data science and knowledge discovery The process model initially starts with the data source which is in raw form of representation The data needed for analysis has to be selected with accordance to the problem need for data interpreta-tion The identified data may contain various missing fields, irrelevant data items This has to be resolved and cleaned Then the data has to be transformed accordingly to the necessary format for evaluation and this can be made by the data standardization techniques such as min-max normalization, Z-score normalization and normalization by decimal scaling As an outcome the final evaluated pattern provides the visualized data representation of the data which can be fed up for evaluation and interpretation The workflow of the process model is depicted in Figure 1

ANALYTIC REQUIREMENTS

The Analytical model should actually solve the chosen problem in which it has to be developed In order

to achieve or to solve the defined problem it should be properly defined The model to be developed must have predictive capabilities in order to determine the patterns and interpretations from the observed data Then the model should resemble an interpretable power and it should be justifiable in nature Even though the model is to be interpretable it should adhere to its statistical performance The efficiency in collect-ing the data, processing, analyzing it also plays a role in the requirement of the analytical requirements

Figure 1 Analytics process model

Trang 31

An Introduction to Data Analytics

TYPES OF ANALYTICS

Predictive Analytics

Predictive analytics mainly deals with the mechanism of predicting or observing the target value of measure The value of measure signifies the performance of the analytical model which is being devel-oped There by the nature of the developed model can be ascertained with the measured value Hence the term predictive analytics if often said to be supervised learning because the target variable will be known in prior with accordance to the definition of the tuple of record (T Hastie, R Tibshirani, & Friedman, 2001) There are various sorts of algorithms used in predicting the nature of a data or a real world problem, such as:

6 Ensemble methods such as boosting and bagging

Let us discuss about one of the techniques in predictive analytics such as linear Regression

Linear Regression

The working of simple linear regression involves a response variable and a predictor variable In simple straight line linear regression, it involves a single predictor variable but in case of logistic regression it involves more than one predictor variable (Jiawei Han, Micheline Kamber & Jian Pei, 2011) The straight line regression is represented in the Equation 1 as follows:

The co-efficient W0 and W1 are referred to as the weights of the predictive function Consider let D

be the dataset which contains the values for predictor and response variable X and Y, which is sented of the form:

x is the mean value of x1, x2……… xD

y is the mean value of y1, y2……… yD

Trang 32

An Introduction to Data Analytics

From the determined value of W1 the value of W0 can be obtained There by if any predictor variable has to be identified its response value can be determined

Descriptive Analytics

Descriptive analytics mainly deals with the intention of describing the patterns of a customer behavior

In predictive analytics the label (target) which will be known in advance but in descriptive analytics there will be no such target measure or a target variable (Srikant R & Agarwal R, 1995) This technique is also referred to as unsupervised learning because the target variable is not known to influence the learn-ing phenomenon (Bart Baesens, 2014) There are about various techniques that deals with descriptive analytics such as:

1 Association rule mining

2 Sequence rule mining

3 Data clustering

Let us discuss about one of the techniques in Descriptive analytics such as sequence rule mining

Sequence Rule Mining

The mechanism of sequence rule mining is to determine the maximum sequences among the set of all sequence that has been determined from the given transactional data It must possess a certain degree of support and confidence level Considering the market based analysis of a transactional data the number

of maximal sequences determined for the item set signifies the frequency level of that sequence among all the items Consider the following example of transactional data which contains the sequence of the items purchased, the session time of purchase, and the items as depicted in Table 1

Table 1 Sequence rule for a transactional dataset

Trang 33

An Introduction to Data Analytics

Table 1 can be represented in the form of sequence rule as follows:

ap-Text Analytics

Text analysis is also known as text mining Text analysis involves in extracting structured information from the collection of documents Generally, it is performed using Natural Language Processing (NLP) techniques It includes sub processes like pre-processing, Part-of-Speech (PoS) tagging, feature extrac-tion, classification and so on

Text pre-processing includes removal of stop words, stemming, etc Usually the words like ‘a’ an’, the’ do not convey any meaning The words which are of type articles, prepositions, and pronouns are removed Stemming is the process of removing derived words to their word stem, base or root form For example, the words ‘construction’, ‘constructing’, are reduced to ‘construct’ There are many stemming algorithms available The Porter Stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and in flexional endings from words in English Its main use is as part of

a term normalization process that is usually done when setting up Information Retrieval systems

In PoS tagging, the sentences in the data set collection are tokenized using the PoS tagger During this process, a part of speech such as noun, verb, adverb, adjective, conjunctions, negations and the like are assigned to every word in the sentences For example, “Place is very good” is tagged as shown in Table 2.There are many free PoS taggers available Python library functions, Stanford NLP tools are readily available to do PoS tagging

Table 2 PoS tagging of sentence

Trang 34

Latent Semantic Indexing (LSI)

Latent semantic indexing (LSI) is an indexing and retrieval method that uses a mathematical technique

called singular value decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text LSI understands the patterns among words

in an intelligent way It considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant This enables the document classification more or less similar to human being action

Latent Dirichlet Allocation

Latent Dirichlet allocation is a way of automatically discovering topics from the sentences It is used for topic modeling or feature extraction in text documents It is the best technique used for document classification

Social Media Analytics

Writing in social media to express one’s views becomes common now-a-days The enormous growth

of social media through online reviews, discussions, blogs, micro-blogs, twitter, etc on the Web, enable individuals and organizations to make decisions using these content In the same way, there has been

an increase in attention on social media as a source of research data in areas such as decision making, recommender systems, etc The power of social media is described by (Jagadeesh Kumar, 2014) and it can be effectively used to influence public opinion or research behavior Social media analysis can be defined as the analysis of user generated content written in the common or public discussion forums, blogs and other social media networking sites to make or improve business decisions

Social media has its foot print in almost all types of industries like tourism, healthcare, and so on (Ting, 2014) analyzed blogs and showed the necessity of travel blogs for sharing of experience among people (Zeng, 2014) suggested and demonstrated the impact of social media analytics to the economic contribution of tourism industry and thereby to the country His work also justified that the user generated content in social media web sites are perceived as recommendations from like-minded friends mostly

by the younger generations of this century (Jacobsen, 2014) made a study on Mallorca (Spain) tourism using the destination-specific surveys and showed how visual content and types of content creators make differences in the holiday decision-making

Social media analysis adds value to improve health care through patient engagement by increasing access to information and ability to receive the information in real-time It provides opportunities for

Trang 35

An Introduction to Data Analytics

patients to share their experiences with others It enables health care industry to improve their service quality

Much of social media text research has been undertaken in marketing and retail sector to improve customer satisfaction, recommend new products and so on Social media, as a data source, contains valu-able consumer insights and enable business intelligence Social media text analysis helps the business

to take marketing decisions based on the most discussed topics in the social media It also enables them drill into the data to see what is causing the dissatisfaction among users It provides multi-dimensional insight of a brand and its features, promotions, shoppers, consumers, and influencers It delivers trend analysis, behavior tracking, and overall understanding

The sentiment analysis framework identifies the key discussion concepts from the user reviews or comments Feature based or aspect level sentiment analysis is to classify sentiment with respect to specific aspects of the entities For example, the sentence “The camera’s picture quality is good, but its battery life is short” evaluates two aspects, picture quality and battery life, of camera (entity) The sentiment on camera’s picture quality is positive, but the sentiment on its battery life is negative The picture quality and battery life of camera are the opinion targets The user generated data provides a higher degree of accuracy about what exactly the user feels about a particular place Feature based analysis recommends the feature of a particular product whether it is positive or negative This information can be used to improve business outcomes and ensure a very high level of user satisfaction

Survival Analytics

Survival analytics is one of the categories of statistics which mainly deals with the happening of an event with respect to time It provides the justification of the reliability of the event in accordance with time For example, predicting the behavior of market based analysis, web catalog visits by customer and so

on The techniques that deals with the survival analysis measurements are;

to that of the results in rapid miner tool

In accordance with the weather dataset with numeric attributes as shown in Table 3, the attribute temperature and humidity are numerical Therefore, an efficient classification algorithm must have the capability to be dealt with numerical data rather than categorical The decision tree classification algorithm resolves this problem by making binary split among the range of the attribute values Let us consider the attribute humidity the values of the attributes are sorted in ascending order as in Table 4.Discretization among the numeric attribute values involves the partitioning of the values by adopting the strategy of breakpoints i.e halfway between the either side of the data values by ensuring the split is made in accordance with the majority of the class values on one side and remaining in the other Therefore, when applying this in the above values we get 9 sort of breakpoints between them such as in Table 5

Trang 36

An Introduction to Data Analytics

The values obtained by considering the halfway split among the values are 67.5, 72.5, 82.5, 85.5,

88, 90.5, and 95.5 While considering the halfway split if the instances with same values fall into the different class label then the split at those points cannot be considered as in Table 6

In this partition, if the preceding class values is of same then there occurs no problem in merging those partitions which belongs to same classes as in Table 7

If the adjacent partition consists of the same sort of the majority of a particular class label then they can be merged together without affecting the rule Therefore, the final discretization as in Table 8 is:The split value for the attribute Humidity is:

Humidity: ≤ 82.5 (Yes)

> 82.5 and ≤ 95.5 (No)

> 95.5 (Yes)

Table 3 Dataset description

Table 5 Ordering level 2

Trang 37

An Introduction to Data Analytics

The split at the point 95.5 makes the partition to fall most of the labels to fall in one split and only one yes tuple in the other split thereby it won’t be considered to be as a binary split while adapting the halfway binary split among the class labels Hence the split value 82.5 is considered to be as the break-point among the class labels of yes and no tuples

The attribute temperature is also found to be numeric, therefore while adopting the same procedure

we get the following result as in Table 9

While choosing breakpoints i.e halfway binary split the values are found to be 64.5, 66.5, 70.5, 72, 77.5, 80.5 and 84 Here, in this partition if the first and the second split are removed then the majority of the class label is found to be Yes therefore the split at those point can be removed Accordingly, if there are number of occurrences of the same values of the labels then the split at those points can be removed without causing any problems for errors The resultant partition is shown in Table 10 as follows,

If the adjacent partitions in the split seems to have same sort of majority in their class label values, then they can be merged together Hence the resultant partition is shown in Table 11 as follows:

At this point, the split value for the attribute temperature is found to be 77.5

i.e ≤ 77.5 (Yes)

> 77.5 (No)

Table 6 Ordering level 3

Table 7 Ordering level 4

Table 8 Ordering level 5

Table 9 Ordering level 6

Table 11 Ordering level 8

Table 10 Ordering level 7

Trang 38

An Introduction to Data Analytics

The resultant values produced by the splitting criterion method with numerical dataset are similar to that of the categorical attribute values Hence for any sort of real world problem decision tree classification algorithm handles both categorical and numerical attributes in a similar way of generating decision trees

An Example: Weather Dataset

Importing a dataset in rapid miner can be made in variety of formats such as csv, excel, xml, access, database, arff, xrff, spss, sparse, Dasylab, Url etc The format that we are importing the dataset is of csv format as described in Figure 2 The dataset contains four attributes and a class label play The class label is of binominal which contains nine yes’s and five no’s tuple

Open a new project in rapid miner as depicted in Figure 3 and then import read csv operator as in Figure 4 Each of the process that we are creating contains one of the operators of this type hence referred

to be as the root operator This operator provides a set of parameters that are of global relevance to the process like initialization of parameters

After loading the csv file as in Figure 5 i.e., the weather dataset from the dataset folder which contains the weather dataset in csv format and select the type for the attribute that has been chosen for classifica-tion and select the class label option for the attribute that has to be fixed as label as in Figure 6 after all these steps are complete then click finish to end the import configuration wizard

After the importing wizard is complete then click and drag the decision tree operator as in Figure 7 this operator learns decision tree for both of the numerical and categorical data Decision tree classifica-tion method is considered to be one of the best classification techniques which can be easily understood Each node in the decision tree is labeled with an attribute and the outcomes of the attribute is mapped

to the next attribute with maximal gain values The evaluation of the tree stops when the leaf node has been reached

Figure 2 Dataset in csv format

Trang 39

An Introduction to Data Analytics

From the tree as in Figure 8 we can observe that the root node is selected to be outlook because it has the maximal gain value among all the attributes that are in accordance with the weather dataset Hence from the distributions made by the attribute outlook the prediction has been made and the next attribute with maximal gain is chosen for the next level of classification and the final destination is reached when the maximal depth is reached

Figure 3 Creating a new project

Figure 4 Importing csv operator

Trang 40

An Introduction to Data Analytics

FUTURE SCOPE AND DIRECTIONS

This chapter mainly focuses towards methods available for analytics with an example illustration Each type

of analytics has its applications towards various fields mapping to real world scenarios The future scope and enhancement can be made with accordance to the applicability to solve use cases concerned with the type

of analytical requirement across various domains Meanwhile, the nature of the environment, the type of data plays a significant role in the way of processing analytical procedures However, each type of analytics has broad variance in analysis and determination of results Proper and suitable analytical technique has to

be selected and adhered for the type of the data and the environment that has been chosen

Figure 5 Loading data into the operator

Figure 6 Selecting the label attribute

Ngày đăng: 03/01/2020, 10:50

TỪ KHÓA LIÊN QUAN