SarrelA Survey of Usage, Access Methods, Projects, and Skills The State of Data Analytics and Visualization Adoption Compliments of... 1 Introduction 1 Data Analytics and Visualizatio
Trang 1Matthew D Sarrel
A Survey of Usage, Access Methods,
Projects, and Skills
The State of Data
Analytics and
Visualization Adoption
Compliments of
Trang 2Raise Your Big Data IQ
Zoomdata Master Class makes it easy to get a big data analytics education
Learn from top industry experts on topics like modern data and analytics platforms, big and streaming data analytics, and more Before you know it, people will wonder how you got so smart!
Check out Zoomdata Master Class today!
Learn from: Tony Baer, Ovum; Howard Dresner, Dresner Advisory Services; Matt Aslett, 451 Research; Wayne Eckerson, Eckerson Group; Mark Madsen, Third Nature; Mike Lock, Aberdeen Group …and more!
Trang 4[LSI]
The State of Data Analytics and Visualization Adoption
by Matthew D Sarrel
Copyright © 2017 O’Reilly Media, Inc All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use Online editions are also available for most titles (http://oreilly.com/safari) For more information, contact our corporate/institutional sales department: 800-998-9938 or
corporate@oreilly.com.
Editor: Nicole Tache
Production Editor: Kristen Brown
Copyeditor: Octal Publishing, Inc.
Interior Designer: David Futato
Cover Designer: Karen Montgomery
Illustrator: Ellie Volckhausen September 2017: First Edition
Revision History for the First Edition
2017-09-18: First Release
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc The State of Data
Analytics and Visualization Adoption, the cover image, and related trade dress are
trademarks of O’Reilly Media, Inc.
While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limi‐ tation responsibility for damages resulting from the use of or reliance on this work Use of the information and instructions contained in this work is at your own risk If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsi‐ bility to ensure that your use thereof complies with such licenses and/or rights.
Trang 5Table of Contents
The State of Data Analytics and Visualization Adoption 1
Introduction 1
Data Analytics and Visualization Usage: The Big Picture 2
Key Areas of Analytics by Industry 4
Usage and Access of Analytics by Industry 7
Working with the Data: Joining, Sourcing, Streaming 9
Requisite Skills for Analytics by Industry 10
The Value of Big Data Today 11
Summary 13
iii
Trang 7The State of Data Analytics and
Visualization Adoption
Introduction
Regardless of industry or company size, businesses are increasinglyrelying on data analytics and visualization to build a competitiveadvantage Organizations are racing to gather, store, and analyzedata from many different sources in many different formats In therace toward success, businesses are transforming themselves tomake data-driven decisions, and the associated technology is evolv‐ing as rapidly (or more so) as the businesses themselves
The fast-evolving data analytics and visualization technology land‐scape means that businesses and individuals are scrambling to makethe best technology choices Businesses need to know that they’rechoosing the right languages, products, architectures, and data sour‐ces Individuals need to know that they’re learning the right skills tosnare the right jobs Those who choose poorly run the risk of beingleft behind as they fail to take advantage of the timely insights pro‐vided by well-conceived and timely data analytics and visualizationprograms
For this reason, in the spring of 2017 Zoomdata commissionedO’Reilly Media to field a survey to assess the state of data analyticsand visualization adoption 875 survey respondents identified theirindustry, job role, company size, reasons for using analytics, tech‐nologies used in analytics programs, the perceived value of analyticsprograms, and more
1
Trang 8Results indicate the following:
• Big data analytics and visualization programs are most mature
in manufacturing, financial service, and technology/softwarecompanies
• Projects are typically built for business users and business ana‐lysts who commonly rely on visual dashboards to gain theinsights that they require to optimize business processes andbetter understand customers
• Relational databases are the most common data source(although analytic databases and Hadoop are the most common
source of big data).
• Companies are hungry for Python, SQL, and relational databaseskills
• Kafka and Spark are emerging as the streaming data technolo‐gies of choice
• Customer 360/customer insights is the most common use case.After veracity (data quality), variety followed by volume are themost valued characteristics of big data across all industries
Our goal with this report is to highlight the results of this survey sothat they might inform your career or organization as you embracenew technologies for data collection, storage, analysis, and visualiza‐tion
Data Analytics and Visualization Usage: The Big Picture
The 875 respondents who participated in this survey represent avariety of industries (Figure 1-1) More than 40% reported working
in technology/software This is followed by just over 10% in finan‐cial services, almost 8% in healthcare/medical technology, androughly 5% in manufacturing, government, retail, or education/academia
2 | The State of Data Analytics and Visualization Adoption
Trang 9Figure 1-1 Industries represented in the survey
As shown in Figure 1-2, respondents primarily indicated that theywere engineers/developers (18%), data scientists (17%), data ana‐lysts/business analysts (15%), or architects (13%) and they work atcompanies of various sizes It is interesting to note that Managersand CxOs are actively engaged with these topics, with 14% ofrespondents compared to 8% for IT professionals
Figure 1-2 Job roles represented in the survey
Surprisingly, small businesses of fewer than 50 employees make upmany respondents (26%) It’s refreshing to see small business leadthe charge toward the new technologies and business processesrelated to data analytics and visualization (Figure 1-3)
Data Analytics and Visualization Usage: The Big Picture | 3
Trang 10Figure 1-3 Organizational size (by number of employees) represented
in the survey
More than 50% of respondents indicated that they use analytics forcustomer insights/customer 360, followed by business process opti‐mization (43%; Figure 1-4) It’s important to note that these areasdirectly support line-of-business activities This supports the ideathat businesses are building data analytics and visualization pro‐grams in order to make data-driven decisions and create competi‐tive advantage
Figure 1-4 Key areas using analytics within organizations
Key Areas of Analytics by Industry
Although aggregate survey results are interesting, when you drilldown into specific industries, you begin to see some importanttrends This also allows you to understand the state of data analyticsand visualization use in your industry and provides guidance fordeveloping programs that help build competitive advantage
Picking up where we left off discussing the aggregate data, let’s take alook at the key areas of analytics use by industry (Figure 1-5) Cus‐tomer insights/customer 360 is an area of focus for more than 50%
of respondents in the technology/software, financial services, andretail industries, and surprisingly for more than 30% of respondents
in education/academia The potential business impact of under‐
4 | The State of Data Analytics and Visualization Adoption
Trang 11standing customers cannot be underestimated Understanding cus‐tomer needs is likely to lead to happy customers, and happycustomers are likely to lead to greater revenue.
Figure 1-5 Areas of analytics by industry
The exception is in healthcare/medical technology where healthcaredata analysis is far and away the most common key area of data ana‐lytics and visualization use This doesn’t come as very much of a
Key Areas of Analytics by Industry | 5
Trang 12surprise though because this is an industry specific use case Ifyou’re not analyzing healthcare data, you’re probably not much of ahealthcare/medical technology company Healthcare data analysis isfollowed by other important business-related analyses such as cus‐tomer insights/customer 360 and business process optimization.Business process analysis is another important use of data analyticsand visualization, and occupies a top-three spot in every industry asreported by survey respondents Business process optimization isthe top use of data analytics and visualization in manufacturing andgovernment Optimizing business processes typically results indecreased operating costs and can also lead to greater customer sat‐isfaction, so this is a strategic way to build competitive advantageacross many industries.
Similarly, the retail and manufacturing industries also place anemphasis on supply chain analytics and visualization initiatives.Uncovering supply chain problems in a timely manner gives retailand manufacturing businesses an opportunity to find alternate sour‐ces An optimal supply chain is certainly a competitive advantagefor these businesses
Fraud detection/cyber security intelligence is an important use ofdata analytics and visualization in financial services and govern‐ment Fraud detection is critical to any financial service, given thatthis industry is rife with attempted fraud Where there’s money,there’s likely to be attempted fraud Detecting and eliminating fraudbuilds trust with customers while decreasing operating costs Cybersecurity intelligence is a focus of numerous government agencies,while preventing fraud is critical to elections and efficient ongoingoperations
Looking at the question “At what stage are big data analyticsproject(s) in your organization” by industry helps us to understandhow the rate of adoption varies by industry In our top six industries
—financial services, government, healthcare/medical technology,manufacturing, retail, and technology/software—we see that adop‐tion runs the gamut from “we don’t have big data analytics projects”(18%) to “multiple projects” (22%) Manufacturing leads the “multi‐ple projects” category, with 28%, while government lags in this cate‐gory, with 7%
Let’s examine the stage of data analytics projects by specific industry(Figure 1-6) The leading response in manufacturing is “multiple
6 | The State of Data Analytics and Visualization Adoption
Trang 13projects” at 28% followed by “in development” at 22% We see a sim‐ilar case in the financial services industry, with about 25% ofrespondents note having “multiple projects,” and about 21% ofrespondents having “in development” projects Technology/softwarerespondents indicate that 21% are involved in multiple projects andin-development projects In healthcare/medical technology the pic‐ture is a little muddled in that 25% of respondents are engaged inmultiple projects, whereas 26% report that they aren’t engaged inany big data analytics projects Retail is in a similar position with23% reporting no projects and 25% reporting multiple projects Ingovernment, “we don’t have big data analytics projects” leads at 33%followed by “defining requirements” at 27%.
Figure 1-6 Stage of data analytics projects by industry
Usage and Access of Analytics by Industry
Looking at the target user for big data analytics and visualizationprojects (Figure 1-7), we see that in aggregate our survey respond‐ents are developing for business users This means that the analyticsand visualization software must be easy to use and intuitive Busi‐ness users can’t afford to spend all day focused on the mechanics ofanalytics For analytics to provide competitive advantage, businessusers must be able to quickly and easily convert data into insightsand take action
Figure 1-7 Target users of big data analytics project(s)
This holds true across our top six industries (Figure 1-8) Diggingdeeper, business analysts are the second most common target in
Usage and Access of Analytics by Industry | 7
Trang 14government (tied with customers), manufacturing, retail, and tech‐nology/software Data scientists are the second most common targetusers in financial services and healthcare/medical technologies.
Figure 1-8 Target users of big data analytics project(s) by industry
We asked survey participants, “Where would big data analytics beavailable to users?” and the responses are split roughly evenlybetween embedded in an application of business process and stand‐alone business intelligence (BI) applications (Figure 1-9) Financialservices (57%) and technology/software (54%) show a slight prefer‐ence for embedded, whereas retail (58%) shows a slight preferencefor standalone BI applications
Figure 1-9 Method for accessing big data analytics
We asked survey participants to identify how users would interactwith data analytics: dashboards, embedded in applications, or opera‐tional reports (Figure 1-10) Across our top 6 categories, respond‐ents showed a strong preference toward dashboards The secondmost common way for users to interact with big data analytics wasoperational reports However, the second most common way forfinancial services users to interact with big data analytics wasembedded in applications
8 | The State of Data Analytics and Visualization Adoption
Trang 15Figure 1-10 User interaction with data analytics
Working with the Data: Joining, Sourcing, Streaming
We asked survey participants how they join data from multiplesources in order to analyze it (Figure 1-11) In our top six categories,data warehouse/datamart was the predominant response This wasespecially true in retail (56%) Virtual federation/mashup (blendingdata on-the-fly without moving into a warehouse) is most widelyused in healthcare/medical technology (24%), technology/software(21%), and government (21%)
Figure 1-11 Joining data methodology
We asked survey participants to identify their main data sources(Figure 1-12) Not surprisingly, relational database is the leadingresponse in our top six industries, topping out at 39% in healthcare/medical technology The leading nonrelational and big data storesare ranked as analytic database, Hadoop, NoSQL database, clouddata store, in-memory database, and search database Financialservices (24%) and government (25%) make the heaviest use of ana‐lytic databases, whereas retail (11%) and technology/software (10%)make the heaviest use of cloud data stores Hadoop usage hoversaround 15%, except in government where it drops to 9% In-memory databases are used primarily by manufacturing (10%) andgovernment (9%) Manufacturing (12%) is also the heaviest user ofsearch databases
Working with the Data: Joining, Sourcing, Streaming | 9