IT industry is booming with the technology evolution Latest trends in IT are tightly integrated with the advanced technologies One of the emerging technologies Hidden Brains focused on is Big Data W.IT industry is booming with the technology evolution Latest trends in IT are tightly integrated with the advanced technologies One of the emerging technologies Hidden Brains focused on is Big Data W.
Trang 1I T i ndustry i s boomi ng wi th the technol ogy evol uti on Latest trends i n I T are ti ghtl y
i ntegrated wi th the advanced technol ogi es One of the emergi ng technol ogi es Hi dden Brai ns f ocused on i s "Bi g Data". We
real i zed the rol e "Bi g Data" wi l pl ay i n I T worl d i n f uture and we started to f ocus on i t
f rom the year 2011 - a ri ght step f orward.
f rom the year 2011 - a ri ght step f orward.
Trang 2Soon we got an opportuni ty to
work on a proj ect whi ch woul d
requi re tracki ng i nfluence of users
on mul ti pl e Soci al Network
pl atforms such as Facebook,
Twi tter, Li nkedI n, nstagram,
YouTube and showcase Anal yti cs
whi ch woul d hel p Adverti sers to
whi ch woul d hel p Adverti sers to
adverti se and provi de other
benefits to the users based on
soci al network i nteracti ons as
Li kes, Comment, nterest and
As a first step, t was necessary for
us to recogni ze 5 mai n
characteri sti cs to i denti fy whether
the system woul d requi re Bi g
data- based sol uti on or not.
Soci al Data Anal ysi s was mai nl y i n the core of the proj ect wi th two mai n consti tuent parts:
1. Gather data generated on Social Network websites and
2. Sophisticated analysis ofthat data
Vol ume
The quantity ofdata generated and stored.The size ofthe data
determines the value and potential
insight-and whether it can actually
be considered big data or not
Vari ety Vari ety
The type and nature ofthe data
This helps people who analyze it to effectively use the resulting insight
Vel oci ty
The speed at which the data is
The speed at which the data is generated and processed to meet the demands and challenges that le
in the path project development
Vari abi l ty
The inconsistency ofthe data set
The inconsistency ofthe data set can hamper processes to handle and manage it
Veraci ty
The quality ofcaptured data can vary greatly,affecting accurate analysis
Trang 3Once establ i shed that we need to use Bi g Data
based system we defined sol uti on approach as:
Get the Socialactivities ofusers on multiple SocialNetwork
platforms.For the same,we need to integrate 100+ APIs
The APIdata wilbe in bulk and unstructured and that's where
the need to use Big Data Architecture to manage this
Big Data Architecture helped us to manage unstructured data
Process the unstructured data and algorithm to get properly
required analytics
Use ofava: Considering the amount ofdata system needed to manage and
process,t was evident for us to not to use PHP as it is not a multi-threading
language and we do not want to make the system Architecture complex by using Pthread which is an Object Oriented APIthat allows multi-threading in PHP
n order to extract out the best performance from the system,we used Java which is a default multi-thread language and can manage 100s ofthreads at a
time
The next step was to anal yze and define the Bi g Data Archi tecture and
Technol ogi cal components requi re to use.
Hadoop: Open Source Framework to store and process Big Data
Hbase:Open Source non-relationalDatabase
Cloudera Manager Configuration to manage Clusters
Use of Hadoop Cluster to manage unprocessed data
Processed data managed in MySqlDatabase
Use ofApache Cassandra as secondary database for scalabilty
Use ofApache Cassandra as secondary database for scalabilty
and high performance
Use ofMongoDB the NoSqlDatabase
Created a Master Crawler to crawldata from SocialNetwork APIs
Used Java as programming language
Thi s case study i s f or i nf ormati on purpose onl y Hi dden Brai ns I nf otech Pvt Ltd makes no warranti es expressed or i mpl i ed i n thi s summary.
No materi al rom here may be copi ed, modi fied, reproduced, republ i shed, upl oaded, transmi tted, posted or di stri buted i n any f orm wi thout pri or wri ten permi ssi on
of Hi dden Brai ns Al l trademarks menti oned herei n are the property of thei r respecti ve owners.
For more i nf ormati on on how to grow your busi ness
vi si t our websi te www hi ddenbrai ns com or cal l+1- 323- 908- 3492
© Copyri ght - Hi dden Brai ns I nfotech Pvt Ltd.