(BQ) Computational network science seeks to unify the methods used to analyze these diverse fields. This book provides an introduction to the field of network science and provides the groundwork for a computational, algorithmbased approach to network and system analysis in a new and important way.
Trang 1Computational Network Science
An Algorithmic Approach
Henry Hexmoor
AMSTERDAM • BOSTON • HEIDELBERG LONDON • NEW YORK • OXFORD PARIS • SAN DIEGO • SAN FRANCISCO SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann is an Imprint of Elsevier
Trang 2Copyright © 2015 Elsevier Inc All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress.
ISBN: 978-0-12-800891-1
For information on all MK publications
visit our website at http://www.mkp.com
Trang 3The days of the need for gurus and extensive libraries are behind us The Internet provides ready and rapid access to knowledge for all This book offers necessary and sufficient descriptions of salient knowledge that have been tested in traditional classrooms The book weaves founda-tions together from disparate disciplines including mathematical sociol-ogy, economics, game theory, political science, and biological networks.Network science is a new discipline that explores phenomena com-mon to connected populations across the natural and man-made world From animals to commodity trades, networks provide relationships among individuals and groups Analysis and leveraging connections provide insights and tools for persuasion Studies in this area have large-
ly focused on opinion attributes The impetus for this book is a need to examine computational processes for automating tedious analyses and usage of network information for online migration Once online, net-work awareness will contribute to improved public safety and superior services for all
A collection of foundational notions for economic and social works is available in Jackson (2008) A mathematical treatment of generic networks is present in Easly and Kleinberg (2010) A comple-mentary gap filled by this book is an algorithmic approach I provide a fast-paced introduction to the state of the art in network science Refer-ences are offered to seminal and contemporary developments The book uses mathematical cogency and contemporary computational insights
net-It also calls to arm further research on open problems
The reader will find a broad treatment of network science and review
of key recent phenomena Senior undergraduates and professional ple in computational disciplines will find sufficient methodologies and processes for implementation and experimentation This book can also
peo-be used as a teaching material for courses on social media and network analysis, computational social networks, and network theory and ap-plications Our coverage of social network analysis is limited and details are available in Golbeck (2013) and Borgatti et al (2013)
Trang 4Whereas a teacher is a tour guide to the subject matter, this book is
a reference manual Chapters in each part are related and they progress
in maturity Chapters are semi-independent and a course instructor may choose any order that meets the course objectives Exercises at the end
of each chapter are students’ hands-on projects that are designed for covering learning activities during a semester Some code is provided
in appendices for prototyping and learning purposes only We do not provide a how-to guide to mainstream social media or codebook for application development that is available elsewhere
Henry HexmoorCarbondale, IL
2014
REFERENCES
Borgatti, S., Everett, M., Johnson, J., 2013 Analyzing Social Networks SAGE Publications Easly, D., Kleinberg, J., 2010 Networks, Crowds, and Markets Cambridge University Press Golbeck, J., 2013 Analyzing the Social Web Morgan Kaufmann Publications
Jackson, M., 2008 Social and Economic Networks Princeton University Press.
Trang 5CHAPTER 1
Ubiquity of Networks
1.1 INTRODUCTION
Broadly speaking, a network is a collection of individuals (i.e., nodes)
where there are implicit or explicit relationships among individuals in
a group The relationships may be strictly physical as in some sort of physical formation (e.g., pixels of a digital image or cars on the road),
or they may be conceptual such as friendship or some similarity among pairs or within a pair In an implicit network, individuals are unaware
of their relationships, whereas in an explicit network, individuals are familiar with at least their local neighbors In certain implicit networks
called affinity networks, there is a potential for explicit connections from
relationships that account for projected connection such as homophilly (i.e., similarity) (McPherson et al., 2001) Biological networks capture relationships among biological organisms For instance, the human brain
neurons form a large network called a connectome (Seung, 2012) An ant
society is an example of a large biological network (Moffett, 2010) There are many examples of small-scale animal networks, including preda-tors and their prey, plant diseases, and bird migration Human crowds and network organizations (e.g., government or state agencies, honey grids in bee colonies) are other examples of natural networks Modern anonymous human networks have capacities for crowd solving problems (Nielsen, 2012), where a group of independently minded individuals pos-sess a collective wisdom that is available to singletons (Reingold, 2000) Social and political networks model human relationships, where social and political relations are paramount Economic networks are models
of parties related to economic relationships such as those among buyers (and consumers), sellers (and producers), and intermediaries (i.e., trad-ers and brokers) (Jackson, 2003) Beyond natural networks, there are myriads of synthetic networks The grid of a photograph is an example
of synthetic networks Nanonetworks are attempts to network
nanoma-chines for emerging nanoscale applications (Jornet and Pierobon, 2011)
Trang 6A large class of networks is a complex engineered network (CEN) that is a
man-made network, where the topology is completely neither regular nor random A CEN supports evolving functionalities Examples of CENs are the Internet, wireless networks, power grids with smart homes and cars, remote monitoring networks with satellites, global networks of tele-scopes, and networks of instruments and sensors from battlefields to hos-pitals Time requirements in CENs range from seconds in cyber-attacks
to years in greenhouse gas emissions Data and control flow in CENs must be managed over connections that could span thousands of miles
A few synthetic network categories, including CENs, are created tionally Here, we list six types:
inten-1 Social networks through networking sites and services
2 Political networks as in parliamentary cabinets and political
committees
3 Computer networks that include computers as nodes and how they communicate over local, wide area, and wireless links (e.g., sensor networks)
4 Telecommunication networks as in switches for nodes and respective routing paths
networks for data mining and marketing Individuals sharing like votes
(or retweets) are part of an affinity network (or a hashtag) in the context
of what they liked (or tweeted)
Figure 1.1 depicts a taxonomy of network types Exchange networks are those in which a quantifiable entity is exchanged among the nodes whether or not the nodes are tangible (e.g., natural gas) or intangible
Trang 7(e.g., trust) Relational networks are inert and merely reflect
juxtaposi-tion of nodes All CENs are exchange networks
Once a network emerges, we can explore interactions within the work Strategic interactions involve reasoning and deciding over selec-tion of strategies They can be modeled with game theory that will be our main focus in Chapter 3
net-Network theory is a set of algorithms that codifies relationships
among network topology and outcomes, which are meaningful to work inhabitants There is a movement afoot that codifies network phe-
net-nomena under the term network science These phenet-nomena and salient
algorithms will be discussed throughout this book
An Online Social Networking Services (OSNS) creates synthetic works among people The salient incentive for using an OSNS is to gain
net-social authority (i.e., legitimacy), which is a form of net-social power and
not generally a measure of vanity Social authority in social networks
is with respect to a group and with respect to specific topics Therefore, social authority is a relative measure and not an absolute quantity In Section 1.2, we review a few popular OSNSs from a rapidly growing list (Khare, 2012) Since they provide platforms to create, to share, and/or
Fig 1.1 A network taxonomy.
Trang 8to exchange information and ideas in virtual communities, an OSNS
is considered to be a medium for social media There are quantitation
schemes over social media, such as Klout, which offers user scores (i.e.,
a number between 1 and 100) Klout calls influence, which is a measure
of a user’s ability to reach one other through an OSNS This measure is valuable for marketing products online
In Section 1.3, we review a few popular online bibliographic services (OBS) that house published articles We return to generic models of net-works in Section 1.4 This is followed by a review of popular models
of synthetic network generations in Section 1.5 A fully implemented NetLogo model (i.e., code and accompanying descriptions for use) of network generation models and analysis is available in the Appendix
1.2 ONLINE SOCIAL NETWORKING SERVICES
Facebook is an OSNS that connects people, organizations, friends, and
others who work or live around together Nodes in a Facebook network can be individuals or organizations Some of these may be entirely syn-thetic without real-world humans The main Facebook tool for connec-
tions is friendship Facebook is used largely for personal and recreational
functions As such, it has filled the social gaps created by physical and chological dispersion among traditional families and friends It also serves
psy-as a medium that creates relationships that would not otherwise exist
One Facebook’s feature known as sharing allows adjustments on
spread of information (i.e., selecting an audience) Sharing is used to limit who can view posts and photos It is a three-step process: (1) in-
dicates who you are (i.e., tagging), (2) tells where you live (i.e., adding a
location to a post), and (3) manages the privacy right for where you post
(i.e., the inline audience selector) Sharing gives users control over their
information diffusion, which in turn can yield a measure of social
au-thority Another Facebook’s like feature provides a directional
relation-ship (i.e., tie, connection, and link) that lends credibility to the item and
is proportional to the credibility (i.e., authority) of the endorser
Twitter is an OSNS that facilitates broadcasts of messages (i.e.,
tweets) The main twitter tool for connections is the explicit alignments
of ideas among people (i.e., following) Twitter can be used by small or large groups to form crowd sourcing For example, in the small network,
Trang 9when a family stays organized about their travel itinerary, there are parate opinions In the large network, a large social project, such as a protest, can be planned Twitter can be used to work semi-anonymously
dis-with others Twitter’s hashtag (i.e., #) is a feature for labeling a topic
Any-one may introduce or reuse a hashtag to attract attention For example,
#flight1549 added to a tweet labels the tweet to be about “flight1549.” This hashtag labeling facilitates search related to specific topics Individ-uals who use specific hashtags form an implicit network in the context
of their hashtags This feature has been used for commercial marketing and anonymous coordination over social actions The range of poten-tial uses for hashtags is enormous, and they have been adopted by other OSNSs such as Facebook On the one hand, Twitter can be used for so-cial organizations of crime or dissent On the other hand, it can be used
to predict and mitigate violations of law enforcement Since Twitter vides democratization of opinion sharing and equal access for dissemi-nation, it is seen as a social equalizer and as such it might be feared by repressive systems (e.g., government regimes) Twitter’s social authority
pro-is composed of three components: (1) the retweet rate of users’ last few hundred tweets, (2) the recentness of those tweets, and (3) a retweet-
based model trained on users’ profile data Tagging someone shows the Twitter id to more people, whereas direct messaging someone just
puts spam in their inbox, which is generally undesirable Websites, such
as Klout.com, gauge the influence you have by monitoring things, for example, how active you are and how much you have been tagged on
Twitter Twitter’s lists are a way to organize others into groups When
you click on a list, you will retrieve a stream of tweets from all the users included in that group As a rule of thumb, if you want to develop re-lationships on Twitter, you should read other tweets, retweet good con-tents, tweet good contents, and stay on top of keywords and interests that you follow The same advice applies if you want to get retweeted
Linkedin is an OSNS that provides an online forum for professional
identity management The main tool for Linkedin’s connections is to
link people, who would like to support one another (i.e., connections)
Linkedin allows people to conduct a weak form of endorsement in gards to specific skills This creates directional links from endorsers to endorsees Linkedin allows a stronger directional endorsement through
re-recommendations Endorsed individuals’ profiles gain social authority
via Linkedin’s endorsements and recommendations Of course, the
Trang 10gained authority is proportional to the authority of those endorsing and recommending.
Pinterest is an OSNS that allows users to create and manage
theme-based image collections Repining in Pinterest is the feature that creates
social authority
Started in 2011, Whisper.sh is a privately owned mobile OSNS that
allows anonymous posts including photographs It allows others to like posts, which creates a network of posts as nodes and directional links Since users are anonymous, the resulting network is implicit
1.3 ONLINE BIBLIOGRAPHIC SERVICES
DBLP is a Computer Science Bibliography database website hosted at
Universität Trier in Germany It houses a large collection of published articles and offers capabilities for browsing and searching The resulting database is a network of “author” nodes connected via coauthorship Through citations, papers are nodes of a separate network of paper, as nodes and citations are the links
Google Scholar is another bibliography database website released in
2004 by google.com It creates networks of authors and papers similar
to DBLP
Microsoft Academic Research is an OBS (with a corresponding
Win-dows app) that is supported by Microsoft.com that offers a similar vice to DBLP
ser-Research Gate is an independent privately owned online site founded
in 2008 for scientists and researchers to share papers, to ask questions,
to answer questions, and to find collaborators On the one hand, it is an OBS, even though it is far smaller than its rivals On the other hand, it is
an OSNS for professionals
1.4 GENERIC NETWORK MODELS
In this section, we review four of the most popular generic network models In contrast to descriptive models in this section, Section 1.5 will offer algorithms for artificially generating networks
Trang 111.4.1 Random Networks
G(n, p) is a random graph model with n nodes where the probability of a
pair of nodes in it being linked is denoted by p (Erdős and Rényi, 1959)
When p is small, the network is sparsely connected When p is close to
1/n, the network appears fully connected When p is almost 1.0, the
con-nectivity among nodes is very high and the network is said to be a giant
component The spread of node degrees for a random graph model (i.e.,
degree distribution) appears binomial in shape A closely related model
is the random geometric graph G(n, r), where there are n nodes and the
distance between a pair of nodes in the graph is less than or equal to r
(Penrose, 2003) Contrary to mathematical models, real-world networks
exhibit a degree distribution that is unevenly distributed In the
power-law distribution, the probability that a node has a degree distribution k
(i.e., the number of connected neighbors) is determined by P(k) ≈ k g, where
parameter g is typically constrained between 2 and 3, that is, 2 ≤ g ≤ 3
Uneven distribution stems from preferential attachment, where the
prob-ability that a new node will attach to a node i is degree /( )i ∑jdegree ( )j
A node degree refers to the node’s number of neighbors Preferential
attachment is commonly found in nature as well as man-made networks
such as an economic network (Gabaix, 2009) Random networks are
mathematically the most well-studied and well-understood models
1.4.2 Scale-Free Networks
There is a model based on preferential attachment described by
Bara-basi and Albert (1999) In this model, a new node is created at each
time step and connected to existing nodes according to the
“prefer-ential attachment” principle At a given time step, the probability p
of creating an edge between an existing node u and the new node is
p [(degree( ) 1) / (| | | |)]u E V , where V is a set of nodes and E is the
set of edges between nodes The algorithm starts with some parameters
such as the number of steps that the algorithm will iterate, the
num-ber of nodes that the graph should start with, and the numnum-ber of edges
that should be attached from the new node to preexisting nodes at each
time step The Barabasi model of network formation produces a
scale-free network, a network where the node degree distribution follows a
power-law principle Scale-free networks produce small number of
com-ponents, small-diameter, heavy-tailed distribution, and low clustering
degree(i)/∑jdegree(j)
p=[(degree(u)+1)/(|E|+|V|)]
Trang 12Many types of data studied in the physical and social sciences can be approximated with a Zipf distribution (Li, 1992), which is one of the families of discrete power-law probability distributions An implication
of the Zipf law is that the most frequent word will occur approximately twice as often as the second most frequent word, which occurs twice as often as the fourth most frequent word, etc
Unlike the growth model of Barabasi, Epstein and Wang’s (2002) steady-state model uses a rewiring scheme that results in power-law distribution This model evolves an initial graph according to Markov process, while maintaining constant size and density
Epstein and Wang’s algorithm has two major steps: (1) ize a sparse graph and (2) edit Markov edges To generate the sparse
initial-graph G, they randomly add an edge between vertices with probability
× −
m n n
2 / [ ( 1)], where m is the number of edges added and n is the
number of vertices If the number of edges in G is still less than m, they start adding edges with a probability of 0.5 until the graph G has m
edges The second step is to reiterate the algorithm in Figure 1.2 r times
on G, where r is a parametric value.
2m/[n×(n−1)]
Fig 1.2 Epstein and Wang’s (2002) algorithm.
Trang 131.4.4 Game Theoretic Models
Game theoretic model of network formation focuses on reasoning over
each node’s connection with others A strategy set of an agent i is a set
of strategies to connect each node in the network, that is, S i = {s i1 , s i2,
…, s in }, where s ij is a strategy to connect a node i to a node j An agent
incurs a cost in a connection that is a combination of a fixed cost plus a
sum of distances between the node and all other nodes in the network
For example, cost( )s ij = + Σc j d i j( , ), where c is a fixed cost and d(i, j)
is the distance between nodes i and j in the number of links The cost is
shared if both parties choose the link Otherwise, it is incurred by one
agent Synergistic strategy selection will provide utility for agents that
are linked
Each strategy will have a payoff that is utility minus the link cost The
Nash equilibrium (Carmona, 2012) is achieved with a strategy profile
(i.e., a set of links) that minimizes cost for all agents, and no agent has
incentive to deviate from it
1.5 NETWORK MODEL GENERATORS
In this section, we review three of the most popular models for
generat-ing artificial networks
1.5.1 Kleinberg’s Small-World Model
A social network is called a small-world network if, roughly speaking,
any two of people in the network can reach each other through a short
sequence of acquaintances (Kleinberg, 2001) Milgram’s basic
small-world experiment is the most famous experiment that analyzed the
small-world problem (Milgram, 1967) The purpose of the experiment
was to determine whether most pairs of people in society were linked by
short chains of acquaintances So, individuals were asked to forward a
letter to a “target” through people whom they knew on a first-name basis
Watts and Strogatz (1998) proposed a small-world network model
that incorporated the features of Milgram’s experiment Kleinberg
(2001) proposed a variant of Watts and Strogatz’s basic model that can
be described as follows One starts with a p-dimensional lattice, in which
nodes are joined only to their nearest neighbors One then adds k-directed
long-range links out of each node v, for a constant k; the end point of
cost(sij)=c+∑jd(i,j)
Trang 14each link is chosen uniformly at random (Kleinberg, 2001) Kleinberg studied the model from an algorithmic perspective and showed that, with a high probability, there will be short paths connecting all pairs
of nodes and the network will have the lattice-like structure Kleinberg model does not yield a heavy-tailed degree distribution
Kleinberg (2000) showed a simple greedy algorithm that can find paths between any source and destination using only O(log )2n expected edges Kleinberg’s algorithm that will be used in this study is based on two parameters: the lattice size and the clustering exponent Each node
u has four local connections, one to each of its neighbors and in addition
one long-range connection to some node v, where v is chosen randomly according to the probability proportional to d −a , where d is the lattice distance between u and v and a is the clustering exponent.
1.5.2 Barabási and Albert’s Scale-Free Network Generator
Barabási and Albert (1999) discussed the features of the scale-free works in detail and compared them with the features of other types of networks, for example, small-world networks Scale-free networks ex-pand continuously by the addition of new vertices, and new vertices at-tach preferentially to vertices that are already well connected Most of the real networks are free-scale networks, such as WWW and citation patterns of scientific publications, and both of them follow a power-law distribution (Barabási and Albert, 1999)
net-Albert and Barabási (2002) showed a comparison between their model and other previously proposed models They state that other network models start with a fixed number of vertices that are then randomly connected or reconnected without modifying the number of vertices However, the WWW as an example will grow exponentially in time by addition of any new web page Also, other network models assume that new edges are placed randomly, that is, the probability of connecting two vertices is independent of the vertices’ degree However, most of the real networks do not behave like that They exhibit preferential attachment, that is, connecting two vertices is dependent on the vertices’ degree (Albert and Barabási, 2002)
According to Albert and Barabási (2002), a new node is created
at each time step and connected to existing nodes according to the
O(log2n)
Trang 15“preferential attachment” principle At a given time step, the probability
p of creating an edge between an existing node u and the new node is
[(degree( ) 1) / (| | | |)] The algorithm starts with some
param-eters, such as the number of steps that the algorithm should iterate, the
number of nodes that the graph should start with, and the number of
edges that the new node should be attached to the preexisting nodes at
each time step
The hierarchical network model (HNM) is part of the scale-free model
family and shares its main property of yielding proportionally more
hubs among the nodes than by random network generation HNMs are
heavy-tailed, have small diameter, and have high clustering
1.5.3 Epstein and Wang’s Power-Law Network Generator
Epstein and Wang (2002) have proposed a graph model called the
steady-state model that results in power law by evolving a graph according to
Markov process while maintaining constant size and density The only
difference between their model and Barabási and Albert’s model is that
their model does not require incremental growth, whereas Barabási and
Albert’s model does Epstein and Wang’s algorithm can be viewed in two
steps: (1) initialize a sparse graph and (2) edit Markov process To
gener-ate the sparse graph G, the algorithm randomly adds an edge between
vertices with the probability 2 / [ (m n × −n 1)], where m is the number of
edges added and n is the number of vertices If the number of edges in G
is still less than m, the algorithm starts adding edges with a probability
of 0.5 until the graph G has m edges Then, we reiterate the algorithmic
steps, shown in Figure 1.2, r times on G, where r is a model parameter
(Epstein and Wang, 2002)
1.6 A REAL-WORLD NETWORK
In this section, we sketch essential components of a generic,
common-place exchange network applicable to package delivery and durable
products We are keeping this model simple in order to avoid
complexi-ties of supply chain management and economic networks Let C be a set
of consumers of a commodity (e.g., received packages or appliances)
and P be a set of producers of the same commodity (e.g., package
[(degree(u)+1)/(|E|+|V|)]
2m/[n×(n−1)]
Trang 16senders) A node can be both a producer and a consumer at different
times T and locations L (i.e., nodes) The set C is strictly larger than P,
and it may subsume it entirely The production rate of a producer is a
function of time and location denoted by P(t, l) Similarly, the
consump-tion rate of a consumer is a funcconsump-tion of time and locaconsump-tion denoted by
C(t, l) After production but before consumption, commodities are in
transit at the rate P − C, that is, Transit(t, l1, l2) = P(t, l1) − C(t, l2)
Loca-tions of production and consumption must be distinct, that is, l1 ≠ l2 If these locations are the same, transit is null, that is:
∀ ∈t T l l, ,1 2 ∈L P t l, ( , ) 01 ≥ ∩C t l( , ) 02 ≥ ∩ = →l1 l2 Transit( , , )t l l1 2 = ∅
The Transit(·) function specifies the flow rate among nodes of the work If we could specify the maximum flow between all pairs of nodes
net-in the network, we could discover network capacity for transit usnet-ing the
standard graph theoretic flow network algorithm, for example, the Fulkerson algorithm (Kleinberg and Tardos, 2005) Transit/flow rates incur a cost corresponding to the amount of flow that needs to be paid
Ford-by the pair of a sender and a receiver It may be beneficial to share the transmission cost with neighbors, who form game theoretic coalitions that will be discussed in Chapter 3
In many scenarios, there is a need for intermediaries to facilitate transfer of commodities from producers to consumers For simplic-
ity, we assume intermediaries to be uniform handlers, who are neither
a producer nor a consumer of commodities they handle In economic networks, handlers are traders (discussed in Chapter 9) In produc-tion line networks, intermediaries are dealers In the mail carrier net-works, intermediaries are delivery personnel In electric networks, in-termediaries are switches In computer networks, intermediaries are routers
Handling capacity of agent i is a function of time and number of
items Let handleri (t, I) return a delay time in i’s ability to handle I items
at time t Delay time of zero is on time handling Typically, there are
more handlers than items in transit A property of interest is to find optimal number of handlers for the volume of items to be handled with
no delay
tT,l1,l2L,P(t,l1)≥0∩C(t,l2)
≥0∩l1=l2→Transit(t,l1,l2)=∅
Trang 171.7 CONCLUSIONS
Networks are abundantly around us They are man-made or naturally occur They are implicit, hidden, explicit, or articulated They might be tangible and objectively quantified, or they might be subjective and dif-ficult to quantify They all tend to change in time, which is the subject of our future chapters on network dynamics
Epstein, D., Wang, J., 2002 A steady state model for graph power laws In: Proceedings of 2nd
International Workshop on Web Dynamics World Scientific Publishing Company.
Erdős, P., Rényi, A., 1959 On random graphs Publicationes Mathematicae 6, 290–297
Gabaix, X., 2009 Power laws in economics and finances Annu Rev Econ 1, 255–294
Jackson, M., 2003 A survey of models of network formation: stability and efficiency In: Demange, G., Wooders, M (Eds.), Group Formation in Economics: Networks, Clubs, and Coalitions Cambridge University Press
Jornet, J.M., Pierobon, M., 2011 Nanonetworks: a new frontier in communications In: Communications
of the ACM Vol 54, No 11 ACM, pp 84–89.
Khare, P., 2012 Social Media Marketing eLearning Kit For Dummies Wiley
Kleinberg, J., 2000 The small-world phenomenon: an algorithmic perspective In: Proceedings of
32nd ACM Symposium on Theory of Computing ACM, pp 163–170.
Kleinberg, J., 2001 Small-world phenomena and the dynamics of information In: Proceedings of
the Advances in Neural Information Processing Systems (NIPS), Vol 14 NIPS.
Kleinberg, J., Tardos, E., 2005 Algorithm Design Addison-Wesley.
Li, W., 1992 Random texts exhibit Zipf’s-law-like word frequency distribution IEEE Trans Inf Theory 38 (6), 1842–1845
McPherson, M., Lovin, L.S., Cook, J., 2001 Birds of a feather: homophily in social networks Annu Rev Sociol 27, 415–444
Milgram, S., 1967 The small world problem Psychol Today 1 (1), 61–67
Moffett, M., 2010 Adventures Among Ants University of California Press
Nielsen, M., 2012 Reinventing Discovery: A New Era of Networked Science Princeton University Press
Penrose, M., 2003 Random Geometric Graphs Oxford University Press
Reingold, H., 2000 The Virtual Community MIT Press
Seung, S., 2012 Connectome: How Brain’s Wiring Makes Us Who We Are Mariner Books Watts, D., Strogatz, S., 1998 Collective dynamic of small-world networks Nature 393 (6684), 440–442
Trang 181 Using examples, describe how animal swarms are networked
2 What are the salient characteristics of biological networks (e.g., brain cells and protein chains) that differentiate them from other types of networks?
3 What will be the role of network organizations in the year 2025? Give examples
4 How can social media be used to track cultural changes in a society?
Trang 19CHAPTER 2
Network Analysis
There has been a long tradition of measuring qualities for network locations from both egocentric and global perspectives This is largely addressed with quantification attempts in mathematical sociology under
the theme of social network analysis (SNA) (Wasserman and Faust, 1994;
Knoke and Yang, 2007; Golbeck, 2013; Borgatti et al., 2013) There are also several popular software toolkits that perform analysis and visualization of social networks (i.e., sociograms) including UCINET
and NodeXL Tom Snijders’ SIENA is a program for the statistical
analysis of network data The NSF-sponsored visualization project is Traces (Suthers, 2011), which traces out the movements, confluences, and transformations of people and ideas in online social networks.The aim of this chapter is to review a selective subset of SNA measures that complement algorithmic descriptions explained in the remainder of this book For a glossary of SNA terms, readers are recommended to consult Golbeck (2013)
We will start with egocentric (i.e., node view) measures A degree-1
network of a node is the node and its immediate neighbor nodes A
degree-1.5 network of a node is the node’s degree-1 network and its links
among immediate neighbors (Golbeck, 2013) A degree-2 network of
a node is the node’s degree-1 network and all its immediate neighbors’
connections (Golbeck, 2013) A degree-n network of a node is the
degree-1 network of the node plus all the nodes and the corresponding
links that are no more than n links away from the starting node.
A path is a chain (i.e., succession) of nodes connected by links tween pairs of nodes Two nodes are connected if and only if (i.e., iff) there is a path between them A connected component is a set of nodes with connected paths among all pairs of nodes in the set A bridge is a link that connects two isolated connected components A hub is a node with many connections Reachability is whether two nodes are connected
be-or not by way of either a direct be-or an indirect path of any length
Trang 20Geodesic distance, denoted by distance ij, is the number of links in the
shortest possible path from node i to node j Diameter of a network is the largest geodesic distance in the connected network Reverse distance,
denoted by RDij, is distanceij − (1 + Diameter) Metrics in Equations 2.1
and 2.2 are adapted from Valente and Foreman (1998):
Structural centrality measures of a node are a host of measures
re-flecting the structural properties of the links surrounding a focal node For example, degree centrality of a node is the number of edges incident
on the node Closeness centrality of a node is the average of the
short-est path lengths from the node to all other nodes in the network It is a
rather small number in small-world networks (Watts and Strogatz, 1998)
Betweenness centrality of a node is a measure of the node’s importance
(and possibly influence as discussed in Chapter 7) and is computed using the algorithm shown in Figure 2.1
Eigenvector centrality measures the centrality of neighbor nodes and
has been used as a measure of influence and power, which are discussed
later in this book (Bonacich and Lu, 2012) Bonacich developed a beta
centrality measure CBC with a parameter a used for adjusting the tance of a node’s degree versus a parameter b for adjusting the impor-
impor-tance of the neighbor’s centrality This is shown in Equation 2.3:
Trang 21Eigenvector centrality of a node at time t is computed with
Equa-tion 2.4, where C(t) is the vector of node centralities, A is the adjacency
matrix, and A t is the result of iterated multiplications of A:
=
C t( ) A C t t ( ) (2.4)
As time approaches ∞, the dominant eigenvalue g will determine the
centrality vector value with the value
γ ×t V 1, where
V1 is the
eigenvec-tor corresponding to the dominant eigenvalue g (Chiang, 2012).
Let us consider a degree-1.5 network of a node and measure the ratio
of the actual number of links in that network over the total number of
possible links that could exist, which yields a measure called the local
Density of a network is the ratio of the actual number of links in that
network over the total number of possible links that could exist
Cohe-sion is the minimum number of edges that has to be removed before the
network is disconnected
Let us consider a cluster that is a subset of nodes s and each node may
count the ratio r as node r is the density of its neighbors in s versus the
total number of its neighbors In the set s, the node with the minimum
Whereas centrality is a microlevel measure, centralization is a
macro-level measure, which measures variance in the distribution of
central-ity in a network We show the most generic form of centralization in
Figure 2.2
Leadership (L) is a measure of network domination, computed
us-ing Equation 2.5, where dmax is the degree of the node with the highest
C(t)∈=AtC(t)∈ gt×V1∈
V1∈
Fig 2.2 Centralization algorithm.
Trang 22degree and d i is the degree of node i (Freeman, 1978; Macindoe and
n
max 1
(2.5)
Richards, 2011) using Equation 2.6:
Diversity is a measure of the number of edges in a graph that are
dis-joint End vertices of such edges are not adjacent (i.e., disjoint dipoles) Diversity is shown in Equation 2.7:
Burt’s structural holes measure gaps among connected components
and as such are another measure of diversity (Burt, 1995)
2.1 CONCLUSIONS AND FUTURE WORK
Network analysis focuses on quantification (and statistical analyses) of qualities of relative nodes’ locations as well as entire network properties SNA has long been a stable tool for mathematical sociology (Borgatti
et al., 2013) An active direction of interest has been intelligence analysis of human networks to understand, predict, and mitigate law enforcement as well as understand geopolitical landscapes The recent debate over surveillance and monitoring of electronic communication metadata by the National Security Agency (NSA) is indicative of this fervent interest
A second direction of interest is marketing and branding on social media The interest is to understand human propensity for influence from network connections Marketers use these propensities to craft viral dissemination of consumption patterns and manipulation of economic
Trang 23activities The documentary filmmaker, Morgan Spurlock, has publicly explored branding on social media His mission is to raise public aware-ness and to inform us about the changing landscape of cultural values in the society (e.g., supersize me app).
REFERENCES
Bonacich, P., Lu, P., 2012 Introduction to Mathematical Sociology Princeton University Press Borgatti, S., Everett, M., Johnson, J., 2013 Analyzing Social Networks SAGE Publications Ltd Burt, R., 1995 Structural Holes: The Social Structure of Competition Harvard University Press Chiang, M., 2012 Networked Life: 20 Questions and Answers Cambridge University Press Freeman, L., 1978 Centrality in social networks: conceptual clarification Soc Netw 1, 215–239 Golbeck, J., 2013 Analyzing the Social Web Morgan Kaufmann
Knoke, D., Yang, S., 2007 Social Network Analysis Sage Publications
Macindoe, O., Richards, W., 2011 Comparing networks using their fine structure Int J Soc Comput Cyber-Phys Syst 1 (1), 79–97, Inderscience Publishers.
Suthers, D., 2011 Interaction, mediation, and ties: an analytic hierarchy for socio-technical systems In: Proceedings of the Hawaii International Conference on the System Sciences (HICSS-44) January 4–7, 2011, Kauai, Hawai‘i.
Valente, T., Foreman, R., 1988 Integration and radiality: measuring the extent of an individual’s connectedness and reachability in a network Soc Netw 20 (1), 89–105
Wasserman, S., Faust, K., 1994 Social Network Analysis: Methods and Applications Cambridge University Press
Watts, D., Strogatz, S., 1998 Collective dynamics of ‘small-world’ networks Nature 393 (6684), 440–442