(BQ) Until now, studies in network science have been focused on particular relationships that require varied and sometimesincompatible datasets, which has kept it from being a truly universal discipline. This new approach would remove the need for tedious humanbased analysis of different datasets and help researchers spend more time on the qualitative aspects of network science research..Computational network science
Trang 1CHAPTER 6
Diffusion and Contagion
This chapter explores the phenomena of rampant changes in networks exemplified by (a) disseminations of preferences, (b) percolation, (c) epidemic (i.e., contagion) of disease, and (d) community compositions The rest of this chapter reviews these four categories in order
6.1 POPULATION PREFERENCE SPREAD
ear-liest formally studied report of rampant changes in networks Schelling pointed out that a small preference for one’s neighbors to be of the same ethnicity leads to a widespread segregation emergent in the network
He used coins on a patch of graph paper to demonstrate this theory by placing pennies and nickels in different patterns on the cells Coins were moved one by one if they were in an unsatisfactory composition For every colored cell, if there were greater than 33% of the adjacent cells that were of a different color, the cell would move to another randomly selected cell You can try the model out using Chris Cook’s online dem-onstration program or the NetLogo model Further details and interest-ing emergent patterns that arise are available from Hatna and Benenson
prob-ability for the existence of a tie for each individual with another person
in the network Let us ignore the repeated connections With the small
value of p and the large value of N, PN represents the probability of an
individual’s indirect connection with others in the network, that is, the second-hand tie through the individual’s primary tie The expected (i.e., average) number of individuals that can be reached is shown by Equa-
occurs Rapid spread of diseases through connections in populations (e.g., obesity) is an epidemic and pandemic (i.e., an epidemic across bor-ders) event (Hays, 2005) Topologies of networks may hinder or pro-mote these events Whereas high-distance networks inhibit epidemics, low-distance networks (e.g., a scale-free network) accelerate them
Trang 2If PN = 0.5, the ratio in Equation 6.1 yields the value of 2 and no epidemic results This value is below the threshold needed for the epi-
demic onset With p = 0.0006 and N = 2500, the ratio is larger than
1.0 (i.e., above the epidemic threshold), and the epidemic is a sigmoidal logistic curve shown in Figure 6.1
is a fraction of agents in a society who have adopted a new product
or behavior by time t p is the rate of innovation and q is the rate of
imitation χ(t) is a function of percentage change in price and other variables
p q F t t F t
F t
1 ( )χ
Trang 3Diffusion and Contagion 47
6.2 PERCOLATION MODEL
In physics and mathematics, percolation theory describes the behavior
of clustered components in random networks (Grimmett, 1999) The common intuition is movement and filtering of fluids through porous materials, for example, filtration of water through soil and permeable rocks In a network, let each node be a cell through which a fluid-like substance may transit to other cells A network (i.e., a grid) then is a sponge-like substance and percolation is the determination of whether
a substance introduced at one cell will reach the other side of the work (or grid) There have been many applications of percolation such
net-as analysis of forest fire, bank failures, and rumor spread You may explore an online implementation of NetLogo percolation model
Commonly, a cell’s transmission rate is modeled as a probability value p
As shown in Figure 6.2, at a certain value of p, percolation is achieved.
6.3 DISEASE EPIDEMIC MODELS
Although the original inspiration is disease epidemics, we treat the topic generically applicable to epidemic spread of any phenomenon The ear-liest model is the susceptible, infected, susceptible (SIS), documented in
a fixed population N at time t that is divided into three camps: tible, infected, and removed The susceptible camps, denoted by S(t), are individuals who are/are not infected at time t The infected camps, de- noted by I(t), are individuals who have already been infected and are
suscep-capable of transmitting epidemics to the individuals in the susceptible
Fig 6.2 Percolation model.
Trang 4category The recovered camps, denoted by R(t), are individuals with previous infection who have been removed due to either immunization
or termination (i.e., death) The removed individuals are not able to get infected again or to transmit epidemics to others
N t( )=S t( )+I t( )+R t( ) (6.3)All individuals are considered to have an equal probability of contact
(i.e., infection), denoted by b At each time step t, each person may infect
b × N others with equal probability The fraction of contacts between an
infected individual and a susceptible one is S t N t( ) / ( ) Therefore, the new infection rate is computed by β× ×N [ ( ) / ( )]S t N t ×I t( )= ×β S t( )×I t( ) The population leaving the susceptible camp is equal to the number en-tering the infected camp Meanwhile, a number of individuals equal to the fraction of infected individuals are leaving this class per unit time
and entering the removed class (i.e., the law of mass action) Let g
de-note the combined mean recovery and death rates It is assumed that the rate of infection and recovery is much faster than the timescale of births and deaths and overlooked in the model Therefore, the rate of infection change is d ( ) / d( )I t t = ×β S t( )×I t( )− ×γ I t( ) and the rate of recovery is
R t t I t
d ( ) / d( )= ×γ ( ).There have been many variations to the SIS model, including SIR and
models that account for births and deaths in mathematical epidemiology
(Brauer et al., 2008) with models available from Vynnycky and White (2010) Another good source of discussion of these models is Jackson (2008)
Whether epidemically grown or not, a community is a group of
close-ly related and connected entities with similar attributes; it can also be
a part of a large network with groups of communities Entities in one community can interact with other communities A good community will interact less with the entities of an outside community and more with those of the inside community Parts of networks form groups called
clusters These clusters can be considered to be communities Next, we
outline processes for community detection
6.4 COMMUNITY DETECTION
In order to motivate a community, let us consider a project team in a firm where the team is a community and the team members have frequent in-teractions There might be interfaces from one project to another For
N(t)=S(t)+I(t)+R(t)
S(t)/N(t) b×N×[S(t)/N(t)]×I(t)=b×
S(t)×I(t)
dI(t)/d(t)=b×S(t)×I(t)−g×I(t)
dR(t)/d(t)=g×I(t)
Trang 5Diffusion and Contagion 49
example, finance and control project teams interact with material dling, sales, and distribution in a factory Although projects are intercon-nected, they are clearly identifiable communities as shown in Figure 6.3
han-We begin a set of global strategies and then turn to local ties We introduce graph partitioning where we divide the graphs into parts with minimal number of links between them The number of links
communi-running between two clusters is called cut size If we divide the network
in one group, that is, it is undivided, the cut size is 0 In partitioning, it might be desirable to specify the least number of groups and the target group size, even though communities cannot be divided in an optimal way For example, consider a sports club with 12 players Six of them play basketball and six of them play football We would like to divide the entire sports network into communities (i.e., clusters) based on how closely players interact with one another All players who play similar sports interact with one another closely and they are friends There are also a few footballers who interact with basketballers Figure 6.4
illustrates the interactions among footballers and basketballers The
Fig 6.3 IT organization of a factory.
Trang 6best graph partitioning method is bipartition, which divides the graph into two clusters of equal size and minimal cut size However, it is not possible to have a partition with a smaller cut size than 3 Therefore, this method is optimal for dividing clusters in our graph.
6.4.1 Spectral Clustering
Spectral “graph” clustering technique is used to determine the number
of clusters in large networks Spectral partitioning is based on Laplacian
matrix (L) and finding eigenvalues and eigenvectors.
Adjacency matrix A = [W ij ], where W ij is the edge weight between
ver-tices x i and x j If there is a link between nodes i and j, W ij = 1; otherwise,
W ij = 0 The Laplacian matrix L = D − A, where D is the diagonal
ma-trix of node degrees We illustrate a simple example shown in Figure 6.5
Fig 6.5 The graph G(9, 15) to be analyzed for spectral partitioning.
Fig 6.4 Two communities of players in sporting club.
Trang 7Diffusion and Contagion 51
For each node, the value of D is computed based on how many edges
are linked to that node For example, for node 1, there are three edges
connected from nodes 2, 3, and 4 Therefore, the degree of node 1 is 3
From the Laplacian matrix shown in Equation 6.4, we compute
Fielder vector (S) based on eigenvalues and eigenvectors Fielder vector
has both positive and negative components and their sum must be 0
S
0.33 0.380.33 0.480.33 0.380.33 0.120.33 0.160.33 0.160.33 0.300.33 0.240.33 0.51
From the matrix S shown in Equation 6.5, we identify two
communi-ties, where all the positive values form one cluster and negative values
form another cluster (Figure 6.6) In this example, two communities
(i.e., clusters) are {1, 2, 3, 4} and {5, 6, 7, 8, 9}
L~=D−A=3−1−1−100000−12−100000-0−1−13−100000−10−14−1−1000000−14−1−1−10000−1−14−1−100000−1−14−1−10000−1−1−130000000−101
S=0.33−0.380.33−0.480.33
−0.380.33−0.120.330.160.330.160.330.300.330.240.330.51
Fig 6.6 Two communities derived from spectral partitioning technique.
Trang 86.4.2 Hierarchical Clustering
Hierarchical clustering is the most popular and widely used method to
analyze social network data In this method, nodes are compared with one another based on their similarity Larger groups are built by joining groups of nodes based on their similarity A criterion is introduced to compare nodes based on their relationship There are two types of hier-archical clustering approaches:
1 Agglomerative approach: This method is also called a
bottom-up approach shown in Figure 6.7 In this method, each node represents a single cluster at the beginning; eventually, nodes start merging based on their similarities and all nodes belong to the same cluster
2 Divisive approach: This method is also called a top-down approach
Initially, all nodes belong to the same cluster; eventually, each node forms its own cluster Divisive approach is less widely used due to its complexity compared with agglomerative approach
The final result for both approaches is represented as a dendrogram shown in Figure 6.8
Consider the distances between four Illinois towns, including bondale, Peoria, Springfield, and Bloomington, shown in Figure 6.9 We can observe that Bloomington and Peoria are the two closest cities and
Car-we join them using hierarchical clustering algorithm Figure 6.10 shows the four towns on the map The distances between Peoria, Bloomington, and Springfield are closer and identical within the distances of 73 and
71 miles The final dendrogram is shown in Figure 6.11
There is a problem with graph partitioning We need to specify the number and the size of the desired clusters If a network is new and large, we do not have any idea about the number of clusters and how
Fig 6.7 Algorithm for agglomerative hierarchical clustering.
Trang 9Diffusion and Contagion 53
big they must be Hierarchical clustering has a shortcoming If we cut the hierarchical tree at any level, we produce a good partition but we
end up with n − 1 partitions If the network has 1 million nodes (n),
we get 1 million minus 1 partitions Many partitions are recovered from which we need to identify the best one We cannot use 1 million partitions We must find additional criteria to find which partition is the best one In Girvan and Newman (2002), an algorithm is offered
to solve the problems with spectral methods It is based on the divisive method and hierarchical clustering The divisive method repeatedly identifies and removes edges connecting densely connected regions It uses edge betweenness that is the number of the shortest paths pass-ing through the edge to identify edges to remove them It also removes the links that connect clusters The algorithm shown in Figure 6.12
Fig 6.8 A dendrogram example for hierarchical clustering approach.
Fig 6.9 Distances between four Illinois towns.
Trang 10Fig 6.11 The final dendrogram for the towns’ example.
Fig 6.10 Hierarchical clustering of example towns shown on a map.
Trang 11Diffusion and Contagion 55
is designed to remove edges in decreasing betweenness An example network is shown in Figure 6.13
6.4.3 Cascade Model
Information cascade models replicate the effects when individuals don their own information in favor of inferences based on other people’s behaviors and opinions
aban-The cascade models presuppose that individuals have a finite action set Individuals make rational, sequential decisions purely based on publicly observable information about others Common examples are
Fig 6.12 Girvan and Newman’s algorithm.
Fig 6.13 An example network illustrating Girvan and Newman’s algorithm.
Trang 12neighborhood rivalry and substance abuse (Dodge et al., 2010; Easly
that are initially active and then spread their influence to other nodes
A more in-depth discussion of influence is deferred to Chapter 7 Let
us consider a weighted graph, where each edge has a weight (i.e., tie strength) and each node carries a threshold value There are a few nodes that are already active The neighbors of an active node become active when the sum of the weights on the incoming edges of the active node
is greater than the threshold value of that node Linear threshold els (LTM) were first introduced by Granovetter (Granovetter, 1978) In
mod-LTMs, each node v randomly chooses a threshold u from a uniform tribution in an interval between 0 and 1 Let u v denote the fraction of
dis-neighbors of v to be active in order to activate v Let a neighbor w be able to influence node v with weight b w,v We assume that Obw,v ≤ 1 Each node is activated only if it satisfies Equation 6.6 The process of activa-tions proceeds in successive activation stages until no further activations are possible
b w v, v
To illustrate, let us look at influence among students for signing on
to a course shown in Figure 6.14 A few variations to this model are discussed in Golbeck (2013)
In the network of Figure 6.14, there are nine graduate students beled from A to I The node value denotes the threshold value for that node, in this case aversion to taking a specific course Students interact with one another and influence their peers in making course registration decisions The edge value represents the weight of influ-ence between a pair of peers The graph illustrates how students are influenced by their peers in registering for the course Student A is already registered for the course The rest of the students are not reg-istered yet E is influenced by A and D However, D is not active, so
la-E is influenced by A only la-E’s threshold value is 0, which is less than the influence (edge) value of 0.2 from A So E registers for the course and becomes active in the network Subsequently, C is influenced by A
∑bw, v≥uv
Trang 13Diffusion and Contagion 57
and E as they are already active in the network, and the sum of their incoming edge influence values is 0.3 + 0.3 = 0.6 that is greater than the threshold value 0.5; therefore, C becomes active Then, H is influenced
by C and E The sum of incoming edge values is 0.4 + 0.2 = 0.6 that is equal to the threshold value 0.6; thus, H becomes active and registers for the course Next, I is influenced by H as the incoming edge value 0.4 is equal to the threshold value 0.4, so I also becomes active The threshold values of nodes B, D, F, and G are greater than their incom-ing active edge values, so these students do not register for the course and they remain inactive Finally, the graph takes the makeup shown
F, and G are inactive
This is an example for the cascade model of network diffusion It forms a community of students (i.e., A, C, E, H, I) who register for the course by leaving certain parts of the network (i.e., nodes B, D, F, G) who do not register for the course (Figure 6.16) Hence, the cas-cade model of network diffusion is shown to stop and leave out certain parts of the network and this could be used as a model for community detection
Fig 6.14 An example network of students and their threshold values of influence.
Trang 14Fig 6.16 Detected students community who registered for the networking course.
Fig 6.15 Student A influenced other students C, E, H, and I in registering the network course.
Trang 15Diffusion and Contagion 59
6.4.4 Independent Contagion Model
In a related independent contagion model, a few nodes are initially active
and every edge has a probability (i.e., tie strength) of propagation An active node activates its neighbor with a probability on that particular edge There are no threshold values for nodes Nodes are activated in
an arbitrary order Initially, node A is active at time t and is said to be contagious This node has one chance for influencing its neighbor node
B at time stamp t + 1 There could be multiple active neighbors
compet-ing to influence node B Node B becomes active based on the activation attempts sequenced in an arbitrary order The probability of success of
node A’s attempts in activating B is denoted by PB(A) If node B has
the set S of neighbors, who already attempted to activate node B but failed, node A’s success probability is denoted by PB(A, S) When node
A completes influencing its neighbors, it remains in an active state but is not contagious This process completes when all the nodes are free from being contagious The network in Figure 6.17 is an example of 10 family members A–J Each edge shows a probability (i.e., weight) of influencing other family members
Fig 6.17 An example network of family members and their weight in influencing other members.
Trang 16Family members A and D first bought an iPhone 5S at time t, so
they are active and contagious Node A attempts to influence C to buy the phone with the probability 0.6 Node C is also influenced by the other active node D with the edge weight 0.4 Node C can be influenced arbitrarily by either family members A or D, and this depends on the sequence of the attempts made on node C at that time Node C is influ-
enced by node A at time stamp t + 1, and the probability of A’s success
in influencing node C is denoted by PC(A, S), where S is the set of other active family members who may influence node C, so S = {D} Nodes A
and D are active and noncontagious as they attempt to influence their neighbors once These steps are repeated and influence cascades from family member C → F and F → I Eventually, the family members A,
D, C, F, and I have new iPhone 5S and form a community, leaving other members as inactive as shown in Figure 6.18
The community of family members is detected based on the dent contagion model Initially, the family members A and D have phones; they are able to diffuse information and influence other family members
indepen-C, F, and I to buy the phone This leaves family members B, F, G, H, and
J not part of the diffusion and cascading process, which helps us detect the community of iPhone 5S holders in the family shown in Figure 6.19
Fig 6.18 Family members A and D influenced C, F, and I in buying iPhone 5.
Trang 17Diffusion and Contagion 61
Thus, the community detection focuses on approaches that are global
to the network There are times when there is a need for detecting only the local neighborhood (i.e., local community) of a starting node This
is covered in our last subsection presented next
6.4.5 Node-Centric Community Detection
A clique is a community Clique search is an NP-hard problem (Tang and Liu, 2010) Let CQ(v, k) be a k-sized queue of cliques starting from node v N(v) is the set of neighbors of node v Figure 6.20 is a brute-
force clique search algorithm that is computationally intractable for large
Fig 6.19 Community of iPhone 5S holders in the family.
Fig 6.20 Clique search algorithm Adapted from Han and Liu (2010).
Trang 18networks Many algorithmic improvements and optimizations are
pos-sible such as the use of dynamic programming techniques Clique lation method (CPM) is a strategy for discovering overlapping communi-
perco-ties (Palla et al., 2005), even though CPM is computationally intractable
A key concept is reachability among nodes when there is a path between
them There are many other community concepts such as lambda sets
largest geodesic distance between any two nodes (defined in the original
network) is not larger than k k-Club, which is a subclass of k-cliques, restricts the geodesic distance not larger than k A subgraph G s (V s , E s)
is a g-dense (i.e., quasi-clique) iff E s/ [( (v v s s−1)) / 2]γ When g = 1, g-dense is the same as clique Two nodes are structurally equivalent if for any node v k that v k ≠ vi and v k ≠ vj , e(v i , v k ) ∈ E iff e(v j , v k ) ∈ E.
A commonly used similarity measure such as Jaccard (Equation 6.7) can be used to find a group of similar nodes in a community:
There are techniques available for crawling around a node to find
a set of most similar nodes from an initial node (Clauset et al., 2004;
modularity shown in Figure 6.21 (Clauset, 2005)
Es/[(vs(vs−1))/2]≫g
Jaccard(vi,vj) = Ni∩NjNi∪Nj=∑kAikAjkNi+Nj−∑kAikAjk
Fig 6.21 General algorithm for the greedy maximization of local modularity Adapted from Clauset (2005)
Trang 19Diffusion and Contagion 63
6.5 COMMUNITY CORRELATION VERSUS INFLUENCE
Given a network, we can count the fraction of edges connecting nodes with distinctive attribute values Then, we compare it with the expected probability of such connections if the attribute and the social connec-tions are independent If the two quantities are significantly different,
we conclude the attribute is correlated with the network If the fraction
of edges linking nodes in a group with different attribute values is nificantly less than the expected probability of random connection in
sig-that network, there is evidence of correlation There exist correlations
between behaviors and attributes of adjacent nodes in a social network
Explanations for correlation are homophily (McPherson et al., 2001) and
influence that is discussed in Chapter 7.
6.6 CONCLUSION
This chapter explored the phenomena of rampant changes in networks exemplified by (a) disseminations of preferences, (b) percolation, (c) epidemic (i.e., contagion) of disease, and (d) community compositions Inspirations have converged from material science in percolation models
to epidemiology in epidemic models and mathematical sociology’s tribution to community detection Community detection in networks is still an evolving front With very large networks, it is difficult to detect desired communities We reviewed the most popular methods and al-gorithms Much research is ongoing and open problems are many, as networks are growing exponentially and becoming more complex
con-REFERENCES
Bass, F., 1969 A new product growth model for consumer durables Manag Sci 15 (5), 215–227
Brauer, F., et al.,2008 Mathematical epidemiology Lecture Notes in Mathematics/Mathematical Biosciences SubseriesSpringer
Clauset, A., 2005 Finding local community structure in networks Phys Rev E 72 (2).
Clauset, A., Newman, M., Moore, C., 2004 Finding community structure in very large networks Phys Rev E 70, 066111
Cross, R., Parker, A., 2004 The Hidden Power of Social Networks: Understanding How Work Really Gets Done in Organizations Harvard Business Review Press
Dodge, K., Malone, P., Lansford, J., Miller, S., Pettit, G., Bates, J., 2010 A Dynamic Cascade Model of the Development of Substance-Use Onset Wiley-Blackwell
Easly, D., Kleinberg, J., 2010 Networks, Crowds, and Markets Cambridge University Press
Trang 20Girvan, M., Newman, M., 2002 Community structure in social and biological networks Proc Natl Acad Sci U S A 99 (12), 7821–7826
Golbeck, J., 2013 Analyzing the Social Web Elsevier
Granovetter, M., 1978 Threshold models of collective behavior Am J Sociol 83 (6), 1420–1443
Grimmett, G., 1999 Percolation Springer
Hatna, E., Benenson, I., 2012 The Schelling model of ethnic residential dynamics: beyond the integrated–segregated dichotomy of patterns J Artif Soc Soc Simulation 15 (1), 1–23 Hays, J., 2005 Epidemics and Pandemics: Their Impacts on Human History ABC-CLIO Jackson, M., 2008 Princeton University Press.
Kermack, W., McKendrick, A., 1927 A contribution to the mathematical theory of epidemics Proc R Soc A Math Phys Eng Sci 115 (772), 700
McPherson, M., Smith-Lovin, L., Cook, J., 2001 Birds of a feather: homophily in social networks Annu Rev Sociol 27, 415–444
Palla, G., Derényi, I., Farkas, I., Vicsek, T., 2005 Uncovering the overlapping community structure
of complex networks in nature and society Nature 435, 814–818
Schelling, T., 1971 Dynamic models of segregation J Math Sociol 1, 143–186
Tang, L., Liu, H., 2010 Community Detection and Mining in Social Media Morgan & Claypool
Vynnycky, E., White, R., 2010 An Introduction to Infectious Disease Modelling Oxford University Press
3 Use percolation model to determine saturation timeframe of an
entire network (That is, as an analogy, consider the following question: “when does a sponge that is fully soaked begin to wet?”)
4 How can a node-centric community detection method be used to find
your long lost friend?
Trang 21CHAPTER 7
Influence Diffusion and Contagion
By analogy with the spread of infectious diseases that we explored in Chapter 6, novel ideas and word-of-mouth spread in networks lead to
viral diffusion of social and economic choices This chapter explores the phenomena of diffusion and contagion over social forces, which are cre-
ated by the society that separates from the individual and yet affects the individual Some examples of social forces are the media, the economy, life styles, religion, and ideology Examples of viral diffusion are chang-ing social values, norms, behaviors, product adoptions, religions, and cultural mindsets When a node’s choices affect choices of others, we say
that the node has influenced others.
All epidemic and community detection models, including Granovetter’s threshold model (Granovetter, 1978), which we presented in Chapter 6, serve as basic models of influence diffusion and contagion We do repeat them in this chapter The caveat for this chapter is to treat networks as di-rected graphs since most often influence is not reciprocal and relationships are not symmetrical
Another caveat and assumption is that once a node is influenced (i.e., becomes active), it cannot lose that influence (i.e., become inactive) This
is the basic assumption in progressive models with examples, such as product adoption, where influence is permanent Nonprogressive models
are used in epistemic contexts, such as opinions and political attitudes, where individuals may change their mind
learning model in Section 7.2 that is a nonprogressive model
7.1 STOCHASTIC MODEL
Independent cascade models (ICM) assume a uniform influence probability
p ∈ [0, 1] between pairs of nodes in the network (Kempe et al., 2003) Let
S t be the set of nodes that are active at time t If t = 0, S t is the seed set
from which influence spreads The model moves forward with successive
Trang 22time steps At each time step t ≥ 1, S t = S t−1 For every node v that is not active at t − 1, find all nodes u that may influence v (i.e., incoming edges to v) and execute an activation attempt, which is performed by a Bernoulli trial using probability p If u’s activation attempt succeeds, add v to S t , that is, node u influences (i.e., activates) node v at time t
If there are multiple nodes that may activate v, the result is similar irrespective of which node activates v Since activation attempts of nodes
are independent, the model is an independent, cascade model When no more activation is possible, the network stays the same, and the set of
activated nodes is called the final set At some point, no new activations
are possible All the nodes are either activated or not We call this
situ-ation satursitu-ation An interesting property of ICM is that the capacity for diffusion is the same for all seed set nodes known as the submodular- ity principle Another property of ICM is the final set that is invariant to the seed set size known as the monotonicity principle.
A fixed threshold model is one where each node v has an individual threshold u v ∈ [0, 1] Our example in Chapter 6 showed how iPhone 5S adoption was modeled as a fixed threshold model
A voter model (Clifford and Sudbury, 1973) uses an undirected graph, where each node possesses a binary state either 0 or 1 Iteratively, at each
step, a node v at time t selects a random neighbor u to mimic its t − 1
state To state things more formally, states of nodes can be captured by
a vector xt at time t Each node i’s probability that it is in state 1 (as opposed to being in state 0) is denoted by x it( ), and x i( )
0
is the state
of node i at t = 0 Equation 7.1 is the update function for changes in
node i’s states M is the stochastic transition matrix, where M ij =1/d i
and d i is the degree of node i Dynamics of the voter model is
summa-rized in Equation 7.2 When t → ∞, the state ceases to change and xtapproaches a steady state The voter model is adapted for use on many recent influence diffusion models (Li et al., 2013)
In a Markov random field model, each node of an undirected
graph is a random variable X v According to the Markov property, X v
x⇀t x⇀t(i)
Trang 23Influence Diffusion and Contagion 67
is independent of nodes that are not its neighbors, Nv, but does
de-pend on its neighbors of v, N v Let C N( v) be a possible configuration of
A special stochastic optimization problem is one discovery of a seed
set that maximizes influence diffusion This is known to be an NP-
complete problem (Kempe et al., 2003) In the real world, we may not
make the assumption that the direct influence and the amount are
known Instead, influence is learned as we outline in Section 7.2 We
review the 1974 model of DeGroot (DeGroot, 1974)
7.2 SOCIAL LEARNING
DeGroot assumes a society of n agents, where individuals possess an
initial (t = 0) opinion on a singular subject, represented by a vector of
probabilities p(0) = (p1(0), …, p n (0)), where p i (t) denotes the
probabilis-tic belief of agent i at time t Let T be a matrix of interactions among
agents, where T ij is the amount of weight (i.e., trust) agent i places on j’s
opinion at time t in forming i’s opinion at t + 1 We assume that agent
i’s influence (i.e., trust) is limited to the sum of its peers Therefore, the
T matrix rows add to 1 This property renders the model to be
stochas-tic Probabilities are updated using Equations 7.4–7.6 with the Markov
chain property:
p t( )= ×T p t( 1)− (7.4)
p t( )= ×T t p(0) (7.5)
i p, ( ) limi t p t i( )
(7.6)
A network N and a matrix T reach consensus if ∀i j, ∈N p, ( )i ∞ = p j( ).∞
Such a matrix T is a steady-state Markov chain A society is said to
be wise when the most influential influence vanishes in the society
N~v C(N~v) N~v
P(Xv|S0)=∑N~v⇀C(N~ v)P(Xv|N~v,S0)×⇀u⇀N (v)−S0P(Xu|S0)|S0)
p(t)=T×p(t−1)
p(t)=Tt×p(0)
⇀i,pi(∞)=limt→∞pi(t)
⇀i,j⇀N,pi(∞)=pj(∞)
Trang 24(Jackson, 2008) Further discussions of convergence and examples are found in Jackson (2008).
7.3 SOCIAL MEDIA INFLUENCE
As people connect via Facebook, Twitter, and Linkedin, they leave a trail of personal data A large amount of the interactions are public with connections, updates, and media creations that are measured It
is estimated that 43% of the marketing data gathered on people come from social media The growth in public data has led to active marketing and employment analytics research In the remainder of this section, we review contemporary measures of a node’s influence
7.3.1 Social Media: A Case for Facebook and Twitter
Eigenvalue centrality (see Chapter 2 for details) is possibly the best
mea-sure of social media influence (Chiang, 2012) Betweenness centrality
(see Chapter 2 for details) is another possible measure of social media influence (Chiang, 2012) Recall that node densities give rise to cluster densities (see Chapter 2 for details) This is the basis of the following
contagion theorem:
The whole network flips iff a cluster of density 1 − p or higher in
the set of nonflipped nodes (Chiang, 2012)
7.3.2 Klout Score
The Klout Score measures personal influence based on the person’s ity to generate interaction Every time you create content or engage the online social media, you influence others The Klout Score uses data from social networks in order to measure the following:
abil-1 How many people you influence (i.e., reach value)
2 How much you influence them (i.e., amplification value)
3 How influential they are (i.e., network score)
7.4 CONCLUSION
Influence is of interest to network scientists as well as social scientists
We reviewed stochastic models and learning models This is an open area
of research and I make a call to arms for scholarly attention Although
Trang 25Influence Diffusion and Contagion 69
discovery of optimal contagion methods is NP-complete, tion methods are promising
DeGroot, M., 1974 Reaching a consensus J Am Stat Assoc 69 (345), 118–121
Domingos, P., Richardson, M., 2001 Mining the network value of customers In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM Press, New York, NY, pp 57–66.
Granovetter, M., 1978 Threshold models of collective behavior Am J Sociol 83 (6), 1420–1443
Jackson, M., 2008 Social and Economic Networks Princeton: Princeton University Press Kempe, D., Kleinberg, J Tardos, É 2003 Maximizing the spread of influence in a social network In: Proceedings of 2003 Knowledge Discovery and Data Mining Conference ACM Press, New York, NY.
Li, Y., Chen, W., Wang, Y Zhang, Z 2013 Influence diffusion dynamics and influence tion in social networks with friend and foe relationships In: Proceedings of 6th ACM International Conference on Web Search and Data Mining, ACM Press, New York, NY pp 657–666
3 Develop strategies to apply machine learning techniques (e.g.,
reinforcement learning) for social learning
Trang 26CHAPTER 8
Power in Exchange Networks
Each link between a pair of nodes that provides an opportunity for a transaction between the pair produces a value that can be shared be-
tween the pair This is the premise in exchange networks (Easley and
exchange networks (Blau, 1964) An exchange requires at least two dividuals and can be over tangible or intangible entities (i.e., goods and services) with imbalances of utility for participants (Turner, 1998) Blau
in-promoted the idea that social choices are social actions that are
coun-terparts of physical, speech, and epistemological actions (Blau, 1964)
In the social exchange theory, exchanges yield value (i.e., worth) that is rewards—costs of the exchange If the value is positive, the exchange
is a desirable relationship; otherwise, exchange and the corresponding relationship are worthless
In search of desirable exchanges, a few individuals may lacks ners (e.g., remain bachelors) while others may have plethora of potential neighbors from which to choose Economic networks such as buyers and sellers of goods and services are well represented with exchange networks
and links would model transactions among them Sellers who have
ac-cess to exclusive buyers have power over them to the extent of their
ex-clusivity Sellers who are exclusive providers of scarce commodities have
power over their buyers to the extent of their exclusivity and scarcity of their goods (i.e., monopoly in economic terms).When buyers are scarce, they have higher power over sellers to the extent of their scarcity (i.e., monopsony in economic terms) In more general terms, dependence
among agents derived from their relative position in the network gives
rise to their structural power There is a reciprocal relationship between
potential power and dependence among pairs of agents (Emerson, 1962) Dependence, exclusion, and satiation are reported underlying determi-nants of structural power (Easley and Kleinberg, 2010) A step toward
formalizing power in an exchange network is the exchange outcome that
consists of two components: (a) proper pairing of exchange partners and
Trang 2772 Computational Network Science: An Algorithmic Approach
(b) value (i.e., utility) gained by each node from an exchange (Easley and
stabil-ity, that is, the production of preferred outcomes that augment values
of nodes engaged in exchange If there are incentives for changes in
exchange partner in order to increase value, there is instability An
ex-change network is stable if and only if there are no instabilities (Easley
used to optimize network stabilities (Easley and Kleinberg, 2010)
Whereas structural power empowers agents with capacities to
influ-ence others, it is the agent choices, that is, agent strategic actions, that
manifest social power Agents can invite positive exchanges by ing or withholding rewards, thereby wielding power strategies in their exchange networks (Russell, 1938; Molm, 1990)
dispens-Individuals often join groups to increase their power This is captured
with the concept of coalitions, for example, economic blocs that may be
regional or transnational For the basics of coalition theory, refer to operative game theory in Chapter 3 For a few good examples, the reader may consult Bonacich and Liu (2012) There is much that remains to be explored and codified into mathematical models that are of benefit for network science (Simpson et al., 2014)
8.1 CONCLUSION
Power has been of interest to network scientists and social scientists
We referred to structural and strategic notions of power with cal roots (Russell, 1938) There is a strong connection between power and influence, even though there are scant formal models for qualita-tive and quantitative analyses (Easley and Kleinberg, 2010) An intuitive direction for progress appears to be models that capture inequalities of resource distribution and access to them in the network Those who are close to resources may regulate resource access to others in the network Those who are remote from resources might wish to engage in strategic negotiation or have brokers engaged in such bartering
histori-REFERENCES
Blau, P., 1964 Exchange and Power in Social Life Wiley, New York
Bonacich, P., Liu, P., 2012 Introduction to Mathematical Sociology Princeton: Princeton University Press
Trang 28Easley, D., Kleinberg, J., 2010 Networks, Crowds, and Markets: Reasoning About a Highly nected World Cambridge University Press
Con-Emerson, R., 1962 Power-dependence relations Am Sociol Rev 27, 31–40
Molm, L., 1990 Structure, action, and outcomes: the dynamics of power in social exchange Am Sociol Rev 55 (3), 427–447
Russell, B., 1938 Power: A New Social Analysis WW Norton & Company
Simpson, J., Farrell, A., Oriña, M., Rothman, A., 2014 Power and social influence in ships In: APA Handbook of Personality and Social Psychology, Vol 3 American Psychological Association,Washington, DC.
relation-Turner, J., 1998 George C Homans’ behavioristic approach, 6th edition The Structure of logical TheoryWadsworth, (Chapter 20)
Socio-EXERCISES
1 Consider a given network with an uneven distribution of resources (e.g., access to oil reserves or a lake view) Model structurally based resource access
2 How can agents with a low resource access fair better with brokers? Design specific broker protocols
3 People develop patterns of exchange to cope with power differentials and to deal with the costs associated with exercising power Develop
a model of reasoning over power in exchange networks
Trang 29CHAPTER 9
Economic Networks
Buyers, sellers, and traders are individuals with an interest in goods and services These individuals often form nodes of an exchange net-
work called markets, which are a special type of an economic network
An economic network is a type of an exchange network where the links are used for exchange of commodities and services These exchanges are direct and indirect economic effects for individuals Interfacing buyers and sellers as matching markets are presented in Chapters 10 and 11 of Easley and Kleinberg (2010)
Payoff (i.e., utility) for node j is defined as v ij− pi , where v ij is the value
derived by agent j from acquiring agent i’s goods and p i is the price i that
is paid for the goods For a set of prices, P = {p1, p2, …}, each buyer wants to maximize his or her payoff If there is a tie among sellers,
a random seller is selected Buyer rationality principle dictates that
buyers purchase only if their payoff is a positive value A buyer places a
value on each item and the valuation profile of a buyer is the list of the values that the buyer assigns to items that are on sale A buyer’s positive seller set is a set of sellers that provides a positive payoff value Market clearing is achieved when buyers and sellers are perfectly matched so
that every item can be sold There are a few illustrative examples in Chapter 10 of Easley and Kleinberg (2010) There are also a couple
of positive results First, for every set of buyer valuations, there is a set of market clearing prices (Easley and Kleinberg, 2010) Second, for every set of market clearing prices, any resultant perfect matching for buyers yields a maximal total valuation for items bought from sellers Sometimes attributed to Jenő Egerváry, there is a procedure for constructing market clearing prices This procedure is famously known
as Gale-Shapley algorithm shown in Figure 9.1 A constricted set is a
set in which edges from this set to the other side of the bipartite graph
“constrict” the formation of a perfect matching Further examples and details are available from Easley and Kleinberg (2010) and will not be duplicated here
Trang 30In more complex markets, there are intermediaries called traders that
facilitate an interface between buyers and sellers The role of traders in
a market for agricultural goods between local producers and ers is illustrated in Chapter 11 of Easley and Kleinberg (2010) and we summarize here The flow of goods from sellers to buyers is determined
consum-by a game in which traders set prices, and then sellers and buyers react
to these prices Each trader t offers a bid price to each seller i to whom
he or she is connected, denoted by b ti This bid price is an offer by t to buy i’s copy of the good at a value of b ti Each trader t offers an ask price to each buyer j to whom he or she is connected This ask price is denoted by a tj and is an offer by t to sell a copy of the good to buyer j
at a value of a tj Each trader posts bid prices to the sellers to whom he
or she is connected, and ask prices to the buyers to whom he or she is connected Once traders announce prices, each seller and buyer chooses
at most one trader to deal with Each seller sells his or her copy of the good to the trader he or she selects, or keeps his or her copy of the good
if he or she chooses not to sell it Each buyer purchases a copy of the good from the trader he or she selects, or receives no copy of the good
if he or she does not select a trader This determines a flow of goods from sellers, through traders, to buyers There are incentives for a trader not to produce bid and ask prices that cause more buyers than sellers to accept his or her offers There are also incentives for a trader not to be caught in the reverse difficulty, with more sellers than buyers accepting his or her offers A trader’s payoff is the profit he or she makes from all
of his or her transactions It is the sum of the ask prices of his or her accepted offers to buyers, minus the sum of the bid prices of his or
her accepted offers to sellers For a seller i, the payoff from ing trader t is b ti , while the payoff from selecting no trader is v i If the
select-exchange is actualized, the seller receives b ti units of money Otherwise,
Fig 9.1 Gale-Shapley algorithm for determining market clearing prices.
Trang 31Economic Networks 77
he or she keeps his or her copy of the good, which he or she values at v i
For each buyer j, the payoff from selecting trader t is v j − a tj, while the
payoff from selecting no trader is zero With completed exchange, the
buyer receives the good but gives up a tj units of money The equilibrium
is based on a set of strategies such that each player chooses the best
re-sponse to what all the other players do In this simple scenario, it turns
out that traders can make only zero profit for reasons based more on the
global structure of the network rather than on direct competition with
any one trader
In the following section, we briefly outline the effects of local
eco-nomic exchanges on the broader ecoeco-nomic network
9.1 NETWORK EFFECTS
In an economic network, there are direct and indirect effects between
an individual’s choices on others’ and vice versa in what economists call
externality For example, a producer’s firm profit and a consumer’s
util-ity are proportional to the numbers of producers and consumers using
the same technology A famous model of externality is
telecommunica-tion subscriptelecommunica-tion (Rohlfs, 1974) Let potential subscribers be indexed
by x, where 0 ≤ x ≤ 1 Consumers indexed by low values of x value the
subscription highly, whereas consumers indexed by x close to 1 place a
low valuation on this service; p denotes the subscription fee, and qe is the
expected total number of subscribers The expected utility of a potential
subscriber indexed by x ∈ [0, 1] is computed with Equation 9.1
a > 0 measures the intensity of network effects Higher values of a
indicate that consumers place a higher value on the ability to
communi-cate with the qe subscribers In contrast, a = 0 implies that there are no
network effects The parameter b > 0 captures the degree of consumers’
heterogeneity with respect to consumers’ benefit from this service
Externalities exist in limited settings Consider the two-player game
shown in Figure 9.2, where there are four strategies of adopting and
U(x)=(1−bx)aqe−p,if he or
she subscribes0,if he or she does not subscribe
Trang 32avoiding changes If a > d and b > g, each player earns a higher payoff
if the player adopts; therefore, network externalities exist
The game has two Nash equilibria of (adopt, adopt) and (avoid,
avoid) If a > b, the outcome (adopt, adopt) Pareto dominates the come (avoid, avoid) This condition is known as excess inertia (Farrell
domi-nates the outcome (adopt, adopt) and the condition is called excess momentum However, neither condition is a subgame perfect equilibrium
with sequential decisions A similar analysis is possible with tion in prices (i.e., Bertrand games) or in quantities (i.e., Cournot game)
create externalities for buyers and sellers (Rysman, 2009) When
custom-ers are identified with components, the externality is considered direct However, in most economic networks, the externality is only indirect
Further information is available from NYU online site maintained by
Dr Nicholas Economides
9.2 CONCLUSION
Whether with our friends, colleagues, or strangers, economic networks are common exchange networks in our daily lives There are clear applications of economic networks in advertising space, for example, selling Google ad space (Chiang, 2012) There are also potential applications for policy analysis In order to understand range and scope of impacts for a new or changed policy, network effects need
to be explored
REFERENCES
Chiang, M., 2012 Networked Life New Jersey: Princeton University Press
Easley, D., Kleinberg, J., 2010 Networks, Crowds, and Markets: Reasoning About a Highly Connected World Cambridge University Press
Fig 9.2 An economic decision game payoff bimatrix for adopting a new technology game.
Trang 33Economic Networks 79
Farrell, J., Saloner, G., 1985 Standardization, compatibility, and innovation Rand J Econ 16 (1), 70–83
Jackson, M., 2008 Social and Economic Networks New Jersey: Princeton University Press
Rohlfs, J., 1974 A theory of interdependent demand for a communications service Bell J Econ Manag Sci 5, 16–37
Rysman, M., 2009 The economics of two-sided markets J Econ Perspect 23 (3), 125–143
EXERCISES
1 Develop an example of a market for ad space where advertisers are
buyers and online media, such as eBay and Google, are sellers of ad space
2 Discuss consequential externalities of social media proliferation on the global economy
Trang 34CHAPTER 10
Network Capital
Networks are storehouses for a variety of capital that have value and
worth In economic terms, capital is money (Marx, 1967) Physical capitals are tools and resources (Economy and Levi, 2014) These might be common tools or raw materials such as oil and gas, which
are commonly accessible to a group Cultural capital consists of
lan-guage, customs, and lifestyle choices (Beemyn, 2014) Whereas diversity promotes cultural heterogeneity, social conventions promote confor-
mity Political capital is largely a measure of power to persuade
oth-ers (Burgmann, 2014) Democracy and human rights serve as example areas, where social forces promote them from the West to the rest of the world Social capital (SC) is an intangible capacity to manage and change relationships such as with interpersonal trust made famous by the works of Pierre Bourdieu, James Coleman, and Robert Putnam
the macroscopic perspective, SC for the entire network is considered
In this view, individuals do not incrementally add to the system or draw units of SC Instead, the areas of interest are the system principles such as norms and conventions that provide resources for the overall social welfare In contrast, the microscopic perspective adopted here ex-plores how individuals can gain access to resources by their positions and connections in the network
with-Physical goods and services are often scattered unevenly in world networks Neighborhoods, communities, and high-status nodes are similarly loosely distributed in networks Nodes of a network often desire capital that may not be locally available to the node; thereby, they may take action to do one of the following: gain access to capital (e.g., gasoline), exert influence over nodes possessing capital (e.g., conserve energy usage), have power over capital (e.g., limit tuna fishing), and have control over flow of capital (e.g., nonproliferation of nuclear material)
actions over them