1. Trang chủ
  2. » Công Nghệ Thông Tin

Ebook Computational network science An algorithmic approach Part 2

69 258 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 69
Dung lượng 8 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

(BQ) Until now, studies in network science have been focused on particular relationships that require varied and sometimesincompatible datasets, which has kept it from being a truly universal discipline. This new approach would remove the need for tedious humanbased analysis of different datasets and help researchers spend more time on the qualitative aspects of network science research..Computational network science

Trang 1

CHAPTER 6

Diffusion and Contagion

This chapter explores the phenomena of rampant changes in networks exemplified by (a) disseminations of preferences, (b) percolation, (c) epidemic (i.e., contagion) of disease, and (d) community compositions The rest of this chapter reviews these four categories in order

6.1 POPULATION PREFERENCE SPREAD

ear-liest formally studied report of rampant changes in networks Schelling pointed out that a small preference for one’s neighbors to be of the same ethnicity leads to a widespread segregation emergent in the network

He used coins on a patch of graph paper to demonstrate this theory by placing pennies and nickels in different patterns on the cells Coins were moved one by one if they were in an unsatisfactory composition For every colored cell, if there were greater than 33% of the adjacent cells that were of a different color, the cell would move to another randomly selected cell You can try the model out using Chris Cook’s online dem-onstration program or the NetLogo model Further details and interest-ing emergent patterns that arise are available from Hatna and Benenson

prob-ability for the existence of a tie for each individual with another person

in the network Let us ignore the repeated connections With the small

value of p and the large value of N, PN represents the probability of an

individual’s indirect connection with others in the network, that is, the second-hand tie through the individual’s primary tie The expected (i.e., average) number of individuals that can be reached is shown by Equa-

occurs Rapid spread of diseases through connections in populations (e.g., obesity) is an epidemic and pandemic (i.e., an epidemic across bor-ders) event (Hays,  2005) Topologies of networks may hinder or pro-mote these events Whereas high-distance networks inhibit epidemics, low-distance networks (e.g., a scale-free network) accelerate them

Trang 2

If PN = 0.5, the ratio in Equation 6.1 yields the value of 2 and no epidemic results This value is below the threshold needed for the epi-

demic onset With p  =  0.0006 and N  =  2500, the ratio is larger than

1.0 (i.e., above the epidemic threshold), and the epidemic is a sigmoidal logistic curve shown in Figure 6.1

is a fraction of agents in a society who have adopted a new product

or behavior by time t p is the rate of innovation and q is the rate of

imitation χ(t) is a function of percentage change in price and other variables

p q F t t F t

F t

1 ( )χ

Trang 3

Diffusion and Contagion 47

6.2 PERCOLATION MODEL

In physics and mathematics, percolation theory describes the behavior

of clustered components in random networks (Grimmett,  1999) The common intuition is movement and filtering of fluids through porous materials, for example, filtration of water through soil and permeable rocks In a network, let each node be a cell through which a fluid-like substance may transit to other cells A network (i.e., a grid) then is a sponge-like substance and percolation is the determination of whether

a substance introduced at one cell will reach the other side of the work (or grid) There have been many applications of percolation such

net-as analysis of forest fire, bank failures, and rumor spread You may explore an online implementation of NetLogo percolation model

Commonly, a cell’s transmission rate is modeled as a probability value p

As shown in Figure 6.2, at a certain value of p, percolation is achieved.

6.3 DISEASE EPIDEMIC MODELS

Although the original inspiration is disease epidemics, we treat the topic generically applicable to epidemic spread of any phenomenon The ear-liest model is the susceptible, infected, susceptible (SIS), documented in

a fixed population N at time t that is divided into three camps: tible, infected, and removed The susceptible camps, denoted by S(t), are individuals who are/are not infected at time t The infected camps, de- noted by I(t), are individuals who have already been infected and are

suscep-capable of transmitting epidemics to the individuals in the susceptible

Fig 6.2 Percolation model.

Trang 4

category The recovered camps, denoted by R(t), are individuals with previous infection who have been removed due to either immunization

or termination (i.e., death) The removed individuals are not able to get infected again or to transmit epidemics to others

N t( )=S t( )+I t( )+R t( ) (6.3)All individuals are considered to have an equal probability of contact

(i.e., infection), denoted by b At each time step t, each person may infect

b × N others with equal probability The fraction of contacts between an

infected individual and a susceptible one is S t N t( ) / ( ) Therefore, the new infection rate is computed by β× ×N [ ( ) / ( )]S t N t ×I t( )= ×β S t( )×I t( ) The population leaving the susceptible camp is equal to the number en-tering the infected camp Meanwhile, a number of individuals equal to the fraction of infected individuals are leaving this class per unit time

and entering the removed class (i.e., the law of mass action) Let g

de-note the combined mean recovery and death rates It is assumed that the rate of infection and recovery is much faster than the timescale of births and deaths and overlooked in the model Therefore, the rate of infection change is d ( ) / d( )I t t = ×β S t( )×I t( )− ×γ I t( ) and the rate of recovery is

R t t I t

d ( ) / d( )= ×γ ( ).There have been many variations to the SIS model, including SIR and

models that account for births and deaths in mathematical epidemiology

(Brauer et al., 2008) with models available from Vynnycky and White (2010) Another good source of discussion of these models is Jackson (2008)

Whether epidemically grown or not, a community is a group of

close-ly related and connected entities with similar attributes; it can also be

a part of a large network with groups of communities Entities in one community can interact with other communities A good community will interact less with the entities of an outside community and more with those of the inside community Parts of networks form groups called

clusters These clusters can be considered to be communities Next, we

outline processes for community detection

6.4 COMMUNITY DETECTION

In order to motivate a community, let us consider a project team in a firm where the team is a community and the team members have frequent in-teractions There might be interfaces from one project to another For

N(t)=S(t)+I(t)+R(t)

S(t)/N(t) b×N×[S(t)/N(t)]×I(t)=b×

S(t)×I(t)

dI(t)/d(t)=b×S(t)×I(t)−g×I(t)

dR(t)/d(t)=g×I(t)

Trang 5

Diffusion and Contagion 49

example, finance and control project teams interact with material dling, sales, and distribution in a factory Although projects are intercon-nected, they are clearly identifiable communities as shown in Figure 6.3

han-We begin a set of global strategies and then turn to local ties We introduce graph partitioning where we divide the graphs into parts with minimal number of links between them The number of links

communi-running between two clusters is called cut size If we divide the network

in one group, that is, it is undivided, the cut size is 0 In partitioning, it might be desirable to specify the least number of groups and the target group size, even though communities cannot be divided in an optimal way For example, consider a sports club with 12 players Six of them play basketball and six of them play football We would like to divide the entire sports network into communities (i.e., clusters) based on how closely players interact with one another All players who play similar sports interact with one another closely and they are friends There are also a few footballers who interact with basketballers Figure  6.4

illustrates the interactions among footballers and basketballers The

Fig 6.3 IT organization of a factory.

Trang 6

best graph partitioning method is bipartition, which divides the graph into two clusters of equal size and minimal cut size However, it is not possible to have a partition with a smaller cut size than 3 Therefore, this method is optimal for dividing clusters in our graph.

6.4.1 Spectral Clustering

Spectral “graph” clustering technique is used to determine the number

of clusters in large networks Spectral partitioning is based on Laplacian

matrix (L) and finding eigenvalues and eigenvectors.

Adjacency matrix A = [W ij ], where W ij is the edge weight between

ver-tices x i and x j If there is a link between nodes i and j, W ij = 1; otherwise,

W ij  = 0 The Laplacian matrix L = D − A, where D is the diagonal

ma-trix of node degrees We illustrate a simple example shown in Figure 6.5

Fig 6.5 The graph G(9, 15) to be analyzed for spectral partitioning.

Fig 6.4 Two communities of players in sporting club.

Trang 7

Diffusion and Contagion 51

For each node, the value of D is computed based on how many edges

are linked to that node For example, for node 1, there are three edges

connected from nodes 2, 3, and 4 Therefore, the degree of node 1 is 3

From the Laplacian matrix shown in Equation  6.4, we compute

Fielder vector (S) based on eigenvalues and eigenvectors Fielder vector

has both positive and negative components and their sum must be 0

S

0.33 0.380.33 0.480.33 0.380.33 0.120.33 0.160.33 0.160.33 0.300.33 0.240.33 0.51

From the matrix S shown in Equation 6.5, we identify two

communi-ties, where all the positive values form one cluster and negative values

form another cluster (Figure  6.6) In this example, two communities

(i.e., clusters) are {1, 2, 3, 4} and {5, 6, 7, 8, 9}

L~=D−A=3−1−1−100000−12−100000-0−1−13−100000−10−14−1−1000000−14−1−1−10000−1−14−1−100000−1−14−1−10000−1−1−130000000−101

S=0.33−0.380.33−0.480.33

−0.380.33−0.120.330.160.330.160.330.300.330.240.330.51

Fig 6.6 Two communities derived from spectral partitioning technique.

Trang 8

6.4.2 Hierarchical Clustering

Hierarchical clustering is the most popular and widely used method to

analyze social network data In this method, nodes are compared with one another based on their similarity Larger groups are built by joining groups of nodes based on their similarity A criterion is introduced to compare nodes based on their relationship There are two types of hier-archical clustering approaches:

1 Agglomerative approach: This method is also called a

bottom-up approach shown in Figure 6.7 In this method, each node represents a single cluster at the beginning; eventually, nodes start merging based on their similarities and all nodes belong to the same cluster

2 Divisive approach: This method is also called a top-down approach

Initially, all nodes belong to the same cluster; eventually, each node forms its own cluster Divisive approach is less widely used due to its complexity compared with agglomerative approach

The final result for both approaches is represented as a dendrogram shown in Figure 6.8

Consider the distances between four Illinois towns, including bondale, Peoria, Springfield, and Bloomington, shown in Figure 6.9 We can observe that Bloomington and Peoria are the two closest cities and

Car-we join them using hierarchical clustering algorithm Figure 6.10 shows the four towns on the map The distances between Peoria, Bloomington, and Springfield are closer and identical within the distances of 73 and

71 miles The final dendrogram is shown in Figure 6.11

There is a problem with graph partitioning We need to specify the number and the size of the desired clusters If a network is new and large, we do not have any idea about the number of clusters and how

Fig 6.7 Algorithm for agglomerative hierarchical clustering.

Trang 9

Diffusion and Contagion 53

big they must be Hierarchical clustering has a shortcoming If we cut the hierarchical tree at any level, we produce a good partition but we

end up with n − 1 partitions If the network has 1 million nodes (n),

we get 1 million minus 1 partitions Many partitions are recovered from which we need to identify the best one We cannot use 1 million partitions We must find additional criteria to find which partition is the best one In Girvan and Newman (2002), an algorithm is offered

to solve the problems with spectral methods It is based on the divisive method and hierarchical clustering The divisive method repeatedly identifies and removes edges connecting densely connected regions It uses edge betweenness that is the number of the shortest paths pass-ing through the edge to identify edges to remove them It also removes the links that connect clusters The algorithm shown in Figure  6.12

Fig 6.8 A dendrogram example for hierarchical clustering approach.

Fig 6.9 Distances between four Illinois towns.

Trang 10

Fig 6.11 The final dendrogram for the towns’ example.

Fig 6.10 Hierarchical clustering of example towns shown on a map.

Trang 11

Diffusion and Contagion 55

is designed to remove edges in decreasing betweenness An example network is shown in Figure 6.13

6.4.3 Cascade Model

Information cascade models replicate the effects when individuals don their own information in favor of inferences based on other people’s behaviors and opinions

aban-The cascade models presuppose that individuals have a finite action set Individuals make rational, sequential decisions purely based on publicly observable information about others Common examples are

Fig 6.12 Girvan and Newman’s algorithm.

Fig 6.13 An example network illustrating Girvan and Newman’s algorithm.

Trang 12

neighborhood rivalry and substance abuse (Dodge et  al.,  2010; Easly

that are initially active and then spread their influence to other nodes

A more in-depth discussion of influence is deferred to Chapter 7 Let

us consider a weighted graph, where each edge has a weight (i.e., tie strength) and each node carries a threshold value There are a few nodes that are already active The neighbors of an active node become active when the sum of the weights on the incoming edges of the active node

is greater than the threshold value of that node Linear threshold els (LTM) were first introduced by Granovetter (Granovetter, 1978) In

mod-LTMs, each node v randomly chooses a threshold u from a uniform tribution in an interval between 0 and 1 Let u v denote the fraction of

dis-neighbors of v to be active in order to activate v Let a neighbor w be able to influence node v with weight b w,v We assume that Obw,v ≤ 1 Each node is activated only if it satisfies Equation 6.6 The process of activa-tions proceeds in successive activation stages until no further activations are possible

b w v, v

To illustrate, let us look at influence among students for signing on

to a course shown in Figure  6.14 A few variations to this model are discussed in Golbeck (2013)

In the network of Figure 6.14, there are nine graduate students beled from A to I The node value denotes the threshold value for that node, in this case aversion to taking a specific course Students interact with one another and influence their peers in making course registration decisions The edge value represents the weight of influ-ence between a pair of peers The graph illustrates how students are influenced by their peers in registering for the course Student A is already registered for the course The rest of the students are not reg-istered yet E is influenced by A and D However, D is not active, so

la-E is influenced by A only la-E’s threshold value is 0, which is less than the influence (edge) value of 0.2 from A So E registers for the course and becomes active in the network Subsequently, C is influenced by A

∑bw, v≥uv

Trang 13

Diffusion and Contagion 57

and E as they are already active in the network, and the sum of their incoming edge influence values is 0.3 + 0.3 = 0.6 that is greater than the threshold value 0.5; therefore, C becomes active Then, H is influenced

by C and E The sum of incoming edge values is 0.4 + 0.2 = 0.6 that is equal to the threshold value 0.6; thus, H becomes active and registers for the course Next, I is influenced by H as the incoming edge value 0.4 is equal to the threshold value 0.4, so I also becomes active The threshold values of nodes B, D, F, and G are greater than their incom-ing active edge values, so these students do not register for the course and they remain inactive Finally, the graph takes the makeup shown

F, and G are inactive

This is an example for the cascade model of network diffusion It forms a community of students (i.e., A, C, E, H, I) who register for the course by leaving certain parts of the network (i.e., nodes B, D, F, G) who do not register for the course (Figure  6.16) Hence, the cas-cade model of network diffusion is shown to stop and leave out certain parts of the network and this could be used as a model for community detection

Fig 6.14 An example network of students and their threshold values of influence.

Trang 14

Fig 6.16 Detected students community who registered for the networking course.

Fig 6.15 Student A influenced other students C, E, H, and I in registering the network course.

Trang 15

Diffusion and Contagion 59

6.4.4 Independent Contagion Model

In a related independent contagion model, a few nodes are initially active

and every edge has a probability (i.e., tie strength) of propagation An active node activates its neighbor with a probability on that particular edge There are no threshold values for nodes Nodes are activated in

an arbitrary order Initially, node A is active at time t and is said to be contagious This node has one chance for influencing its neighbor node

B at time stamp t + 1 There could be multiple active neighbors

compet-ing to influence node B Node B becomes active based on the activation attempts sequenced in an arbitrary order The probability of success of

node A’s attempts in activating B is denoted by PB(A) If node B has

the set S of neighbors, who already attempted to activate node B but failed, node A’s success probability is denoted by PB(A, S) When node

A completes influencing its neighbors, it remains in an active state but is not contagious This process completes when all the nodes are free from being contagious The network in Figure 6.17 is an example of 10 family members A–J Each edge shows a probability (i.e., weight) of influencing other family members

Fig 6.17 An example network of family members and their weight in influencing other members.

Trang 16

Family members A and D first bought an iPhone 5S at time t, so

they are active and contagious Node A attempts to influence C to buy the phone with the probability 0.6 Node C is also influenced by the other active node D with the edge weight 0.4 Node C can be influenced arbitrarily by either family members A or D, and this depends on the sequence of the attempts made on node C at that time Node C is influ-

enced by node A at time stamp t + 1, and the probability of A’s success

in influencing node C is denoted by PC(A, S), where S is the set of other active family members who may influence node C, so S = {D} Nodes A

and D are active and noncontagious as they attempt to influence their neighbors once These steps are repeated and influence cascades from family member C → F and F → I Eventually, the family members A,

D, C, F, and I have new iPhone 5S and form a community, leaving other members as inactive as shown in Figure 6.18

The community of family members is detected based on the dent contagion model Initially, the family members A and D have phones; they are able to diffuse information and influence other family members

indepen-C, F, and I to buy the phone This leaves family members B, F, G, H, and

J not part of the diffusion and cascading process, which helps us detect the community of iPhone 5S holders in the family shown in Figure 6.19

Fig 6.18 Family members A and D influenced C, F, and I in buying iPhone 5.

Trang 17

Diffusion and Contagion 61

Thus, the community detection focuses on approaches that are global

to the network There are times when there is a need for detecting only the local neighborhood (i.e., local community) of a starting node This

is covered in our last subsection presented next

6.4.5 Node-Centric Community Detection

A clique is a community Clique search is an NP-hard problem (Tang and Liu, 2010) Let CQ(v, k) be a k-sized queue of cliques starting from node v N(v) is the set of neighbors of node v Figure 6.20 is a brute-

force clique search algorithm that is computationally intractable for large

Fig 6.19 Community of iPhone 5S holders in the family.

Fig 6.20 Clique search algorithm Adapted from Han and Liu (2010).

Trang 18

networks Many algorithmic improvements and optimizations are

pos-sible such as the use of dynamic programming techniques Clique lation method (CPM) is a strategy for discovering overlapping communi-

perco-ties (Palla et al., 2005), even though CPM is computationally intractable

A key concept is reachability among nodes when there is a path between

them There are many other community concepts such as lambda sets

largest geodesic distance between any two nodes (defined in the original

network) is not larger than k k-Club, which is a subclass of k-cliques, restricts the geodesic distance not larger than k A subgraph G s (V s , E s)

is a g-dense (i.e., quasi-clique) iff E s/ [( (v v s s−1)) / 2]γ When g = 1, g-dense is the same as clique Two nodes are structurally equivalent if for any node v k that v k ≠ vi and v k ≠ vj , e(v i , v k ) ∈ E iff e(v j , v k ) ∈ E.

A commonly used similarity measure such as Jaccard (Equation 6.7) can be used to find a group of similar nodes in a community:

There are techniques available for crawling around a node to find

a set of most similar nodes from an initial node (Clauset et al., 2004;

modularity shown in Figure 6.21 (Clauset, 2005)

Es/[(vs(vs−1))/2]≫g

Jaccard(vi,vj) = Ni∩NjNi∪Nj=∑kAikAjkNi+Nj−∑kAikAjk

Fig 6.21 General algorithm for the greedy maximization of local modularity Adapted from Clauset (2005)

Trang 19

Diffusion and Contagion 63

6.5 COMMUNITY CORRELATION VERSUS INFLUENCE

Given a network, we can count the fraction of edges connecting nodes with distinctive attribute values Then, we compare it with the expected probability of such connections if the attribute and the social connec-tions are independent If the two quantities are significantly different,

we conclude the attribute is correlated with the network If the fraction

of edges linking nodes in a group with different attribute values is nificantly less than the expected probability of random connection in

sig-that network, there is evidence of correlation There exist correlations

between behaviors and attributes of adjacent nodes in a social network

Explanations for correlation are homophily (McPherson et al., 2001) and

influence that is discussed in Chapter 7.

6.6 CONCLUSION

This chapter explored the phenomena of rampant changes in networks exemplified by (a) disseminations of preferences, (b) percolation, (c) epidemic (i.e., contagion) of disease, and (d) community compositions Inspirations have converged from material science in percolation models

to epidemiology in epidemic models and mathematical sociology’s tribution to community detection Community detection in networks is still an evolving front With very large networks, it is difficult to detect desired communities We reviewed the most popular methods and al-gorithms Much research is ongoing and open problems are many, as networks are growing exponentially and becoming more complex

con-REFERENCES

Bass, F., 1969 A new product growth model for consumer durables Manag Sci 15 (5), 215–227

Brauer, F., et al.,2008 Mathematical epidemiology Lecture Notes in Mathematics/Mathematical Biosciences SubseriesSpringer

Clauset, A., 2005 Finding local community structure in networks Phys Rev E 72 (2).

Clauset, A., Newman, M., Moore, C., 2004 Finding community structure in very large networks Phys Rev E 70, 066111

Cross, R., Parker, A., 2004 The Hidden Power of Social Networks: Understanding How Work Really Gets Done in Organizations Harvard Business Review Press

Dodge, K., Malone, P., Lansford, J., Miller, S., Pettit, G., Bates, J., 2010 A Dynamic Cascade Model of the Development of Substance-Use Onset Wiley-Blackwell

Easly, D., Kleinberg, J., 2010 Networks, Crowds, and Markets Cambridge University Press

Trang 20

Girvan, M., Newman, M., 2002 Community structure in social and biological networks Proc Natl Acad Sci U S A 99 (12), 7821–7826

Golbeck, J., 2013 Analyzing the Social Web Elsevier

Granovetter, M., 1978 Threshold models of collective behavior Am J Sociol 83 (6), 1420–1443

Grimmett, G., 1999 Percolation Springer

Hatna, E., Benenson, I., 2012 The Schelling model of ethnic residential dynamics: beyond the integrated–segregated dichotomy of patterns J Artif Soc Soc Simulation 15 (1), 1–23 Hays, J., 2005 Epidemics and Pandemics: Their Impacts on Human History ABC-CLIO Jackson, M., 2008 Princeton University Press.

Kermack, W., McKendrick, A., 1927 A contribution to the mathematical theory of epidemics Proc R Soc A Math Phys Eng Sci 115 (772), 700

McPherson, M., Smith-Lovin, L., Cook, J., 2001 Birds of a feather: homophily in social networks Annu Rev Sociol 27, 415–444

Palla, G., Derényi, I., Farkas, I., Vicsek, T., 2005 Uncovering the overlapping community structure

of complex networks in nature and society Nature 435, 814–818

Schelling, T., 1971 Dynamic models of segregation J Math Sociol 1, 143–186

Tang, L., Liu, H., 2010 Community Detection and Mining in Social Media Morgan & Claypool

Vynnycky, E., White, R., 2010 An Introduction to Infectious Disease Modelling Oxford University Press

3 Use percolation model to determine saturation timeframe of an

entire network (That is, as an analogy, consider the following question: “when does a sponge that is fully soaked begin to wet?”)

4 How can a node-centric community detection method be used to find

your long lost friend?

Trang 21

CHAPTER 7

Influence Diffusion and Contagion

By analogy with the spread of infectious diseases that we explored in Chapter 6, novel ideas and word-of-mouth spread in networks lead to

viral diffusion of social and economic choices This chapter explores the phenomena of diffusion and contagion over social forces, which are cre-

ated by the society that separates from the individual and yet affects the individual Some examples of social forces are the media, the economy, life styles, religion, and ideology Examples of viral diffusion are chang-ing social values, norms, behaviors, product adoptions, religions, and cultural mindsets When a node’s choices affect choices of others, we say

that the node has influenced others.

All epidemic and community detection models, including Granovetter’s threshold model (Granovetter,  1978), which we presented in Chapter  6, serve as basic models of influence diffusion and contagion We do repeat them in this chapter The caveat for this chapter is to treat networks as di-rected graphs since most often influence is not reciprocal and relationships are not symmetrical

Another caveat and assumption is that once a node is influenced (i.e., becomes active), it cannot lose that influence (i.e., become inactive) This

is the basic assumption in progressive models with examples, such as product adoption, where influence is permanent Nonprogressive models

are used in epistemic contexts, such as opinions and political attitudes, where individuals may change their mind

learning model in Section 7.2 that is a nonprogressive model

7.1 STOCHASTIC MODEL

Independent cascade models (ICM) assume a uniform influence probability

p ∈ [0, 1] between pairs of nodes in the network (Kempe et al., 2003) Let

S t be the set of nodes that are active at time t If t = 0, S t is the seed set

from which influence spreads The model moves forward with successive

Trang 22

time steps At each time step t ≥ 1, S t  = S t−1 For every node v that is not active at t − 1, find all nodes u that may influence v (i.e., incoming edges to v) and execute an activation attempt, which is performed by a Bernoulli trial using probability p If u’s activation attempt succeeds, add v to S t , that is, node u influences (i.e., activates) node v at time t

If there are multiple nodes that may activate v, the result is similar irrespective of which node activates v Since activation attempts of nodes

are independent, the model is an independent, cascade model When no more activation is possible, the network stays the same, and the set of

activated nodes is called the final set At some point, no new activations

are possible All the nodes are either activated or not We call this

situ-ation satursitu-ation An interesting property of ICM is that the capacity for diffusion is the same for all seed set nodes known as the submodular- ity principle Another property of ICM is the final set that is invariant to the seed set size known as the monotonicity principle.

A fixed threshold model is one where each node v has an individual threshold u v ∈  [0, 1] Our example in Chapter 6 showed how iPhone 5S adoption was modeled as a fixed threshold model

A voter model (Clifford and Sudbury, 1973) uses an undirected graph, where each node possesses a binary state either 0 or 1 Iteratively, at each

step, a node v at time t selects a random neighbor u to mimic its t − 1

state To state things more formally, states of nodes can be captured by

a vector xt at time t Each node i’s probability that it is in state 1 (as opposed to being in state 0) is denoted by x it( ), and x i( )

0

 is the state

of node i at t = 0 Equation 7.1 is the update function for changes in

node i’s states M is the stochastic transition matrix, where M ij =1/d i

and d i is the degree of node i Dynamics of the voter model is

summa-rized in Equation 7.2 When t → ∞, the state ceases to change and xtapproaches a steady state The voter model is adapted for use on many recent influence diffusion models (Li et al., 2013)

In a Markov random field model, each node of an undirected

graph is a random variable X v According to the Markov property, X v

x⇀t x⇀t(i)

Trang 23

Influence Diffusion and Contagion 67

is independent of nodes that are not its neighbors, Nv, but does

de-pend on its neighbors of v, N v Let C N( v) be a possible configuration of

A special stochastic optimization problem is one discovery of a seed

set that maximizes influence diffusion This is known to be an NP-

complete problem (Kempe et al., 2003) In the real world, we may not

make the assumption that the direct influence and the amount are

known Instead, influence is learned as we outline in Section  7.2 We

review the 1974 model of DeGroot (DeGroot, 1974)

7.2 SOCIAL LEARNING

DeGroot assumes a society of n agents, where individuals possess an

initial (t = 0) opinion on a singular subject, represented by a vector of

probabilities p(0) = (p1(0), …, p n (0)), where p i (t) denotes the

probabilis-tic belief of agent i at time t Let T be a matrix of interactions among

agents, where T ij is the amount of weight (i.e., trust) agent i places on j’s

opinion at time t in forming i’s opinion at t + 1 We assume that agent

i’s influence (i.e., trust) is limited to the sum of its peers Therefore, the

T matrix rows add to 1 This property renders the model to be

stochas-tic Probabilities are updated using Equations 7.4–7.6 with the Markov

chain property:

p t( )= ×T p t( 1)− (7.4)

p t( )= ×T t p(0) (7.5)

i p, ( ) limi t p t i( )

(7.6)

A network N and a matrix T reach consensus if i j, ∈N p, ( )i ∞ = p j( ).∞

Such a matrix T is a steady-state Markov chain A society is said to

be wise when the most influential influence vanishes in the society

N~v C(N~v) N~v

P(Xv|S0)=∑N~v⇀C(N~ v)P(Xv|N~v,S0)×⇀u⇀N (v)−S0P(Xu|S0)|S0)

p(t)=T×p(t−1)

p(t)=Tt×p(0)

⇀i,pi(∞)=limt→∞pi(t)

⇀i,j⇀N,pi(∞)=pj(∞)

Trang 24

(Jackson, 2008) Further discussions of convergence and examples are found in Jackson (2008).

7.3 SOCIAL MEDIA INFLUENCE

As people connect via Facebook, Twitter, and Linkedin, they leave a trail of personal data A large amount of the interactions are public with connections, updates, and media creations that are measured It

is estimated that 43% of the marketing data gathered on people come from social media The growth in public data has led to active marketing and employment analytics research In the remainder of this section, we review contemporary measures of a node’s influence

7.3.1 Social Media: A Case for Facebook and Twitter

Eigenvalue centrality (see Chapter 2 for details) is possibly the best

mea-sure of social media influence (Chiang,  2012) Betweenness centrality

(see Chapter 2 for details) is another possible measure of social media influence (Chiang, 2012) Recall that node densities give rise to cluster densities (see Chapter 2 for details) This is the basis of the following

contagion theorem:

The whole network flips iff a cluster of density 1 − p or higher in

the set of nonflipped nodes (Chiang, 2012)

7.3.2 Klout Score

The Klout Score measures personal influence based on the person’s ity to generate interaction Every time you create content or engage the online social media, you influence others The Klout Score uses data from social networks in order to measure the following:

abil-1 How many people you influence (i.e., reach value)

2 How much you influence them (i.e., amplification value)

3 How influential they are (i.e., network score)

7.4 CONCLUSION

Influence is of interest to network scientists as well as social scientists

We reviewed stochastic models and learning models This is an open area

of research and I make a call to arms for scholarly attention Although

Trang 25

Influence Diffusion and Contagion 69

discovery of optimal contagion methods is NP-complete, tion methods are promising

DeGroot, M., 1974 Reaching a consensus J Am Stat Assoc 69 (345), 118–121

Domingos, P., Richardson, M., 2001 Mining the network value of customers In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM Press, New York, NY, pp 57–66.

Granovetter, M., 1978 Threshold models of collective behavior Am J Sociol 83 (6), 1420–1443

Jackson, M., 2008 Social and Economic Networks Princeton: Princeton University Press Kempe, D., Kleinberg, J Tardos, É 2003 Maximizing the spread of influence in a social network In: Proceedings of 2003 Knowledge Discovery and Data Mining Conference ACM Press, New York, NY.

Li, Y., Chen, W., Wang, Y Zhang, Z 2013 Influence diffusion dynamics and influence tion in social networks with friend and foe relationships In: Proceedings of 6th ACM International Conference on Web Search and Data Mining, ACM Press, New York, NY pp 657–666

3 Develop strategies to apply machine learning techniques (e.g.,

reinforcement learning) for social learning

Trang 26

CHAPTER 8

Power in Exchange Networks

Each link between a pair of nodes that provides an opportunity for a transaction between the pair produces a value that can be shared be-

tween the pair This is the premise in exchange networks (Easley and

exchange networks (Blau, 1964) An exchange requires at least two dividuals and can be over tangible or intangible entities (i.e., goods and services) with imbalances of utility for participants (Turner, 1998) Blau

in-promoted the idea that social choices are social actions that are

coun-terparts of physical, speech, and epistemological actions (Blau, 1964)

In the social exchange theory, exchanges yield value (i.e., worth) that is rewards—costs of the exchange If the value is positive, the exchange

is a desirable relationship; otherwise, exchange and the corresponding relationship are worthless

In search of desirable exchanges, a few individuals may lacks ners (e.g., remain bachelors) while others may have plethora of potential neighbors from which to choose Economic networks such as buyers and sellers of goods and services are well represented with exchange networks

and links would model transactions among them Sellers who have

ac-cess to exclusive buyers have power over them to the extent of their

ex-clusivity Sellers who are exclusive providers of scarce commodities have

power over their buyers to the extent of their exclusivity and scarcity of their goods (i.e., monopoly in economic terms).When buyers are scarce, they have higher power over sellers to the extent of their scarcity (i.e., monopsony in economic terms) In more general terms, dependence

among agents derived from their relative position in the network gives

rise to their structural power There is a reciprocal relationship between

potential power and dependence among pairs of agents (Emerson, 1962) Dependence, exclusion, and satiation are reported underlying determi-nants of structural power (Easley and Kleinberg, 2010) A step toward

formalizing power in an exchange network is the exchange outcome that

consists of two components: (a) proper pairing of exchange partners and

Trang 27

72 Computational Network Science: An Algorithmic Approach

(b) value (i.e., utility) gained by each node from an exchange (Easley and

stabil-ity, that is, the production of preferred outcomes that augment values

of nodes engaged in exchange If there are incentives for changes in

exchange partner in order to increase value, there is instability An

ex-change network is stable if and only if there are no instabilities (Easley

used to optimize network stabilities (Easley and Kleinberg, 2010)

Whereas structural power empowers agents with capacities to

influ-ence others, it is the agent choices, that is, agent strategic actions, that

manifest social power Agents can invite positive exchanges by ing or withholding rewards, thereby wielding power strategies in their exchange networks (Russell, 1938; Molm, 1990)

dispens-Individuals often join groups to increase their power This is captured

with the concept of coalitions, for example, economic blocs that may be

regional or transnational For the basics of coalition theory, refer to operative game theory in Chapter 3 For a few good examples, the reader may consult Bonacich and Liu (2012) There is much that remains to be explored and codified into mathematical models that are of benefit for network science (Simpson et al., 2014)

8.1 CONCLUSION

Power has been of interest to network scientists and social scientists

We referred to structural and strategic notions of power with cal roots (Russell, 1938) There is a strong connection between power and influence, even though there are scant formal models for qualita-tive and quantitative analyses (Easley and Kleinberg, 2010) An intuitive direction for progress appears to be models that capture inequalities of resource distribution and access to them in the network Those who are close to resources may regulate resource access to others in the network Those who are remote from resources might wish to engage in strategic negotiation or have brokers engaged in such bartering

histori-REFERENCES

Blau, P., 1964 Exchange and Power in Social Life Wiley, New York

Bonacich, P., Liu, P., 2012 Introduction to Mathematical Sociology Princeton: Princeton University Press

Trang 28

Easley, D., Kleinberg, J., 2010 Networks, Crowds, and Markets: Reasoning About a Highly nected World Cambridge University Press

Con-Emerson, R., 1962 Power-dependence relations Am Sociol Rev 27, 31–40

Molm, L., 1990 Structure, action, and outcomes: the dynamics of power in social exchange Am Sociol Rev 55 (3), 427–447

Russell, B., 1938 Power: A New Social Analysis WW Norton & Company

Simpson, J., Farrell, A., Oriña, M., Rothman, A., 2014 Power and social influence in ships In: APA Handbook of Personality and Social Psychology, Vol 3 American Psychological Association,Washington, DC.

relation-Turner, J., 1998 George C Homans’ behavioristic approach, 6th edition The Structure of logical TheoryWadsworth, (Chapter 20)

Socio-EXERCISES

1 Consider a given network with an uneven distribution of resources (e.g., access to oil reserves or a lake view) Model structurally based resource access

2 How can agents with a low resource access fair better with brokers? Design specific broker protocols

3 People develop patterns of exchange to cope with power differentials and to deal with the costs associated with exercising power Develop

a model of reasoning over power in exchange networks

Trang 29

CHAPTER 9

Economic Networks

Buyers, sellers, and traders are individuals with an interest in goods and services These individuals often form nodes of an exchange net-

work called markets, which are a special type of an economic network

An economic network is a type of an exchange network where the links are used for exchange of commodities and services These exchanges are direct and indirect economic effects for individuals Interfacing buyers and sellers as matching markets are presented in Chapters 10 and 11 of Easley and Kleinberg (2010)

Payoff (i.e., utility) for node j is defined as v ij− pi , where v ij is the value

derived by agent j from acquiring agent i’s goods and p i is the price i that

is paid for the goods For a set of prices, P = {p1, p2, …}, each buyer wants to maximize his or her payoff If there is a tie among sellers,

a random seller is selected Buyer rationality principle dictates that

buyers purchase only if their payoff is a positive value A buyer places a

value on each item and the valuation profile of a buyer is the list of the values that the buyer assigns to items that are on sale A buyer’s positive seller set is a set of sellers that provides a positive payoff value Market clearing is achieved when buyers and sellers are perfectly matched so

that every item can be sold There are a few illustrative examples in Chapter  10 of Easley and Kleinberg (2010) There are also a couple

of positive results First, for every set of buyer valuations, there is a set of market clearing prices (Easley and Kleinberg, 2010) Second, for every set of market clearing prices, any resultant perfect matching for buyers yields a maximal total valuation for items bought from sellers Sometimes attributed to Jenő Egerváry, there is a procedure for constructing market clearing prices This procedure is famously known

as Gale-Shapley algorithm shown in Figure 9.1 A constricted set is a

set in which edges from this set to the other side of the bipartite graph

“constrict” the formation of a perfect matching Further examples and details are available from Easley and Kleinberg (2010) and will not be duplicated here

Trang 30

In more complex markets, there are intermediaries called traders that

facilitate an interface between buyers and sellers The role of traders in

a market for agricultural goods between local producers and ers is illustrated in Chapter 11 of Easley and Kleinberg (2010) and we summarize here The flow of goods from sellers to buyers is determined

consum-by a game in which traders set prices, and then sellers and buyers react

to these prices Each trader t offers a bid price to each seller i to whom

he or she is connected, denoted by b ti This bid price is an offer by t to buy i’s copy of the good at a value of b ti Each trader t offers an ask price to each buyer j to whom he or she is connected This ask price is denoted by a tj and is an offer by t to sell a copy of the good to buyer j

at a value of a tj Each trader posts bid prices to the sellers to whom he

or she is connected, and ask prices to the buyers to whom he or she is connected Once traders announce prices, each seller and buyer chooses

at most one trader to deal with Each seller sells his or her copy of the good to the trader he or she selects, or keeps his or her copy of the good

if he or she chooses not to sell it Each buyer purchases a copy of the good from the trader he or she selects, or receives no copy of the good

if he or she does not select a trader This determines a flow of goods from sellers, through traders, to buyers There are incentives for a trader not to produce bid and ask prices that cause more buyers than sellers to accept his or her offers There are also incentives for a trader not to be caught in the reverse difficulty, with more sellers than buyers accepting his or her offers A trader’s payoff is the profit he or she makes from all

of his or her transactions It is the sum of the ask prices of his or her accepted offers to buyers, minus the sum of the bid prices of his or

her accepted offers to sellers For a seller i, the payoff from ing trader t is b ti , while the payoff from selecting no trader is v i If the

select-exchange is actualized, the seller receives b ti units of money Otherwise,

Fig 9.1 Gale-Shapley algorithm for determining market clearing prices.

Trang 31

Economic Networks 77

he or she keeps his or her copy of the good, which he or she values at v i

For each buyer j, the payoff from selecting trader t is v j  − a tj, while the

payoff from selecting no trader is zero With completed exchange, the

buyer receives the good but gives up a tj units of money The equilibrium

is based on a set of strategies such that each player chooses the best

re-sponse to what all the other players do In this simple scenario, it turns

out that traders can make only zero profit for reasons based more on the

global structure of the network rather than on direct competition with

any one trader

In the following section, we briefly outline the effects of local

eco-nomic exchanges on the broader ecoeco-nomic network

9.1 NETWORK EFFECTS

In an economic network, there are direct and indirect effects between

an individual’s choices on others’ and vice versa in what economists call

externality For example, a producer’s firm profit and a consumer’s

util-ity are proportional to the numbers of producers and consumers using

the same technology A famous model of externality is

telecommunica-tion subscriptelecommunica-tion (Rohlfs,  1974) Let potential subscribers be indexed

by x, where 0 ≤ x ≤ 1 Consumers indexed by low values of x value the

subscription highly, whereas consumers indexed by x close to 1 place a

low valuation on this service; p denotes the subscription fee, and qe is the

expected total number of subscribers The expected utility of a potential

subscriber indexed by x ∈ [0, 1] is computed with Equation 9.1

a > 0 measures the intensity of network effects Higher values of a

indicate that consumers place a higher value on the ability to

communi-cate with the qe subscribers In contrast, a = 0 implies that there are no

network effects The parameter b > 0 captures the degree of consumers’

heterogeneity with respect to consumers’ benefit from this service

Externalities exist in limited settings Consider the two-player game

shown in Figure  9.2, where there are four strategies of adopting and

U(x)=(1−bx)aqe−p,if he or

she subscribes0,if he or she does not subscribe

Trang 32

avoiding changes If a > d and b > g, each player earns a higher payoff

if the player adopts; therefore, network externalities exist

The game has two Nash equilibria of (adopt, adopt) and (avoid,

avoid) If a > b, the outcome (adopt, adopt) Pareto dominates the come (avoid, avoid) This condition is known as excess inertia (Farrell

domi-nates the outcome (adopt, adopt) and the condition is called excess momentum However, neither condition is a subgame perfect equilibrium

with sequential decisions A similar analysis is possible with tion in prices (i.e., Bertrand games) or in quantities (i.e., Cournot game)

create externalities for buyers and sellers (Rysman, 2009) When

custom-ers are identified with components, the externality is considered direct However, in most economic networks, the externality is only indirect

Further information is available from NYU online site maintained by

Dr Nicholas Economides

9.2 CONCLUSION

Whether with our friends, colleagues, or strangers, economic networks are common exchange networks in our daily lives There are clear applications of economic networks in advertising space, for example, selling Google ad space (Chiang,  2012) There are also potential applications for policy analysis In order to understand range and scope of impacts for a new or changed policy, network effects need

to be explored

REFERENCES

Chiang, M., 2012 Networked Life New Jersey: Princeton University Press

Easley, D., Kleinberg, J., 2010 Networks, Crowds, and Markets: Reasoning About a Highly Connected World Cambridge University Press

Fig 9.2 An economic decision game payoff bimatrix for adopting a new technology game.

Trang 33

Economic Networks 79

Farrell, J., Saloner, G., 1985 Standardization, compatibility, and innovation Rand J Econ 16 (1), 70–83

Jackson, M., 2008 Social and Economic Networks New Jersey: Princeton University Press

Rohlfs, J., 1974 A theory of interdependent demand for a communications service Bell J Econ Manag Sci 5, 16–37

Rysman, M., 2009 The economics of two-sided markets J Econ Perspect 23 (3), 125–143

EXERCISES

1 Develop an example of a market for ad space where advertisers are

buyers and online media, such as eBay and Google, are sellers of ad space

2 Discuss consequential externalities of social media proliferation on the global economy

Trang 34

CHAPTER 10

Network Capital

Networks are storehouses for a variety of capital that have value and

worth In economic terms, capital is money (Marx,  1967) Physical capitals are tools and resources (Economy and Levi,  2014) These might be common tools or raw materials such as oil and gas, which

are commonly accessible to a group Cultural capital consists of

lan-guage, customs, and lifestyle choices (Beemyn, 2014) Whereas diversity promotes cultural heterogeneity, social conventions promote confor-

mity Political capital is largely a measure of power to persuade

oth-ers (Burgmann, 2014) Democracy and human rights serve as example areas, where social forces promote them from the West to the rest of the world Social capital (SC) is an intangible capacity to manage and change relationships such as with interpersonal trust made famous by the works of Pierre Bourdieu, James Coleman, and Robert Putnam

the macroscopic perspective, SC for the entire network is considered

In this view, individuals do not incrementally add to the system or draw units of SC Instead, the areas of interest are the system principles such as norms and conventions that provide resources for the overall social welfare In contrast, the microscopic perspective adopted here ex-plores how individuals can gain access to resources by their positions and connections in the network

with-Physical goods and services are often scattered unevenly in world networks Neighborhoods, communities, and high-status nodes are similarly loosely distributed in networks Nodes of a network often desire capital that may not be locally available to the node; thereby, they may take action to do one of the following: gain access to capital (e.g., gasoline), exert influence over nodes possessing capital (e.g., conserve energy usage), have power over capital (e.g., limit tuna fishing), and have control over flow of capital (e.g., nonproliferation of nuclear material)

actions over them

Ngày đăng: 16/05/2017, 16:43

TỪ KHÓA LIÊN QUAN