Through a case study of a global manufacturing company, in our previous studies we have demonstrated our method was effective to indentify informal communities and potential leaders with
Trang 1Abstract - In this turbulent business environment of
global recession, traditional organizational structure is
reaching its limits In order to accommodate itself to these
changes, managing informal communication beyond old
framework is indispensable It is critical for innovation
management to recognize communities of practice and
informal leaders In previous studies we have demonstrated
our method was effective to indentify informal communities
and potential leaders from one month email log data
collected in September 2008 within an organization through
a case study of a global manufacturing company In this
paper we collect the second set of one-month email log in
June 2009 so as to chronologically compare with the first set
of data collected in September 2008 and to analyze changes
before and after major organizational changes triggered by
the bankruptcy of Lehman Brothers Email network analysis
helps management systematically view its organization as a
whole.
Keywords - email, network analysis, organizational
management, leadership, innovation
I INTRODUCTION
In this turbulent business environment, traditional
organizational structure is reaching its limits By
accommodating itself to these changes for its survival and
prosperity, business organizations need to manage
communication networks beyond old framework As for
innovation management, it is indispensable to identify
communities of practice and deploy informal leaders
Through a case study of a global manufacturing
company, in our previous studies we have demonstrated
our method was effective to indentify informal
communities and potential leaders with the network
analysis of the first set of one-month email log data
collected in September 2008 within the organization [1]
As the results of the previous case study with
interviews, we identified communities and hierarchical
structures reflect actual status of organization structures of
the organization Most of people who have high network
centralities are recognized as key persons in the firm We
found that both betweenness and pagerank is a good
indicator to detect hidden leadership in their communities
In this paper, we collect the second set of one-month
email log data in June 2009 and chronologically compare
and analyze any changes We use the same methodology
of the previous studies for the email network analysis in
which we construct an email network from a set of log
data, and then identify communities in the email network
by performing a topological clustering of the networks
We calculate degree centrality, betweenness centrality, closeness centrality, and pagerank centrality Clustering process is visualized by a dendrogram which is a hierarchical tree diagram Then, we interview the managers of the company
Our data are unique in three ways (1) The email log
of a fairly large size organization is collected (2) Two sets of data are collected for chronological analysis (3) The collection of data sets coincides with the drastic organizational change owing to the unprecedented business impact triggered by the bankruptcy of Lehman Brothers in September 2008 Consequently, we have the data sets for organizational analysis before and after the impact of global recession from a perspective of informal community by an email network analysis
According to the interview with the managers of the company, the top management team resolutely carried out organizational changes for its survival through the global depression, aiming for (1) restructuring of highly paid managers, (2) rejuvenations for organizational vitality, and (3) reintegration of divisions for innovation We challenge to evaluate the organizational changes for verification with the email network analysis As well as informal community analysis, we compare before and after leader characteristics with network centralities and communication patterns
The informal networks coexist with the formal structure of the organization and serve many purposes, such as resolving the conflicting goals of the institution to which they belong, solving problems in more efficient ways [2], and furthering the interests of their members Despite their lack of official recognition, informal networks can provide effective ways of learning and with the proper incentives actually enhance the productivity of the formal organization [3, 4] Along with the growth of the informal communities, leadership roles in the communities have been distributed [5] Given the dynamics of forming communities and distributed leadership, it is important to extract such hidden patterns
of collaboration and leadership for organizational management that could lead to innovation
The previous approach to identify informal community was to gather data from interviews, surveys,
or other fieldwork and to construct links and communities
by manual inspection [6] or an internet-centric approach [7] These methods are accurate but time-consuming and
H Tashiro , J Mori , N Fujii , and K Matsushima 1
Graduate School of Engineering, the University of Tokyo, Tokyo, Japan
2
Faculty of Science and Engineering, Waseda University, Tokyo, Japan
{jmori, tashiro}@ipr-ctr.t.u-tokyo.ac.jp nf_tomo_home@ybb.ne.jp matsushima@biz-model.t.u-tokyo.ac.jp
Trang 2labor-intensive, prohibitively so in the context of a very
large organization Given the recent development of
online communications in an organization, several studies
have been working on identification of communities using
online information resources [8] Adamic showed that the
communities, identified from online mailing lists and
Web, resemble the actual social communities of the
represented individuals [9]
Among several communication means, email has
widely become the means of communication in an
organization Therefore email has been established as an
indicator of collaboration and knowledge exchange [8, 10,
11] Since email provides plentiful data on personal
communication in an electronic form which enables
automatic processing of data, several studies have
addressed using email to discover shared interests,
relationships, and social networks [12, 13] Providing the
structure and communication patterns within an
organization [14, 15], email networks are useful
information resources to find informal communities
Several studies have proposed automated methods for
using email data to construct a network, and then identify
informal communities within an organization [16, 17]
However, there is not yet enough understanding and
evaluation regarding how identified communities from
email data can be exploited for management of
organization and leadership which is important to enhance
organizational innovation
In this paper, we collect and analyze the second set of
the one-month email log data with the method for
indentifying informal communities and potential leaders
We use the clustering method that can rapidly detect
dense communities within an email network The result of
the clustering process reveals informal communities and
hierarchical structures with an organization To
characterize people in the informal communities, we
calculate several network centralities of a person using the
structure of an email network Through the interviews
with the managers, these measures enable us to identify
leadership roles with the informal communities Then, we
compare two sets of email communication networks to
see if we can conclude any managerial implications with
significance to the top management
II METHODOLOGY
A Email network
We construct an email network from email log data
We extract the information about sender and receiver
from each email The sender or receiver corresponds to a
node in the network If there is at least one email
communication between persons, an edge is then drawn
between these persons As a sum of the nodes and edges,
we finally obtain the email network Since we distinguish
a sender from a receiver, an email network is expressed as
a directed graph Given the network, we find a maximal
complete sub-graph as a clique which becomes a target of
following network analysis
B Email network analysis
We first identify communities in the email network
To this aim, we perform a topological clustering of networks Although such a methodology had been difficult to achieve due to the difficulty in performing cluster analysis of non-weighted graphs consisting of the large number of nodes, recently proposed algorithms [18, 19] facilitate fast clustering with calculation time in the
order of O((l+n)n), or O(n 2 ) on a sparse network with l
links; hence this could be applied to large-scale networks The algorithm proposed was based on the idea of
modularity Modularity Q was defined as follows [18, 19,
20]:
¼
º
«
«
¬
ª
¸
¹
·
¨
©
§
m N
s
s s
l
d l
l Q
1
2 2
where N m is the number of modules, l s is the number of
links between nodes in module s, and d s is the sum of the
degrees of the nodes in module s In other words, Q is the
fraction of links that fall within modules, minus the expected value of the same quantity if the links fall at random without regard for the modular structure
A good partition of a network into cluster must comprise many within-cluster links and as few as possible between-cluster links The objective of a community identification algorithm is to find the partition with the
largest modularity The algorithm to optimize Q over all
possible divisions is as follows Starting with a state in
which each node is the only member of one of n clusters,
we repeatedly join clusters together in pairs, choosing the
join which results in the greatest increase in Q at each step Since a high value of Q represents a good cluster division,
we stopped joining whenǼQ became minus At the
maximal value of Q, Q max, we obtain a cluster structure of
a network with effective division The clusters correspond
to the informal communities in the email network The cluster label can be assigned by examining characteristics
of node attributes
A node in a cluster is characterized with its network centralities [21] We calculate centralities as follows
Degree centrality: the number of links of a node
Betweenness centrality: the number of node pairs that pass through a node
Closeness centrality: average shortest path to other nodes
Pagerank centrality: the stationary distribution of the Markov chain corresponding to the stochastic transition matrix of a network
Assuming that leadership is influenced by communication and trust on one’s social network [5], leadership roles are characterized with these centralities
C Email network visualization
To visualize the large-scale network, we employ the force-directed GEM layout [22] GEM optimizes minimal Proceedings of the 2010 IEEE ICMIT
Trang 3node distances and constant edge lengths and in turn
visualizes a network as a circle This layout helps give an
overview of identified clusters in a network
Clustering process is visualized by a dendrogram
which is a tree diagram frequently used to illustrate the
arrangement of the clusters produced by hierarchical
clustering A dendrogram helps show hierarchical
structure among clusters and therefore understand how
identified communities are related each other
III RESULTS
We applied our method to actual email data from one
firm We collected two sets of one-month email log in
September 2008 (data1) and in June 2009 (data2) in order
to chronologically compare and analyze any changes The
data1 includes emails of 2,882 employees and the data2
includes emails of 2,459 employees For reasons of
privacy and complexity, we only used emails that had an
internal origin and destination within the firm
Table I shows properties of a network from the data1
Each node has 51.77 links on average and the whole
network showed power law in degree (see Fig 1.) It also
has “small-world” properties where clustering coefficients
are much larger than the ones of random network (0.387 /
0.01) and the path length (2.67/ 2.74) is close to the one of
random network (see TABLE I)
Table II shows properties of a network from the data2
Each node has 36.151 links on average and the whole
network showed power law in degree (see Fig 2.) It also
has “small-world” properties where clustering coefficients
are much larger than the ones of random network (0.377 /
0.01) and the path length (2.72/ 2.76) is close to the one of
random network (see TABLE II)
We applied our algorithm as described above to
identify the communities within the network We obtained
seven distinct clusters from the data1 as shown in Fig 3
We also identified the hierarchical structure among
communities from the data1 as shown in Fig 5 From the
data2, we obtained four distinct clusters as shown in Fig
3 Consequently, we indentified the hierarchical
structure among communities from the data1 as shown in
Fig 6
We manually checked division that each employee in
a cluster belongs We found that each cluster nearly
corresponded to one or combination of some divisions in
the firm We showed the results to people from the firm
and conducted interviews They agreed with that both
identified communities and hierarchical structure reflect
actual status of organization structures of the firm They
pointed out that some identified clusters fit informal
communities that play important roles in the organization
management
We also showed people who have high network
centralities in a community They recognized most of
people who have high network centralities as key persons
in the firm However, they also find some people who
they did not expect have high network centralities In fact,
the further interviews reveal that such people have potential leadership for the organization management In particular, we found that both betweenness and pagerank
is a good indicator to detect such hidden leadership among the centralities
Fig 1 Degree distribution of the email network (2008.09)
Fig 2 Degree distribution of the email network (2009.06)
TABLE II PROPERTIES OF THE EMAIL NETWORK (2009.06)
2,459 36.151 0.377
(0.010)
2.72 (2.76)
n: number of nodes, k: number of links C: Clustering coefficient, L: Average path length
TABLE I PROPERTIES OF THE EMAIL NETWORK (2008.09)
2,882 51.77 0.387
(0.010)
2.67 (2.74)
n: number of nodes, k: number of links C: Clustering coefficient, L: Average path length
Trang 4Fig 3 Clusters of the email network (2008.09)
Fig 4 Clusters of the email network (2009.06)
Fig 5 Dendrogram of Clusters of the email network
(2008.09)
Fig 6 Dendrogram of Clusters of the email network
(2009.06)
IV DISCUSSION
A Small World
The email communication network maintains the properties of scale-free and “small-world” network The number of nodes, or senders and receivers of emails, has decreased drastically by 14.7% from 2,882 to 2,459, comparing with the previous period in September 2008 The number of edges, or email communication links between nodes, has decreased by 40.4% from 74,601 to 44,448 The average degree, or average number of people the nodes communicate with email, has also decreased by 30.2% from 52 to 36 Clustering coefficient, the tendency
to group together, has decreased by 2.5% while average path length has increased by 2.1%
The results indicate the facts that there were drastic reduction of email users and changes in email behavioral pattern among employees The number of email communication in the organization has been reduced The scope of communication rather focused than the previous period
The interview with the managers revealed the company offered a voluntary early retirement program for highly paid seniors and managers to improve the company’s income statement Consequently, the organization was slimed down and restructured The concurrent reduction of both overtime and number of workers gave the employees time pressures to reduce issuances of emails In the past the seniors and the managers retired early had to be included in the communication network The early retirement of those people influenced the reduction of direct emails as well as carbon copies for red tapes
The top management realized its intention for higher productivities by reducing inputs of the management resources even though the sales have radically decreased during the global recession The analysis shows the organization as a whole accommodate the challenge of time constraint with the radical reduction of email time along with preparation of attachments as one of the means
The email network with clusters represents one aspect
of the organizational reality The number of clusters has decreased from 7 to 4 The previous cluster C was merged with A, forming a cluster of 770 nodes The previous D remains as the smallest cluster D of 19 nodes, and the previous cluster B remains as the present B of 265 nodes The previous clusters E and G were merged with the cluster F, forming the largest present cluster F of 1,405 nodes
According to the dendrogram analysis, the previous clusters A and B had stronger tie with each other However, after the major organizational restructuring, the clusters B and F are closer now As the cluster D supplies parts to the cluster A, they remains close relationship Proceedings of the 2010 IEEE ICMIT
Trang 5In the previous section, we observed the productivity
increase of the new organization after the major change
On the other hand, we can deduce the decrease of the
emails with lower priority, taken over the necessary
work-related emails The majority of the informal layer of
communities was removed from the email communication
network With these assumptions, the current
communication network with clusters represents rather
job-related communication network
According to the interview, the top management
aimed the integration of headquarters with business
divisions The largest cluster F demonstrates the
integration of headquarters functions and one business
division as well as its business branches Physical
locations and peer human relationships became less
significant than work relationship in the dendrogram That
is evidenced by the merger of the cluster E and the cluster
G with the cluster F
Although the physical locations of the clusters A and
B are close, on business basis, the cluster B is now closer
to the cluster F However, the independence of the cluster
A was emphasized as a self-sufficient organization The
phenomenal observation of the changes in the clusters is
meaningful for the evaluation of the top management’s
intention of the organizational change and reshuffle of
managers
The email network analysis with communication
network with clusters provides the top management with
rather objective pictures of before and after the
organizational changes as well as its environmental
changes This is a powerful feedback for the top
management team to evaluate the organizational status
and performance of their strategies As changes become
faster and stronger in magnitude of turbulence in business
circumstances, quick feedback and chronological database
surely assist the organizational leaders for effective
management
C Individual Centralities
None of the top 30 employees in the previous
pagerank or betweenness lists was ranked within the
current top 30 this time In other words, the people with
high scores in the communication importance and bridge
were replaced with the new groups On the other hand,
according to the interview, the pagerank and betweenness
were still indicators of potential leaders The
communication structures were dynamically changed
through the major restructuring
The managers told us in the interview that a year ago,
the degree centrality of administrative assistants, office
clerks, and people who had established their own informal
networks over long periods of their career in the
organization was higher However, the analysis of data2
showed that new divisional managers’ degrees were
higher The returned overseas expatriates and young
analytic engineers were with higher scores than before
Owing to the early retirements of seniors in their 50’s,
the new organizational communication network was
shifted toward the healthy directions as the top management intended
The innovative leaders have created the environment
of the knowledge interactions through communication among their members and the ecosystem of knowledge creation As the cores of the communication network clusters, they have managed the effective communication through their strong visions of the organizational success The visualization of such leaders and their communication patterns as the managers of successful teams has helped the top management design and implement its strategy for the innovation management
D Future Study
In near future, we plan to narrow our focus organization down to engineering groups and their interactions with the entire organization Recently, for innovative stimulus the top management engineered the system of an internal engineering community program within the engineering organization An engineering community is engineers’ group of one technological element across the organization The members of cross-divisional communities are from novices to experienced specialists The top management needs to evaluate the activities of leaders and their communities We are going
to analyze the email communication networks with clusters and network centralities of the engineering communities We also plan to compare the innovative activities before and after the engineering community program introductions
V CONCLUSION
We observed the chronological phenomena of managerial decisions of organizational changes with email log data by network analysis Characteristics changes of communities are clearly curved in relief with cluster analysis Leadership roles have not changed between before and after analysis while leaders reduced their influences as bridges among communities
Our method helps management systematically view its organization as a whole by using email network analysis The email network analysis can be used to evaluate communication of interactions among the members It also helps identify candidates of leaders acting as a hub of information channel of the communication network
Formal organization would be evaluated with informal communities before and after major organizational changes There are traditional interviews and questionnaires to capture a state of organization Email network analysis provides with one more significant, objective, and analytical tool in a manager’s tool box
Trang 6REFERENCES [1] J Mori, H Tashiro, K Haraoka, and K Matsushima,
“Identifying Informal Communities and Leaders for
Total Quality Management using Network Analysis
of Email,” In Proc of the International Conference on
Industrial Engineering and Engineering Management
(IEEM), 2009
[2] B A Huberman and T Hogg, “Communities of
Practice: Performance and Evolution”
Computational and Mathematical Organization
Theory, Vol 1, pp 73-92, 1995
[3] D Crane, Invisible Colleges: Diffusion of Knowledge
in Scientific Communities University of Chicago
Press, Chicago, 1972
[4] J Lave and E Wenger, Situated Learning: Legitimate
Peripheral Participation Cambridge University Press,
1991
[5] Chen, J Li, and H Wang, “Structure and Dynamics
of Distributed Leadership in the Perspective of Social
Network Analysis,” In Proc of the International
Conference on Industrial Engineering and
Engineering Management (IEEM), 2008
[6] T Allen, Managing the Flow of Technology MIT
Press, 1984
[7] L Garton, C Haythornthwaite, and B Wellman,
“Studying online social networks,” Journal of
Computer-Mediated Communication, Vol 3, No
1, 1997
[8] B Wellman, “Computer Networks As Social
Networks,” Science Vol 293, No 14, 2001
[9] L A Adamic, and E Adar, “Friends and Neighbors
on the Web,” Journal of Social Networks, Vol 25,
No 3, 2002
[10] S Whittaker and C Sidner, “Email Overload:
Exploring Personal Information Management of
Email”, in Proc of CHI ’96, pp 276-283, 1996
[11] N Ducheneaut and V Bellotti, “A Study of Email
Work Processes in Three Organizations,” Journal of
CSCW, 2002
[12] M F Schwartz and D C M Wood, “Discovering
Shared Interests Among People Using Graph
Analysis”, Communications of the ACM, volume 36,
issue 8, pp 78-89, 1992
[13] A Culotta, R Bekkerman and A McCallum,
“Extracting social networks and contact information
from email and the Web,” First Conference on Email
and Anti-Spam, 2004
[14] R S Burt, “Models of Network Structure”, Annual
Review of Sociology, Vol 6, pp 79-141, 1980
[15] W R Scott, Organizations: Rational, Natural, and
Open Systems Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1992
[16] J R Tyler, D M Wilkinson, and B A Huberman,
"Email as spectroscopy: Automated discovery of
community structure within organizations," The
Information Society, Vol 21, No 2, pp 143-153,
2005
[17] J Diesner and K M Carley, “Exploration of communication networks from the Enron email corpus,” Proceedings of Workshop on Link Analysis, Counterterrorism and Security, SIAM International Conference on Data Mining, pp 3-14, 2005
[18] M E J Newman, “Fast algorithm for detecting community structure in networks,” Physical Review
E, vol 69, art no 066133, 2004
[19] M E J Newman and M Girvan, “Finding and evaluating community structure in networks,”
Physical Review E, vol 69, art no 026113, 2004
[20] R Guimerà, M Sales–Pardo, and L A N Amaral,
“Modularity from fluctuations in random graphs and
complex networks,” Physical Review E, vol 70, art
no 025101, 2004
[21] L C Freeman, "Centrality in Social Networks: Conceptual Clarification," Social Networks, Vol.1, pp.215-239, 1978
[22] A Frick, A Ludwig, and H Mehldau A fast adaptive layout algorithm for undirected graphs In R Tamassia and I G Tollis, editors, Graph Drawing (Proc GD ’94), volume 894 of Lecture Notes Comput Sci., pages 388–403.Springer-Verlag, 1995 Proceedings of the 2010 IEEE ICMIT