Qur research is aimed at taking those advantages of Bayesian Networks into software project scheduling by addressing common software project features ‘The research provides answers to th
Trang 1MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
Major: Software Engineering Code No.: 9480103
PhD DISSERTATION ON SOFTWARE ENGINEERING
SUPERVISORS:
1 Assoc Prof Huynh Quyet Thang
2 Dr Vu Thi Huong Giang
Hanoi — 2021
Trang 2DECLARATION:
I certify that this thesis and the work presented in it are products of my
own work, and that any ideas or quotations from other people work published
or otherwise, are fully acknowledged in accordance with the standard referen- cing practices of the discipline
‘This thesis has not been submitted for any degree or other purposes
Trang 3Acknowledgements
First of all, T would like to express my sincere gratitude lo my Grst supervisor Associate Professor Huynh Quyct Thang for his invaluable guidance and support throughout my research Professor Thang has supporied me all the way, all the time 1t is his patience that keeps me always committed to doing this research and reaching the end of PhD student period I am also vory grateful to my sccond supervisor Dr Vu ‘thi Huong Giang whose bright hints and expertise has been always helptul to me,
My special thanks go to Ms Vo Thi Huang, Ms Bui Thi Quynh Nga, Mr
Tran Trung Hieu, Mr Tran The Anh, Mr Tran Bao Ngoe and Mr Cao Mant
Quyến, who were masler and bachelor students af, School of ICT, Hanoi Universily
of Scienoe and Technology and helped me with building the tools as well as testing
our models
I am also indebted to Dr Nguyen Thanh Nam (former CEO of FPT and
former President of FSOFT), Mr Luu Quoc Tuan (Tinh Van Outsourcing Jsc.), Mr
Ngo Quang Vinh (Bvizi), Mr Nguyen Huy Binh (FIS) who provide helpful real
software project data and valuable expertise judgments on the data
Finally, my greatest appreciation is to my family, especially to my wife Tran
‘Thi Bich Ngoc Without their love, patience and sacrifice, this achievement would never be possible.
Trang 4Summary
$oflware projecl management is an art and scicnce of plarming and leading
soflware projects In software industry, projecl managers mostly rely on their
experience and skills (o manage their projects and lack of scientific tools to suppor
them
Risk management is a crucial part of software project management that helps
prevent software disasters In this research, risks are defined as uncertain events or
conditions that, if they occur, they would have a bad impact ơn one or more
software project outcomes (cost, time, quality) Tdentifying and dealing with risks or uncertainty in early phases of software development life cycle would lessen long-
term cost and enhance the chance of the project success The mos important, part of
risk management is risk analysis which assesses the risks and their impact to the
outputs of the soflware project To overcome subjective assessment based on
development team’s experience, the team needs 4 quanililative risk analysis method
Software project scheduling is one part of software project planning Since in practice, most software projects are aver-budget and behind schedule, software project scheduling needs to be taken into careful consideration We come up with the following questions:
« How to schedule software projects better?
« How to better manage risks in software projects?
« How to quantitatively analyse risks?
Some researchers say that Bayesian Networks can be used to quantify uncertain factors in (general) project scheduling and improve project risk assessment and analysis Qur research is aimed at taking those advantages of Bayesian Networks into software project scheduling by addressing common software project features
‘The research provides answers to the above questions with probabilistic approaches and tools to assess the impacts of risk factors on software project
scheduling: proposing list of common risk factors and Bayesian Network model of these risk factors; and proposing advanced scheduling methods based on incorporating Bayesian Networks into popular scheduling techniques such as CPM, PERT or agile iteration scheduling etc Bayesian Networks help quantify the factors, and hence help better manage them as well as enhancing the predictability
of things happen in the project,
Trang 5This rescarch first takes a Literature review on (general) project planning issucs, project scheduling techniques, project scheduling tools, uncertainty and risk characteristics in software projects, risk management processes, project risk analysis
in order to apply state-of-the-art techniques into software projects (Chapter 1)
Aller thai, Bayesian Networks are applied in building and experimenting risk factors in sofware project scheduling BRI (Bayes Risk-Tmpact) algorithm is
proposed to ass
risk factors’ impacl on soflware scheduling (Section 2.1) The first set of risk factors with 3 risk factors are examined using a probabilistic owie
built tool CKDY to analyse risks in software project scheduling (Section 2.2
The research proposes an advanced algorithm for agile iteration scheduling
using Bayesian Networks The advantages of this method are praviding a schedule
and the probability of finishing agile ileration on lime (Section 3.1) In addition, the author goes further with a more refined list of 19 risk factors in software scheduling and uses thom in sofiware scheduling methods The research also incorporates
Bayesian Network with CPM and PERT scheduling techniques in traditional
sollware projects Logether with the Bayesian Networks of commơn risk [actors
(Section 3.2 and Section 3.3) The list of 19 risk factors im agile software
development is also examined in agile iteration scheduling (Seclion 3.4) The experimental results show that our models are reliable and our approaches have practical implications, ic we can take advantage of Bayesian Networks in
modelling and quantifying risks/mecrlainty in software projects
Trang 6How to read this report?
‘The author highly recommends that you read this report from beginning to the end However, if at any point you want to look at specific important pieces of information, the following guide could be helpful:
To get the motivation, the overview of related work, the objectives, the
scape, the hypothesis and methodology of this research, please go to the
Introduction section
‘To get an overview of software project scheduling and risk management in
software project scheduling, please go to Sections 1.1, 1.2 and 1.3
To gel an overview of Bayesian Networks, please go to Section 14
To get details on main contributions and key findings of the research, please read Chapter 2 and Chapter 3
To get information on common risk factors in software project scheduling,
you can have a look al Seclion 2.3
The Chapter 2 is about building tools and doing experiments on applying Bayesian Networks into risk management in software project planning (Section 2.1) and some key risk factors (Section 2.27
The Chapter 3 is about incorporating Bayesian Networks and common risk
faclors into software project scheduling techniques such as CPM (Section 3.2), PERT (Section 3.3), Agile software development scheduling (Section
3.4)
To get to know the conclusions, the limitations as well as the further research
of the study in this PhD thesis, please read the Conclusion section
Trang 7How to read this report?
List of symbols and abbreviations
1.2.3 Agile software project scheduling
1.3 Risk management in software project scheduling
Trang 81.3.1 Overview of project risk management 1.3.2 Project tisk analysis
Chapter 2 Common risk factors and experiments on Bayesian
Networks and software project scheduling
2.1.3 Risk impact calculation
2.1.4 Bayesian Risk Impact algorithm
2.1.5 Tool and expBrimerifs 61
2.2 Experiments on common risk factors
2.2.1 Discovering the top ranked risk factors 68
2.3 Proposed common risk factors in software project scheduling
2.4 Chapter rernark:
Trang 9Chapter 3 Incorporation of Bayesian Networks into software project
3.1.4 Tool and experimentat results 90
3.2 Ineorporstion of Bayesian Networks inio CPM „84
Trang 11List of symbols and abbreviations
10
Trang 12
25 RAMP Risk Analysis and Management for Projects
11
Trang 13List of tabies
Table 1.1 Basic mathematical notations used for CPM calculation _ Table 1.2 The differences between waterfall and agile proje
Table 2.1 Hui and Liu’s common risk factors [9]
'Table 2.2 Risk factors in the phas6s
Table 2.3 Risk factors, consequences
Table 2.4, Examples of risk factors and probabilitics .70
‘Table 2.5 Probability of risk factors in the whole pr oject with data s set t1 74 Table 2.6 Probability of risk factors in the whole project with data set 2 75 Table 2.7 Probability of the experimental risk factors to compare with MSBNx 76
Table 29 List of 19 common risk factors for software project ‘scheduling 19
Table 2.10 List of 5 risk factors for sofware project scheduling in Section 2
Table 2.11 List of 19 risk factors im iteration scheduling
‘Table 3.1 ‘The first data sample
Table 3.2 The probability table for tasks andre resources
Table 3.3 Risk factors analysis
‘Fable 3.4, Data sample 1
Table 3.6 Task aliribules of the frst data saraple - - 110
Trang 14
List of figures
Figure 1.1 Activities of project management according to PMBOK Guide 25
Figure 1.2 CPM parameters im an aclivily -
Figure 1.3 An example of BN which represents a simple case
Figure 2.1 A sub BN for the risk factor “Staff experience shortag
Figure 2.2 A sub BN for the risk factor “Reliance on few key person
Figure 2.3 A sub BN for the risk factor “Schedule prcsaure”
#igure 2.4 A sub BN for the risk factor “Lơw produetivity”
Figure 2.5 A sub BN for the risk factor “Lack of staff commitment”
Figure 2.6 A sub BN for the risk factor “Lack of client support”
Figure 2.7 A sub BN for the risk factor “Lack of contact person competence” 50
Figure 2.8 A sub BN for the risk factor “Lack of quantitative histarical data”
Figure 2.9 A sub BN Cor the risk factor “Tnaccurate cost estimating”
Figure 2.10 A sub BN for the risk factor “Large and complex external interface” 5
wow wag
Figure 2.11, A sub IBN for the risk factor “Large and complex project” 51 Figure 2.12 A sub BN for the risk factor “Unnecessary features” 52 Figure 2.13 A sub BN for the risk factor “Creoping user requirement” 52 Figure 2.14, A sub BN for the risk factor “Unreliable subproject delivery” 5
Figure 2.17, A sub BN for the risk factor “Lack of organization maturity 54 Tigure 2.18 A sub BN for risk faotor “Emmature teclmology” 34 Figure 2.19, A sub BN forthe risk factor “Inadequate configuration contrel” 55 Figure 2.20, A sub BN for the risk factor “Lixoessive paperwork”” 55 Figure 2.21 A sub BN for the risk factor “Inaccurate metrics” 56 Figure 2.22 A sub RN for risk factor “Exccssive relianee on a single process
Figure 2.23 A sub BN for the risk factor “Lack of experience with project
A-simple example of Bayesian inference
The three nodes of a simple-chain BN Figure 2.28, The graphical interface of the tool
Figure 2.29 Result of exporiment 1
Figure 2.30, Results of the three experiments ¬
Figure 2.31, Experimental results for Software Design phase - - 66
Trang 15Figure 2.32 Sub BN1 — seo ỐỔ
Figure 2.33 Sub lN 2 ¬ ẻ.ẻẻẻẻ-‹¿
Figure 2.35, Experimont with j30 with the carly start schedule 74 Figure 2.36, Activity joint in the file j301_1rep 75 Figure 2.37 Diagram of probabilities of finishing phase by phase 15
Figure 3.2 Gantt chart for SPT strategy seco sustenance 92 Figure 3.3 A part of a BN for 19 risk factors 95 Figure 3.4 Task’s parameters and coumection lo other tasks 98
Figure 3.5 À sơreenshot of RBCPM
Figure 3.6 A result for experiment with data sample 1
Figure 3.7 A result for experiment with data sample 2 103
tigure 3.9 Risk imtegreiơn network model into PHK'T seheduling, 106
Figure 3.11, The input sorcen of the RBPERT tool - eee 108 Figure 3.12, The input file type of the RBPERT tool 108
Figure 3.16 A screenshot of tool BAIS ào LES
14
Trang 16Introduction
Motivation
Projects in general always involve risks and project managers’ regular worries
are concerns about risks In October 2008, the Ilanoi Urban Railway Project Line
2A (Cal Linh-Ha Dong) was approved to be invested with dhe total budget of more
than 8,700 billion VND (552 million USD) Until now, the project’s investment had
almost doubled to 868 million USD It was scheduled to be put into service in 2013
‘out until now the project remains incomplete [1]
Software projects also have schedule risks, and as a consequence, budget or cost
risks For example, the project on the Vietramese National Population Database
was approved to be invested in 2015 [2] and was planned to be finished in two years
(2016 and 2017) Liowever, the system can only be put into operations in !'ebruary
2021 Another similar example is the project on Vietnamese National Public
Service Porlal which was planned to come public in Seplember 2016 [3] but was
only opened since December 2019 As a matter of fact, the majority of software projects the author has experienced in Vietnam are behind schedule (some of the
projects will be examined in Chapter 2 and Chapter 3)
liven in developed countries, software projects are facing ongoing problems For example, the project Universal Credit - the wellare payment system owned by
the Central Government of the United Kingdom - started im 2013 ‘The project
schedule has slipped, with the final delivery date now expected to be 2021, although
the system is gradually being introduced In 2013, only one of four planned pilot
sites went live on the originally scheduled date, and the pilot was resincied lo
extremely simple cases [4]
Many sofiware projecls have sulfered from significant budget overruns Logether with a series of delays, which cause either temporary issues or permanent failures For example, The Queensland Ilealth Payroll System was launched in 2013 in what
could be considered one of the most spectacularly over budget projects in
Australian history, coming, in at over 200 times the original budget Besides, in spite
of promises that the new system would be fully automated, the new system required
a considerable amount of imanval operation [5] Another example for sofware project permanent failure case is the project o-Borders for an advanced passenger information programme which aimed to collect and store information on passengers
and crew entering and leaving the United Kingdom Started in 2007, the project had
a series of delays and had to be cancelled in 2014 [5]
15
Trang 17Some rescarches pointed cut that most of the software projocts (83.8%) are ovor budget or behind schedule and 52.7% of software development projects deliver
soflware with fewer features than originally specified [7, 8] Statistics also show
that 31.1% of development projects end up being cancelled or tenninated prematurely Among those completed projects, only 61% of them satisfy originally specified features and functions [9] In the software industry, one of the greatest
challenges thal development teams constantly face with is 1o keep the projects under control in terms of budget and schedule (development time frame) The activitics of
a software project are influenced by intemal and extemal factors (fram that project organization) thal make i uncerlam whether the project will achieve ils objectives The effect that this uncertainty has on the project’s goals is called risk |10] In the
other words, risk is an event or an uncertain condition that, if it occurs, will have a
positive or negative effect on at least one of the project objectives [12] In this thesis, risks are defined as uncertain events or conditions that, if they occur, they would have a bad impact on one or more software project outcomes {cost, time,
quality)
‘The above situation raises an important question: how projects’ risks are managed better in order to get rid of the temporary issues as well as preventing from failure?
The purpose of project management is to lead the project to success A successful sofware project certainly relies on many factors (eg following appropriate provsses and tasks, managing risks properly cte.), Sinve risks are inevitable in projects, risk management has become an important part of project management Although many researchers, experts and writers have proposed variely
of processes and techniques, project risk management (PRM) is still rapidly evolving and handling risks in general projects as well as software projects remains
a challenge
Concerning PRM, an important component is risk analysis which also known or considered the same as risk quamtification Risk analysis attempts to measure risks and ther impacts on different project oulcomes (ic, lime, cust, quality) Many software projects fail since project managers mostly plan based on their experience and there is a lack of scientific methods to support them To overcome subjective
assessment based on development team’s experience, the team needs a quaniilalive
risk analysis method, Although various researches have proposed and examined a range of processes and techniques and software project risk management is continuously evolving, handling uncertainty in more and more complex real-world projecis remains a challenge
16
Trang 18Aside from that, project scheduling (a part of project planning — an early phase
of software development life cycle) is concerned with the techniques that can be
employed to manage the activilies that need to be undertaken durmg the
development of a project There are various techniques for project scheduling, from simple and easily understandable ones such as ‘ask List, Gantt Chart, Schedule Network Analysis, to more complicated ones like Critical Path Method (CPM),
Program Evaluation and Review Technique (PERT), Monte-Carlo Simulation
(MCS) or Fuzzy Logie ete [10, 12, 13, 14]
Traditional project scheduling under risk/uncertainly has aliracied more research and attention in the project management community In some of the project management literature in 1990s, “risk analysis” was equivalent to “the analysis of risk on project plan” [15] This thesis focuses on modelling risks in software project time management (of course, it is indirectly related to other project outcomes which are cost and quality) In other words, this thesis concentrates on quantitative risk analysis in software project scheduling
‘The earliest studies incorporating uncertainty/risk in project scheduling were in the late 1950's by Malcolm et al [16] and Miller [17] Since then, a variety of techniques have becn introduced, several tools have beon developed, and many of them are widely used throughout different industries However, they often fail to capture uncertainty properly and/or produce inaccurate, inconsistent and unreliable resulls, especially when applied to sofiware projects which have specifically different attributes to other traditional projects
Project uncerlainly jas several aspeets of which not all can be categorized and
treated as risks Several authors such as Ward and Chapman |18] argued that project risk management should be focusing on managing uncertainty and its various sources rather than emphasizing a set of possible events that might have bad
ampacts on projcet performance (i.c., should be aware more about uncertain aspects rather than fixed set of defined risks) Llowever, since this thesis is about software
project, risks are considered and treated the same as uncertainty Most of
quantilalive techniques and methods in the cwrent practice of project risk
management are based on the “Probability Impact” concept, which have certain shortcomings in terms of risk analysis in project scheduling More sophisticated imetheds and teclmiques are needed lo address as well as managing imporlanl sourees of uncertainty’ risk
Tn software industry, project scheduling also has to deal with the fact that
resourees such as human, lime, technology and money sre not always pro- determined [19] ‘There are always risks in software project scheduling as well In
1
Trang 19most of the projects, the activity (from now on is considered the same as the “task”
in software projects) times are not known for certain Therefore, they may be assumed as random variables
Jurthermore, Bayesian Networks (BNs) have attracted a lot of attention in different fields (construction, R&D ete.) as a powerful approach for decision support under uneerlainty A BN is a graphical and mathematical model which offers a powerful, general and flexible approach for modelling risk and uncertainty Its capability of modelling causality and also conditional dependency between variables make il perfectly suilable for capluring uncertainty in projects Yel, BNs are rarely applied in project risk management in general as well as in software project management and software project scheduling,
‘rhe author of this thesis strongly believes that if we can identify and control risks at early stages of software development project, we can significantly increase
the chance of success of the project Since il is nol easy (or impossible) to control all of the problems or Lactors, this thesis only focus on ume factors which related to
software development schedule
Therefore, this thesis aims at introducing an advanced approach as well as finding a better model for incorporating and managing uncertainty/risks in software project scheduling The idea is ta use BNs to perform the well-known scheduling tecluriques such as CPM, PERT cle as well as modelling risk factors in software project scheduling The proposed approach enriches the benefits of scheduling techniques by incorporating, uncertainty/risk factors and adding the strong, analytical
power of BNs
Related work
‘There have been various researches on applying BNs in to general projects Khodakarami [19] applied Ns into general project scheduling with two case
studies of aircraft design and health and fitness center design and construction Ali
et al [20] combined Monte Carlo Simulation and Bayesian Networks methods to
present a structure for assessing the aggregated impact of risks on the completion
time of a construction project Lee and Shin [21] proposed an application of BNs
jlo risk management of ship building project and proposed 26 risks Sharma and Chanda [22] developed a BN model for prediction of R&D project success which
also assesses based on R&D project risk factors Khodakarami et al [23] also
examined an approach lo gencrate project schedules that incorporates risk, uncertainty, and causality using 13Ns ‘Their model empowered the traditional CPM
to handle uncertainty, and they also provided explanatory analysis to elicit,
Tepresent, and manage different sources of uncertainty in project planning Fenton
18
Trang 20and Neil [25] introduced AgenaRisk a3 a probabilistic tool based on BNs, Chang,
Yu, and Cheng [26] proposed a risk-based Critical Path Scheduling Method based
on 2 risk categories and 7 risk levels which applied into construction projects
Regarding risk factors in software projects, Llui and Liu [9] selected 24 risk factors that may cause potential impacts an (the whole) software project and applied BNa properties in the calculation of impact in their project risk model Kumar and Yadav [24] considered quantitative features and causal relationships among risk factors in saftware projects They introduced a probabilistic approach to assess risks
in software projects as well as proposing @ list of 27 risk factors Gr sollware projects) However, they analysed risks for the whole software projects and did not focus on the scheduling and planning phases which would decide the success of projects Adjusting Kumar and Yadav’s method, this thesis proposes the list of 5 most crucial risk factors as well as building the tool CKDY to examine risks in software scheduling (Section 2.2)
There have been some other researches on BNs and soliware risks’ analysis, Hu
et al [27] studied causality analysis among risk factors and project outcomes for software development projects For this purpose, they proposed a modelling framework based on BNs to deal with causality constraints in risk analysis The developed framework can be used for discovering new causal relationships and validating existing relationships among risk factors and project outcames Anthony
et al [28] proposed a risk assessment model for decision-making int soflware management which consists of processes and component of risk assessment in three groups: operational risks, technical risks and strategic risks Rai et al [29] believed that managing projects is managing risks and identified 43 risk indicators in Agile Software Development
One notable research is from Akos Szoke’s PhD dissertation in 2014 which
proposed an oplimived algorithm for agile sollware projecl scheduling [30]
As can be seen from literature review, much research on software risk analysis focuses on finding oul the relalionsinp risk faclors and soltware outcomes, bul lack
of a quantitative approach and causal relationship between risk factors [9, 24, 31, 32] Some other researches pay attentian to define the quantitative approach and the
causal relationship belween nsk faclors and assess risks for the whole sofiware
project [33, 34] but does not pay enough attention to medel risk factors from the scheduling, (in the planning) phase — the phase decides the failure or success of the
project later on
J Yong and Z Zhigang [35] proposed a PRT Bayesian Network (PERN) model with the modelling methodology and the conditional probability calculation
19
Trang 21method of different kinds of procedwe arangement (single-chain, centralized, distributed) and stated that with PERN model, the effectiveness of the project schedule control and oplimivalion are ensured However, the research did nol cxamine more in-depth on the risk factors or other specific software features that can have impacts on the project schedule
Tu addition, there is always a need for properly schedule control in software projects to determine the instant status of the schedule, to know if the schedule has changed, and to embrace changes when they occur In order to do that, influential
factors thai cause schedule changes need to be carefully considered
In summary, current researches related to this thesis are either on risk management or assessment for the whole software project or for other project
(construction, building, R&D ete.) scheduling ‘here is a need of probabilistic method on risk management in software project scheduling as well as examining
deeper the risk attributes of soflware project scheduling
Research scope
The research is about software projects (or software developmont projccts), having common features and also specific features in comparison to other type of
projects (such as construction projects, R&D projects ele.) Unfortunately, there
have been only a few good researches on applying probabilistic methods on software development projects ‘Therefore, this method first has a literature review
on common projects to look for approaches applied for them, and after that
proposes the approach applied for sofiware projects
The scope of this research is on risk management in software project scheduling This is quanlitative risk management which concerns about risks affecting project schedule (or project time frame) 1n terms of project scheduling, techniques, this thesis focuses on the most popular techniques such as CPM, PERT for traditional software development projects, as well as Agile software project scheduling
Research objectives
The main objectives of this research are:
1) To find out a quantitative method to better assess and analyse risks in software project scheduling In order to achieve this objective, the research has to
answer to following questions: what are the sks’ altributes of sofiware project
scheduling? How to manage risks in software project scheduling better?
20
Trang 22In other words the research aims at analyzing and modelling risks in software project scheduling,
‘Yo find out a probabilistic method to improve well-known software project scheduling techniques, including both techniques for traditional software scheduling and agile software scheduling
‘The proposed methods and models would enthance risk management process by
a quantitative assessment of risks impact on software project scheduling If we
apply this model and method in practice, the aulhor of this thesis expeel that iL would help predict, monitor project schedule better as well as making appropriate
decisions
Scientific and realistic meaning
The proposed methods and model would enhance risk management process by a quantitative assessment of risks impact on software project scheduling,
Tf we apply this model and method in practice, i would help predict, monitor
project schedule better as well as making appropriate decisions
Research hypothesis and methodology
The hypothesis of this thesis is that it is possible to use BNs to quantify
uncertainly in software project scheduling and improve sofware project risk agsessment
Since there is very limited research on this topic, the research methadolagy
comprises a literature reviews from general project management to get the relevant
ideas for software project management Firstly, a literature reviews to investigate
the current slale of project scheduling under uncertainly which determines the need,
scope arid cbjcotives of the new approach, Scvondly, a literature review follows on the background, theory and application of BNs ‘this provides the conceptual and the fundamental background for the new approach
‘The research also examines the features of software projects, both in waterfall model and agile software development model In order to handle risks in software projeel scheduling, the common risk (aclors arc also needed to be examined
Within the research, tools are built to validate the models and help software
projeck managers in assessing risks and making appropriate decisions,
Expected results
Following the above methodology, the author expeats to:
21
Trang 231) Apply Bayesian Networks to dovolop an algorifim and tool to assess the impacts of risks and hence proposes common risk factors in software project scheduling
2) Apply Bayesian Networks to develop a probabilistic approach to enhance the common scheduling techniques (for both traditional software development and agile
software development) in terms of risk management and prediclabilily
Structure of the thesis
An overview of the main chapters is as follows:
Chapter 1 brictly reviews software project scheduling and software project risk management process and explores the curently popular techniques in project scheduling
Chapter 2 consists of initial attempts of applying I3Ns into risk management in software project scheduling as well as experiments on common risk factors in sollware project scheduling 19 common risk factors for both baditional sofiware development projects and agile sottware projects are proposed
Chaplr 3 meopomlcs BNs into popular sofiware project scheduling techniques, namely CPM, PERT and agile software scheduling Ns are also applied in examining the relationships among risk factors proposed in Chapter 2
Chapter 4 concludes the thesis and poinls the way forward [or future research,
developed the algorithun BRI (Bayes Risk-Impact) and the tool CKDY to assess the impacts of risks and hence proposes common risk factors in software project
contributions and results of the research:
scheduling Based on literature review and experiments, the research has come up with 19 common risk factors in sofware project scheduling (for both agile development style and traditional development style)
The research also proposes advanced scheduling methods in software project development, ‘The methods based on incorporating Bayesian Networks and common risk factors models into popular software scheduling techniques such as PERT, CPM, and Agile software development, with the examination of (ke model of 19 common risk factors, ‘Tools have been built to experiment the proposed scheduling, methods and models Experimental results show that the proposed methods and models are reliable as well as providing practical value to software development teams in analyzing, monitoring and predicting risks and the chance of success of the project
22
Trang 24Chapter 1 Overview of software project scheduling
and risk management
1.1 Software project management and software project
scheduling
1.1.1 Software project management
Software project management is an art and science of planning and monitoring
software projects It refers to the branch of project management dedicated to the
planning, scheduling, resource allocation, implementation, tracking and delvery of software and web projects [36, 37]
There are various types of projects (R&D projects, construction projects, information system projects, software projects ete.) which are associated with different styles of management Software project management is quite distinct from
tradilional or other projecL management Firstly, sofiware is developed, nol
manufactured Therefore, the product (working software) is intangible and uniquely flexible Secondly, software engineering is not recognized as an engineering
discipline with the sare status as mei
software projects have a unique lifecycle process that requires multiple rounds of testing, updating, and customer feedback ‘Chat software development process is not standardized Lastly, most software projects are “one-off” projects Software development tcam can only usc similar experience, not the same experienes or repeated process
cal, electrical engimeering elc Moreover,
Therefore, software project management is about the methodology to organize all activities related to the software We always need project management since software projects always have constraints of budget and time frame
Nowadays, most I'I-related projects are managed in the agile style and software
is developed in groups, in order to keep up with the increasing pace of business, and iterate based on customer and stakeholder feedback Resides being used in IT- relnled projects, Agile style has also been increasingly used in other project management
The project manager leads the project team and often plays the central role among the investors (or customers), the suppliers and the senior management of the organization He or she makes sure the project complies with the constraints as well
as delivering the producl (soflwarc) on lime Software project mariagers may have
to do any of the following tasks: [37]
23
Trang 25- Planning and scheduling: This means putting together the blueprint for the entire project from ideation to fruition It will define the scope, allocate necessary
resources, propose the limeline, delineate the plan for execution, lay oul a
conmmunication strategy, and indicate the steps necessary for testing and maintenance
- Loading: A software project manager will need to assemble and lead the project team, which likely will consist of developers, analysts, testers, graphic designers, and technical writers This requires excellent communication, people and Ieadership skills
- Execution’ The project manager will participate in and supervise the successful execution of each stage of the project This includes monitoring progress, frequent team check-ms and creating, status reports
~ Time management: Staying on schedule is crucial to the successful completion
of any projcot, but it is particularly challenging when it comes to managing software projects because changes to the original plan are almost certain to occur as the project evolves Software project managers must be experts in risk management and contingeney planning to ensure forward progress when roadblocks or changes
occur
- Budgel: Like traditional project managers, sofware projecl managers are tasked with creating a budget for a project, and then sticking to it as closely as possible, moderating spend and re-allocating funds when necessary
- Maintenance: Software project management typically encourages constant product testing in order to discover and fix bugs early, adjust the end product to the
customer's needs, and keep the project on target The software project manager is
responsible for cnsuring proper and consistent testing, ovaluation and fixes are
According lo Project, Management Instilule (PMN in Project Management Body
of Knowledge (PMBOK) guide [11], project management includes five stages or process groups: Initiating, Planning, Executing, Monitoring and Controlling, and Closing (Figure 1.1)
24
Trang 26In modem software project planning, the two csscntial tasks are project risk management and project scheduling ‘They play crucial roles to make sure the project is effectively and efficiently organized, including resources (hardware, software, and network) allocation, task and personnel assignment and monitoring [11, 14] Software projects are quite different to other projects since software requirements are continuously changing, (during software development life cycle), soflware projects are ollen behind schedule and over budgel Moreover, in reality, many software project managers cither ignore or do not take appropriate risk management This leads to project failure or customer complains on the quality, the achedule or the over budget of the project Some olher projeck managers who are aware of risk management, but they only rely on their own team skills or experience, even if they follow the capability maturity models CMM/CMMi (Capability Maturity Model Integration) or DPMP (Project Management Professional) [38] As can be seen in Figure 1.1, risk management affects all the processes in Process Groups In addition, project teams could adjust or update the planning process while they are executing, monitoring and controlling their
Figure 1.1, Activities of project management according te PMBOK Guide
1.1.2 Software project scheduling
Software project scheduling is onc of the most demanding tasks for software project managers It is all about resources allocation during the project life cycle In
hoduling is splitting the whole project into smaller
simple words, software project
tasks and estimates the required time and resources to complete each task Software development teams normally try to organize tasks concurrently to make optimal use
of workforce as well as minimizing task dependencies to avoid delays caused by
25
Trang 27one task waiting for another to complete In reality, software projeot scheduling is dependent on project managers’ intuition and experience
In real-life software project, a schedule is represented as a set of activity diagrams (Work Lreakdown Structure, Activity Charts) which clarifies the dependencies between activities (tasks) and personnel assignment
1.2 Software project scheduling methods and techniques
1.2.1 Overview
There are many popular techniques for project scheduling, inchide:
« Graphical representations used to illustrate the project schedule such az
© Work Breakdown Structure: show project breakdown into tasks
© Activity Charts: show task dependencies and the critical path
© Gantt Charts: Bar charts show schedule against calendar time
Critical Path Methad — CPM [14, 19, 23, 39]
«Program Evaluation and Review Techmauc — PERT [16, 17, 19, 40]
Project scheduling (especially under uncertainty) is the most widely studied
area of risk quantificabon in projecl management Producing a reasonable and
reliable project schedule 1s one of the crucial tasks of project managers Moreover, having a realistic schedule for the project is one of the most cited factors of project
success [41] Several techniques are proposed for modelling risk and uncertainty in
project scheduling [14, 40, 42]
This section reviews some notable techniques CPM and PERT are the classical approaches for project scheduling, Simulation-based techniques are more modem approach that is adopted by many project management software tools and some argue the best practice available Altemative approaches are Critical Chain Method and Fuzvy logic will be reviewed briefly Last but not least, scheduling techmque and method for agile software development will also be discussed
1.2.2 Traditional scheduling methods and techniques
a) Critical Path Method (CPM)
Critical Path Mcthod (CPM) is onc of the most famous techniques in project scheduling Developed in 1957 by DuPont, CPM has become the standard technique
in project management and most project management tools support CPM
26
Trang 28calculation [39], According to Pollack-Johnson and Libcratore [43], almost 70% of project managers or professionals use CPM CPM calculation includes the following steps
Specify the individual activities using a work breakdown struclure
Determine the sequence of those activities and dependency between them
Draw a network diagram (that models the activities and their dependency) Estimate the completion time (duration) for each activity
Tdontily the critical path (dhe shorLest-duration path through the network)
Update the CPM diagram as the project progresses
The basic mathematical notations used for CPM calculation is shown in the Table 1.1 Tn fact, the parameters D, FS, EF, LF, 1S are common used in scheduling
socket alayt ei” 7 tis one of the
4 pF Latliest finish of DEE}= ES; 4D;
Bị
sesLEnehoF L=Mm[LIk kis one of the
27
Trang 29{19J, The connections among thesc parametors in an activity are desoribed in Figure
12
‘Therefore, CPM is a deterministic model which uses a fixed time estimate for activities Although CPM (“pure deterministic in nature” [25]} was not developed to handle or quantify uncertainty, it does provide very useful information about
relations belweer aclivilics, activiies tine and the overall project schedule (so [hal
project scheduling can be controlled)
Figure 1.2 CPM parameters in an activity
4) Program Evaluation and Review Technique (PERT)
PERT was itraduced in 1957 by the US Navy as onc of carliesl research
incorporating risk in project management [17, 19] A special feature of PERT is its
ability to handle uncertainty in activity duration This means if there is a variation in
time eslinale of an activity: il may alfcol the whole projecl, PERT methodology is
developed to help completing the project successfully when the time estimate is not definitive
In order to do that, instead of a single estimation in CPM, PER'T provides a beta probability distribution to each project activity Three time estimates (optimistic, most likely, and pessinuslic time estinaies) can be obtained and can be used lo estimate the expecied time and the standard devialion Lor an activity i
28
Trang 30Optimistic time estimate is the estimate detormined considering all favorable conditions; i.e in the best-case scenario or when everything goes right In other
words, this is the shortest time in which the activity may be completed
Most likely time estimate is the time duration where there is a high probability
of completing the activity within the given time duration In other words, it is the estimate in case of normal problems or opportunities
Pessimistic time estimate is the estimate determined when we consider all
unlavourable conditions, ic im the worst case scenario or when everything goes completely wrong In other words, this is the longest time the activity might require
to complete
Lxpected time: pi = (Optimistic + 4xMost likely + Pessimistic6
Slandard deviation: oj (Pessimistic — Optimisticy6
The critical path is the sequence of project activities that determines the earliest
time by which the project can be completed, and the total duration determines the
completion date of the project, PER'T assumes that only one path is the critical path and that the path does not change Therefore, managers using PERT are advised to
focus on these critical activities to ensure the project completion date remains
unchanged The expected value of a critical path is calculated by the expecled value
of each activity, and the variance of the critical path is the sum of the variances of
all activities in the path Based on the calculation, the probability that the project
wall be completed by a certain date can be calculated Therefore, PERT is somehow
similar to CPM The main difference is that each activity ina PERT network bas a variance associated with its completion time In other words, CPM is deterministic,
while PERT is somehow probabilistic
©) Simulation-based techniques
Monle Carlo Simulation (MCS) was first proposed for project scheduling in the
early 1960s [44] owever, it was not until the 1980s when sufficient computer
power became available that simulation became the dominant technique for
handling risk and ur
uses the project activity diagram
wlainty in projecis |45, 46] In its simplest approach, MCS
The duration of each activity is estimated by shortest, most likely and longest
duration and also the shape of the distribution (such as Normal, Beta etc.) ‘hen
critical path calculation is performed several times, each time using random values
from the activities’ distribution function
29
Trang 31More advanced tools like PertMaster (Oracle Primavery Risk Analysis |47|) use simulation-based approach not only for handling uncertainty in duration and cost, but also for providing a whole risk analysis process They can link the project schedule to the risk register and apply simulation-based techniques to cary out probability impact analyses
A survey by the Project Management Institule [48] showed (hat nearly 20% of project management software packages support Monte Carlo Simulation Another survey by Pollack-Johnson and Liberatore in 2003 [49] found that 17% of project managers uscd probabilistic analysis and/or simulation within project management
software
However, simulation has its own drawbacks One serious methodological flaw
in traditional MCS of project networks is the assumption of statistical independence
for individual activities which share risk factors in common with other activities
[43] Most available simulation packages assume (hal the marginal distribulions of
uncertainly [or individual activities in the projecl completely deline the multivariate distribution for project schedule It is intuitively obvious that this assumption is highly suspect for many projects which involve multiple activities of a similar type and/or have different activity typos, which arc influenced by common risk factors van Dorp and Duffey in 1999 [50] denionstated that failure to model such types of risk dependence during MCS can result in the underestimation of total uncertainty
project scheduls The most effective way lo deal wilh dependence in a stalist:
use a causal structure to explain it MCS is not capable of modclling causal
structures,
is
Another weakness of MCS explained by Williams [51] is the inability of
simulation to capture the actions taken by the managers to recover any slippage in activity/project duration MCS simply runs through a network assigning values to
vandom variables on cash Heration TL ignores the favt that m reality if an actvily
was ranning late, management would take actions to affect the activity duration
‘Uncertainty in an activity is usually the result of a chain of causes (sources) and can
be affected by a chain of aclions (controls)
Furthermore, MCS is only as good as the information that is fed into it If the
duration distributions of the project acliviHies are incorrect or inadequate, the
simulation results are erroneous and invalid, Ln reality duration of most activities are estimated subjectively In order to capture all aspects of uncertainty in activity (project) duration various known and unknown sources of risk have to be addressed
‘Therefore, MCS will not be applied as a scheduling technique in the scope of
this thesis
30
Trang 32no generally accepted computational approach available Therefore, the fuzzy project-scheduling approaches have been kept in the academic sphere A summary
of most of the published research works in fuzzy project scheduling can be found in the work of Bonnal et al in 2004 [54]
1.2.3 Agile software project scheduling
From the Tale 1990s soverat methodologies like RUP, XP, FDD, Scrmn cle
began to get incroasing public attention and has become mainstream software development methods, especially in Viemam where most software vendors are small and medium enterprises These methods arc represenilative of agile sofiware development
Agile - denoting “the quality of being agile; readiness for motion, nimbleness, activity, dexterily in motion” [55] — sofware development methods are allempting
to offer an answer to the eager business community asking for lighter weight along with faster and nimbler software development processes This is especially the case with the rapidly growing and volatile Tntemnet sofware industry as well as for the
emerging mobile application environment
Agile development is a way of organizing the development process,
emphasizing direct and frequent communication preferably face-to-face, frequent
deliveries of working software increments, short iterations, active customer
engagement broughoul the whole development life-cycle and change
responsiveness rather than change avoidance (56) Thus, agile software development recognizes that software development is inherently a type of product
developinent and therefore a learning process Tt is ierative, exploralive and
designed to facilitate learning as quickly and efficiently as possible Two of the most significant characteristics of agile approaches are: 1) they can handle unstable requirements throughout the development cycle: and 2) they deliver products in
shorter time-frames and under budget constraints when compared with tradiHonal
devclopment methods,
31
Trang 33‘An agile approach can be scen as a contrast to (traditional) waterfall-like processes [57, 58, 59] which pay attention to thorough and detailed planning and design upfront and consecutive plan conformance The waterfall model is the oldest and the most mature software development model [58] In practice, the waterfall development model can be followed in a linear way, and iteration in an agile method can also be treated as a miniature waterfall lifecycle
Agile approaches have been widely employed in a domain of low cost of falure
or linear incremental cost of failure [60] Examples within this domain inchide web- based applications, mobile applications |55], Tutemmel commerce, social networking, games development, and even some areas in government, finance and banking software development,
‘Table 1.2 summarizes some of the differences between waterfall and agile
Table 1.2 the differences between waterfall and agile projects
Product’ | An often bloated product that The best possible product according
scope is still missing features (ie, to customers own prioritization,
rejected change requests or de- incorporating leaming from actual scoped to meet deadlines) use (revolves with the increments)
Schedule’ | Deadlines are usually missed, Very high probability of meeting
time and it is unlikely for a project fixed date commitments; can often
to deliver early deliver carly with the highest value Quality Defects must be tested Quality is built in, and is the key to
extensively and expensively productivity (writing tests before
writing code)
Return’ Revenue eaming and value Value is generated early, as soon as
value creation are delayed until the the minimum highest prioritized
creation lowest priority features are features are delivered irealer
implemented and delivered return on investment Relalionship | Contractual Collaboralive
to the
customer
32
Trang 34Since agile software development is organized :teratively and incrementally in iterations, agile software scheduling is actually iteration scheduling Iteration scheduling aims at delermining a very feasible and precise plan for the development that schedules the implomentation of selected features within an iteration (ic assigning tasks to developers) ‘echnical tasks (or Sprint backlog items in Scrum) are the main concepts of iteration scheduling These tasks are the fundamental working unils accomplished by one developer, and usually require some working hour realization cffort that is cstimated by the team The aim of iteration scheduling
is to break down selected requirements into technical tasks and to assign them to developers [61] Tr that process, the development (eam also needs to care abou! tasks dependencies (sequencing) and time constrains The problem of optimized Agile iteration scheduling will be discussed in details in Section 3.1
1.3 Risk management in software project scheduling
1.3.1 Overview of project risk management
Risk management has become an important part of project management and has attracted a wide range of research during the last two decades [5] Since 1990 various Risk Managetont Processes (RMP) have beon proposed Probably the most
popular Project Risk Management Processes (PRMP) is Chapter 11 of the PMBOK
(Project Management Body of Knowledge) guide [11], the PRAM (Project Risk Analysis and Management) guide [62] and the RAMP (Risk Analysis and Management for Projects) guide [63] Most organisations adopt one of these guides
or use them to develop their own process This thesis does not intend to explore the
detailed differences between different guides since, apart fram fundamental
differences in assumptions and methodologies [64], they all aim to capture risk and
uncertainty in the following three stages:
The usual output of the risk identification stage is a document called the Risk Register Many awhors have discussed risk registers in their works [65] Williams [66] stated two main roles for a risk regustor:
33
Trang 35© A repository of a corpus of knowledge
© ‘To initiate the analysis and plans that flow from it
Chapman and Ward [18] consider a risk register as documentation of the sources
of the risks, their responses and also risk classification Ward [67] described the
purpose of a risk register “to help the project team review project risk on a regular
basis throughout the project” Patterson and Neailey [68] presented a risk register
database system to aid managing project risk Risk registers can be a good management toot during the course of a project However, it is not possible to
identify all visks and caplure all aspeels of them There are always unknown (ie
uidiscovered, unattended or immeasurable) risks that often are more important than the identified risks in the risk register
‘The Risk Analysis stage attempts to measure the risk and its impacts on different project outputs (ie cost, time, and performance) This stage is also known as
quantitative risk management The likelihood that each identified risk will occur and also ils possible impact on the project is estimated The combinalion of ihe
risks, probabilities and their impact create “probability-impact’ (PL) matrices ‘This
matrix can be used to assign ranks to risks and then prioritise them Most of the
available quantilative tools and techrriques (simulation based tools) implement the
PI values to quantify uncertainty in projects However, use of PI matrices has some
important shortcomings [15]
Tho Risk Response stage allempls to formulate management responses to the risk Also known as “Risk Mitigation”, it uses the results of the analysis stage in order to improve the chance of achieving the project objectives “Risk Response
a decision making process A mumber of alternative stratogics are available when, planning risk responses, which can be described under one of the following strategies [69]:
© Avoid - seeking to eliminale uncertainty by reducing either the probability or
the impact to zero
s« Tr
insurance)
er — seeking 1o transfer ownership and/or habihty to a third parly (¢ g
© Mitigate — scking to reduce the size of the risk exposure in order to make it more acceptable to the project or organization
© Accept — recognizing residual risks and responding cither actively by allocating appropriate contingency, or passively doing nothing except monitoring the status of the risk
34
Trang 36There are several other publications with different perceptions of project risk
management processes l’or example, Al-Dahar and Crandail [70], the UK Ministry
of Defence [71], del Caano and de la Cruz [72], Wideman [73], British Slandard
Institute (BSD [74], NASA (Rosenberg ct al 1999) [75], the U.S Department of
Defence [76], and the US Department of ‘Iransportation [77] suggest the use of processes with different stages or phases Even though risk management process is
adopled for managing risk/uncerlainty, risk analysis always plays an important role
in the process
1.3.2 Project risk analysis
The term risk analysis in the scope of this research is the same with quantitative
risk analysis and related to risk measurement, as we focus on quantitative issues of
projecl risks Projcet risk analysis is one stage of project risk management In some literature, risk analysis is even synonymous with risk management
Ta facl, risk analysis is usually started oul by a qualitative analysis and ils
results support the decision making process in the Risk Response stage It is a continuous process that can be started at almost all stages in the duration of a
projecl However, it is the best to use risk analysis in the beginning stages of
projects (i.c some phases like feasibility study and planning) and contumually update
it during the implementation phase ‘This can be done iteratively at intervals, and
this also matches with agile soflware development
Risk analysis is the most “formal” aspect of the project risk management process [69]), often involving sophisticated techniques and usually requiring computer sofware (or tools) Such techniques may be applied with various levels of effort depending on the available resources for the analysis and also on the details
Risk analysis can bring in cerlain benefils to software projeel, meludinyg:
© Help to make decisions and make it possible for more effective and efficient!
risk management
® Tlelp to make more feasible (realistic) plans, in terms of both duration and
cuss
© Help lo form statistical data of historical risks This in tum would be benefits
in better planning and implementation of future projects
35
Trang 371.3.3 Unknown risks
One important category of uncertainty in projects is “Unknown Risks” These
are important sources of uncertainly bevause their impact on a project: may
outweigh all other sources of risks
Allhough unknown risks are thoroughly acknowledged (perhaps with different names) by several authors, none of the existing approaches for project scheduling, is able to model and quantify this type of risk The conventional “probability impact”
approach al best is only capable af modelling “known risk” Most of the current
quantitative techniques for risk analysis are event-oriented and more concemed about ‘risk of something happening” They assume that a list of events (conditions)
that may take place is known, the impact of each risk on activity duration is also
known and oven the nature of the response to cach risk is roughly known [19]
However, unknown risks are unpredictable and immeasurable (their impacts are wuknown or hard to quantify) Those risks required much effort to clarify, An example of unknown risks is Internally Generated Risk - 1GR [78] As their names already reveal, IGRs originated from within the project team or organization, from rules, policies, regulations, struciures, actions, behaviours or cullure of the organization IGRs have the following features:
© Common, since organizational issues such as policies, processes, culture etc are widespread in most projects of the organization
* Important, since they often have impact on more than one activity
© Not well-wannged in projeots, as they are unprediclable (and hardly pul in documents or risk registers) and hard to quantify
1.3.4 Risk aspects in software project scheduling
In different project management processes there are different aspects of uuncertainty/risk [23] This thesis focuses on quantitative risk management which
concems about risks affecting project schedule (or project time frame), including
risks affecting project scheduling (a phase or a process in project planning) As can
be deduced from the previous sections, these risks cannot he completely separated
from risks of other processes or phases
In project scheduling, the most obvious risk is in duration estimation for a
parlicular activity Di
of what is involved as well as from the uncertain consequences of potential threats
or opportunities Some sources of uncertainty
ully in this eslimalion can arise from a lack of knowledge
36
Trang 38ø_ Level of available and reqnired resources (including inexperienced or lack of training developers)
© Incomplete (or often changing) requirements
© Tradeoff between resources and time
© Possible occurrence of uncertain events (especially those cause badly impact,
«Lack of previous experience and use of subjective instead of objective data
© Incomplete or imprecise data, or lack of data
© Uncertainty about the basis of subjective estimation (i.c bias in estimation),
1.4 Bayesian Networks
1.4.1, Probabilistic approach using Bayesian Networks
Bayesian Nelwork (BN or also known as Bayesian Belief Network, Causal
Probabilistic Networks, Probabilistic Cause-Liffect Models, and Probabilistic
Influence Diagrams) is a special type of graphs that associated together with a set of
probabilily tables BN models causal relationships of a system or đalasel and
provides a graphical representation of this causal structure through the use of directed acyclic graphs (DAGs) with nodes and edges The DAG representation
provides a framework for inference and prediction The nudes represent, random
variables with probability distributions, while cdges represent weighted causal
relationships between the nodes Hach node has a probability of having a certain value (a finite set of mutually exclusive states), A directed edge exists from a parent
toa child) Rach child node A has a conditional probability lable P(AJB1, Bu) based on its parental values Bl, Ba If the node has no parents, then the table
becomes the unconditional probabilities P(A) (i.e prior probability)
BN is based on Bayes’ Theorem, with the well-known formula presenting, the joint probabilities:
Trang 39The above Baycs rule is interpreted in terms of updating the belicf (posterior probability of each possible state of a variable, that is, the state probabilities after considering all the available evidence) about a hypothesis Rin the light of new cvidence § So, the posterior beliof P(R/S) is caleulated by multiplying the prior belief P(R) by the likelihood P(S/R) that $ will cecur if R is true (See more about updating probability in Section 1.1.2)
We can re-arrange the formula for conditional probability to get the following, formula in form of product rule
We can extend the above product rule for three variables:
T(A,B,C) = P(A|B,C)*P(ŒB.C) = P(AIB.C)*P(BIC)*P(C) q4)
And it follows the generalbzed formula tơ n variables that:
PCALAz An) — PCAGA2, .An)*PCAGAa Au)¥ *P(AntiAn)*P(An) (1.5)
Formulas 1.4 and £5 are often referred to as the “Chain Rule”, which says ina
BN the full joint probability distribution is the product of all conditional probabilities specified in the BN Those formulas are important ones considering BBN since they provide means of calculating the full joint probability distribution in BNs [9] Many of the variables Ai will be conditionally independent which means that the formula can be simplitied as shown
BN allows an injection of probability distributions associated with individual
nodes The initial probability distributions can he simply based on “exper! opinions”, survey or other mathematical methods, ie., BN approach is consisted of
expert opinions and mathematical calculations
A BN consists of two parts: 1) qualitative part represents the relationships among variables by a directed acyclic graph, and 2) quantitative part specifies the probabilily distributions associated will every node of the wodel The Figure 1.3 shows a BN representing a simple case about the relationship between sub-contract, (team) staff quality and the possibility of delay in a task [23]
In the BN in Figure 1.3, the qualitative part consists of three nodes (represent uncertain variables) and two edges, ach node has a set of states l’or example, the node Staff Quality has two states: “Good” and “Poor” Another part of the directed graph the edges represents influential relationships between variables Kor instance, an observed event on Sub-contract or/and Staff Quality may lead to Delay
in Task
38
Trang 40For the quantitative part: there is probability table associated with each node, providing the probabilities of each state of the variable For nodes without parents
(ie., prior nodes), the associated table are not conditioned on the other variables and
are called prior probabilities or prior distributions that represent prior belief For example, for the node Staff Quality, P(“Good”) = 0,7 and P(“Poor”) = 0.3 For a
node with parents, the probability table has conditional probabilities for each
combination of the parents’ states (for example, see the table for the node Delay in
Task in the Figure 1.3)
Bayesian inference is based on a conceptually simple collection of ideas We
are uncertain about the quantity of a parameter We can quantify our uncertainties
as subjective probabilities for the parameter (prior probability), and also conditional
probabilities for observations we might make given the true value of the parameter (likelihood function) When data arrives, Bayes’ theorem tells us how to move from our prior probabilities to the new conditional probabilities for the parameter
(posterior distribution) [79] For example, in the Figure 1.3, a project manager is
analyzing the cause of delay in a particular task in a project A part of the task is
done by a sub-contractor Based on previous experience and the good reputation of
the sub-contractor, the project manager believes that the chance of delivering the
sub-contract on time is 95 percent There is an 80 percent chance of delay in the
task if the sub-contractor fails to deliver on time Even if the sub-contractor delivers
on time, there is still 10 percent chance that the task is over scheduled (as a result of
39