The research provides answers to the above questions with probabilistic approaches and tools to assess the impacts of risk factors on software project scheduling; proposing list of commo
Trang 1MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
PhD DISSERTATION ON SOFTWARE ENGINEERING
Hanoi – 2021
Trang 2MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
Major: Software Engineering
Code No.: 9480103
PhD DISSERTATION ON SOFTWARE ENGINEERING
SUPERVISORS:
1 Assoc Prof Huynh Quyet Thang
2 Dr Vu Thi Huong Giang
Hanoi – 2021
Trang 3SUPERVISORS:
Assoc Prof Huỳnh Quyết Thắng Dr Vũ Thị Hương Giang
Trang 42
Acknowledgements
First of all, I would like to express my sincere gratitude to my first supervisor Associate Professor Huynh Quyet Thang for his invaluable guidance and support throughout my research Professor Thang has supported me all the way, all the time
It is his patience that keeps me always committed to doing this research and reaching the end of PhD student period I am also very grateful to my second supervisor Dr Vu Thi Huong Giang whose bright hints and expertise has been always helpful to me
My special thanks go to Ms Vo Thi Huong, Ms Bui Thi Quynh Nga, Mr Tran Trung Hieu, Mr Tran The Anh, Mr Tran Bao Ngoc and Mr Cao Manh Quyen, who were master and bachelor students at School of ICT, Hanoi University
of Science and Technology and helped me with building the tools as well as testing our models
I am also indebted to Dr Nguyen Thanh Nam (former CEO of FPT and former President of FSOFT), Mr Luu Quoc Tuan (Tinh Van Outsourcing Jsc.), Mr Ngo Quang Vinh (Evizi), Mr Nguyen Huy Binh (FIS) who provide helpful real software project data and valuable expertise judgments on the data
Finally, my greatest appreciation is to my family, especially to my wife Tran Thi Bich Ngoc Without their love, patience and sacrifice, this achievement would never be possible
Trang 53
Summary
Software project management is an art and science of planning and leading software projects In software industry, project managers mostly rely on their experience and skills to manage their projects and lack of scientific tools to support them
Risk management is a crucial part of software project management that helps prevent software disasters In this research, risks are defined as uncertain events or conditions that, if they occur, they would have a bad impact on one or more software project outcomes (cost, time, quality) Identifying and dealing with risks or uncertainty in early phases of software development life cycle would lessen long-term cost and enhance the chance of the project success The most important part of risk management is risk analysis which assesses the risks and their impact to the outputs of the software project To overcome subjective assessment based on development team’s experience, the team needs a quantitative risk analysis method Software project scheduling is one part of software project planning Since in practice, most software projects are over-budget and behind schedule, software project scheduling needs to be taken into careful consideration We come up with the following questions:
How to schedule software projects better?
How to better manage risks in software projects?
How to quantitatively analyse risks?
Some researchers say that Bayesian Networks can be used to quantify uncertain factors in (general) project scheduling and improve project risk assessment and analysis Our research is aimed at taking those advantages of Bayesian Networks into software project scheduling by addressing common software project features The research provides answers to the above questions with probabilistic approaches and tools to assess the impacts of risk factors on software project scheduling; proposing list of common risk factors and Bayesian Network model of these risk factors; and proposing advanced scheduling methods based on incorporating Bayesian Networks into popular scheduling techniques such as CPM, PERT or agile iteration scheduling etc Bayesian Networks help quantify the factors, and hence help better manage them as well as enhancing the predictability
of things happen in the project
Trang 64
This research first takes a literature review on (general) project planning issues, project scheduling techniques, project scheduling tools, uncertainty and risk characteristics in software projects, risk management processes, project risk analysis
in order to apply state-of-the-art techniques into software projects (Chapter 1) After that, Bayesian Networks are applied in building and experimenting risk factors in software project scheduling BRI (Bayes Risk-Impact) algorithm is proposed to assess risk factors’ impact on software scheduling (Section 2.1) The first set of risk factors with 5 risk factors are examined using a probabilistic own-built tool CKDY to analyse risks in software project scheduling (Section 2.2)
The research proposes an advanced algorithm for agile iteration scheduling using Bayesian Networks The advantages of this method are providing a schedule and the probability of finishing agile iteration on time (Section 3.1) In addition, the author goes further with a more refined list of 19 risk factors in software scheduling and uses them in software scheduling methods The research also incorporates Bayesian Network with CPM and PERT scheduling techniques in traditional software projects together with the Bayesian Networks of common risk factors (Section 3.2 and Section 3.3) The list of 19 risk factors in agile software development is also examined in agile iteration scheduling (Section 3.4) The experimental results show that our models are reliable and our approaches have practical implications, i.e we can take advantage of Bayesian Networks in modelling and quantifying risks/uncertainty in software projects
Trang 75
How to read this report?
The author highly recommends that you read this report from beginning to the end However, if at any point you want to look at specific important pieces of information, the following guide could be helpful:
To get the motivation, the overview of related work, the objectives, the scope, the hypothesis and methodology of this research, please go to the Introduction section
To get an overview of software project scheduling and risk management in software project scheduling, please go to Sections 1.1, 1.2 and 1.3
To get an overview of Bayesian Networks, please go to Section 1.4
To get details on main contributions and key findings of the research, please read Chapter 2 and Chapter 3
To get information on common risk factors in software project scheduling, you can have a look at Section 2.3
The Chapter 2 is about building tools and doing experiments on applying Bayesian Networks into risk management in software project planning (Section 2.1) and some key risk factors (Section 2.2)
The Chapter 3 is about incorporating Bayesian Networks and common risk factors into software project scheduling techniques such as CPM (Section 3.2), PERT (Section 3.3), Agile software development scheduling (Section 3.4)
To get to know the conclusions, the limitations as well as the further research
of the study in this PhD thesis, please read the Conclusion section
Trang 86
Content
Acknowledgements 2
Summary 3
How to read this report? 5
List of symbols and abbreviations 10
List of tables 12
List of figures 13
Introduction 15
Motivation 15
Related work 18
Research scope 20
Research objectives 20
Scientific and realistic meaning 21
Research hypothesis and methodology 21
Expected results 21
Structure of the thesis 22
Chapter 1 Overview of software project scheduling and risk management 23
1.1 Software project management and software project scheduling 23
1.1.1 Software project management 23
1.1.2 Software project scheduling 25
1.2 Software project scheduling methods and techniques 26
1.2.1 Overview 26
1.2.2 Traditional scheduling methods and techniques 26
1.2.3 Agile software project scheduling 31
1.3 Risk management in software project scheduling 33
Trang 97
1.3.1 Overview of project risk management 33
1.3.2 Project risk analysis 35
1.3.3 Unknown risks 36
1.3.4 Risk aspects in software project scheduling 36
1.4 Bayesian Networks 37
1.4.1 Probabilistic approach using Bayesian Networks 37
1.4.2 Bayesian Inference 39
1.4.3 Bayesian Networks and project risk management 40
1.5 Chapter remarks 42
Chapter 2 Common risk factors and experiments on Bayesian Networks and software project scheduling 44
2.1 Application of Bayesian Networks into schedule risk management in software project 44
2.1.1 Common risk factors in software project management 45
2.1.2 Bayesian Networks of risk factors 46
2.1.3 Risk impact calculation 58
2.1.4 Bayesian Risk Impact algorithm 61
2.1.5 Tool and experiments 61
2.1.6 Conclusion and contribution 66
2.2 Experiments on common risk factors 67
2.2.1 Discovering the top ranked risk factors 68
2.2.2 Tool CKDY 71
2.2.3 Experiments and analysis 73
2.2.4 Conclusion and contribution 77
2.3 Proposed common risk factors in software project scheduling 78
2.3.1 The 19 common risk factors in traditional software project 78
2.3.2 The 19 common risk factors in agile software project 80
2.4 Chapter remarks 82
Trang 108
Chapter 3 Incorporation of Bayesian Networks into software project
scheduling techniques 83
3.1 Applying Bayesian Networks into specific software project development 83
3.1.1 Introduction 83
3.1.2 Optimized Agile iteration scheduling 84
3.1.3 Optimization model for Agile software iteration 85
3.1.4 Tool and experimental results 90
3.1.5 Conclusion and contribution 94
3.2 Incorporation of Bayesian Networks into CPM 94
3.2.1 The RBCPM Model 95
3.2.2 The RBCPM Method 98
3.2.3 Tool and experimental results 99
3.2.4 Conclusion and contribution 103
3.3 Incorporation of Bayesian Networks into PERT 104
3.3.1 Proposed model 104
3.3.2 Tool development and data collection 108
3.3.3 Experimental results and analysis 112
3.3.4 Conclusion and contribution 114
3.4 Incorporation of Bayesian Networks into Agile software development scheduling 114 3.4.1 Optimization model for Agile software iteration 115
3.4.2 Tool and experimental results 115
3.4.3 Conclusion and contribution 117
3.5 Chapter remarks 118
Conclusion 119
What has been done 119
Main contributions 119
Limitations 119
Trang 119
Further research 120
List of scientific publications 121
References 122
Index«« 130
Trang 1210
List of symbols and abbreviations
3 BAIS Bayesian Agile Iteration Scheduling
7 CMMi Capability Maturity Model Integration
16 PERT Program Evaluation and Review Technique
18 PMBOK Project Management Body of Knowledge
21 PRAM Project Risk Analysis and Management
Trang 1311
24 PSPLIB Project Scheduling Problem Library
25 RAMP Risk Analysis and Management for Projects
26 RBCPM Risk Bayesian Critical Path Method
Trang 1412
List of tables
Table 1.1 Basic mathematical notations used for CPM calculation 27
Table 1.2 The differences between waterfall and agile projects 32
Table 2.1 Hui and Liu’s common risk factors [9] 45
Table 2.2 Risk factors in the phases 64
Table 2.3 Risk factors, consequences and impact 68
Table 2.4 Examples of risk factors and probabilities 70
Table 2.5 Probability of risk factors in the whole project with data set 1 74
Table 2.6 Probability of risk factors in the whole project with data set 2 75
Table 2.7 Probability of the experimental risk factors to compare with MSBNx 76
Table 2.8 CKDY compared with MSBNx 77
Table 2.9 List of 19 common risk factors for software project scheduling 79
Table 2.10 List of 5 risk factors for software project scheduling in Section 2.2 80
Table 2.11 List of 19 risk factors in iteration scheduling 81
Table 3.1 The first data sample 91
Table 3.2 The probability table for tasks and resources 92
Table 3.3 Risk factors analysis 96
Table 3.4 Data sample 1 100
Table 3.5 Data sample 2 101
Table 3.6 Task attributes of the first data sample 110
Table 3.7 Task attributes of the second data sample 110
Table 3.8 Task attributes of the third data sample 111
Table 3.9 The result for the first data sample 115
Trang 1513
List of figures
Figure 1.1 Activities of project management according to PMBOK Guide 25
Figure 1.2 CPM parameters in an activity 28
Figure 1.3 An example of BN which represents a simple case 39
Figure 2.1 A sub BN for the risk factor “Staff experience shortage” 47
Figure 2.2 A sub BN for the risk factor “Reliance on few key person” 47
Figure 2.3 A sub BN for the risk factor “Schedule pressure” 48
Figure 2.4 A sub BN for the risk factor “Low productivity” 48
Figure 2.5 A sub BN for the risk factor “Lack of staff commitment” 49
Figure 2.6 A sub BN for the risk factor “Lack of client support” 49
Figure 2.7 A sub BN for the risk factor “Lack of contact person competence” 50
Figure 2.8 A sub BN for the risk factor “Lack of quantitative historical data” 50
Figure 2.9 A sub BN for the risk factor “Inaccurate cost estimating” 51
Figure 2.10 A sub BN for the risk factor “Large and complex external interface” 51 Figure 2.11 A sub BN for the risk factor “Large and complex project” 51
Figure 2.12 A sub BN for the risk factor “Unnecessary features” 52
Figure 2.13 A sub BN for the risk factor “Creeping user requirement” 52
Figure 2.14 A sub BN for the risk factor “Unreliable subproject delivery” 52
Figure 2.15 A sub BN for the risk factor “Incapable project management” 53
Figure 2.16 A sub BN for the risk factor “Lack of senior management commitment” 53
Figure 2.17 A sub BN for the risk factor “Lack of organization maturity” 54
Figure 2.18 A sub BN for risk factor “Immature technology” 54
Figure 2.19 A sub BN for the risk factor “Inadequate configuration control” 55
Figure 2.20 A sub BN for the risk factor “Excessive paperwork” 55
Figure 2.21 A sub BN for the risk factor “Inaccurate metrics” 56
Figure 2.22 A sub BN for risk factor “Excessive reliance on a single process improvement” 56
Figure 2.23 A sub BN for the risk factor “Lack of experience with project environment” 57
Figure 2.24 A sub BN for the risk factor “Lack of experience with project software” 57
Figure 2.25 The overall BN for software risk factors 58
Figure 2.26 A simple example of Bayesian inference 59
Figure 2.27 The three nodes of a simple-chain BN 60
Figure 2.28 The graphical interface of the tool 62
Figure 2.29 Result of experiment 1 63
Figure 2.30 Results of the three experiments 65
Figure 2.31 Experimental results for Software Design phase 66
Trang 1614
Figure 2.32 Sub BN 1 69
Figure 2.33 Sub BN 2 69
Figure 2.34 The overall BN model 70
Figure 2.35 Experiment with j30 with the early start schedule 74
Figure 2.36 Activity joint in the file j301_1.rcp 75
Figure 2.37 Diagram of probabilities of finishing phase by phase 75
Figure 3.1 Home GUI of tool BAIS 90
Figure 3.2 Gantt chart for SPT strategy 92
Figure 3.3 A part of a BN for 19 risk factors 95
Figure 3.4 Task’s parameters and connection to other tasks 98
Figure 3.5 A screenshot of RBCPM 99
Figure 3.6 A result for experiment with data sample 1 102
Figure 3.7 A result for experiment with data sample 2 103
Figure 3.8 Bayesian Network for each activity 105
Figure 3.9 Risk integration network model into PERT scheduling 106
Figure 3.10 Process in improved RBPERT Model 107
Figure 3.11 The input screen of the RBPERT tool 108
Figure 3.12 The input file type of the RBPERT tool 109
Figure 3.13 A result for the network provided by the RBPERT tool for the first data sample 111
Figure 3.14 A result for RBPERT network provided by the tool for the first data sample 113
Figure 3.15 A result for experiment with the third data sample (distribution of Total Duration of activity J) 113
Figure 3.16 A screenshot of tool BAIS 115
Figure 3.17 The result of the second experiment 117
Trang 17Software projects also have schedule risks, and as a consequence, budget or cost risks For example, the project on the Vietnamese National Population Database was approved to be invested in 2015 [2] and was planned to be finished in two years (2016 and 2017) However, the system can only be put into operations in February
2021 Another similar example is the project on Vietnamese National Public Service Portal which was planned to come public in September 2016 [3] but was only opened since December 2019 As a matter of fact, the majority of software projects the author has experienced in Vietnam are behind schedule (some of the projects will be examined in Chapter 2 and Chapter 3)
Even in developed countries, software projects are facing ongoing problems For example, the project Universal Credit - the welfare payment system owned by the Central Government of the United Kingdom - started in 2013 The project schedule has slipped, with the final delivery date now expected to be 2021, although the system is gradually being introduced In 2013, only one of four planned pilot sites went live on the originally scheduled date, and the pilot was restricted to extremely simple cases [4]
Many software projects have suffered from significant budget overruns together with a series of delays, which cause either temporary issues or permanent failures For example, The Queensland Health Payroll System was launched in 2013 in what could be considered one of the most spectacularly over budget projects in Australian history, coming in at over 200 times the original budget Besides, in spite
of promises that the new system would be fully automated, the new system required
a considerable amount of manual operation [5] Another example for software project permanent failure case is the project e-Borders for an advanced passenger information programme which aimed to collect and store information on passengers and crew entering and leaving the United Kingdom Started in 2007, the project had
a series of delays and had to be cancelled in 2014 [6]
Trang 1816
Some researches pointed out that most of the software projects (83.8%) are over budget or behind schedule and 52.7% of software development projects deliver software with fewer features than originally specified [7, 8] Statistics also show that 31.1% of development projects end up being cancelled or terminated prematurely Among those completed projects, only 61% of them satisfy originally specified features and functions [9] In the software industry, one of the greatest challenges that development teams constantly face with is to keep the projects under control in terms of budget and schedule (development time frame) The activities of
a software project are influenced by internal and external factors (from that project organization) that make it uncertain whether the project will achieve its objectives The effect that this uncertainty has on the project’s goals is called risk [10] In the other words, risk is an event or an uncertain condition that, if it occurs, will have a
positive or negative effect on at least one of the project objectives [11] In this thesis, risks are defined as uncertain events or conditions that, if they occur, they would have a bad impact on one or more software project outcomes (cost, time, quality)
The above situation raises an important question: how projects’ risks are managed better in order to get rid of the temporary issues as well as preventing from failure?
The purpose of project management is to lead the project to success A
successful software project certainly relies on many factors (e.g following appropriate processes and tasks, managing risks properly etc.) Since risks are inevitable in projects, risk management has become an important part of project management Although many researchers, experts and writers have proposed variety
of processes and techniques, project risk management (PRM) is still rapidly evolving and handling risks in general projects as well as software projects remains
a challenge
Concerning PRM, an important component is risk analysis which also known or considered the same as risk quantification Risk analysis attempts to measure risks and their impacts on different project outcomes (i.e., time, cost, quality) Many software projects fail since project managers mostly plan based on their experience and there is a lack of scientific methods to support them To overcome subjective assessment based on development team’s experience, the team needs a quantitative risk analysis method Although various researches have proposed and examined a range of processes and techniques and software project risk management is continuously evolving, handling uncertainty in more and more complex real-world projects remains a challenge
Trang 1917
Aside from that, project scheduling (a part of project planning – an early phase
of software development life cycle) is concerned with the techniques that can be employed to manage the activities that need to be undertaken during the development of a project There are various techniques for project scheduling, from simple and easily understandable ones such as Task List, Gantt Chart, Schedule Network Analysis, to more complicated ones like Critical Path Method (CPM), Program Evaluation and Review Technique (PERT), Monte-Carlo Simulation (MCS) or Fuzzy Logic etc [10, 12, 13, 14]
Traditional project scheduling under risk/uncertainty has attracted more research and attention in the project management community In some of the project management literature in 1990s, “risk analysis” was equivalent to “the analysis of risk on project plan” [15] This thesis focuses on modelling risks in software project time management (of course, it is indirectly related to other project outcomes which are cost and quality) In other words, this thesis concentrates on quantitative risk analysis in software project scheduling
The earliest studies incorporating uncertainty/risk in project scheduling were in the late 1950’s by Malcolm et al [16] and Miller [17] Since then, a variety of techniques have been introduced, several tools have been developed, and many of them are widely used throughout different industries However, they often fail to capture uncertainty properly and/or produce inaccurate, inconsistent and unreliable results, especially when applied to software projects which have specifically different attributes to other traditional projects
Project uncertainty has several aspects of which not all can be categorized and treated as risks Several authors such as Ward and Chapman [18] argued that project risk management should be focusing on managing uncertainty and its various sources rather than emphasizing a set of possible events that might have bad impacts on project performance (i.e., should be aware more about uncertain aspects rather than fixed set of defined risks) However, since this thesis is about software project, risks are considered and treated the same as uncertainty Most of quantitative techniques and methods in the current practice of project risk management are based on the “Probability Impact” concept, which have certain shortcomings in terms of risk analysis in project scheduling More sophisticated methods and techniques are needed to address as well as managing important sources of uncertainty/ risk
In software industry, project scheduling also has to deal with the fact that resources such as human, time, technology and money are not always pre-determined [19] There are always risks in software project scheduling as well In
Trang 2018
most of the projects, the activity (from now on is considered the same as the “task”
in software projects) times are not known for certain Therefore, they may be assumed as random variables
Furthermore, Bayesian Networks (BNs) have attracted a lot of attention in different fields (construction, R&D etc.) as a powerful approach for decision support under uncertainty A BN is a graphical and mathematical model which offers a powerful, general and flexible approach for modelling risk and uncertainty Its capability of modelling causality and also conditional dependency between variables make it perfectly suitable for capturing uncertainty in projects Yet, BNs are rarely applied in project risk management in general as well as in software project management and software project scheduling
The author of this thesis strongly believes that if we can identify and control risks at early stages of software development project, we can significantly increase the chance of success of the project Since it is not easy (or impossible) to control all of the problems or factors, this thesis only focus on time factors which related to software development schedule
Therefore, this thesis aims at introducing an advanced approach as well as finding a better model for incorporating and managing uncertainty/risks in software project scheduling The idea is to use BNs to perform the well-known scheduling techniques such as CPM, PERT etc as well as modelling risk factors in software project scheduling The proposed approach enriches the benefits of scheduling techniques by incorporating uncertainty/risk factors and adding the strong analytical power of BNs
Related work
There have been various researches on applying BNs in to general projects Khodakarami [19] applied BNs into general project scheduling with two case studies of aircraft design and health and fitness center design and construction Ali
et al [20] combined Monte Carlo Simulation and Bayesian Networks methods to present a structure for assessing the aggregated impact of risks on the completion time of a construction project Lee and Shin [21] proposed an application of BNs into risk management of ship building project and proposed 26 risks Sharma and Chanda [22] developed a BN model for prediction of R&D project success which also assesses based on R&D project risk factors Khodakarami et al [23] also examined an approach to generate project schedules that incorporates risk, uncertainty, and causality using BNs Their model empowered the traditional CPM
to handle uncertainty, and they also provided explanatory analysis to elicit, represent, and manage different sources of uncertainty in project planning Fenton
Trang 2119
and Neil [25] introduced AgenaRisk as a probabilistic tool based on BNs; Chang,
Yu, and Cheng [26] proposed a risk-based Critical Path Scheduling Method based
on 2 risk categories and 7 risk levels which applied into construction projects Regarding risk factors in software projects, Hui and Liu [9] selected 24 risk factors that may cause potential impacts on (the whole) software project and applied BNs properties in the calculation of impact in their project risk model Kumar and Yadav [24] considered quantitative features and causal relationships among risk factors in software projects They introduced a probabilistic approach to assess risks
in software projects as well as proposing a list of 27 risk factors (in software projects) However, they analysed risks for the whole software projects and did not focus on the scheduling and planning phases which would decide the success of projects Adjusting Kumar and Yadav’s method, this thesis proposes the list of 5 most crucial risk factors as well as building the tool CKDY to examine risks in software scheduling (Section 2.2)
There have been some other researches on BNs and software risks’ analysis Hu
et al [27] studied causality analysis among risk factors and project outcomes for software development projects For this purpose, they proposed a modelling framework based on BNs to deal with causality constraints in risk analysis The developed framework can be used for discovering new causal relationships and validating existing relationships among risk factors and project outcomes Anthony
et al [28] proposed a risk assessment model for decision-making in software management which consists of processes and component of risk assessment in three groups: operational risks, technical risks and strategic risks Rai et al [29] believed that managing projects is managing risks and identified 43 risk indicators in Agile Software Development
One notable research is from Akos Szoke’s PhD dissertation in 2014 which proposed an optimized algorithm for agile software project scheduling [30]
As can be seen from literature review, much research on software risk analysis focuses on finding out the relationship risk factors and software outcomes, but lack
of a quantitative approach and causal relationship between risk factors [9, 24, 31, 32] Some other researches pay attention to define the quantitative approach and the causal relationship between risk factors and assess risks for the whole software project [33, 34] but does not pay enough attention to model risk factors from the scheduling (in the planning) phase – the phase decides the failure or success of the project later on
J Yong and Z Zhigang [35] proposed a PERT Bayesian Network (PERTBN) model with the modelling methodology and the conditional probability calculation
Trang 2220
method of different kinds of procedure arrangement (single-chain, centralized, distributed) and stated that with PERTBN model, the effectiveness of the project schedule control and optimization are ensured However, the research did not examine more in-depth on the risk factors or other specific software features that can have impacts on the project schedule
In addition, there is always a need for properly schedule control in software projects to determine the instant status of the schedule, to know if the schedule has changed, and to embrace changes when they occur In order to do that, influential factors that cause schedule changes need to be carefully considered
In summary, current researches related to this thesis are either on risk management or assessment for the whole software project or for other project (construction, building, R&D etc.) scheduling There is a need of probabilistic method on risk management in software project scheduling as well as examining deeper the risk attributes of software project scheduling
Research scope
The research is about software projects (or software development projects), having common features and also specific features in comparison to other type of projects (such as construction projects, R&D projects etc.) Unfortunately, there have been only a few good researches on applying probabilistic methods on software development projects Therefore, this method first has a literature review
on common projects to look for approaches applied for them, and after that proposes the approach applied for software projects
The scope of this research is on risk management in software project scheduling This is quantitative risk management which concerns about risks affecting project schedule (or project time frame) In terms of project scheduling techniques, this thesis focuses on the most popular techniques such as CPM, PERT for traditional software development projects, as well as Agile software project scheduling
Research objectives
The main objectives of this research are:
1) To find out a quantitative method to better assess and analyse risks in software project scheduling In order to achieve this objective, the research has to answer to following questions: what are the risks’ attributes of software project scheduling? How to manage risks in software project scheduling better?
Trang 23The proposed methods and models would enhance risk management process by
a quantitative assessment of risks impact on software project scheduling If we apply this model and method in practice, the author of this thesis expect that it would help predict, monitor project schedule better as well as making appropriate decisions
Scientific and realistic meaning
The proposed methods and model would enhance risk management process by a quantitative assessment of risks impact on software project scheduling
If we apply this model and method in practice, it would help predict, monitor project schedule better as well as making appropriate decisions
Research hypothesis and methodology
The hypothesis of this thesis is that it is possible to use BNs to quantify uncertainty in software project scheduling and improve software project risk assessment
Since there is very limited research on this topic, the research methodology comprises a literature reviews from general project management to get the relevant ideas for software project management Firstly, a literature reviews to investigate the current state of project scheduling under uncertainty which determines the need, scope and objectives of the new approach Secondly, a literature review follows on the background, theory and application of BNs This provides the conceptual and the fundamental background for the new approach
The research also examines the features of software projects, both in waterfall model and agile software development model In order to handle risks in software project scheduling, the common risk factors are also needed to be examined
Within the research, tools are built to validate the models and help software project managers in assessing risks and making appropriate decisions
Expected results
Following the above methodology, the author expects to:
Trang 2422
1) Apply Bayesian Networks to develop an algorithm and tool to assess the impacts of risks and hence proposes common risk factors in software project scheduling
2) Apply Bayesian Networks to develop a probabilistic approach to enhance the common scheduling techniques (for both traditional software development and agile software development) in terms of risk management and predictability
Structure of the thesis
An overview of the main chapters is as follows:
Chapter 1 briefly reviews software project scheduling and software project risk management process and explores the currently popular techniques in project scheduling
Chapter 2 consists of initial attempts of applying BNs into risk management in software project scheduling as well as experiments on common risk factors in software project scheduling 19 common risk factors for both traditional software development projects and agile software projects are proposed
Chapter 3 incorporates BNs into popular software project scheduling techniques, namely CPM, PERT and agile software scheduling BNs are also applied in examining the relationships among risk factors proposed in Chapter 2 Chapter 4 concludes the thesis and points the way forward for future research
The main contributions and results of the research: The research has
developed the algorithm BRI (Bayes Risk-Impact) and the tool CKDY to assess the impacts of risks and hence proposes common risk factors in software project scheduling Based on literature review and experiments, the research has come up with 19 common risk factors in software project scheduling (for both agile development style and traditional development style)
The research also proposes advanced scheduling methods in software project development The methods based on incorporating Bayesian Networks and common risk factors models into popular software scheduling techniques such as PERT, CPM, and Agile software development, with the examination of the model of 19 common risk factors Tools have been built to experiment the proposed scheduling methods and models Experimental results show that the proposed methods and models are reliable as well as providing practical value to software development teams in analyzing, monitoring and predicting risks and the chance of success of the project
Trang 251.1.1 Software project management
Software project management is an art and science of planning and monitoring software projects It refers to the branch of project management dedicated to the planning, scheduling, resource allocation, implementation, tracking and delivery of software and web projects [36, 37]
There are various types of projects (R&D projects, construction projects, information system projects, software projects etc.) which are associated with different styles of management Software project management is quite distinct from traditional or other project management Firstly, software is developed, not manufactured Therefore, the product (working software) is intangible and uniquely flexible Secondly, software engineering is not recognized as an engineering discipline with the same status as mechanical, electrical engineering etc Moreover, software projects have a unique lifecycle process that requires multiple rounds of testing, updating, and customer feedback That software development process is not standardized Lastly, most software projects are “one-off” projects Software development team can only use similar experience, not the same experience or repeated process
Therefore, software project management is about the methodology to organize all activities related to the software We always need project management since software projects always have constraints of budget and time frame
Nowadays, most IT-related projects are managed in the agile style and software
is developed in groups, in order to keep up with the increasing pace of business, and iterate based on customer and stakeholder feedback Besides being used in IT-related projects, Agile style has also been increasingly used in other project management
The project manager leads the project team and often plays the central role among the investors (or customers), the suppliers and the senior management of the organization He or she makes sure the project complies with the constraints as well
as delivering the product (software) on time Software project managers may have
to do any of the following tasks: [37]
Trang 2624
- Planning and scheduling: This means putting together the blueprint for the entire project from ideation to fruition It will define the scope, allocate necessary resources, propose the timeline, delineate the plan for execution, lay out a communication strategy, and indicate the steps necessary for testing and maintenance
- Leading: A software project manager will need to assemble and lead the project team, which likely will consist of developers, analysts, testers, graphic designers, and technical writers This requires excellent communication, people and leadership skills
- Execution: The project manager will participate in and supervise the successful execution of each stage of the project This includes monitoring progress, frequent team check-ins and creating status reports
- Time management: Staying on schedule is crucial to the successful completion
of any project, but it is particularly challenging when it comes to managing software projects because changes to the original plan are almost certain to occur as the project evolves Software project managers must be experts in risk management and contingency planning to ensure forward progress when roadblocks or changes occur
- Budget: Like traditional project managers, software project managers are tasked with creating a budget for a project, and then sticking to it as closely as possible, moderating spend and re-allocating funds when necessary
- Maintenance: Software project management typically encourages constant product testing in order to discover and fix bugs early, adjust the end product to the customer’s needs, and keep the project on target The software project manager is responsible for ensuring proper and consistent testing, evaluation and fixes are being made
Therefore, managers have diverse roles Since software project management is normally concerned with activities involved in ensuring that software is delivered
on time, on schedule and in accordance with the requirements of the organizations
developing and procuring the software, managers most significant activities are planning, estimating and scheduling
According to Project Management Institute (PMI) in Project Management Body
of Knowledge (PMBOK) guide [11], project management includes five stages or process groups: Initiating, Planning, Executing, Monitoring and Controlling, and Closing (Figure 1.1)
Trang 2725
In modern software project planning, the two essential tasks are project risk management and project scheduling They play crucial roles to make sure the project is effectively and efficiently organized, including resources (hardware, software, and network) allocation, task and personnel assignment and monitoring [11, 14] Software projects are quite different to other projects since software requirements are continuously changing (during software development life cycle), software projects are often behind schedule and over budget Moreover, in reality, many software project managers either ignore or do not take appropriate risk management This leads to project failure or customer complains on the quality, the schedule or the over budget of the project Some other project managers who are aware of risk management, but they only rely on their own team skills or experience, even if they follow the capability maturity models CMM/CMMi (Capability Maturity Model Integration) or PMP (Project Management Professional) [38] As can be seen in Figure 1.1, risk management affects all the processes in Process Groups In addition, project teams could adjust or update the planning process while they are executing, monitoring and controlling their projects
Figure 1.1 Activities of project management according to PMBOK Guide
1.1.2 Software project scheduling
Software project scheduling is one of the most demanding tasks for software project managers It is all about resources allocation during the project life cycle In simple words, software project scheduling is splitting the whole project into smaller tasks and estimates the required time and resources to complete each task Software development teams normally try to organize tasks concurrently to make optimal use
of workforce as well as minimizing task dependencies to avoid delays caused by
Trang 281.2 Software project scheduling methods and techniques
1.2.1 Overview
There are many popular techniques for project scheduling, include:
Graphical representations used to illustrate the project schedule such as
o Work Breakdown Structure: show project breakdown into tasks
o Activity Charts: show task dependencies and the critical path
o Gantt Charts: Bar charts show schedule against calendar time
Critical Path Method – CPM [14, 19, 23, 39]
Program Evaluation and Review Technique – PERT [16, 17, 19, 40] Project scheduling (especially under uncertainty) is the most widely studied area of risk quantification in project management Producing a reasonable and reliable project schedule is one of the crucial tasks of project managers Moreover, having a realistic schedule for the project is one of the most cited factors of project success [41] Several techniques are proposed for modelling risk and uncertainty in project scheduling [14, 40, 42]
This section reviews some notable techniques CPM and PERT are the classical approaches for project scheduling Simulation-based techniques are more modern approach that is adopted by many project management software tools and some argue the best practice available Alternative approaches are Critical Chain Method and Fuzzy logic will be reviewed briefly Last but not least, scheduling technique and method for agile software development will also be discussed
1.2.2 Traditional scheduling methods and techniques
a) Critical Path Method (CPM)
Critical Path Method (CPM) is one of the most famous techniques in project scheduling Developed in 1957 by DuPont, CPM has become the standard technique
in project management and most project management tools support CPM
Trang 2927
calculation [39] According to Pollack-Johnson and Liberatore [43], almost 70% of project managers or professionals use CPM CPM calculation includes the following steps:
Specify the individual activities using a work breakdown structure
Determine the sequence of those activities and dependency between them
Draw a network diagram (that models the activities and their dependency)
Estimate the completion time (duration) for each activity
Identify the critical path (the shortest-duration path through the network)
Update the CPM diagram as the project progresses
The basic mathematical notations used for CPM calculation is shown in the Table 1.1 In fact, the parameters D, ES, EF, LF, LS are common used in scheduling techniques
Table 1.1 Basic mathematical notations used for CPM calculation
activities
6 LS Latest start of aj LSj = LFj – Dj
Total float of aj - the time that the activity’s duration can be increased without increasing the overall project completion time
TFj = ESj – LSj
= LFj – EFj
A critical activity is the one with no float time (TF = 0) and should receive special attention, since delay in critical activity will lead to delay the whole project Informally, the critical path is determined by performing forward and backward passes through the project network The forward path computes the earliest start (ES) and the earliest finish (EF) time for each activity The backward path computes the latest start (LS) and the latest finish (LF) time for each activity The total float for each activity is the difference in the latest and earliest finish of each activity
Trang 30Figure 1.2 CPM parameters in an activity
b) Program Evaluation and Review Technique (PERT)
PERT was introduced in 1957 by the US Navy as one of earliest research incorporating risk in project management [17, 19] A special feature of PERT is its ability to handle uncertainty in activity duration This means if there is a variation in time estimate of an activity; it may affect the whole project PERT methodology is developed to help completing the project successfully when the time estimate is not definitive
In order to do that, instead of a single estimation in CPM, PERT provides a beta probability distribution to each project activity Three time estimates (optimistic, most likely, and pessimistic time estimates) can be obtained and can be used to estimate the expected time and the standard deviation for an activity i
Trang 3129
Optimistic time estimate is the estimate determined considering all favorable conditions; i.e in the best-case scenario or when everything goes right In other words, this is the shortest time in which the activity may be completed
Most likely time estimate is the time duration where there is a high probability
of completing the activity within the given time duration In other words, it is the estimate in case of normal problems or opportunities
Pessimistic time estimate is the estimate determined when we consider all unfavourable conditions; i.e in the worst case scenario or when everything goes completely wrong In other words, this is the longest time the activity might require
to complete
Expected time: μi = (Optimistic + 4xMost likely + Pessimistic)/6
Standard deviation: σi = (Pessimistic – Optimistic)/6
The critical path is the sequence of project activities that determines the earliest time by which the project can be completed, and the total duration determines the completion date of the project PERT assumes that only one path is the critical path and that the path does not change Therefore, managers using PERT are advised to focus on these critical activities to ensure the project completion date remains unchanged The expected value of a critical path is calculated by the expected value
of each activity, and the variance of the critical path is the sum of the variances of all activities in the path Based on the calculation, the probability that the project will be completed by a certain date can be calculated Therefore, PERT is somehow similar to CPM The main difference is that each activity in a PERT network has a
variance associated with its completion time In other words, CPM is deterministic, while PERT is somehow probabilistic
c) Simulation-based techniques
Monte Carlo Simulation (MCS) was first proposed for project scheduling in the early 1960s [44] However, it was not until the 1980s when sufficient computer power became available that simulation became the dominant technique for handling risk and uncertainty in projects [45, 46] In its simplest approach, MCS uses the project activity diagram
The duration of each activity is estimated by shortest, most likely and longest duration and also the shape of the distribution (such as Normal, Beta etc.) Then critical path calculation is performed several times, each time using random values from the activities’ distribution function
Trang 3230
More advanced tools like PertMaster (Oracle Primavery Risk Analysis [47]) use simulation-based approach not only for handling uncertainty in duration and cost, but also for providing a whole risk analysis process They can link the project schedule to the risk register and apply simulation-based techniques to carry out probability impact analyses
A survey by the Project Management Institute [48] showed that nearly 20% of project management software packages support Monte Carlo Simulation Another survey by Pollack-Johnson and Liberatore in 2003 [49] found that 17% of project managers used probabilistic analysis and/or simulation within project management software
However, simulation has its own drawbacks One serious methodological flaw
in traditional MCS of project networks is the assumption of statistical independence for individual activities which share risk factors in common with other activities [43] Most available simulation packages assume that the marginal distributions of uncertainty for individual activities in the project completely define the multivariate distribution for project schedule It is intuitively obvious that this assumption is highly suspect for many projects which involve multiple activities of a similar type and/or have different activity types, which are influenced by common risk factors van Dorp and Duffey in 1999 [50] demonstrated that failure to model such types of risk dependence during MCS can result in the underestimation of total uncertainty
in project schedule The most effective way to deal with dependence in a statistic is use a causal structure to explain it MCS is not capable of modelling causal structures
Another weakness of MCS explained by Williams [51] is the inability of simulation to capture the actions taken by the managers to recover any slippage in activity/project duration MCS simply runs through a network assigning values to random variables on each iteration It ignores the fact that in reality if an activity was running late, management would take actions to affect the activity duration Uncertainty in an activity is usually the result of a chain of causes (sources) and can
be affected by a chain of actions (controls)
Furthermore, MCS is only as good as the information that is fed into it If the duration distributions of the project activities are incorrect or inadequate, the simulation results are erroneous and invalid In reality duration of most activities are estimated subjectively In order to capture all aspects of uncertainty in activity (project) duration various known and unknown sources of risk have to be addressed Therefore, MCS will not be applied as a scheduling technique in the scope of this thesis
Trang 3331
d) Fuzzy logic
An alternative approach that has interested several researchers in the past two decades [52, 53] is Fuzzy project-scheduling The fuzzy set scheduling literature recommends the use of imprecision rather than uncertainty, fuzzy numbers rather than stochastic variables and membership functions rather than probability distributions The output of a fuzzy scheduling will normally be a fuzzy schedule, which indicates fuzzy starting and ending times for the activities This may be as difficult to generate as probability distributions of activity duration and also there is
no generally accepted computational approach available Therefore, the fuzzy project-scheduling approaches have been kept in the academic sphere A summary
of most of the published research works in fuzzy project scheduling can be found in the work of Bonnal et al in 2004 [54]
1.2.3 Agile software project scheduling
From the late 1990s several methodologies like RUP, XP, FDD, Scrum etc began to get increasing public attention and has become mainstream software development methods, especially in Vietnam where most software vendors are small and medium enterprises These methods are representative of agile software development
Agile – denoting “the quality of being agile; readiness for motion; nimbleness, activity, dexterity in motion” [55] – software development methods are attempting
to offer an answer to the eager business community asking for lighter weight along with faster and nimbler software development processes This is especially the case with the rapidly growing and volatile Internet software industry as well as for the emerging mobile application environment
Agile development is a way of organizing the development process, emphasizing direct and frequent communication – preferably face-to-face, frequent deliveries of working software increments, short iterations, active customer engagement throughout the whole development life-cycle and change responsiveness rather than change avoidance [56] Thus, agile software development recognizes that software development is inherently a type of product development and therefore a learning process It is iterative, explorative and designed to facilitate learning as quickly and efficiently as possible Two of the most significant characteristics of agile approaches are: 1) they can handle unstable requirements throughout the development cycle; and 2) they deliver products in shorter time-frames and under budget constraints when compared with traditional development methods
Trang 3432
An agile approach can be seen as a contrast to (traditional) waterfall-like processes [57, 58, 59] which pay attention to thorough and detailed planning and design upfront and consecutive plan conformance The waterfall model is the oldest and the most mature software development model [58] In practice, the waterfall development model can be followed in a linear way, and iteration in an agile method can also be treated as a miniature waterfall lifecycle
Agile approaches have been widely employed in a domain of low cost of failure
or linear incremental cost of failure [60] Examples within this domain include based applications, mobile applications [55], Internet commerce, social networking, games development, and even some areas in government, finance and banking software development
web-Table 1.2 summarizes some of the differences between waterfall and agile projects
Table 1.2 The differences between waterfall and agile projects
Product/
scope
An often bloated product that
is still missing features (i.e., rejected change requests or de-scoped to meet deadlines)
The best possible product according
to customers own prioritization, incorporating learning from actual use (revolves with the increments) Schedule/
extensively and expensively
Quality is built in, and is the key to productivity (writing tests before writing code)
Value is generated early, as soon as the minimum highest prioritized features are delivered Greater return on investment
Relationship
to the
customer
Trang 3533
Since agile software development is organized iteratively and incrementally in iterations, agile software scheduling is actually iteration scheduling Iteration scheduling aims at determining a very feasible and precise plan for the development that schedules the implementation of selected features within an iteration (i.e assigning tasks to developers) Technical tasks (or Sprint backlog items in Scrum) are the main concepts of iteration scheduling These tasks are the fundamental working units accomplished by one developer, and usually require some working hour realization effort that is estimated by the team The aim of iteration scheduling
is to break down selected requirements into technical tasks and to assign them to developers [61] In that process, the development team also needs to care about tasks dependencies (sequencing) and time constrains The problem of optimized Agile iteration scheduling will be discussed in details in Section 3.1
1.3 Risk management in software project scheduling
1.3.1 Overview of project risk management
Risk management has become an important part of project management and has attracted a wide range of research during the last two decades [15] Since 1990 various Risk Management Processes (RMP) have been proposed Probably the most popular Project Risk Management Processes (PRMP) is Chapter 11 of the PMBOK (Project Management Body of Knowledge) guide [11], the PRAM (Project Risk Analysis and Management) guide [62] and the RAMP (Risk Analysis and Management for Projects) guide [63] Most organisations adopt one of these guides
or use them to develop their own process This thesis does not intend to explore the detailed differences between different guides since, apart from fundamental differences in assumptions and methodologies [64], they all aim to capture risk and uncertainty in the following three stages:
The usual output of the risk identification stage is a document called the Risk Register Many authors have discussed risk registers in their works [65] Williams [66] stated two main roles for a risk register:
Trang 3634
A repository of a corpus of knowledge
To initiate the analysis and plans that flow from it
Chapman and Ward [18] consider a risk register as documentation of the sources
of the risks, their responses and also risk classification Ward [67] described the purpose of a risk register “to help the project team review project risk on a regular basis throughout the project” Patterson and Neailey [68] presented a risk register database system to aid managing project risk Risk registers can be a good management tool during the course of a project However, it is not possible to identify all risks and capture all aspects of them There are always unknown (i.e undiscovered, unattended or immeasurable) risks that often are more important than the identified risks in the risk register
The Risk Analysis stage attempts to measure the risk and its impacts on different project outputs (i.e cost, time, and performance) This stage is also known as quantitative risk management The likelihood that each identified risk will occur and also its possible impact on the project is estimated The combination of the risks, probabilities and their impact create ‘probability-impact’ (PI) matrices This matrix can be used to assign ranks to risks and then prioritise them Most of the available quantitative tools and techniques (simulation based tools) implement the
PI values to quantify uncertainty in projects However, use of PI matrices has some important shortcomings [15]
The Risk Response stage attempts to formulate management responses to the risk Also known as “Risk Mitigation”, it uses the results of the analysis stage in order to improve the chance of achieving the project objectives “Risk Response” is
a decision making process A number of alternative strategies are available when planning risk responses, which can be described under one of the following strategies [69]:
Avoid - seeking to eliminate uncertainty by reducing either the probability or the impact to zero
Transfer – seeking to transfer ownership and/or liability to a third party (e.g insurance)
Mitigate – seeking to reduce the size of the risk exposure in order to make it more acceptable to the project or organization
Accept – recognizing residual risks and responding either actively by allocating appropriate contingency, or passively doing nothing except monitoring the status of the risk
Trang 37in the process
1.3.2 Project risk analysis
The term risk analysis in the scope of this research is the same with quantitative risk analysis and related to risk measurement, as we focus on quantitative issues of
project risks Project risk analysis is one stage of project risk management In some literature, risk analysis is even synonymous with risk management
In fact, risk analysis is usually started out by a qualitative analysis and its results support the decision making process in the Risk Response stage It is a continuous process that can be started at almost all stages in the duration of a project However, it is the best to use risk analysis in the beginning stages of projects (i.e some phases like feasibility study and planning) and continually update
it during the implementation phase This can be done iteratively at intervals, and this also matches with agile software development
Risk analysis is the most “formal” aspect of the project risk management process [69]), often involving sophisticated techniques and usually requiring computer software (or tools) Such techniques may be applied with various levels of effort depending on the available resources for the analysis and also on the details Risk analysis can bring in certain benefits to software project, including:
Help to make decisions and make it possible for more effective and efficient risk management
Help to make more feasible (realistic) plans, in terms of both duration and costs
Help to form statistical data of historical risks This in turn would be benefits
in better planning and implementation of future projects
Trang 3836
1.3.3 Unknown risks
One important category of uncertainty in projects is “Unknown Risks” These are important sources of uncertainty because their impact on a project may outweigh all other sources of risks
Although unknown risks are thoroughly acknowledged (perhaps with different names) by several authors, none of the existing approaches for project scheduling is able to model and quantify this type of risk The conventional “probability impact” approach at best is only capable of modelling “known risk” Most of the current quantitative techniques for risk analysis are event-oriented and more concerned about ‘risk of something happening’ They assume that a list of events (conditions) that may take place is known, the impact of each risk on activity duration is also known and even the nature of the response to each risk is roughly known [19] However, unknown risks are unpredictable and immeasurable (their impacts are unknown or hard to quantify) Those risks required much effort to clarify An example of unknown risks is Internally Generated Risk - IGR [78] As their names already reveal, IGRs originated from within the project team or organization, from rules, policies, regulations, structures, actions, behaviours or culture of the organization IGRs have the following features:
Common, since organizational issues such as policies, processes, culture etc are widespread in most projects of the organization
Important, since they often have impact on more than one activity
Not well-managed in projects, as they are unpredictable (and hardly put in documents or risk registers) and hard to quantify
1.3.4 Risk aspects in software project scheduling
In different project management processes there are different aspects of uncertainty/risk [23] This thesis focuses on quantitative risk management which concerns about risks affecting project schedule (or project time frame), including risks affecting project scheduling (a phase or a process in project planning) As can
be deduced from the previous sections, these risks cannot be completely separated from risks of other processes or phases
In project scheduling, the most obvious risk is in duration estimation for a particular activity Difficulty in this estimation can arise from a lack of knowledge
of what is involved as well as from the uncertain consequences of potential threats
or opportunities Some sources of uncertainty:
Trang 3937
Level of available and required resources (including inexperienced or lack of training developers)
Incomplete (or often changing) requirements
Tradeoff between resources and time
Possible occurrence of uncertain events (especially those cause badly impact,
Lack of previous experience and use of subjective instead of objective data
Incomplete or imprecise data, or lack of data
Uncertainty about the basis of subjective estimation (i.e bias in estimation)
1.4 Bayesian Networks
1.4.1 Probabilistic approach using Bayesian Networks
Bayesian Network (BN, or also known as Bayesian Belief Network, Causal Probabilistic Networks, Probabilistic Cause-Effect Models, and Probabilistic Influence Diagrams) is a special type of graphs that associated together with a set of probability tables BN models causal relationships of a system or dataset and provides a graphical representation of this causal structure through the use of directed acyclic graphs (DAGs) with nodes and edges The DAG representation provides a framework for inference and prediction The nodes represent random variables with probability distributions, while edges represent weighted causal relationships between the nodes Each node has a probability of having a certain value (a finite set of mutually exclusive states) A directed edge exists from a parent
to a child Each child node A has a conditional probability table P(A|B1,…,Bn) based on its parental values B1,…,Bn If the node has no parents, then the table becomes the unconditional probabilities P(A) (i.e prior probability)
BN is based on Bayes’ Theorem, with the well-known formula presenting the joint probabilities:
Trang 4038
The above Bayes rule is interpreted in terms of updating the belief (posterior probability of each possible state of a variable, that is, the state probabilities after considering all the available evidence) about a hypothesis R in the light of new evidence S So, the posterior belief P(R/S) is calculated by multiplying the prior belief P(R) by the likelihood P(S/R) that S will occur if R is true (see more about updating probability in Section 1.4.2)
We can re-arrange the formula for conditional probability to get the following formula in form of product rule:
We can extend the above product rule for three variables:
P(A,B,C) = P(A|B,C)*P(B,C) = P(A|B,C)*P(B|C)*P(C) (1.4) And it follows the generalized formula to n variables that:
P(A1,A2,…,An) = P(A1|A2, … ,An)*P(A1|A2, … ,An)*…*P(An-1|An)*P(An) (1.5) Formulas 1.4 and 1.5 are often referred to as the “Chain Rule”, which says in a
BN the full joint probability distribution is the product of all conditional probabilities specified in the BN These formulas are important ones considering
BN since they provide means of calculating the full joint probability distribution in BNs [9] Many of the variables Ai will be conditionally independent which means that the formula can be simplified as shown
BN allows an injection of probability distributions associated with individual nodes The initial probability distributions can be simply based on “expert opinions”, survey or other mathematical methods, i.e., BN approach is consisted of expert opinions and mathematical calculations
A BN consists of two parts: 1) qualitative part represents the relationships among variables by a directed acyclic graph, and 2) quantitative part specifies the probability distributions associated with every node of the model The Figure 1.3 shows a BN representing a simple case about the relationship between sub-contract, (team) staff quality and the possibility of delay in a task [23]
In the BN in Figure 1.3, the qualitative part consists of three nodes (represent uncertain variables) and two edges Each node has a set of states For example, the node Staff Quality has two states: “Good” and “Poor” Another part of the directed graph – the edges – represents influential relationships between variables For instance, an observed event on Sub-contract or/and Staff Quality may lead to Delay
in Task