The research provides answers to the above questions with probabilistic approaches and tools to assess the impacts of risk factors on software project scheduling; proposing list of commo
Trang 1MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
PhD DISSERTATION ON SOFTWARE ENGINEERING
Hanoi – 2021
Trang 2MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Ngoc Tuan
RISK MANAGEMENT IN SOFTWARE PROJECT SCHEDULING
USING BAYESIAN NETWORKS
Major: Software Engineering
Code No.: 9480103
PhD DISSERTATION ON SOFTWARE ENGINEERING
SUPERVISORS:
1 Assoc Prof Dr Huynh Quyet Thang
2 Dr Vu Thi Huong Giang
Hanoi – 2021
Trang 3Nguyễn Ngọc Tuấn Assoc Prof Dr Huỳnh Quyết Thắng
Trang 42
Acknowledgements
First of all, I would like to express my sincere gratitude to my first supervisor Assoc Prof Dr Huynh Quyet Thang for his invaluable guidance and support throughout my research Professor Thang has supported me all the way, all the time
It is his patience that keeps me always committed to doing this research and reaching the end of PhD student period I am also very grateful to my second supervisor Dr Vu Thi Huong Giang whose bright hints and expertise has been always helpful to me
My special thanks go to Ms Vo Thi Huong, Ms Bui Thi Quynh Nga, Mr Tran Trung Hieu, Mr Tran The Anh, Mr Tran Bao Ngoc and Mr Cao Manh Quyen, who were master and bachelor students at School of ICT, Hanoi University
of Science and Technology and helped me with building the tools as well as testing our models
I am also indebted to Dr Nguyen Thanh Nam (former CEO of FPT and former President of FSOFT), Mr Luu Quoc Tuan (Tinh Van Outsourcing Jsc.), Mr Ngo Quang Vinh (Evizi), Mr Nguyen Huy Binh (FIS) who provide helpful real software project data and valuable expertise judgments on the data
Finally, my greatest appreciation is to my family, especially to my wife Tran Thi Bich Ngoc and to my son Nguyen Minh Huy Without their love, patience and sacrifice, this achievement would never be possible
Trang 53
Summary
Software project management is an art and science of planning and leading software projects In software industry, project managers mostly rely on their experience and skills to manage their projects and lack of scientific tools to support them
Risk management is a crucial part of software project management that helps prevent software disasters In this research, risks are defined as uncertain events or conditions that, if they occur, they would have a bad impact on one or more software project outcomes (cost, time, quality) Identifying and dealing with risks or uncertainty in early phases of software development life cycle would lessen long-term cost and enhance the chance of the project success The most important part of risk management is risk analysis which assesses the risks and their impact to the outputs of the software project To overcome subjective assessment based on development team’s experience, the team needs a quantitative risk analysis method Software project scheduling is one part of software project planning Since in practice, most software projects are over-budget and behind schedule, software project scheduling needs to be taken into careful consideration We come up with the following questions:
How to schedule software projects better?
How to better manage risks in software projects?
How to quantitatively analyse risks?
Some researchers say that Bayesian Networks can be used to quantify uncertain factors in (general) project scheduling and improve project risk assessment and analysis Our research is aimed at taking those advantages of Bayesian Networks into software project scheduling by addressing common software project features The research provides answers to the above questions with probabilistic approaches and tools to assess the impacts of risk factors on software project scheduling; proposing list of common risk factors and Bayesian Network model of these risk factors; and proposing advanced scheduling methods based on incorporating Bayesian Networks into popular scheduling techniques such as CPM, PERT or agile iteration scheduling etc Bayesian Networks help quantify the factors, and hence help better manage them as well as enhancing the predictability
of things happen in the project
Trang 64
This research first takes a literature review on (general) project planning issues, project scheduling techniques, project scheduling tools, uncertainty and risk characteristics in software projects, risk management processes, project risk analysis
in order to apply state-of-the-art techniques into software projects (Chapter 1) After that, Bayesian Networks are applied in building and experimenting risk factors in software project scheduling BRI (Bayes Risk-Impact) algorithm is proposed to assess risk factors’ impact on software scheduling (Section 2.1) The first set of risk factors with 5 risk factors are examined using a probabilistic own-built tool CKDY to analyse risks in software project scheduling (Section 2.2)
The research proposes an advanced algorithm for agile iteration scheduling using Bayesian Networks The advantages of this method are providing a schedule and the probability of finishing agile iteration on time (Section 3.1) In addition, the author goes further with a more refined list of 19 risk factors in software scheduling and uses them in software scheduling methods The research also incorporates Bayesian Network with CPM and PERT scheduling techniques in traditional software projects together with the Bayesian Networks of common risk factors (Section 3.2 and Section 3.3) The list of 19 risk factors in agile software development is also examined in agile iteration scheduling (Section 3.4) The experimental results show that our models are reliable and our approaches have practical implications, i.e we can take advantage of Bayesian Networks in modelling and quantifying risks/uncertainty in software projects
Trang 75
How to read this report?
The author highly recommends that you read this report from beginning to the end However, if at any point you want to look at specific important pieces of information, the following guide could be helpful:
To get the motivation, the overview of related work, the objectives, the scope, the hypothesis and methodology of this research, please go to the Introduction section
To get an overview of software project scheduling and risk management in software project scheduling, please go to Sections 1.1, 1.2 and 1.3
To get an overview of Bayesian Networks, please go to Section 1.4
To get details on main contributions and key findings of the research, please read Chapter 2 and Chapter 3
To get information on common risk factors in software project scheduling, you can have a look at Section 2.3
The Chapter 2 is about building tools and doing experiments on applying Bayesian Networks into risk management in software project planning (Section 2.1) and some key risk factors (Section 2.2)
The Chapter 3 is about incorporating Bayesian Networks and common risk factors into software project scheduling techniques such as CPM (Section 3.2), PERT (Section 3.3), Agile software development scheduling (Section 3.4)
To get to know the conclusions, the limitations as well as the further research
of the study in this PhD thesis, please read the Conclusion section
Trang 86
Content
Acknowledgements 2
Summary………… 3
How to read this report? 5
List of symbols and abbreviations 10
List of tables 12
List of figures 13
Introduction 15
Motivation 15
Related work 18
Research scope 20
Research objectives 21
Scientific and realistic meaning 21
Research hypothesis and methodology 21
Expected results 22
Structure of the thesis 22
Chapter 1 Overview of software project scheduling and risk management 24
1.1 Software project management and software project scheduling 24
1.1.1 Software project management 24
1.1.2 Software project scheduling 26
1.2 Software project scheduling methods and techniques 27
1.2.1 Overview 27
1.2.2 Traditional scheduling methods and techniques 27
1.2.3 Agile software project scheduling 32
1.3 Risk management in software project scheduling 34
Trang 97
1.3.1 Overview of project risk management 34
1.3.2 Project risk analysis 36
1.3.3 Unknown risks 37
1.3.4 Risk aspects in software project scheduling 37
1.4 Bayesian Networks 38
1.4.1 Bayesian approach vs classical approach 38
1.4.2 Probabilistic approach using Bayesian Networks 39
1.4.3 Bayesian Inference 41
1.4.4 Bayesian Networks and project risk management 42
1.5 Chapter remarks 44
Chapter 2 Common risk factors and experiments on Bayesian Networks and software project scheduling 46
2.1 Application of Bayesian Networks into schedule risk management in software project 46
2.1.1 Common risk factors in software project management 47
2.1.2 Bayesian Networks of risk factors 48
2.1.3 Risk impact calculation 54
2.1.4 Bayesian Risk Impact algorithm 57
2.1.5 Tool and experiments 58
2.1.6 Conclusion and contribution 63
2.2 Experiments on common risk factors 64
2.2.1 Discovering the top ranked risk factors 64
2.2.2 Tool CKDY 68
2.2.3 Experiments and analysis 70
2.2.4 Conclusion and contribution 74
2.3 Proposed common risk factors in software project scheduling 75
2.3.1 The 19 common risk factors in traditional software project 75
2.3.2 The 19 common risk factors in agile software project 77
Trang 108
2.3.3 Conclusion and contribution 79
2.4 Chapter remarks 79
Chapter 3 Incorporation of Bayesian Networks into software project scheduling techniques 81
3.1 Applying Bayesian Networks into specific software project development 81
3.1.1 Introduction 81
3.1.2 Optimized Agile iteration scheduling 82
3.1.3 Optimization model for Agile software iteration 83
3.1.4 Tool and experimental results 88
3.1.5 Conclusion and contribution 92
3.2 Incorporation of Bayesian Networks into CPM 92
3.2.1 The RBCPM Model 93
3.2.2 The RBCPM Method 96
3.2.3 Tool and experimental results 97
3.2.4 Conclusion and contribution 101
3.3 Incorporation of Bayesian Networks into PERT 102
3.3.1 Proposed model 102
3.3.2 Tool development and data collection 106
3.3.3 Experimental results and analysis 110
3.3.4 Conclusion and contribution 112
3.4 Incorporation of Bayesian Networks into Agile software development scheduling 112 3.4.1 Incorporation of risk model 113
3.4.2 Tool and experimental results 113
3.4.3 Conclusion and contribution 115
3.5 Chapter remarks 116
Conclusion 117
What has been done 117
Trang 119
Main contributions 117
Limitations 117
Further research 118
List of scientific publications 119
References 120
Index……… 128
Appendix Sub Bayesian Networks of the 24 risk factors 129
Trang 1210
List of symbols and abbreviations
3 BAIS Bayesian Agile Iteration Scheduling
7 CMMi Capability Maturity Model Integration
17 PERT Program Evaluation and Review Technique
19 PMBOK Project Management Body of Knowledge
Trang 1311
22 PRAM Project Risk Analysis and Management
25 PSPLIB Project Scheduling Problem Library
26 RAMP Risk Analysis and Management for Projects
27 RBCPM Risk Bayesian Critical Path Method
Trang 1412
List of tables
Table 1.1 Basic mathematical notations used for CPM calculation 28
Table 1.2 The differences between waterfall and agile projects 33
Table 1.3 The differences between Bayesian and Frequentist approaches 38
Table 2.1 Hui and Liu’s common risk factors [9] 47
Table 2.2 Risk factors in the phases 61
Table 2.3 Risk factors, consequences and impact 65
Table 2.4 Examples of risk factors and probabilities 67
Table 2.5 Probability of risk factors in the whole project with data set 1 72
Table 2.6 Probability of risk factors in the whole project with data set 2 73
Table 2.7 Probability of the experimental risk factors to compare with MSBNx 74
Table 2.8 CKDY compared with MSBNx 74
Table 2.9 List of 19 common risk factors for software project scheduling 76
Table 2.10 List of 5 risk factors for software project scheduling in Section 2.2 77
Table 2.11 List of 19 risk factors in iteration scheduling 78
Table 3.1 The first data sample 89
Table 3.2 The probability table for tasks and resources 90
Table 3.3 Risk factors analysis 94
Table 3.4 Data sample 1 98
Table 3.5 Data sample 2 99
Table 3.6 Task attributes of the first data sample 108
Table 3.7 Task attributes of the second data sample 108
Table 3.8 Task attributes of the third data sample 109
Table 3.9 The result for the first data sample 114
Trang 1513
List of figures
Figure 1.1 Activities of project management according to PMBOK Guide 26
Figure 1.2 CPM parameters in an activity 29
Figure 1.3 An example of BN which represents a simple case 41
Figure 2.1 A sub BN for the risk factor “Staff experience shortage” 49
Figure 2.2 A sub BN for the risk factor “Low productivity” 49
Figure 2.3 A sub BN for the risk factor “Lack of client support” 50
Figure 2.4 A sub BN for the risk factor “Inaccurate cost estimating” 50
Figure 2.5 A sub BN for the risk factor “Incapable project management” 51
Figure 2.6 A sub BN for the risk factor “Lack of senior management commitment” 52
Figure 2.7 A sub BN for the risk factor “Inadequate configuration control” 52
Figure 2.8 A sub BN for the risk factor “Inaccurate metrics” 53
Figure 2.9 A sub BN for risk factor “Excessive reliance on a single process improvement” 53
Figure 2.10 The overall BN for software risk factors 54
Figure 2.11 A simple example of Bayesian inference 55
Figure 2.12 The three nodes of a simple-chain BN 57
Figure 2.13 The graphical interface of the tool 59
Figure 2.14 Result of experiment 1 60
Figure 2.15 Results of the three experiments 62
Figure 2.16 Experimental results for Software Design phase 63
Figure 2.17 Sub BN 1 66
Figure 2.18 Sub BN 2 66
Figure 2.19 The overall BN model 67
Figure 2.20 Experiment with j30 with the early start schedule 71
Figure 2.21 Activity joint in the file j301_1.rcp 71
Figure 2.22 Diagram of probabilities of finishing phase by phase 72
Figure 3.1 Home GUI of tool BAIS 88
Figure 3.2 Gantt chart for SPT strategy 90
Figure 3.3 A part of a BN for 19 risk factors 93
Figure 3.4 Task’s parameters and connection to other tasks 96
Figure 3.5 A screenshot of RBCPM 97
Figure 3.6 A result for experiment with data sample 1 100
Figure 3.7 A result for experiment with data sample 2 101
Figure 3.8 Bayesian Network for each activity 103
Figure 3.9 Risk integration network model into PERT scheduling 104
Figure 3.10 Process in improved RBPERT Model 105
Figure 3.11 The input screen of the RBPERT tool 106
Trang 1614
Figure 3.12 The input file type of the RBPERT tool 107Figure 3.13 A result for the network provided by the RBPERT tool for the first data sample 109Figure 3.14 A result for RBPERT network provided by the tool for the first data sample 111Figure 3.15 A result for experiment with the third data sample (distribution of Total Duration of activity J) 111Figure 3.16 A screenshot of tool BAIS 113Figure 3.17 The result of the second experiment 115
Trang 17Software projects also have schedule risks, and as a consequence, budget or cost risks For example, the project on the Vietnamese National Population Database2
was approved to be invested in 2015 and was planned to be finished in two years (2016 and 2017) However, the system can only be put into operations in February
2021 Another similar example is the project on Vietnamese National Public Service Portal3 which was planned to come public in September 2016 but was only opened since December 2019 As a matter of fact, the majority of software projects the author has experienced in Vietnam are behind schedule (some of the projects will be examined in Chapter 2 and Chapter 3)
Even in developed countries, software projects are facing ongoing problems For example, the project Universal Credit - the welfare payment system owned by the Central Government of the United Kingdom - started in 2013 The project schedule has slipped, with the final delivery date now expected to be 2021, although the system is gradually being introduced In 2013, only one of four planned pilot sites went live on the originally scheduled date, and the pilot was restricted to extremely simple cases4
Many software projects have suffered from significant budget overruns together with a series of delays, which cause either temporary issues or permanent failures For example, The Queensland Health Payroll System was launched in 2013 in what could be considered one of the most spectacularly over budget projects in Australian history, coming in at over 200 times the original budget Besides, in spite
1 VnExpress (2019), “Ministry of Transport admits the mistakes on the Cat Linh-Ha Dong urban
railway project”, available online (in Vietnamese) at: sai-sot-trong-du-an-cat-linh-ha-dong-3988254.html
https://vnexpress.net/bo-giao-thong-van-tai-thua-nhan-2 Vietnamese Prime Minister (2015), “Decision regarding the approval of investment policy for the
project on the National population database”, Government of Vietnam, 2083/QĐ-TTg (26 November 2015)
3 Vietnamese Prime Minister (2015), “Resolutions on e-Government”, Government of Vietnam,
36a/NQ-CP (14 October 2015)
4 Wikipedia.org, “List of failed and over-budget custom software projects”, Retrieved 20 September
2019, available online at: projects
Trang 18https://en.wikipedia.org/wiki/List_of_failed_and_overbudget_custom_software_-16
of promises that the new system would be fully automated, the new system required
a considerable amount of manual operation [1] Another example for software project permanent failure case is the project e-Borders for an advanced passenger information programme which aimed to collect and store information on passengers and crew entering and leaving the United Kingdom Started in 2007, the project had
a series of delays and had to be cancelled in 2014 [2]
Some researches pointed out that most of the software projects (83.8%) are over budget or behind schedule and 52.7% of software development projects deliver software with fewer features than originally specified [3, 4] Statistics also show that 31.1% of development projects end up being cancelled or terminated prematurely Among those completed projects, only 61% of them satisfy originally specified features and functions [5] In the software industry, one of the greatest challenges that development teams constantly face with is to keep the projects under control in terms of budget and schedule (development time frame) The activities of
a software project are influenced by internal and external factors (from that project organization) that make it uncertain whether the project will achieve its objectives The effect that this uncertainty has on the project’s goals is called risk [6] In the other words, risk is an event or an uncertain condition that, if it occurs, will have a
positive or negative effect on at least one of the project objectives [7] In this thesis, risks are defined as uncertain events or conditions that, if they occur, they would have a bad impact on one or more software project outcomes (cost, time, quality)
The above situation raises an important question: how projects’ risks are managed better in order to get rid of the temporary issues as well as preventing from failure?
The purpose of project management is to lead the project to success A
successful software project certainly relies on many factors (e.g following appropriate processes and tasks, managing risks properly etc.) Since risks are inevitable in projects, risk management has become an important part of project management Although many researchers, experts and writers have proposed variety
of processes and techniques, project risk management (PRM) is still rapidly evolving and handling risks in general projects as well as software projects remains
a challenge
Concerning PRM, an important component is risk analysis which also known or considered the same as risk quantification Risk analysis attempts to measure risks and their impacts on different project outcomes (i.e., time, cost, quality) Many software projects fail since project managers mostly plan based on their experience and there is a lack of scientific methods to support them To overcome subjective
Trang 1917
assessment based on development team’s experience, the team needs a quantitative risk analysis method Although various researches have proposed and examined a range of processes and techniques and software project risk management is continuously evolving, handling uncertainty in more and more complex real-world projects remains a challenge
Aside from that, project scheduling (a part of project planning – an early phase
of software development life cycle) is concerned with the techniques that can be employed to manage the activities that need to be undertaken during the development of a project There are various techniques for project scheduling, from simple and easily understandable ones such as Task List, Gantt Chart, Schedule Network Analysis, to more complicated ones like Critical Path Method (CPM), Program Evaluation and Review Technique (PERT), Monte-Carlo Simulation (MCS) or Fuzzy Logic etc [6, 8, 9, 10]
Traditional project scheduling under risk/uncertainty has attracted more research and attention in the project management community In some of the project management literature in 1990s, “risk analysis” was equivalent to “the analysis of risk on project plan” [11] This thesis focuses on modelling risks in software project time management (of course, it is indirectly related to other project outcomes which are cost and quality) In other words, this thesis concentrates on quantitative risk analysis in software project scheduling
The earliest studies incorporating uncertainty/risk in project scheduling were in the late 1950’s by Malcolm et al [12] and Miller [13] Since then, a variety of techniques have been introduced, several tools have been developed, and many of them are widely used throughout different industries However, they often fail to capture uncertainty properly and/or produce inaccurate, inconsistent and unreliable results, especially when applied to software projects which have specifically different attributes to other traditional projects
Project uncertainty has several aspects of which not all can be categorized and treated as risks Several authors such as Ward and Chapman [14] argued that project risk management should be focusing on managing uncertainty and its various sources rather than emphasizing a set of possible events that might have bad impacts on project performance (i.e., should be aware more about uncertain aspects rather than fixed set of defined risks) However, since this thesis is about software project, risks are considered and treated the same as uncertainty Most of quantitative techniques and methods in the current practice of project risk management are based on the “Probability Impact” concept, which have certain shortcomings in terms of risk analysis in project scheduling More sophisticated
Trang 20in software projects) times are not known for certain Therefore, they may be assumed as random variables
Furthermore, Bayesian Networks (BNs) have attracted a lot of attention in different fields (construction, R&D etc.) as a powerful approach for decision support under uncertainty A BN is a graphical and mathematical model which offers a powerful, general and flexible approach for modelling risk and uncertainty Its capability of modelling causality and also conditional dependency between variables make it perfectly suitable for capturing uncertainty in projects Yet, BNs are rarely applied in project risk management in general as well as in software project management and software project scheduling
The author of this thesis strongly believes that if we can identify and control risks at early stages of software development project, we can significantly increase the chance of success of the project Since it is not easy (or impossible) to control all of the problems or factors, this thesis only focus on time factors which related to software development schedule
Therefore, this thesis aims at introducing an advanced approach as well as finding a better model for incorporating and managing uncertainty/risks in software project scheduling The idea is to use BNs to perform the well-known scheduling techniques such as CPM, PERT etc as well as modelling risk factors in software project scheduling The proposed approach enriches the benefits of scheduling techniques by incorporating uncertainty/risk factors and adding the strong analytical power of BNs
Related work
There have been various researches on applying BNs in to general projects Khodakarami [15] applied BNs into general project scheduling with two case studies of aircraft design and health and fitness center design and construction Erhan et al [16] proposed a project control framework that integrates the project uncertainty and associated risk factors into project control Their framework is based on earned value management (EVM), which is an effective and widely used quantitative project control technique in practice The framework uses hybrid BNs
Trang 2119
to enhance EVM with the ability to compute the uncertainty associated with its parameters and risk factors, making it practical for construction projects Ali et al [17] combined Monte Carlo Simulation and Bayesian Networks methods to present
a structure for assessing the aggregated impact of risks on the completion time of a construction project Lee and Shin [18] proposed an application of BNs into risk management of ship building project and proposed 26 risks Sharma and Chanda [19] developed a BN model for prediction of R&D project success which also assesses based on R&D project risk factors Khodakarami et al [20] also examined
an approach to generate project schedules that incorporates risk, uncertainty, and causality using BNs Their model empowered the traditional CPM to handle uncertainty, and they also provided explanatory analysis to elicit, represent, and manage different sources of uncertainty in project planning Fenton and Neil [21] introduced AgenaRisk as a probabilistic tool based on BNs; Chang, Yu, and Cheng [22] proposed a risk-based Critical Path Scheduling Method based on 2 risk categories and 7 risk levels which applied into construction projects
Regarding risk factors in software projects, Hui and Liu [5] selected 24 risk factors that may cause potential impacts on (the whole) software project and applied BNs properties in the calculation of impact in their project risk model Kumar and Yadav [23] considered quantitative features and causal relationships among risk factors in software projects They introduced a probabilistic approach to assess risks
in software projects as well as proposing a list of 27 risk factors (in software projects) However, they analysed risks for the whole software projects and did not focus on the scheduling and planning phases which would decide the success of projects Adjusting Kumar and Yadav’s method, this thesis proposes the list of 5 most crucial risk factors as well as building the tool CKDY to examine risks in software scheduling (Section 2.2)
There have been some other researches on BNs and software risks’ analysis Hu
et al [24] studied causality analysis among risk factors and project outcomes for software development projects For this purpose, they proposed a modelling framework based on BNs to deal with causality constraints in risk analysis The developed framework can be used for discovering new causal relationships and validating existing relationships among risk factors and project outcomes Anthony
et al [25] proposed a risk assessment model for decision-making in software management which consists of processes and component of risk assessment in three groups: operational risks, technical risks and strategic risks Rai et al [26] believed that managing projects is managing risks and identified 43 risk indicators in Agile Software Development
Trang 22to develop a model that takes into account the relationships of dependencies and interdependence that exist between the sources of risks and uncertainties in software projects As a result, their work contributes with the practice of risk and uncertainty management in software projects
J Yong and Z Zhigang [33] proposed a PERT Bayesian Network (PERTBN) model with the modelling methodology and the conditional probability calculation method of different kinds of procedure arrangement (single-chain, centralized, distributed) and stated that with PERTBN model, the effectiveness of the project schedule control and optimization are ensured However, the research did not examine more in-depth on the risk factors or other specific software features that can have impacts on the project schedule
In addition, there is always a need for properly schedule control in software projects to determine the instant status of the schedule, to know if the schedule has changed, and to embrace changes when they occur In order to do that, influential factors that cause schedule changes need to be carefully considered
In summary, current researches related to this thesis are either on risk management or assessment for the whole software project or for other project (construction, building, R&D etc.) scheduling There is a need of probabilistic method on risk management in software project scheduling as well as examining deeper the risk attributes of software project scheduling
Research scope
The research is about software projects (or software development projects), having common features and also specific features in comparison to other type of projects (such as construction projects, R&D projects etc.) Unfortunately, there have been only a few good researches on applying probabilistic methods on software development projects Therefore, this method first has a literature review
Trang 23Research objectives
The main objectives of this research are:
1) To find out a quantitative method to better assess and analyse risks in software project scheduling In order to achieve this objective, the research has to answer to following questions: what are the risks’ attributes of software project scheduling? How to manage risks in software project scheduling better?
In other words, the research aims at analyzing and modelling risks in software project scheduling
2) To find out a probabilistic method to improve well-known software project scheduling techniques, including both techniques for traditional software scheduling and agile software scheduling
The proposed methods and models would enhance risk management process by
a quantitative assessment of risks impact on software project scheduling If we apply this model and method in practice, the author of this thesis expect that it would help predict, monitor project schedule better as well as making appropriate decisions
Scientific and realistic meaning
The proposed methods and model would enhance risk management process by a quantitative assessment of risks impact on software project scheduling
If we apply this model and method in practice, it would help predict, monitor project schedule better as well as making appropriate decisions
Research hypothesis and methodology
The hypothesis of this thesis is that it is possible to use BNs to quantify uncertainty in software project scheduling and improve software project risk assessment
Trang 2422
Since there is very limited research on this topic, the research methodology comprises a literature reviews from general project management to get the relevant ideas for software project management Firstly, a literature reviews to investigate the current state of project scheduling under uncertainty which determines the need, scope and objectives of the new approach Secondly, a literature review follows on the background, theory and application of BNs This provides the conceptual and the fundamental background for the new approach
The research also examines the features of software projects, both in waterfall model and agile software development model In order to handle risks in software project scheduling, the common risk factors are also needed to be examined
Within the research, tools are built to validate the models and help software project managers in assessing risks and making appropriate decisions
Expected results
Following the above methodology, the author expects to:
1) Apply Bayesian Networks to develop an algorithm and tool to assess the impacts of risks and hence proposes common risk factors in software project scheduling
2) Apply Bayesian Networks to develop a probabilistic approach to enhance the common scheduling techniques (for both traditional software development and agile software development) in terms of risk management and predictability
Structure of the thesis
An overview of the main chapters is as follows:
Chapter 1 briefly reviews software project scheduling and software project risk management process and explores the currently popular techniques in project scheduling
Chapter 2 consists of initial attempts of applying BNs into risk management in software project scheduling as well as experiments on common risk factors in software project scheduling 19 common risk factors for both traditional software development projects and agile software projects are proposed
Chapter 3 incorporates BNs into popular software project scheduling niques, namely CPM, PERT and agile software scheduling BNs are also applied in examining the relationships among risk factors proposed in Chapter 2
Trang 25tech-23
The last section Conclusion concludes the thesis and points the way forward for future research
The main contributions and results of the research: The research has
developed the algorithm BRI (Bayes Risk-Impact) and the tool CKDY to assess the impacts of risks and hence proposes common risk factors in software project scheduling Based on literature review and experiments, the research has come up with 19 common risk factors in software project scheduling (for both agile development style and traditional development style)
The research also proposes advanced scheduling methods in software project development The methods based on incorporating Bayesian Networks and common risk factors models into popular software scheduling techniques such as PERT, CPM, and Agile software development, with the examination of the model of 19 common risk factors Tools have been built to experiment the proposed scheduling methods and models Experimental results show that the proposed methods and models are reliable as well as providing practical value to software development teams in analyzing, monitoring and predicting risks and the chance of success of the project
Trang 261.1.1 Software project management
Software project management is an art and science of planning and monitoring software projects It refers to the branch of project management dedicated to the planning, scheduling, resource allocation, implementation, tracking and delivery of software and web projects [34]
There are various types of projects (R&D projects, construction projects, information system projects, software projects etc.) which are associated with different styles of management Software project management is quite distinct from traditional or other project management Firstly, software is developed, not manufactured Therefore, the product (working software) is intangible and uniquely flexible Secondly, software engineering is not recognized as an engineering discipline with the same status as mechanical, electrical engineering etc Moreover, software projects have a unique lifecycle process that requires multiple rounds of testing, updating, and customer feedback That software development process is not standardized Lastly, most software projects are “one-off” projects Software development team can only use similar experience, not the same experience or repeated process
Therefore, software project management is about the methodology to organize all activities related to the software We always need project management since software projects always have constraints of budget and time frame
Nowadays, most IT-related projects are managed in the agile style and software
is developed in groups, in order to keep up with the increasing pace of business, and iterate based on customer and stakeholder feedback Besides being used in IT-related projects, Agile style has also been increasingly used in other project management
The project manager leads the project team and often plays the central role among the investors (or customers), the suppliers and the senior management of the organization He or she makes sure the project complies with the constraints as well
as delivering the product (software) on time Software project managers may have
to do any of the following tasks [34]:
Trang 2725
- Planning and scheduling: This means putting together the blueprint for the
entire project from ideation to fruition It will define the scope, allocate necessary resources, propose the timeline, delineate the plan for execution, lay out a communication strategy, and indicate the steps necessary for testing and maintenance
- Leading: A software project manager will need to assemble and lead the
project team, which likely will consist of developers, analysts, testers, graphic designers, and technical writers This requires excellent communication, people and leadership skills
- Execution: The project manager will participate in and supervise the
successful execution of each stage of the project This includes monitoring progress, frequent team check-ins and creating status reports
- Time management: Staying on schedule is crucial to the successful completion
of any project, but it is particularly challenging when it comes to managing software projects because changes to the original plan are almost certain to occur as the project evolves Software project managers must be experts in risk management and contingency planning to ensure forward progress when roadblocks or changes occur
- Budget: Like traditional project managers, software project managers are
tasked with creating a budget for a project, and then sticking to it as closely as possible, moderating spend and re-allocating funds when necessary
- Maintenance: Software project management typically encourages constant
product testing in order to discover and fix bugs early, adjust the end product to the customer’s needs, and keep the project on target The software project manager is responsible for ensuring proper and consistent testing, evaluation and fixes are being made
Therefore, managers have diverse roles Since software project management is normally concerned with activities involved in ensuring that software is delivered
on time, on schedule and in accordance with the requirements of the organizations
developing and procuring the software, managers most significant activities are planning, estimating and scheduling
According to Project Management Institute (PMI) in Project Management Body
of Knowledge (PMBOK) guide [7], project management includes five stages or process groups: Initiating, Planning, Executing, Monitoring and Controlling, and Closing (Figure 1.1)
Trang 2826
In modern software project planning, the two essential tasks are project risk management and project scheduling They play crucial roles to make sure the project is effectively and efficiently organized, including resources (hardware, software, and network) allocation, task and personnel assignment and monitoring [7, 10] Software projects are quite different to other projects since software requirements are continuously changing (during software development life cycle), software projects are often behind schedule and over budget Moreover, in reality, many software project managers either ignore or do not take appropriate risk management This leads to project failure or customer complains on the quality, the schedule or the over budget of the project Some other project managers who are aware of risk management, but they only rely on their own team skills or experience, even if they follow the capability maturity models CMM/CMMi (Capability Maturity Model Integration) or PMP (Project Management Professional) As can be seen in Figure 1.1, risk management affects all the processes in Process Groups In addition, project teams could adjust or update the planning process while they are executing, monitoring and controlling their projects
Figure 1.1 Activities of project management according to PMBOK Guide
1.1.2 Software project scheduling
Software project scheduling is one of the most demanding tasks for software project managers It is all about resources allocation during the project life cycle In simple words, software project scheduling is splitting the whole project into smaller tasks and estimates the required time and resources to complete each task Software development teams normally try to organize tasks concurrently to make optimal use
of workforce as well as minimizing task dependencies to avoid delays caused by
Trang 291.2 Software project scheduling methods and techniques
1.2.1 Overview
There are many popular techniques for project scheduling, include:
- Graphical representations used to illustrate the project schedule such as + Work Breakdown Structure: show project breakdown into tasks
+ Activity Charts: show task dependencies and the critical path
+ Gantt Charts: Bar charts show schedule against calendar time
- Critical Path Method – CPM [10, 15, 20]
- Program Evaluation and Review Technique – PERT [12, 13, 15, 35]
Project scheduling (especially under uncertainty) is the most widely studied area of risk quantification in project management Producing a reasonable and reliable project schedule is one of the crucial tasks of project managers Moreover, having a realistic schedule for the project is one of the most cited factors of project success [36] Several techniques are proposed for modelling risk and uncertainty in project scheduling [10, 35, 37]
This section reviews some notable techniques CPM and PERT are the classical approaches for project scheduling Simulation-based techniques are more modern approach that is adopted by many project management software tools and some argue the best practice available Alternative approaches are Critical Chain Method and Fuzzy logic will be reviewed briefly Last but not least, scheduling technique and method for agile software development will also be discussed
1.2.2 Traditional scheduling methods and techniques
a) Critical Path Method (CPM)
Critical Path Method (CPM) is one of the most famous techniques in project scheduling Developed in 1957 by DuPont, CPM has become the standard technique
in project management and most project management tools support CPM calculation [15] According to Pollack-Johnson and Liberatore [38], almost 70% of project managers or professionals use CPM CPM calculation includes the following steps:
Trang 3028
- Specify the individual activities using a work breakdown structure
- Determine the sequence of those activities and dependency between them
- Draw a network diagram (that models the activities and their dependency)
- Estimate the completion time (duration) for each activity
- Identify the critical path (the shortest-duration path through the network)
- Update the CPM diagram as the project progresses
The basic mathematical notations used for CPM calculation is shown in the Table 1.1 In fact, the parameters D, ES, EF, LF, LS are common used in scheduling techniques
Table 1.1 Basic mathematical notations used for CPM calculation
6 LS Latest start of aj LSj = LFj – Dj
Total float of aj - the time that the activity’s duration can be increased without increasing the overall project completion time
TFj = ESj – LSj
= LFj – EFj
A critical activity is the one with no float time (TF = 0) and should receive special attention, since delay in critical activity will lead to delay the whole project Informally, the critical path is determined by performing forward and backward passes through the project network The forward path computes the earliest start (ES) and the earliest finish (EF) time for each activity The backward path computes the latest start (LS) and the latest finish (LF) time for each activity The total float for each activity is the difference in the latest and earliest finish of each activity [15] The connections among these parameters in an activity are described in Figure 1.2
Trang 3129
Therefore, CPM is a deterministic model which uses a fixed time estimate for activities Although CPM (“pure deterministic in nature” [20]) was not developed to handle or quantify uncertainty, it does provide very useful information about relations between activities, activities time and the overall project schedule (so that project scheduling can be controlled)
Figure 1.2 CPM parameters in an activity
b) Program Evaluation and Review Technique (PERT)
PERT was introduced in 1957 by the US Navy as one of earliest research incorporating risk in project management [13, 15] A special feature of PERT is its ability to handle uncertainty in activity duration This means if there is a variation in time estimate of an activity; it may affect the whole project PERT methodology is developed to help completing the project successfully when the time estimate is not definitive
In order to do that, instead of a single estimation in CPM, PERT provides a beta probability distribution to each project activity Three time estimates (optimistic, most likely, and pessimistic time estimates) can be obtained and can be used to estimate the expected time and the standard deviation for an activity i
Optimistic time estimate is the estimate determined considering all favorable conditions; i.e in the best-case scenario or when everything goes right In other words, this is the shortest time in which the activity may be completed
Trang 3230
Most likely time estimate is the time duration where there is a high probability
of completing the activity within the given time duration In other words, it is the estimate in case of normal problems or opportunities
Pessimistic time estimate is the estimate determined when we consider all unfavourable conditions; i.e in the worst case scenario or when everything goes completely wrong In other words, this is the longest time the activity might require
to complete
- Expected time: μi = (Optimistic + 4xMost likely + Pessimistic)/6
- Standard deviation: σi = (Pessimistic – Optimistic)/6
The critical path is the sequence of project activities that determines the earliest time by which the project can be completed, and the total duration determines the completion date of the project PERT assumes that only one path is the critical path and that the path does not change Therefore, managers using PERT are advised to focus on these critical activities to ensure the project completion date remains unchanged The expected value of a critical path is calculated by the expected value
of each activity, and the variance of the critical path is the sum of the variances of all activities in the path Based on the calculation, the probability that the project will be completed by a certain date can be calculated Therefore, PERT is somehow similar to CPM The main difference is that each activity in a PERT network has a
variance associated with its completion time In other words, CPM is deterministic, while PERT is somehow probabilistic
c) Simulation-based techniques
Monte Carlo Simulation (MCS) was first proposed for project scheduling in the early 1960s [39] However, it was not until the 1980s when sufficient computer power became available that simulation became the dominant technique for handling risk and uncertainty in projects [40, 41] In its simplest approach, MCS uses the project activity diagram
The duration of each activity is estimated by shortest, most likely and longest duration and also the shape of the distribution (such as Normal, Beta etc.) Then critical path calculation is performed several times, each time using random values from the activities’ distribution function
More advanced tools like PertMaster (Oracle Primavery Risk Analysis [42]) use simulation-based approach not only for handling uncertainty in duration and cost, but also for providing a whole risk analysis process They can link the project
Trang 33However, simulation has its own drawbacks One serious methodological flaw
in traditional MCS of project networks is the assumption of statistical independence for individual activities which share risk factors in common with other activities [38] Most available simulation packages assume that the marginal distributions of uncertainty for individual activities in the project completely define the multivariate distribution for project schedule It is intuitively obvious that this assumption is highly suspect for many projects which involve multiple activities of a similar type and/or have different activity types, which are influenced by common risk factors van Dorp and Duffey in 1999 [45] demonstrated that failure to model such types of risk dependence during MCS can result in the underestimation of total uncertainty
in project schedule The most effective way to deal with dependence in a statistic is use a causal structure to explain it MCS is not capable of modelling causal structures
Another weakness of MCS explained by Williams [46] is the inability of simulation to capture the actions taken by the managers to recover any slippage in activity/project duration MCS simply runs through a network assigning values to random variables on each iteration It ignores the fact that in reality if an activity was running late, management would take actions to affect the activity duration Uncertainty in an activity is usually the result of a chain of causes (sources) and can
be affected by a chain of actions (controls)
Furthermore, MCS is only as good as the information that is fed into it If the duration distributions of the project activities are incorrect or inadequate, the simulation results are erroneous and invalid In reality duration of most activities are estimated subjectively In order to capture all aspects of uncertainty in activity (project) duration various known and unknown sources of risk have to be addressed Therefore, MCS will not be applied as a scheduling technique in the scope of this thesis
Trang 3432
d) Fuzzy logic
An alternative approach that has interested several researchers in the past two decades [47, 48] is Fuzzy project-scheduling The fuzzy set scheduling literature recommends the use of imprecision rather than uncertainty, fuzzy numbers rather than stochastic variables and membership functions rather than probability distributions The output of a fuzzy scheduling will normally be a fuzzy schedule, which indicates fuzzy starting and ending times for the activities This may be as difficult to generate as probability distributions of activity duration and also there is
no generally accepted computational approach available Therefore, the fuzzy project-scheduling approaches have been kept in the academic sphere A summary
of most of the published research works in fuzzy project scheduling can be found in the work of Bonnal et al in 2004 [49]
1.2.3 Agile software project scheduling
From the late 1990s several methodologies like RUP, XP, FDD, Scrum etc began to get increasing public attention and has become mainstream software development methods, especially in Vietnam where most software vendors are small and medium enterprises These methods are representative of agile software development
Agile – denoting “the quality of being agile; readiness for motion; nimbleness, activity, dexterity in motion” [50] – software development methods are attempting
to offer an answer to the eager business community asking for lighter weight along with faster and nimbler software development processes This is especially the case with the rapidly growing and volatile Internet software industry as well as for the emerging mobile application environment
Agile development is a way of organizing the development process, emphasizing direct and frequent communication – preferably face-to-face, frequent deliveries of working software increments, short iterations, active customer engagement throughout the whole development life-cycle and change responsiveness rather than change avoidance [51] Thus, agile software development recognizes that software development is inherently a type of product development and therefore a learning process It is iterative, explorative and designed to facilitate learning as quickly and efficiently as possible Two of the most significant characteristics of agile approaches are: 1) they can handle unstable requirements throughout the development cycle; and 2) they deliver products in shorter time-frames and under budget constraints when compared with traditional development methods
Trang 3533
An agile approach can be seen as a contrast to (traditional) waterfall-like processes [52, 53, 54] which pay attention to thorough and detailed planning and design upfront and consecutive plan conformance The waterfall model is the oldest and the most mature software development model [53] In practice, the waterfall development model can be followed in a linear way, and iteration in an agile method can also be treated as a miniature waterfall lifecycle
Agile approaches have been widely employed in a domain of low cost of failure
or linear incremental cost of failure [55] Examples within this domain include based applications, mobile applications [50], Internet commerce, social networking, games development, and even some areas in government, finance and banking software development
web-Table 1.2 summarizes some of the differences between waterfall and agile projects
Table 1.2 The differences between waterfall and agile projects
Product/
scope
An often bloated product that
is still missing features (i.e., rejected change requests or de-scoped to meet deadlines)
The best possible product according
to customers own prioritization, incorporating learning from actual use (revolves with the increments) Schedule/
extensively and expensively
Quality is built in, and is the key to productivity (writing tests before writing code)
Value is generated early, as soon as the minimum highest prioritized features are delivered Greater return on investment
Relationship
to the
customer
Trang 3634
Since agile software development is organized iteratively and incrementally in iterations, agile software scheduling is actually iteration scheduling Iteration scheduling aims at determining a very feasible and precise plan for the development that schedules the implementation of selected features within an iteration (i.e assigning tasks to developers) Technical tasks (or Sprint backlog items in Scrum) are the main concepts of iteration scheduling These tasks are the fundamental working units accomplished by one developer, and usually require some working hour realization effort that is estimated by the team The aim of iteration scheduling
is to break down selected requirements into technical tasks and to assign them to developers [56] In that process, the development team also needs to care about tasks dependencies (sequencing) and time constrains The problem of optimized Agile iteration scheduling will be discussed in details in Section 3.1
1.3 Risk management in software project scheduling
1.3.1 Overview of project risk management
Risk management has become an important part of project management and has attracted a wide range of research during the last two decades [11] Since 1990 various Risk Management Processes (RMP) have been proposed Probably the most popular Project Risk Management Processes (PRMP) is Chapter 11 of the PMBOK (Project Management Body of Knowledge) guide [7], the PRAM (Project Risk Analysis and Management) guide [57] and the RAMP (Risk Analysis and Management for Projects) guide [58] Most organisations adopt one of these guides
or use them to develop their own process This thesis does not intend to explore the detailed differences between different guides since, apart from fundamental differences in assumptions and methodologies [59], they all aim to capture risk and uncertainty in the following three stages:
The usual output of the risk identification stage is a document called the Risk Register Many authors have discussed risk registers in their works [60] Williams [61] stated two main roles for a risk register:
Trang 3735
- A repository of a corpus of knowledge
- To initiate the analysis and plans that flow from it
Chapman and Ward [14] consider a risk register as documentation of the sources
of the risks, their responses and also risk classification Ward [62] described the purpose of a risk register “to help the project team review project risk on a regular basis throughout the project” Patterson and Neailey [63] presented a risk register database system to aid managing project risk Risk registers can be a good management tool during the course of a project However, it is not possible to identify all risks and capture all aspects of them There are always unknown (i.e undiscovered, unattended or immeasurable) risks that often are more important than the identified risks in the risk register
The Risk Analysis stage attempts to measure the risk and its impacts on different project outputs (i.e cost, time, and performance) This stage is also known as quantitative risk management The likelihood that each identified risk will occur and also its possible impact on the project is estimated The combination of the risks, probabilities and their impact create ‘probability-impact’ (PI) matrices This matrix can be used to assign ranks to risks and then prioritise them Most of the available quantitative tools and techniques (simulation based tools) implement the
PI values to quantify uncertainty in projects However, use of PI matrices has some important shortcomings [11]
The Risk Response stage attempts to formulate management responses to the risk Also known as “Risk Mitigation”, it uses the results of the analysis stage in order to improve the chance of achieving the project objectives “Risk Response” is
a decision making process A number of alternative strategies are available when planning risk responses, which can be described under one of the following strategies [64]:
- Avoid - seeking to eliminate uncertainty by reducing either the probability or
the impact to zero
- Transfer – seeking to transfer ownership and/or liability to a third party (e.g
insurance)
- Mitigate – seeking to reduce the size of the risk exposure in order to make it
more acceptable to the project or organization
- Accept – recognizing residual risks and responding either actively by
allocating appropriate contingency, or passively doing nothing except monitoring the status of the risk
Trang 38in the process
1.3.2 Project risk analysis
The term risk analysis in the scope of this research is the same with quantitative risk analysis and related to risk measurement, as we focus on quantitative issues of
project risks Project risk analysis is one stage of project risk management In some literature, risk analysis is even synonymous with risk management
In fact, risk analysis is usually started out by a qualitative analysis and its results support the decision making process in the Risk Response stage It is a continuous process that can be started at almost all stages in the duration of a project However, it is the best to use risk analysis in the beginning stages of projects (i.e some phases like feasibility study and planning) and continually update
it during the implementation phase This can be done iteratively at intervals, and this also matches with agile software development
Risk analysis is the most “formal” aspect of the project risk management process [64]), often involving sophisticated techniques and usually requiring computer software (or tools) Such techniques may be applied with various levels of effort depending on the available resources for the analysis and also on the details Risk analysis can bring in certain benefits to software project, including:
- Help to make decisions and make it possible for more effective and efficient risk management
- Help to make more feasible (realistic) plans, in terms of both duration and costs
- Help to form statistical data of historical risks This in turn would be benefits
in better planning and implementation of future projects
Trang 3937
1.3.3 Unknown risks
One important category of uncertainty in projects is “Unknown Risks” These are important sources of uncertainty because their impact on a project may outweigh all other sources of risks
Although unknown risks are thoroughly acknowledged (perhaps with different names) by several authors, none of the existing approaches for project scheduling is able to model and quantify this type of risk The conventional “probability impact” approach at best is only capable of modelling “known risk” Most of the current quantitative techniques for risk analysis are event-oriented and more concerned about ‘risk of something happening’ They assume that a list of events (conditions) that may take place is known, the impact of each risk on activity duration is also known and even the nature of the response to each risk is roughly known [15] However, unknown risks are unpredictable and immeasurable (their impacts are unknown or hard to quantify) Those risks required much effort to clarify An example of unknown risks is Internally Generated Risk - IGR [73] As their names already reveal, IGRs originated from within the project team or organization, from rules, policies, regulations, structures, actions, behaviours or culture of the organization IGRs have the following features:
- Common, since organizational issues such as policies, processes, culture etc are widespread in most projects of the organization
- Important, since they often have impact on more than one activity
- Not well-managed in projects, as they are unpredictable (and hardly put in documents or risk registers) and hard to quantify
1.3.4 Risk aspects in software project scheduling
In different project management processes there are different aspects of uncertainty/risk [20] This thesis focuses on quantitative risk management which concerns about risks affecting project schedule (or project time frame), including risks affecting project scheduling (a phase or a process in project planning) As can
be deduced from the previous sections, these risks cannot be completely separated from risks of other processes or phases
In project scheduling, the most obvious risk is in duration estimation for a particular activity Difficulty in this estimation can arise from a lack of knowledge
of what is involved as well as from the uncertain consequences of potential threats
or opportunities Some sources of uncertainty:
Trang 4038
- Level of available and required resources (including inexperienced or lack of training developers)
- Incomplete (or often changing) requirements
- Tradeoff between resources and time
- Possible occurrence of uncertain events (especially those cause badly impact,
- Lack of previous experience and use of subjective instead of objective data
- Incomplete or imprecise data, or lack of data
- Uncertainty about the basis of subjective estimation (i.e bias in estimation)
1.4 Bayesian Networks
1.4.1 Bayesian approach vs classical approach
The fields of statistics and data analysis are concerned about inferring the probability of an uncertain event The difference between the classical (also called Frequentist) style and Bayesian approach is summarised in Table 1.3
Table 1.3 The differences between Bayesian and Frequentist approaches
Judgement Depends on the person’s
(subjective) opinions or
beliefs
A fact, independent on the analyst’s opinions or beliefs