In the traditional approach to construction, the design phase must be finalized before any building work commences Typically, the design and construction processes are handled by separate entities that engage directly with the property owner.
The Design and Build (D&B) approach assigns the contractor full responsibility for both the design and construction phases of a project, consolidating accountability and streamlining the process This method is increasingly favored by clients and contractors as a modern alternative to the traditional Design-Bid-Build (DBB) model, facilitating a more integrated project delivery system (Hegazy, 2002).
- Turnkey: This approach is similar to the design-build approach, but the Builder is responsible for design, construction and project financing (Hegazy,
Advantages & Disadvantages of Project Delivery Methods:
Table 1.1- Summary of advantages & disadvantages of Project Delivery Methods
- The Designer, engineering, and constructor is familiar with this method
- Easy to use in all markets, including public and private
-The project time take longer -Designer cannot get the benefit from construction experience;
- Disputes and claims if have any Changes
- No conflict among the parties
- Design can get benefit from construction experience
- Time can be reduced since overlaps design and construction
- Cost may not be known until the end of design
- High risk to contractor and more cost to owner
Role of Cost Estimating in construction project:
Cost estimation is essential for the success of construction organizations, as it facilitates feasibility analysis, budgeting, securing funding from owners, and provides a baseline for evaluating contractors.
Cost estimating aims to provide a foundation for managing project expenses by generating cost estimates that align with the approved budget, while also supplying critical information to support the decision-making process in project development.
Effective cost estimating for bids must adapt to various stages of a project Estimators, typically representing the owner or designer, are often required to generate estimates based on conceptual information, lacking precise dimensions, details, specifications, and a defined schedule of the owner's needs.
In a highly competitive industry with declining market shares and shrinking profit margins, the cost of delivering services or products is crucial for decision-making During the tendering phase, accurate cost estimates for capital expenditures significantly influence planning, bidding, and overall cost management, impacting critical commitments like resource allocation These estimates are essential for project managers to assess feasibility and maintain effective cost control, ultimately affecting the client's decision to proceed with the project Clients expect contractors to complete projects within budget, which is vital for client satisfaction Therefore, precise cost estimates not only enhance a contractor's reputation but also strengthen client relationships.
Vietnam has faced significant challenges in construction projects, including frequent delays, cost overruns, and failures, which have disheartened investors and stakeholders The current project implementation method predominantly used in the country is the traditional "design-bidding-build" approach.
The traditional procurement method relies on a Bill of Quantity (BoQ) and design drawings prepared by skilled professionals, ensuring regulatory compliance and design certainty However, this approach often falls short in delivering cost savings, speed, and effective integration of design and construction, leading to a shift in client perspectives (Young et al., 2021) Additionally, the BoQ utilized during the tender phase is often unable to accurately forecast the final project cost due to incomplete information in the drawings and specifications, resulting in a limited understanding of the client's needs.
Adopting a more efficient method than traditional practices can help address the challenges facing Vietnam The design-build approach, recognized as an effective project implementation strategy, has gained popularity for its numerous benefits to all stakeholders This modern technique is widely utilized across the globe, enhancing project outcomes and collaboration.
- In the UK, D&B was already in use in the 1960s, and by the end of the 1990s, DB had captured 23% of the market for new construction projects (Ling, Liu, & Environment, 2004)
- In the USA D&B started in the early 1900s (Ling et al., 2004) [9], in the mid- 1990s, more than one-third of construction projects in USA used the D&B approach (Ling et al., 2004)
Since its introduction in 1992, Design and Build (D&B) has become increasingly popular in Singapore's construction sector, with its share of public sector projects rising to 16% and private sector projects reaching 34.5% by 2000 (Ling et al., 2004).
Hence, adopting the design-build method for project implementation is likely to result in high satisfaction among project participants in Vietnam, thanks to the numerous benefits it offers
In 2017, Vietnam experienced a significant rise in the adoption of the design-build approach among leading construction companies, utilizing it as a strategic bidding tactic This innovative method simplifies the construction process for investors by removing the complexities of traditional plans, such as managing multiple contractors and dividing work into smaller packages By implementing design-build, projects can potentially save 20-30% in progress costs, enhancing cash flow for all stakeholders In practice, some projects have reported savings of 10-15% of total investment capital through the effective use of this approach.
While the design-build method presents challenges, particularly in large-scale and complex projects, it offers construction companies in Vietnam a chance to demonstrate their expertise This approach also promotes the integration of modern technologies, enhancing the country's appeal to foreign investors.
Accurate cost-estimating models are essential for effective decision-making in the early stages of building design, where project information is often limited In Vietnam, the traditional floor area method is commonly used for cost estimation, which involves multiplying the total gross floor area by a historical cost per square meter However, this method's accuracy is generally low, with estimates varying by 15% to 25% The scarcity of data and cost volatility further complicate accurate estimations at this stage Therefore, specialized methods like Case-Based Reasoning, which leverage AI techniques, offer significant advantages over conventional approaches in improving cost estimation accuracy.
Figure 1.3- Transition from early cost estimation to final cost (Petroutsatou et al.,
Motivated by the limitations discussed in section 1.3, the author presents an innovative AI-based method designed to minimize error rates compared to conventional approaches while remaining cost-effective This advanced solution seamlessly integrates with widely used tools such as Excel and other existing computational resources.
This research aims to develop an accurate AI-driven cost estimation method that leverages existing data from previous projects to enhance cost performance predictions for new projects using the Design and Build (D&B) procurement method By implementing this approach, the cost assessment process can be expedited, resulting in more precise cost estimates To achieve this objective, tailored models are created to forecast the performance of each influencing factor.
Construction project cost estimation techniques are thoroughly explored in the literature, highlighting significant differences between conventional methods These variations depend on factors such as the project's objectives, level of planning and design, size, complexity, conditions, timeline, and location.
Cost estimation techniques used in construction projects are applicable across various fields and can be categorized into several types: parametric methods, historical bid analysis, quantity-based unit cost assessments, range-based estimates, and probabilistic risk evaluations (Geberemariam, 2018).
The identified traditional methods are categorized into four types: parametric, detailed, comparative, and probabilistic estimates This article will analyze and outline the advantages and disadvantages of each of these strategies.
In the initial stages of a project, a mathematical model representing cost estimating relationships (CER) is utilized to create parametric estimates that link the physical and functional elements of the project (Geberemariam, 2018) These elements may include functional and performance requirements or specific physical characteristics CER is illustrated through cost-to-cost or cost-to-non-cost variables, where the cost of an independent variable, such as labor hours for one component, can be employed to predict the cost of labor hours for another component.
Estimating labor costs can be achieved by analyzing the quantity of output items in a cost-to-non-cost scenario These relationships can range from straightforward one-to-one correlations to more complex algorithms, forming the basis of an estimating model that encompasses a variety of intricate relationships.
The cost estimating relationships (CERs) framework, represented by equations 1 and 2, illustrates both linear and nonlinear cost estimation methods These relationships are essential for quantifying how an independent variable influences contract price through quantitative analysis (Geberemariam).
Equation 2 below for CER with associated nonlinear form cost estimation relationships:
PI = Parameter of independent variable of interest ni = exponent used to transform 𝑷I
The temporal effects of cost, such as inflation, sharp rises in material costs, and for an independent variable and other metrics, are transformed and normalized using the exponential factors (E.q-2)
In parametric estimation, it is essential to identify the project's cost drivers, as these variables significantly influence future costs based on historical data Access to this data allows for the identification of cost drivers and the relevant Cost Estimating Relationships (CER) By leveraging the unique characteristics of each project, parametric CER can effectively forecast costs for upcoming initiatives.
- Cost Estimating is usually quickly and easily
- Actual observations assist to remove the reliance on opinion
- To ensure that they are consistent with the present link between project qualities and costs, they should be regularly reviewed
- Should be accurately and completely described because using the CER incorrectly could result in substantial estimate errors (Geberemariam, 2018)
The bottom-up or analytical estimation methods, also known as thorough estimation approaches, create detailed project cost estimates by establishing a Work Breakdown Structure (WBS) for each activity, which is calculated based on elements, time, and scope involved in the project.
A quantity surveyor or other technical person with extensive experience in a certain activity often calculates and connects the costs per activity to the WBS parts
The general mathematical formula is shown in equation 3 below But each project requires a different approach
- Finding out exactly what the estimate includes and whether anything was missed is one of the biggest benefits of the thorough estimation approach (Geberemariam,
- Reveals information on the project's primary cost drivers Additionally, the distinct project activities are frequently repeated and can be employed again in subsequent projects
- Executing a thorough estimate might take a lot of time, which makes it expensive
- The requirement that every new project require a fresh estimate Estimates of some recurring tasks may obtain from earlier projects, however, they have to integrate into the new estimate's condition
- To provide a trustworthy estimate, the project specs must be well-known if the specifications’ project always change, the estimate must continuously account for these changes
- During the summation of the many WBS elements, small inaccuracies can become huge errors
- and it take a lot of time to establish, especially in large, complex projects with many components of the work breakdown structure
The comparative estimating approach is an effective method for quickly assessing new projects that resemble recently completed ones This technique relies on key cost factors and insights gained from previous similar projects Adjustments are made based on unique characteristics such as the project's size, complexity, performance requirements, duration, location, and available technology.
A comparative estimate is essential for assessing a project's viability and determining if it should proceed within defined parameters (Burke, 2013) Additionally, the analogous method is useful for estimating a generic system when limited specifications are provided.
This method technique normally is used by unit method, cost indexes Cost Capacity Equation 4 or power law and sizing model, and Factored Estimates (Geberemariam,
2018) Equations 4 of the generic mathematical cost estimation are used:
The Cost Index (CI) measures the ratio of current costs to previous costs, providing a dimensionless figure that reflects changes in expenses over time while accounting for inflation (Geberemariam, 2018).
𝑻𝑪 = Total cost estimation at present
𝑰𝟎 = Index value at base time
- This completes an estimate quite quickly
- The accuracy still remains same in case the data from earlier that use for reference is slight changing
- Everyone involved can easily understand the determined estimate
- It is quite difficult to identify a project that is similar perspective with the new project in order to compare
The method utilizes extrapolation and expert judgment for factor adjustments, which can lead to subjective data normalization and potentially impact the accuracy of the estimates.
The method depends on extrapolation and expert judgment for factor adjustments, which can impact the accuracy of estimates This is particularly true when normalization is required, as it may lead to subjective evaluations of the data.
The probabilistic estimation approach focuses on identifying and quantifying the risks and uncertainties linked to a project, aiming to measure cost variability through the application of probability distributions for various parameters in the cost estimation process (Zwaving, 2014).
This method quantifies the effects of changes to planned or requested resources, enhancing communication regarding the likelihood of meeting cost and schedule baselines It effectively addresses concerns about the probability of exceeding specific costs, the potential extent of cost overruns, and the various uncertainties that influence these costs (Geberemariam, 2018).
Additionally, the design and requirements are still mostly undefined at this early stage Therefore, using probabilistic estimation as opposed to deterministic estimation or other conventional methods makes sense (Elkjaer, 2000)
In computer science, machine learning is a technique which creates a "model" out of
"data" (P Kim, 2017) and predicts future data based on current data (Paluszek & Thomas,
2017) In place of being explicitly designed, it functions by learning from training data
As shown in Figure 3.1 below, the machine learning procedure:
Figure 3.1- Machine learning procedure (Source: Phil Kim)
The machine learning process begins with the provision of a training dataset, which, upon completion of the learning phase, results in the creation of a model This model can subsequently analyze new, unfamiliar data and generate outputs based on the patterns identified during training.
Machine learning techniques can be classified into three main categories based on their training methods: unsupervised learning, supervised learning, and reinforcement learning Each of these techniques serves distinct purposes and applications in the field of artificial intelligence.
Figure 3.2- Different types of machine learning
Step 1- Import libraries and divide the data
Step 6: Evaluation of the result
Random Forest (RF) is a supervised learning algorithm that constructs multiple decision trees for classification, regression, and other tasks It outperforms traditional classification algorithms by effectively identifying the most significant attributes while highlighting those that have little to no impact on the decision-making process.
RF algorithm usually works in four steps, figure 3.4:
- Select random samples from the given data set
- Set up a decision tree for each sample and get prediction results from each decision tree
- Vote for each prediction outcome
- Choose the most predicted outcome as the final prediction
Figure 3.4- Model of Random Forest (Source: Nguyen Duy Sim, 2018)
In this research Python 3.10.9 and Spyder IDE 5.4.1 were used to develop the Random Forest model
The study evaluates the model's ability to anticipate outcomes using the Nash Sutcliffe Efficiency (NSE) index which calculate as the equation 6:
- y t is the value selected for evaluation
- ȳ is the mean of y t in the sample
The more accurate the model's prediction performance when NSE value is closer to
The accuracy of a predictive model is assessed by randomly selecting a set of result values and comparing them to the corresponding test values; greater proximity between the predicted and test values indicates higher prediction accuracy.
Artificial Neural Networks (ANN) mimic biological neural networks, consisting of a block structure made up of simple computational nodes The connections between these nodes play a crucial role in defining the network's functionality.
The basic features of neural networks include:
- Consists of a set of processing nodes (artificial neurons)
- Enable or output status of the processing node
In network theory, the connections between nodes are characterized by a weight, denoted as Wjk, which indicates the influence that the signal from node j exerts on node k This weight can either amplify or diminish the strength of the signal at the receiving node, thereby affecting the overall network dynamics.
- A propagation rule that determines how the output of each node is calculated from its input
- An activation function, or transfer function, that determines another activation level based on the current activation level
- Unit of adjustment: deviation/bias/offset
- Method of collecting information (learning rule)
- The system environment can operate
Neurons are organized into layers, each capable of performing distinct transformations on their inputs Signals flow from the input layer to the output layer, facilitating the processing of information throughout the network.
Figure 3.5- Structure of neural network
3.4.2- Components of an artificial neural network
A neuron, or node, performs a fundamental function by receiving input from the front unit or an external source, which it then uses to compute an output signal This output signal is subsequently transmitted to other units in the network.
Z j = g(a j ) aj= wjixi + Ѳj wj0 wj1 w jn
Figure 3.6- Processing Unit (Source: duytan.edu)
- The circle and arrow illustrate the node and signal flow
- 𝑤ji: the weights of the corresponding signals
In a neural network there are three types of units:
✓ Input node: receive signals from the outside
✓ Output node: sending data to the outside
✓ Hidden node: its input and output signals located in the network
Each j node can have multiple inputs, such as x0, x1, x2, and so on, but it produces only one output, denoted as zj The inputs to a node can originate from external data sources, the outputs of other nodes, or even the node's own output.
- Linear function, Identity function: Equation 7 will use to calculate when the inputs are treated as a node Sometimes a constant multiplied by net-input to produce a uniform function
Figure 3.7- Identity function (Source: duytan.edu)
The binary step function is utilized in single-layer networks, producing outputs restricted to two values In this context, the function is defined as g(x) = {1, if x ≥ θ}, where the output is "1" when the input meets or exceeds a specified threshold.
The sigmoid function is particularly advantageous for neural networks trained with the back-propagation algorithm due to its simplicity in deriving derivatives, which helps streamline computations during the training process This function is commonly used in applications where the expected output ranges between 0 and 1.
Figure 3.8- Sigmoid function (Source: duytan.edu)
- Hyperbolic Tan function: This function has the same properties as the sigmoid function But it works well for applications with required output interval range [- 1,1] g(x) = 𝟏 −𝒆
Figure 3.9- Hyperbolic Tan Function (Source: Erik)
Transfer functions for hidden nodes are essential for incorporating nonlinearity into neural networks, as the composition of homogeneous functions results in an identity function This nonlinear characteristic enables multilayer networks to effectively represent complex nonlinear mappings.
Back-propagation learning requires a differentiable function, ideally one that is constrained within a specific interval Consequently, the sigmoid function is often the preferred option for this purpose.
To effectively choose transfer functions for output nodes, it's essential to align them with the distribution of desired target values For output values within the range of [0,1], the sigmoid function is particularly beneficial, as it ensures that continuous target values remain within this interval In cases where the target values are unknown prior to the specified range, the identity function is commonly utilized Additionally, when the desired values are positive but the upper limit is unspecified, employing an exponential output activation function is recommended.
In this study, a summary table outlines the advantages and disadvantages of commonly used functions in the ANN model, aiding in the determination of the most suitable function for application The author presents their selection based on this analysis.
Table 3.1- Summary of Strengths & weaknesses of Activation function of ANN method
1.Simplicity: The "Linear function" is a very simple function, easy to understand, and computationally efficient
This function is continuous and has a derivative at all points, making it easy to calculate gradients and convenient for the optimization process using gradient descent algorithms
3.Useful in some special cases:
The "Linear function" can be useful in certain special cases, such as when the model requires a wide and unbounded output range
1.Limited representation and learning capability: Due to its lack of nonlinearity, the "Linear function" limits the ability to represent complex models and learn nonlinear information
The cost estimation for the Design and Build (D&B) project has traditionally depended on calculating the construction cost per unit area, which varies by project type Typically, these estimates are based on concept drawings; however, during the project's early stages, the information available for precise estimation is often insufficient To overcome this limitation, a dedicated model will be employed to enhance the cost estimation process during the conceptual phases of the project.
The cost-estimating models in the case study aim to minimize the gap between predicted and actual values, as demonstrated in the optimization flow chart shown in Figure 4.1.
Process Survey to obtain consensual from Experts
Creating the cost estimate model (Separately Training & Testing Data)
Evaluating the model and Analysis of the result
- ANN/SPSS -RF/Python -CBR/Slover-Excle Data Collection
To develop an AI-powered cost model, it is essential to gather extensive data, as cost models rely heavily on historical data from past cases This study collects information on 36 buildings from six projects in Vietnam, spanning the years 2015 to 2021, provided by a company specializing in quantity surveying Given the unique nature of construction projects, obtaining a substantial amount of cost data from similar conditions—such as time, place, and project specifics—can be quite challenging Consequently, the cost information collected, which originates from various times and locations, necessitates adjustments to align with a common reference point To achieve this, the data was normalized using construction cost coefficients, guided by the construction cost index published by Statista in 2022 The current status of this construction cost index is illustrated in Table 4.1.
Table 4.1- Construction Index for Construction
4.3- Identify the factors and profile of casebase:
Accurate project cost estimation requires a thorough analysis of various elements such as foundation works, structural components, façade construction, mechanical and electrical systems, as well as interior finishing The importance of each component can differ based on the building's design and intended use, highlighting the need for comprehensive multidisciplinary coordination in the estimation process.
An online survey was conducted with 48 Quantity Surveyors to identify factors influencing cost estimation, yielding 46 valid responses and a high response rate of 95.8% One invalid response was excluded from the analysis Utilizing a five-point Likert scale, the survey assessed the influence of various factors, with ratings ranging from 1 (no influence) to 5 (extreme influence) The factors included in the survey were selected based on expert consensus to ensure relevance, and the results, displayed in Table 4.2, highlight factors rated between moderate influence (level 3) and extreme influence (level 5).
In Appendix A.1, you can find the comprehensive details of the Questionnaire
The detailed Survey results can be found in Appendix A.2
Evaluating the reliability of a questionnaire is essential, with scale testing being a key component in this assessment It aids in identifying the reliability of observed variables related to the main factor and their inter-correlations This research employs Cronbach's Alpha coefficient to measure the reliability of the scale effectively.
The mathematical formula of the Cronbach’s Alpha coefficient (by Jim Frost): α = 𝑁 𝑥 𝑐̅
Where: N is the number of factors, 𝑐̅ is mean covariance between factors and 𝑣̅ is mean factors variance
Table 4.2- Meaning of Cronbach’s Alpha coefficient values
Cronbach’s Alpha Reliability Level α≤0.9 Excellent
According to the survey results (Appendix A.2) and the E.q- 16 provided earlier; the calculated Cronbach's Alpha coefficient in SPSS as Figure 4.2 below:
The analysis presented in Figure 4.2 reveals a Cronbach's Alpha coefficient of 0.952 for nine factors, demonstrating strong internal consistency among the items Although removing the X3 factor could increase the coefficient to 0.973 due to its low correlation with the overall sum, the author chose to retain it This decision is justified as the coefficient of 0.952 exceeds the acceptable reliability threshold of 0.7, indicating a high level of reliability for the factors.
Table 4.3 below provides the details of the factors used in this chapter
Table 4.3- Factors affect the cost
No Description CBR ANN RF
X1 Gross Floor area Numerical Scale Numerical
X2 The height of the building
X3 The height between stories of the building
X4 Location of Building Textual Nominal Transformed to
Numerical X5 The floor type of the building
X6 Number of units per floor
X7 Warranty time Textual Scale Transformed to
Numerical X8 Specification of material Textual Ordinal
X9 Construction Duration Numerical Scale Numerical
Gross Floor Area (GFA) is typically measured in square meters and represents the total construction area of a project A larger GFA indicates that the contractor will face increased workload and require more time to complete the project Consequently, the Gross Floor Area significantly impacts cost estimation for construction projects.
The height of a building, typically measured in meters, represents the total elevation of the project A taller structure impacts construction methods and technical requirements, which in turn influences cost estimation.
The height between stories in a building, typically measured in meters, significantly impacts construction methods and techniques If the distance between stories exceeds standard measurements, alternative systems such as semi-unitized or spider systems may be necessary, leading to changes in cost estimates Therefore, understanding the height between stories is crucial for accurate budgeting and construction planning.
The location of a building significantly impacts cost estimation, as urban settings can affect construction timelines, material delivery, and labor expenses.
In addition, the location of building near or far with the river because of impaction of the wind load, flooding This research focuses on projects located in District 1, 2,
7 and 9 of Ho Chi Minh City, Vietnam
The type of flooring in a building significantly impacts cost estimation, as different flooring structures require various techniques and methods from the tenderer.
This study mentioned on 2 types of floor structure: Reinforced concrete (RC) and Precast Concrete (PC)
- Number of units per floor: a greater number of units in one floor will affect the time to carry out, this led to the cost will be affected
- Warranty time: The warranty time is longer the cost is higher The project warranty is usually shown in year
Materials typically account for 60-70% of the total construction cost, and this is also true for façade systems, making material selection crucial as it significantly impacts expenses The market offers a wide variety of material specifications, highlighting the importance of choosing the right options to manage costs effectively.
Effective construction duration management can significantly lower project costs and enhance profitability When contractors offer a detailed schedule and diligently monitor progress against the overall timeline, they can optimize efficiency Typically, project duration is measured in weeks or months.
Short Profile of 36 cases to apply in this chapter is shown in table 4.4 below:
Table 4.4- Profile of cases for model validation (*)
(*) full profile of case is shown in appendix B.1 as attachment
To validate prediction results, various accuracy measures are employed, including Percentage Error (PE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) PE indicates the error's magnitude and direction as a percentage of actual values, while MPE averages these errors to provide an overview of the model's accuracy MAPE assesses accuracy by averaging absolute percentage errors, treating overestimations and underestimations equally MSE and RMSE focus on the average squared errors, with RMSE quantifying the average magnitude of prediction errors and penalizing larger discrepancies However, authors may prefer using mean and standard deviation for validation due to limited data, their simplicity and interpretability, and their robustness against outliers These measures offer an aggregated summary of prediction errors, facilitating easy comparison between models and aligning with common practices in statistical analyses, making findings more accessible to readers.
Model of Random Forest is established by Python 3.10.9 and Spyder IDE 5.4.1 with the step as follow:
Step 1- Import libraries and divide the data
Step 5- Import prediction result to excel