1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Determining the contributing factors to traffic accident in Ho Chi Minh city using binary logit model

5 51 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 755,11 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The results might be helpful for effective measures suggestion to improve traffic safety at signalized intersections. A case study is conducted in Ho Chi Minh City (HCMC), Vietnam. Historical traffic accident data in the city are collected during five years (2011-2015). Binary logit models have been used to identify contributing factors to serious traffic accident. The results show that the involvement of intersection type, land use and road type are contributing factors to the accident severity. Based on the findings, strategies and measures for safety improvement are formulated and discussed.

Trang 1

DETERMINING THE CONTRIBUTING FACTORS TO TRAFFIC ACCIDENT IN HO CHI MINH CITY USING BINARY LOGIT

MODEL

ỨNG DỤNG MÔ HÌNH HỒI QUY LOGIT NHỊ THỨC ĐỂ XÁC ĐỊNH CÁC YẾU TỐ ẢNH HƯỞNG ĐẾN TAI NẠN GIAO THÔNG Ở THÀNH PHỐ HỒ CHÍ MINH

Tran Quang Vuong

University of Transport and Communications Campus in Ho Chi Minh City

Abstract: Traffic accident patterns, the severity level and the factors determination to the

accidents have been investigated in this research The results might be helpful for effective measures suggestion to improve traffic safety at signalized intersections A case study is conducted in Ho Chi Minh City (HCMC), Vietnam Historical traffic accident data in the city are collected during five years (2011-2015) Binary logit models have been used to identify contributing factors to serious traffic accident The results show that the involvement of intersection type, land use and road type are contributing factors to the accident severity Based on the findings, strategies and measures for safety improvement are formulated and discussed

Keywords: Road traffic accident, signalized intersection, logit model, traffic safety measures,

factor analysis

Tóm tắt: Nghiên cứu này tập trung phân tích đặc điểm tai nạn, mức độ nghiêm trọng và các yếu

tố ảnh hưởng đến tai nạn Kết quả nghiên cứu sẽ là căn cứ rất hữu ích để đề xuất các giải pháp hiệu quả nhằm nâng cao an toàn giao thông tại các nút giao thông có đèn tín hiệu Nghiên cứu này được thực hiện cho trường hợp ở Thành phố Hồ Chí Minh, Việt Nam, dựa trên dữ liệu thống kê về tai nạn giao thông trong 5 năm (2011-2015) Mô hình hồi quy logit nhị thức được sử dụng để xác định các yếu tố ảnh hưởng đến tai nạn giao thông nghiêm trọng Kết quả phân tích cho thấy loại nút giao thông, vị trí nút giao và loại đường là những yếu tố ảnh hưởng đến mức độ nghiêm trọng của tai nạn

Dựa vào kết quả nghiên cứu này để xuất các chính sách, giải pháp nhằm nâng cao an toàn giao thông Từ khóa: Tai nạn giao thông đường bộ, nút giao thông có đèn tín hiệu, mô hình logit, giải pháp

an toàn giao thông, phân tích yếu tố

1 Introduction

Nearly 25 percent of all fatal crashes

occur at intersections and about 30 percent of

those are at intersections controlled by

signals In 2015, the number of traffic

accident, fatalities, injuries which occurrence

in HCMC, have been slightly decreased

accounted for 3,694 (accidents); 693

(fatalities) and 3,301 (injuries) Although,

this showed that comparison with 2014, the

number of traffic accident, fatalities and

injuries in 2015 have slightly reduced,

accounted for 14.51%, 4.15% and 18.07%,

respectively, these increase at signalized

intersections in HCMC accounted for 41% of

total accident occurrence at intersections

(9.7% of total traffic accident in HCMC)

Until now, there is lack of empirical research

about traffic safety for signalized

intersections under mixed traffic conditions

since most of previous research on this topic

focusing on vehicle dominance To address

the road traffic accident problems, it is necessary to deeply understand contributing factors to traffic accident The objectives of this research are to investigate traffic accident patterns, the severity levels and contributing factors to traffic accidents This study aims, however, at exploring not all contributing factors, since substantial limitations in data obtained from accident reports Logistic regression was used in this study to estimate the effect of the significant contributing factors to accident severity

This paper are divided into five parts, introduction is the first, the second is literature review, descriptive analysis and modelling are the third and fourth, respectively and the last is discussions

2 Literature review

developed to determine contributing factors

to accident severity for both developed and

Trang 2

developing cities, such as Poison and

Negative Binomial model (Hoong et al.,

2001, Lin et al, 2003, Yinhai et al., 2004,

Huang et al., 2008); ordered probit model

(Abdel-Aty et al., 2003, 2005, Yu Jin et al.,

2010); logistic regression models (Hilakivi et

al., 1989, James and Kim, 1996, Mercier et

al., 1997, Al-Ghamdi, 2002, Kelvin, 2004);

Multiple logistic regression (Shankar and

Mannering, 1996, Carson and Mannering,

2001, Yan et al., 2005) and binary logistic

model was developed by many researchers

Logistic modeling technique is often

preferred by researchers, due to the logistic

function must lie in the range between 0 and

1, and this is not usually the case with other

possible functions (Kleinbaum and Klein,

2002)

In summary, there have been numerous

studies to determine contributing factors

effect on accident severity by developing

logistic regression models Nevertheless, only

limited studies explored crash injury severity

at signalized intersections (Abdel - Aty,

2003; Abdel - Aty and Keller, 2005; Yan et

al., 2005; Huang et al., 2008; Yu Jin et al.,

2010) Time of day, intersections type, nature

of lane, street lighting, presence of the red

light camera, pedestrian involved, vehicle

type, driver age and accident type are

variables which major contributing factors to

accident severity that learning from literature

review Moreover, there is no study

investigating contributing factors to accident

severity by using logistic models at

signalized intersections in Vietnam in general

and in HCMC in particular Based on

literature review combination with historical

traffic accident data which is available in

Vietnam condition, binary logistic model can

be applied for this case with highly

appropriation

3 Descriptive analysis of traffic

accident at signalized intersections in

HCMC

3.1 Overview of HCMC

Acording to master plan, HCMC is

divided into three zones City centre (zone 1)

includes 13 urban districts - 1, 3, 4, 5, 6, 8,

10, 11, Go Vap, Tan Binh, Tan Phu, Binh Thanh, and Phu Nhuan Newly developed areas (zone 2) include 6 newly developed districts - 2, 7, 9, 12, Binh Tan, and Thu Duc Rural areas (Zone 3) include 5 rural districts - Hoc Mon, Nha Be, Can Gio, Cu Chi and Binh Chanh, Fig.1

Figure 1 Classification zone in HCMC

3.2 Data collection

This research has been carried out based

on the historical accident database during five years (2011 - 2015), obtained from the Rail-Road Traffic Police Bureau in HCMC The traffic accident information was recorded in accordance with form No 02/TNDB with nearly 60 categorizes information In fact, nevertheless, accident information just only could be recorded 17 categorizes information which were conducted for analyzing to determine significant contributing factors to accident severity

3.3 Analysis of the patterns

There were 375 traffic accidents which happened at signalized intersections in HCMC during five years (2011 - 2015) The number of traffic accident was distributed different between three zones, with 212 (56.5%) accidents occurrence in zone 1, 126 (33.6%) in zone 2, and 37 (9.9%) in zone 3 However, the rate between the number of traffic accident and the total signalized intersections in zone 2 is highest (0.79), following by zone 1(0.44), and the less in zone 3(0.36)

3.3.1 Distribution by time

The traffic accident trends slightly increasing on holidays, tet holidays, at the weekend and at the end of months in year The time of traffic accident occurrence is

Zone 1 Zone 2 Zone 3

Trang 3

difference in three zones, in zone 1 most of

the traffic accident happened in night

off-peak hour from 8PM to 4AM, while in zone

2, zone 3 it trends slightly increasing

morning, noon, and night peak hour (6AM -

8AM; 12AM - 2PM; 6PM - 8PM)

3.3.2 Distribution by road user

involvement accident

Most age group of road user involvement

traffic accident is 19 - 24 year - old (24%),

following by 25 - 30 year - old group (19%)

This age group accounted for 32%, 46%, and

22% in zone 1, zone 2 and zone 3,

respectively This age group is not really

maturity, and irritated easily by alcohol Male

road users are main group leading traffic

accident for three zones, which accounted for

77%, 88% and 78% in zone 1, zone 2 and

zone 3, respectively The traffic accident

motorcycle or motorcycle and truck are

configuration type, which are the most

popular in zone 1 and in zone 2, 3 accounted

for 38%, 47%, respectively

Red - light running, not accept priority,

wrong lane, illegal turning, and illegal

overtaking are significant causes leading to

traffic accident at signalized intersections In

particular, red - light running, not accepted

priority are the most significant accident

cause in zone 1, and zone 2 accounted for

26%, 29%, respectively Red - light running

and wrong lane are main causes in zone 3,

accounted for 35%

4 Modelling of accident at signalized

intersections

4.1 Theoretical background of logistic

regression

In this research, accident severity is

dichotomous type It should pay attention that

the definition non - fatal accident mean any

accident happened without any fatal during

24 hours account from traffic accident

occurence and otherwise Each accident in

time - series on road accident data was

categorized as either non - fatal or fatal The

logistic model used is

(6) And thus

P(fatal accident) = 1-P(non-fatal accident)

= 1- p(x) = 1/(1+eg(x)) (7) Where g(x) stands for the function of the independent variables:

g(x) = 0 + 1x1 + 2x2 + +nx (8) Logistic regression determines the coefficients that makes the observed outcome (non - fatal or fatal accident) most likely using the maximum - likelihood technique Principle estimation of this model is based on probability value (P) equal 0.3, this means, in case probability value is more than and equal 0.3, that is fatal accident occurrence, and otherwise

4.2 List of variables

Since the research goal was to determine the factors that might affect the severity of the accident (i.e whether it was a fatal or none-fatal accident), 37 variables are summarized from the time - series data, accident patterns and they are coded under 0 and 1 to serve for developing model Because

of discrete variables, correlation analysis (Kendall’s tau-b test) was also used to reduce the number of variables basing on the level of correlation and P - value

Table 1 Matrix coefficient correlation

ro H

Sig.

r -.318 ** 1.000 Sig .000

r 1.000 **

-.318 ** 1.000

r -.318 ** 1.000 ** -.318 ** 1.000

r 637 **

.264 **

.637 **

.264 ** 1.000

r 450 **

.207 **

.450 **

.207 **

.442 ** 1.000

r 540 ** 388 ** 540 ** 388 ** 662 ** 421 ** 1.000

r -.039 123 * -.039 123 * 063 -.031 -.042 1.000

r 591 ** 357 ** 591 ** 357 ** 643 ** 460 ** 670 ** 061 1.000

r 117 * 182 ** 117 * 182 ** 233 ** 262 ** 188 ** -.013 179 ** 1.000

r 345 **

.294 **

.345 **

.294 **

.484 **

.263 **

.679 ** -.029 391 **

.181 ** 1.000 Sig.

r 155 ** 213 ** 155 ** 213 ** 261 ** 262 ** 282 ** 166 ** 252 ** 203 ** 159 ** 1.000

The number of samples (N)=375

r Correlation Coefficient

Variable s

** Correlation is significant at the 0.01 level (2-tailed).

* Correlation is significant at the 0.05 level (2-tailed).

Time of Day of accide nt Month of accide nt location Urban road Province road Commune road

He lme t Don't acce pt priority Zone 2 Width pave me nt

<3m vs

<3m

Se ve rity of accide nt

4.3 Development of logistic model

The entry method of logistic regression was followed using SPSS version 21 The Omnibus tests of traffic accident severity

Trang 4

model coefficients is analyzed to assess

whether data fit the model or not as

illustration in Table 2

Table 2 Omnibus Tests of Model Coefficients

Step 1

The specified model is significant (Sig <

0.05), hence it is recommended that the

independent variables improve on the

predictive power of the null model

Table 3 contains the two pseudo R2

measures that are Cox - Snell and

Nagelkerke Cox and Snell’s R-square

attempts to imitate multiple R - square based

on ‘likelihood’, but its maximum can be (and

usually is) less than 1.0, making it difficult to

interpret Here it is indicating that 11.8% of

the variation is explained by the logistic

model

The Nagelkerke modification that does

range from 0 to 1 is a more reliable measure

of the relationship Nagelkerke’s R2 will

normally be higher than the Cox and Snell

measure In this case it is 0.263 indicating the

relationship of 26.3% between the predictors

and the prediction In addition, in Table 3

Hosmer - Lemeshow (H - L) test illustrate the

significance of the developed logistic

regression models (sig >0.05)

Table 3 Goodness of fit (Pseudo R2 and H-L Test)

-2 Log

likelihood

Cox & Snell R Square

Nagelkerke R Square

Ps e udo R2 Te s t

Ste p

a Estimation terminated at iteration number 7 because

parameter estimates changed by less than 001.

Hos me r and Le me s how Te s t

Our H - L statistic has a significance of 0.22

which means that it is not statistically

significant and therefore our model is quite

good fit Rather than using a goodness – of -

fit statistic, we often want to look at the

proportion of cases we have managed to

classify correctly In a perfect model, the

overall percent correct will be 100% for all

cases In our study overall 88.3% were

correctly classified Nevertheless, it trends

skew prediction for non - fatal accident

(percentage correct 95%) while only 18.2% is percentage correct for fatal accident prediction From Wald - value test at Table 4,

it appears that the variables loc, Uroad, Proad, Croad and Zone 2, show some significant effect (loc, Uroad, Proad, Croad are about significant)

Table 4 The result of Wald test

B S.E Wald df Sig Exp(B) loc .770 426 3.258 1 049 2.159

Uroad 1.008 522 3.727 1 049 2.740

Proad .929 415 5.020 1 025 2.533

Croad 1.188 563 4.451 1 035 3.280

Zone 2 .792 541 2.143 1 043 2.207

Consta nt

-4.422 558 62.782 1 000 012

Ste p

1 a

a Variable(s) entered on step 1: loc, Uroad, Proad, Croad, Zone2.

According to the previous analysis, the logit model with the significant variables is

as follows:

g(x) = - 4.422 + 0.77loc + 1.008Uroad + 0.929Proad + 1.188Croad + 0.792zone (9) Hence the logistic regression model developed in this study is

(x) = eg(x)/ (1+eg(x)), where g(x) in Eqs.(9)

4.4 Model interpretation

Interpretation of any models means the ability to explain practical inferences from the estimated coefficients The estimated coefficients for the independent variables represent the trend or rate of change of the dependent variables per unit of change in the independent variable The interpretation of the model developed in this study are presented in detailed, as follows

4.4.1 Impact of location on accident severity

It should pay attention that due to ‘loc’ has two levels:

loc = 1 (fatal accident occurrence at junction and the others)

loc = 0 (fatal accident occurrence at intersections)

According to this coding, our model shows loc in the logit model with the coefficient of 0.77 To interpret this parameter, the logit difference should be computed as follows:

Logit (fatal accident/ junction & other)

= 

Logit (fatal accident/ Intersection)

Trang 5

= 

Logit difference

=



Hence the odds ratio is e1 =e0.77 = 2.16

This value shows that the odds of being

in a fatal accident at a junction and the others

location are 2.16 higher than those at an

intersection By using the same method, we

can explain the zone 2 factor to impact on

accident severity easily, the odds of being in

a fatal accident happening in zone 2 are 2.2

(e0.792) higher than those occurrence related to

zone 1 and zone 3

4.4.2 Impact of Uroad on accident

severity

2(1.008) measures the differential effect

on the logit of two cases, whether fatal

accident occurrence on urban road or not

To interpret this parameter, the logit

difference is computed first:

Logit (Fatal/Uroad) 

For any other type of road:

Logit (Fatal/not Uroad)

=

Logit difference

=





Hence the odds ratio is e(-1.109) = 0.33

Thus, the odds that accident will be fatal,

in case it occurrences on urban road is 0.33

times its being fatal related to the other type

of road

The similar method was used to compute

the odds for Proad and Croad, which account

for 0.28 and 0.47, respectively

5 Conclusions

Logit model was developed in this study

in order to determine significant contributing

factors to accident severity in HCMC basing

on response variable which is binary nature

(i.e has two categories – fatal or non-fatal)

with three variables namely, type of road,

location and land use This model is

reasonable statistic fit with 88.3% overall

percentage, although it trend skew prediction for non - fatal accident case (18.2%)

The findings might help the authorities in HCMC should focus on improvement safety

at junctions in zone 2 where involve commune road for their strategies It also help the authorities that should be pay attention to make own safety policies for each zone instead of for whole HCMC as they have made before This may make safety policies more cost - effectively

The odds presented in this paper can be used to help establish priorities solutions to reduce serious accident Such as the odds of being involved in a fatal accident at junctions and other on commune road in zone 2, where there is few policeman to control the traffic, lack of traffic signs and drivers with low safety awareness, are relatively higher than those for other cases

It is important should pay attention that, some significant variables such as road surface, traffic signal pattern, light condition, collision type, license status and so on which are not available or difficult to obtain in HCMC condition So they are not including

in this research Nevertheless, the findings of this study can be considered as guidance methods for future study when these variables are available

References

[1] Yau, K.K.W (2004), Risk factors affecting the severity of

single vehicle traffic accidents in Hong Kong

Accident Analysis & Prevention

[2] Abdel-Aty et al., (2005), Exploring the overall and

specific crash severity levels at signalized intersections Accident Analysis & Prevention

[3] Yan, X et al., (2005), Characteristics of rear-end

accidents at signalized intersections using multiple logistic regression model Accident Analysis &

Prevention

[4] Huang, H et al., (2008), Severity of driver injury and

vehicle damage in traffic crashes at intersections: A Bayesian hierarchical analysis Accident Analysis &

Prevention

[5] Jin, Y., X Wang, and X Chen (2010) Right-angle

crash injury severity analysis using ordered probability models Intelligent Computation Technology and

Automation (ICICTA), IEEE

Ngày nhận bài: 26/9/2016 Ngày chuyển phản biện: 30/9/2016 Ngày hoàn thành sửa bài: 21/10/2016 Ngày chấp nhận đăng: 28/10/2016

Ngày đăng: 12/01/2020, 02:21

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN