Dipo T. Akomolafe1, Akinbola Olutayo2
1Dept. of Mathematical Sciences,Ondo State University of Science and Technology,Okitipupa, Nigeria
2Dept. of Computer Science, Joseph Ayo Babalola University,Ikeji Arakeji, Osun State
Correspondence to: Dipo T. Akomolafe, Dept. of Mathematical Sciences,Ondo State University of Science and Technology,Okitipupa, Nigeria.
Email: | |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
Abstract
Road accident is a special case of trauma that constitutes a major cause of disability, untimely death and loss of loved ones as well as family bread winners. Therefore, predicting the likelihood of road accident on high ways with particular emphasis on Lagos – Ibadan express road, Nigeria in order to prevent accident is very important. Various attempts had been made to identify the cause(s) of accidents on highways using different techniques and system and to reduce accident on the roads but the rate of accident keep on increasing. In this study, the various techniques used to analyse the causes of accidents along this route and the effects of accidents were examined. A technique of using data mining tool to predict the likely occurrence of accident on highways, the likely cause of the accident and accident prone locations was proposed using Lagos –Ibadan highway as a case study. WEKA software was used to analyse accident data gathered along this road. The results showed that causes of accidents, specific time/condition that could trigger accident and accident prone areas could be effectively identified.
Keywords:
Data Mining, Decision Tree, Accident, WEKA, Data Modelling, Id3 Algorithm, Id3 Tree, Functional Tree Algorithm
Cite this paper:
Dipo T. Akomolafe, Akinbola Olutayo, "Using Data Mining Technique to Predict Cause of Accident and Accident Prone Locations on Highways", American Journal of Database Theory and Application, Vol. 1 No. 3, 2012, pp. 26-38. doi: 10.5923/j.database.20120103.01.
1. Introduction
Road accident is a special case of trauma that constitutes a major cause of disability and untimely death. It has been estimated that over 300,000 persons die and 10 to 15 million persons are injured every year in road accidents throughout the world. Statistics have also shown that mortality in road accidents is very high among young adults that constitute the major part of the work force. In actual fact, accidents kill faster than AIDS and it gives no preparatory time to its victims. In order to combat this problem, various road safety strategies have been proposed and used. These methods mainly involve conscious planning, design and operations on roads. One important feature of this method is the identification and treatment of accident prone locations commonly called black spots; black spots are not the only cause of accidents on the highway. Also various organizations such as Police High Way Patrol, Vehicle Inspection Officer (VIO), Federal Road Safety Commission (FRSC) among others are charged with the responsibility of maintaining safety thereby reducing road accidents. However, lack of good forecasting techniques has been a major hindrance to these organizations in achieving their objectives.It is against this background that Decision Tree is beingproposed to model data from road accident database to determine causes of accidents and accident prone locations using historical data collected from Ibadan-Lagos express road as reference point.
2. Objective
The primary objective of this research is to use data mining technique; decision tree to predict causes of accident and accident prone locations on highways using data collected on Lagos – Ibadan express way.
3. Methods
3.1. Data Mining
Data Mining is an interactive process of discovering valid and novel, useful and understandable patterns or models in large database (Han, Mannila and Smyth, 2001). Data Mining, according to Han, Mannila and Symth (2001) is a process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make a valid prediction. Data mining uses advances in the field of Artificial Intelligence (AI) and Statistical techniques. Therefore, decision tree is being used in this research
3.2. Decision Trees
Decision Trees have emerged as a powerful technique for modelling general input / output relationships. They are tree – shaped structures that represents a series of roles that lead to sets of decisions. They generate rules for the classification of a dataset and a logical model represented as a binary (two – way split) tree that shows how the value of a target variable can be predicted by using the values of a set predictor variables. Decision trees, which are considered in a regression analysis problem, are called regression trees. Thus, the decision tree represents a logic model of regularities of the researched phenomenon.
3.3. Accidents along Lagos - Ibadan Express Way
Lagos to Ibadan Express road is one of the busiest roads in Africa. This is because. Lagos was the capital of Nigeria until the seat of government moved to the Federal Capital Territory Abuja and also the headquarters of many national institutions while Ibadan is said to be the largest city in black Africa. The traffic along this route is very heavy because it is a gateway linkage of the heavy traffic going from the Northern, Eastern and Majority of Western states. Fig 3.1 shows the frequency of accidents between the distances of 1 and 40km from Ibadan to Lagos between January 2002 and December 2003.The statistics shows that having a means of predicting likely location of accident base on some input values is essential to advice on dangerous locations. | Figure 3.1. Graph of Frequency of Accidents against Month |
Several works have been carried out by different researchers both on road accident analysis and forecasting, using Decision Tree and Artificial Neural Networks. Martin, Grandal and Pilkey (2000), analysed the relationship between road infrastructure and safety by using a cross-sectional time-series data base collected for all 50 U.S. states over 14 years. The result suggested that as highway facilities are upgraded, there are reduced fatalities. Gelfand (1991) studied the effect of new pavement on traffic safety in Sweden. The result of his study shows that Traffic accidents increased by 12 % after one year of resurfacing on all types of roads. Akomolafe (2004) employed Artificial Neural Network using multilayer perceptron to predict likelihood of accident happening at particular location between the first 40 kilometers along Lagos-Ibadan Express road and discovered that location 2 recorded the highest number of road accident occurrence and that, tyre burst was the major cause of accident along the route. Ossenbruggen (2005) used a logistic regression model to identify statistically significant factors that predict the probabilities of crashes and injury crashes aiming at using these models to perform a risk assessment of a given region. Their study illustrated that village sites are less hazardous than residential and shopping sites. Abdalla et al (1987) studied the relationship between casualty frequencies and the distance of the accidents from the zones of residence. As might have been anticipated, the casualty frequencies were higher nearer to the zones of residence, possibly due to higher exposure. Akomolafe et al (2009) used geo spatial technology to identify various positions along major roads in Nigeria. The study revealed that the casualty rates amongst residents from areas classified as relatively deprived were significantly higher than those from relatively affluent areas. Table 3.1. Record of Accidents along Lagos Ibadan between year 2002 and 2003 |
| S/NO | Month | No of Accident | 1 | Jan 2002 | 6 | 2 | Feb 2002 | 11 | 3 | March 2002 | 10 | 4 | April 2002 | 18 | 5 | May 2002 | 14 | 6 | June 2002 | 4 | 7 | July 2002 | 6 | 8 | August 2002 | 1 | 9 | September 2002 | 9 | 10 | October 2002 | 6 | 11 | Nov. 2002 | 4 | 12 | December 2002 | 5 | 13 | Jan 2002 | 5 | 14 | Feb 2003 | 5 | 15 | March 2003 | 4 | 16 | April 2003 | 7 | 17 | May 2003 | 2 | 18 | June 2003 | 1 | 19 | July 2003 | 4 | 20 | August 2003 | 5 | 21 | September 2003 | 8 | 22 | October 2003 | 5 | 23 | Nov. 2003 | 5 | 24 | December 2003 | 6 |
|
|
3.4. Process of Data Mining
The process of data mining consists of three steps which are:
3.4.1. Data Preparation
This includes; Data collection, Data cleaning and Data transformation.
3.4.2. Data Modeling
This research considers the data of accident record between the first 40km from Ibadan to Lagos. The data were organized into a relational database. The unknown causes in Table 3.2 may include other factors such as Law enforcement agent problems, attitude of other road users, inadequate traffic road signs, traffic congestion and general vehicle conditions The sample data used covered the period of 24 Months, that is, January 2002 to December 2003 as indicated in Fig. 3.1.The output variable is the location and the locations can be divided into three distinct regions tagged regions A, B and C, meaning we have three outputs. Where First location 1 – 10km is Region A or location 1, Above10km – 20km is region B or Location 2 and above 20km is region C or Location 3 The data sample used covered a period of twenty four Months starting from January 2002 to December 2003.The data were collected by Akomolafe (2004) and this is presented in Table.3.3.
3.4.3. Deployment
In this stage, new sets are applied to the model selected in the previous stage to generate predictions or estimates of the expected outcome.Table 3.2. showing variables given both continuous and categorical values |
| |
|
Table 3.3. Sample Data collected from FRSC (Akomolafe O.P 2004) |
| SNO | DATE | TYPE | TIME | SEASON | CAUSE | LOCATION | REG. NO | 1 | 6.1.2002 | 2 | 2 | 1 | 2 | 31 | XG 506 LND | 2 | 7.1.2002 | 2 | 1 | 1 | 1 | 14 | XC 720 ACD | 3 | 11.1.2002 | 1 | 1 | 1 | 1 | 14 | AM 713 LND | 4 | 12.1.2002 | 2 | 1 | 1 | 2 | 27 | XE 905 JJJ | 5 | 19.1.2002 | 1 | 2 | 1 | 3 | 27 | AA 559 LAF | 6 | 30.01.02 | 3 | 3 | 1 | 2 | 12 | AA 156 NWD | 7 | 03.02.02 | 2 | 2 | 1 | 2 | 35 | XF 635 JJJ | 8 | 05.02.02 | 2 | 1 | 1 | 2 | 10 | XE 141 AKD | 9 | 05.02.02 | 2 | 3 | 1 | 2 | 14 | XE 124 AKD | 10 | 06.02.02 | 2 | 3 | 1 | 2 | 31 | XE 124 AKD | 11 | 11.02.02 | 1 | 1 | 1 | 3 | 5 | AG 276 LAR | 12 | 14.02.02 | 1 | 1 | 1 | 2 | 14 | | 13 | 18.02.02 | 1 | 2 | 1 | 2 | 18 | | 14 | 21.02.02 | 2 | 1 | 1 | 2 | 19 | XD 249 SMK | 15 | 21.02.02 | 3 | 2 | 1 | 2 | 19 | XC 361 KTU | 16 | 24.02.02 | 2 | 1 | 1 | 2 | 18 | XE 716 SMK | 17 | 27.02.02 | 2 | 3 | 1 | 2 | 35 | XC 307 SGM | 18 | 03.03.02 | 2 | 1 | 1 | 2 | 16 | XE 807 NSR | 19 | 05.03.02 | 1 | 2 | 1 | 2 | 10 | XC 348 AKP | 20 | 07.03.02 | 2 | 1 | 1 | 2 | 2 | OY 2270 JB | 21 | 07.03.02 | 1 | 1 | 1 | 2 | 13 | AP 820 LSD | 22 | 07.03.02 | 3 | 2 | 1 | 2 | 18 | XE 322 APP | 23 | 19.03.02 | 2 | 2 | 1 | 2 | 19 | XC 993 AGL | 24 | 19.03.02 | | 3 | 1 | 2 | 2 | LA 1804 RF | 25 | 30.03.02 | 1 | 4 | 1 | 2 | 14 | AM 343 FST | 26 | 31.03.02 | 1 | 2 | 1 | 2 | 14 | KC 461 ABA | 27 | 31.03.02 | 1 | 2 | 1 | 2 | 14 | BS 142 KJA | 28 | 01.04.02 | 2 | 1 | 2 | 2 | 22 | AA 807 EGB | 29 | 01.04.02 | 1 | 1 | 2 | 2 | 22 | BX 527 GGE | 30 | 01.04.02 | 2 | 2 | 2 | 2 | 18 | AG 787 GNN | 31 | 02.04.02 | 1 | 1 | 2 | 1 | 7 | AU 725 MAP | 32 | 02.04.02 | 2 | 2 | 2 | 2 | 27 | XG 358 APP | 33 | 04.04.02 | 1 | 1 | 2 | 2 | 15 | CY 65 EKY | 34 | 04.04.02 | 1 | 2 | 2 | 2 | 17 | AJ 21 AGG | 35 | 05.04.02 | 1 | 2 | 2 | 1 | 6 | AW 45 FST | 36 | 06.04.02 | 2 | 1 | 2 | 2 | 30 | XB 855 AKD | 37 | 07.04.02 | 1 | 2 | 2 | 1 | 13 | AL 567 YAB | 38 | 09.04.02 | 2 | 2 | 2 | 2 | 12.5 | XA 787 WWP | 39 | 13.04.02 | 2 | 1 | 2 | 1 | 1 | XB 791 GNN | 40 | 13.04.02 | 2 | 1 | 2 | 1 | 11 | XA 127 AFN | 41 | 13.04.02 | 1,2 | 1 | 2 | 1 | 11 | AH 202 AKN | 42 | 22.04.02 | 1 | 2 | 2 | 1 | 15 | RA 01 KRD | 43 | 22.04.02 | 1,3 | 2 | 2 | 1 | 11 | BB 731 KJA | 44 | 27.04.02 | 2 | 2 | 2 | 2 | 27 | AU 739 JJJ | 45 | 28.04.02 | 1 | 2 | 2 | 1 | 14 | AE 316 FST | 46 | 03.04.02 | 1 | 3 | 2 | 2,1 | 12 | AZ 824 AAA | 47 | 5.8.2002 | 1 | 1 | 2 | 2 | 20 | AA 654 GBY | 48 | 5.8.2002 | 1 | 1 | 2 | 2 | 30 | XF 65 JJJ | 49 | 5.10.2002 | 2&1 | 1 | 2 | 1 | 35 | DM 207 AAA | | | | | 2 | | | BL 86 AAA | 50 | 5.10.2002 | 1 | 1 | 2 | 1&2 | 35 | BR 608 LSR | 51 | 5.11.2002 | 3 | 1 | 2 | 2 | 26 | XB 606 APP | 52 | 5.13.2002 | 2 | 1 | 2 | 1 | 2 | XA 616 YLW | 53 | 5.13.2002 | 1 | 1 | 2 | 1 | 26.5 | BM 566 GGE | 54 | 5.14.2002 | 2 | 3 | 2 | 2 | 15 | XC 348 AKD | 55 | 5.15.2002 | 1 | 2 | 2 | 2 | 19 | OY 2077 JB | 56 | 5.15.2002 | 1 | 2 | 2 | 2 | 14 | AJ 101 NND | 57 | 5.20.2002 | 1 | | 2 | 2 | 26 | AU 682 ABC | 58 | 5.21.2002 | 2 | | 2 | 2 | 24 | XG 719 FST | 59 | 5.25.2002 | 1 | 1 | 2 | 2 | 12 | AV 70 LSR | 60 | 6.2.2002 | | 3 | 2 | 1 | 12 | AZ 191 MUS | 61 | 6.3.2002 | 2 | 2 | 2 | 2 | 16 | AQ 742 YYY | 62 | 6.15.2002 | 2 | 1 | 2 | 2 | 12 | XA 682 YRE | 63 | 6.16.2002 | 1 | 1 | 2 | 2 | 21 | AL 885 AKN | 64 | 6.16.2002 | 2 | 1 | 2 | 2 | 21 | XE 751 SMK | 65 | 7.15.2002 | 2 | 1 | 2 | 3 | 12 | XH 649 GGE | 66 | 7.20.2002 | 2 | 2 | 2 | 2 | 10 | XB 286 KNR | 67 | 8.8.2002 | 3 | 2 | 2 | 2 | 12 | XE 232 SGM | 68 | 9.19.2002 | 1 | 3 | 2 | 2 | 22 | XA 940 KNH | 69 | 9.20.2002 | 2 | 1 | 2 | 2 | 4 | AX 94 JJJ | 70 | 9.20.2002 | 3 | 2 | 2 | 2 | 7 | XC 768 BDJ | 71 | 9.21.2002 | 1 | 1 | 2 | 1 | 29 | BL 254 SMK | 72 | 9.21.2002 | 2 | 1 | 2 | 1 | 16 | AP 647 AKR | 73 | 9.21.2002 | 2 | | 2 | 2 | 18 | XC 253 GGE | 74 | 9.22.2002 | 2 | 1 | 2 | 2 | 10 | LA 979 BG | 75 | 9.22.2002 | 2 | 3 | 2 | 2 | 16 | XU 510 GGE | 76 | 9.27.2002 | | 2 | 2 | 2 | 12 | | 77 | 10.1.2002 | 1 | 2 | 2 | 1 | 6 | AA 05 MHA | 78 | 10.14.2002 | 2 | 1 | 2 | 2 | 13 | XE 869 MUS | 79 | 10.16.2002 | 2 | 2 | 2 | 2 | 15 | XB 888 AKR | 80 | 10.29.2002 | | 2 | 2 | 2 | 7 | | 81 | 10.29.2002 | 2 | 2 | 2 | 2 | 17 | XD 168 BDJ | 82 | 10.29.2002 | 3 | 1 | 2 | 2 | 6 | AA 342 LES | 83 | 11.4.2002 | 2 | 1 | 1 | 1 | 5 | BX 877 KJA | 84 | 11.10.2002 | 2 | 1 | 1 | 2 | 12 | XC 637 RKJ | 85 | 11.10.2002 | 2 | 2 | 1 | 2 | 11 | XC 937 SGM | 86 | 11.12.2002 | 1 | | 1 | | 12 | AA 466 KNR | 87 | 2.12.2004 | 2 | 1 | 1 | 2 | 14 | XG 182 JJJ | 88 | 12.7.2002 | 3 | 2 | 1 | 2 | 1 | XA 425 CRC | 89 | 12.10.2002 | 2 | 3 | 1 | 3 | 13 | XD 695 EKY | 90 | 12.11.2002 | 2 | 2 | 1 | 2 | 16 | XA 350 EDY | 91 | 12.12.2002 | | 1 | 1 | 2 | 14 | XG 955 KSF | 92 | 23.01.2002 | 1 | 3 | 1 | 1 | 16 | XA 411 EJG | 93 | 18.01.03 | 1 | 3 | 1 | 1 | 18 | AE 015 GBN | 94 | 27.01.03 | 2 | 2 | 1 | 2 | 8 | XD 125 LSR | 95 | 29.01.03 | 3 | 4 | 1 | 2 | 12 | XC 616 KTU | 96 | 29.01.03 | 2 | | 1 | 2 | 14 | XF 797 AKD | 97 | 02.02.03 | 2 | 1 | 1 | 2 | 18 | CW 293 AAA | 98 | 12.02.03 | 1 | 2 | 1 | 1 | 18 | AV 3 GGE | 99 | 12.02.03 | 2 | 2 | 1 | 2 | 18 | XB 6 WWD | 100 | 12.02.03 | 1 | 3 | 1 | 1 | 12 | HB 40 KJA | 101 | 17.02.03 | 2 | 3 | 1 | 2 | 11 | XB 446 MNY | 102 | 05.03.03 | 1 | 2 | 1 | 2 | 6 | AE 753 KRE | 103 | 19.03.03 | 2 | 1 | 1 | 2 | 12 | XH 382 ABC | 104 | 28.03.03 | 3 | | 1 | 1 | 12 | AG 145 NRK | 105 | 31.03.03 | 2 | 3 | 1 | 2 | 13 | AA 499 GBY | 106 | 05.04.03 | 2 | 2 | 2 | 3 | 11.5 | XD 432 KSF | 107 | 06.04.031 | 1 | 1 | 2 | 3 | 12 | CE 188 JJJ | 108 | 06.04.03 | 2 | 1 | 2 | 2 | 12 | FA 01 JJ | 109 | 14.04.03 | 1 | 1 | 2 | | 28 | FV 43 AAA | 110 | 24.04.03 | 1 | 2 | 2 | 2 | 7 | OY 01 SE | 111 | 24.04.03 | 3 | 2 | 2 | 2 | 9 | XB 328 MAG | 112 | 30.04.03 | 3 | 3 | 2 | 1 | 16 | XD 644 NRK | 113 | 10.05.03 | | 1 | 2 | | 40 | AA 399 KTU | 114 | 16.05.03 | 1 | 3 | 2 | 2 | 20 | XH 327 ADC | 115 | 02.06.03 | 1 | 1 | 2 | 1 | 8 | XB 144 YRE | 116 | 20.07.03 | 2 | 1 | 2 | 2 | 27 | 5K 324 LND | 117 | 26.07.03 | 1 | 2 | 2 | 2 | 9 | DG 329 LSR | 118 | 28.07.03 | 2 | 2 | 2 | 2 | 13 | XJ 179 LND | 119 | 28.07.03 | 2 | 2 | 2 | 1 | 18 | XF 114 EPE | 120 | 02.08.03 | 1 | 1 | 2 | 2 | 13 | CB 434 MUS | 121 | 02.08.03 | 1 | 1 | 2 | 1 | 8 | XG 954 FST | 122 | 09.08.03 | 1 | 1 | 2 | 1 | 19 | AG 802 SGB | 123 | 16.08.03 | 2 | 2 | 2 | 2 | 2 | XF 450 SMK | 124 | 31.08.03 | 1 | 1 | 2 | 1 | 14 | OY 1281 TD | 125 | 01.09.03 | 3 | 2 | 2 | 1 | 8 | XA 362 KJA | 126 | 08.09.03 | 1 | | 2 | | 18 | XH 723 JJJ | 127 | 14.09.03 | | | 2 | | 19 | | 128 | 16.09.03 | 1 | 2 | 2 | 2 | 6 | AA 112 YRE | 129 | 21.09.03 | 2 | 1 | 2 | 2 | 31 | XB 766 AGG | 130 | 24.09.03 | 2 | 2 | 2 | 1 | 18 | XC 115 EDE | 131 | 28.09.03 | 2 | 1 | 2 | 2 | 14 | XN 739 AAA | 132 | 28.09.03 | 2 | 3 | 2 | 2 | 13 | XD 642 NRK | 133 | 06.10.03 | 1 | 2 | 2 | 2 | 11 | DG 548 LND | 134 | 14.10.03 | 2 | 2 | 2 | 2 | 12 | XA 730 FUF | 135 | 18.10.03 | 2 | 3 | 2 | 2 | 28 | XA 286 GBH | 136 | 19.10.03 | | 1 | 2 | 2 | 22 | AA 188 AAA | 137 | 20.10.03 | 2 | 2 | 2 | 2 | 27 | LG 016 KNE | 138 | 01.11.03 | 3 | 1 | 1 | 1 | 9 | XA 847 KEH | 139 | 02.11.03 | 2 | 2 | 1 | 2 | 18 | XC 575 GGE | 140 | 25.11.03 | 1 | 1 | 1 | 3 | 24 | BO 984 APP | 141 | 27.11.03 | 1 | 1 | 1 | 2 | 18 | AJ 06 SGB | 142 | 27.11.03 | 2 | 2 | 1 | 2 | 13 | XB 369 EKY | 143 | 06.12.03 | 2 | 1 | 1 | 2 | 13 | AP 938 KJA | 144 | 09.12.03 | 3 | 3 | 1 | 1 | 13 | BM 130 MAP | 145 | 13.12.03 | 2 | 1 | 1 | 1 | 7 | XA 610 ARP | 146 | 22.12.03 | 1 | 1 | 1 | 1 | 11 | BL 500 GGE | 147 | 24.12.03 | | 1 | 1 | 3 | 12 | JB 356 KJA | 148 | 24.12.03 | 2 | 2 | 1 | 2 | 13 | XG 562 AKD |
|
|
4. Results
4.1. Analysis
The major step required to obtain result of the research was carried out by analysing the data using WEKA. WEKA is a collection of machine learning algorithms and data processing tools. It contains various tools for data pre-processing, classification, regression, clustering, association rules and visualization. There are many learning algorithms implemented in WEKA including Bayesian classifier, Trees, Rules, Functions, Lazy classifiers and miscellaneous classifiers. The algorithms can be applied directly to a data set. WEKA is also data mining software developed in JAVA it has a GUI chooser from which any one of the four major WEKA applications can be selected. For the purpose of this study, the Explorer application was used.The Explorer window of WEKA has six tabs. The first tab is pre- process that enables the formatted data to be loaded into WEKA environment. Once the data has been loaded, the preprocess panel shows a variety of information as shown in figure 4.3 below. | Figure 4.1. WEKA GUI chooser |
| Figure 4.2. WEKA Explorer |
4.1. Weka Classifiers
There are several classifiers available in WEKA but Function Tree and Id3 were used in this study in case of Decision Tree. Prism Rule based learner was generated using WEKA. Attribute importance analysis was carried out to rank the attribute by significance using information gain. Finally, correlation based feature subset selection (cfs) and consistency subset selection (COE) filter algorithm were used to rank and select the attribute that are most useful. The F- measure and the AUC which are well known measures of probability tree learning was used as evaluation metrics for model generated by WEKA classifiers.Several numbers of setups of decision tree algorithms have been experimented and the best result obtained is reported as the data set. Each class was trained with entropy of fit measure, the prior class probabilities parameter was set to equal, the stopping option for pruning was misclassification error, the minimum n per node was set to 5, the fraction of objects was 0.05, the maximum number of nodes was 100, surrogates was 5, 10 fold cross-validation was used, and generated comprehensive results.The best decision tree result was obtained with Id3 with 115 correctly classified instances and 33 incorrectly classified instances which represents 77.70% and 22.29% respectively. Mean absolute error was 0.1835 and Root mean squared error was 0.3029.The tree and rules generated with Id3 algorithm are given thus:
4.2. Id3 Tree
TYREBURST = TRUE| SEASON = WET| | TYPE = HAEVY VEHICLE| | | TIME = EVENING: LOCATION2| | | TIME = AFTERNOON: LOCATION2| | | TIME = MORNING: LOCATION2| | | TIME = NIGHT: null| | TYPE = SMALL CAR: LOCATION2| | TYPE = MOTOCYCLE: null| SEASON = DRY| | TIME = EVENING| | | TYPE = HAEVY VEHICLE: LOCATION2| | | TYPE = SMALL CAR: LOCATION3| | | TYPE = MOTOCYCLE: null| | TIME = AFTERNOON| | | TYPE = HAEVY VEHICLE: LOCATION2| | | TYPE = SMALL CAR: LOCATION2| | | TYPE = MOTOCYCLE: null| | TIME = MORNING| | | TYPE = HAEVY VEHICLE: LOCATION3| | | TYPE = SMALL CAR: LOCATION3| | | TYPE = MOTOCYCLE: null| | TIME = NIGHT: nullTYREBURST = FALSE| TIME = EVENING| | OVERSPEEDING = FALSE: LOCATION2| | OVERSPEEDING = TRUE| | | TYPE = HAEVY VEHICLE: LOCATION2| | | TYPE = SMALL CAR: LOCATION2| | | TYPE = MOTOCYCLE: null| TIME = AFTERNOON| | LOSS-OF-CONTROL = FALSE| | | OVERSPEEDING = FALSE| | | | BRAKE-FAILURE = FALSE| | | | | TYPE = HAEVY VEHICLE| | | | | | WRONG-OVERTAKING = FALSE| | | | | | | BROKEN-SHAFT = FALSE: LOCATION1| | | | | | | BROKEN-SHAFT = TRUE: LOCATION3| | | | | | WRONG-OVERTAKING = TRUE: LOCATION2| | | | | TYPE = SMALL CAR| | | | | | SEASON = WET: LOCATION3| | | | | | SEASON = DRY| | | | | | | CARELESSDRIVING = FALSE: LOCATION3| | | | | | | CARELESSDRIVING = TRUE: LOCATION2| | | | | TYPE = MOTOCYCLE: LOCATION3| | | | BRAKE-FAILURE = TRUE| | | | | TYPE = HAEVY VEHICLE: LOCATION1| | | | | TYPE = SMALL CAR: LOCATION1| | | | | TYPE = MOTOCYCLE: LOCATION2| | | OVERSPEEDING = TRUE| | | | TYPE = HAEVY VEHICLE: LOCATION2| | | | TYPE = SMALL CAR| | | | | SEASON = WET: LOCATION2| | | | | SEASON = DRY: LOCATION2| | | | TYPE = MOTOCYCLE: null| | LOSS-OF-CONTROL = TRUE| | | TYPE = HAEVY VEHICLE: LOCATION2| | | TYPE = SMALL CAR| | | | SEASON = WET: LOCATION2| | | | SEASON = DRY: LOCATION1| | | TYPE = MOTOCYCLE: LOCATION1| TIME = MORNING| | SEASON = WET| | | OVERSPEEDING = FALSE| | | | TYPE = HAEVY VEHICLE| | | | | WRONG-OVERTAKING = FALSE| | | | | | CARELESSDRIVING = FALSE: LOCATION1| | | | | | CARELESSDRIVING = TRUE: LOCATION2| | | | | WRONG-OVERTAKING = TRUE: LOCATION1| | | | TYPE = SMALL CAR| | | | | CARELESSDRIVING = FALSE| | | | | | LOSS-OF-CONTROL = FALSE: LOCATION3| | | | | | LOSS-OF-CONTROL = TRUE: LOCATION2| | | | | CARELESSDRIVING = TRUE: LOCATION1| | | | TYPE = MOTOCYCLE: LOCATION2| | | OVERSPEEDING = TRUE: LOCATION2| | SEASON = DRY| | | BROKEN-SHAFT = FALSE| | | | TYPE = HAEVY VEHICLE| | | | | CARELESSDRIVING = FALSE| | | | | | LOSS-OF-CONTROL = FALSE| | | | | | | BROKEN-SPRING = FALSE| | | | | | | | OVERSPEEDING = FALSE: LOCATION2| | | | | | | | OVERSPEEDING = TRUE: LOCATION2| | | | | | | BROKEN-SPRING = TRUE: LOCATION2| | | | | | LOSS-OF-CONTROL = TRUE: LOCATION2| | | | | CARELESSDRIVING = TRUE: LOCATION3| | | | TYPE = SMALL CAR| | | | | CARELESSDRIVING = FALSE| | | | | | OVERSPEEDING = FALSE| | | | | | | UNKNOWN-CAUSES = FALSE| | | | | | | | ROBBERY-ATTACK = FALSE| | | | | | | | | WRONG-OVERTAKING = FALSE| | | | | | | | | | LOSS-OF-CONTROL = FALSE| | | | | | | | | | | TREE-OBSTRUCTION = FALSE| | | | | | | | | | | | BRAKE-FAILURE = FALSE: LOCATION3| | | | | | | | | | | | BRAKE-FAILURE = TRUE: LOCATION2| | | | | | | | | | | TREE-OBSTRUCTION = TRUE: LOCATION2| | | | | | | | | | LOSS-OF-CONTROL = TRUE: LOCATION2| | | | | | | | | WRONG-OVERTAKING = TRUE: LOCATION2| | | | | | | | ROBBERY-ATTACK = TRUE: LOCATION3| | | | | | | UNKNOWN-CAUSES = TRUE: LOCATION3| | | | | | OVERSPEEDING = TRUE: LOCATION3| | | | | CARELESSDRIVING = TRUE: LOCATION1| | | | TYPE = MOTOCYCLE: null| | | BROKEN-SHAFT = TRUE: LOCATION3| TIME = NIGHT: LOCATION2Prism rules----------Rule 1 If BROKEN-SHAFT = TRUE then LOCATION3Rule 2 If ROBBERY-ATTACK = TRUE and TYPE = SMALL CAR then LOCATION3Rule 3 If TREE-OBSTRUCTION = TRUE and TIME = EVENING then LOCATION3Rule 4 If TYREBURST = TRUE and TIME = MORNING and TYPE = SMALL CAR and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 5 If TYPE = MOTOCYCLE and CARELESSDRIVING = TRUE then LOCATION3Rule 6 If ROAD-PROBLEM = TRUE and TYPE = SMALL CAR and TIME = AFTERNOON and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 7 If TYREBURST = TRUE and SEASON = DRY and TIME = MORNING and TYPE = HAEVY VEHICLE and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 8 If UNKNOWN-CAUSES = TRUE and TYPE = SMALL CAR and TIME = MORNING and SEASON = DRY then LOCATION3Rule 9 If TYREBURST = TRUE and TYPE = HAEVY VEHICLE and TIME = AFTERNOON and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 10 If TIME = MORNING and OVERSPEEDING = TRUE and TYPE = SMALL CAR and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 11 If TYREBURST = TRUE and TIME = EVENING and TYPE = SMALL CAR then LOCATION3Rule 12 If TYREBURST = TRUE and TYPE = HAEVY VEHICLE and TIME = AFTERNOON and SEASON = WET and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 13 If TIME = MORNING and LOSS-OF-CONTROL = TRUE and TYPE = HAEVY VEHICLE and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 14 If UNKNOWN-CAUSES = TRUE and TYPE = SMALL CAR and TIME = MORNING and SEASON = WET and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 15 If TYREBURST = TRUE and TYPE = HAEVY VEHICLE and SEASON = WET and TIME = EVENING and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 16 If TIME = MORNING and TYREBURST = TRUE and TYPE = HAEVY VEHICLE and SEASON = WET and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION3Rule 17 If CARELESSDRIVING = TRUE and TYPE = HAEVY VEHICLE and SEASON = DRY then LOCATION3Rule 18 If TIME = MORNING and TYPE = SMALL CAR and SEASON = DRY and CARELESSDRIVING = FALSE and WRONG-OVERTAKING = FALSE and LOSS-OF-CONTROL = FALSE and TREE-OBSTRUCTION = FALSE and BRAKE-FAILURE = FALSE then LOCATION3Rule 19 If TIME = NIGHT then LOCATION2Rule 20 If WRONG-OVERTAKING = TRUE and TYPE = SMALL CAR then LOCATION2Rule 21 If TIME = EVENING and CARELESSDRIVING = TRUE then LOCATION2Rule 22 If TIME = EVENING and UNKNOWN-CAUSES = TRUE then LOCATION2Rule 23 If TIME = EVENING and LOSS-OF-CONTROL = TRUE then LOCATION2Rule 24 If TIME = EVENING and ROBBERY-ATTACK = TRUE then LOCATION2Rule 25 If TIME = EVENING and TYPE = HAEVY VEHICLE and SEASON = DRY then LOCATION2Rule 26 If SEASON = WET and TYPE = MOTOCYCLE then LOCATION2Rule 27 If SEASON = WET and OVERSPEEDING = TRUE and TIME = MORNING then LOCATION2Rule 28 If TYREBURST = TRUE and SEASON = WET and TYPE = SMALL CAR then LOCATION2Rule 29 If TYREBURST = TRUE and SEASON = WET and TIME = MORNING and TYPE = HAEVY VEHICLE and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 30 If TYPE = HAEVY VEHICLE and ROBBERY-ATTACK = TRUE then LOCATION2Rule 31 If TYPE = HAEVY VEHICLE and OVERSPEEDING = TRUE and TIME = AFTERNOON then LOCATION2Rule 32 If TYREBURST = TRUE and SEASON = WET and TIME = EVENING and TYPE = HAEVY VEHICLE and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 33 If TYREBURST = TRUE and SEASON = WET and TYPE = HAEVY VEHICLE and TIME = AFTERNOON and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 34 If TYPE = HAEVY VEHICLE and TIME = EVENING then LOCATION2Rule 35 If TYPE = HAEVY VEHICLE and OVERSPEEDING = TRUE and TIME = MORNING and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 36 If TYREBURST = TRUE and TIME = AFTERNOON and TYPE = SMALL CAR and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 37 If BRAKE-FAILURE = TRUE and TYPE = MOTOCYCLE then LOCATION2Rule 38 If WRONG-OVERTAKING = TRUE and TIME = AFTERNOON then LOCATION2Rule 39 If TREE-OBSTRUCTION = TRUE and TIME = MORNING then LOCATION2Rule 40 If BROKEN-SPRING = TRUE and TYPE = HAEVY VEHICLE and TIME = MORNING and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 41 If TYPE = HAEVY VEHICLE and TYREBURST = TRUE and TIME = AFTERNOON and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 42 If LOSS-OF-CONTROL = TRUE and TIME = MORNING and TYPE = SMALL CAR then LOCATION2Rule 43 If UNKNOWN-CAUSES = TRUE and TYPE = HAEVY VEHICLE and SEASON = DRY then LOCATION2Rule 44 If OVERSPEEDING = TRUE and TIME = AFTERNOON and SEASON = WET then LOCATION2Rule 45 If TYPE = HAEVY VEHICLE and LOSS-OF-CONTROL = TRUE and TIME = MORNING and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 46 If SEASON = WET and LOSS-OF-CONTROL = TRUE and TIME = AFTERNOON and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE and TYPE = HAEVY VEHICLE then LOCATION2Rule 47 If CARELESSDRIVING = TRUE and TIME = AFTERNOON and TYPE = SMALL CAR then LOCATION2Rule 48 If OVERSPEEDING = TRUE and TIME = AFTERNOON and TYPE = SMALL CAR and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 49 If SEASON = WET and TIME = EVENING and TYPE = SMALL CAR and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and LOSS-OF-CONTROL = FALSE and TYREBURST = FALSE and OVERSPEEDING = TRUE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2Rule 50 If TYPE = HEAVY VEHICLE and LOSS-OF-CONTROL = TRUE and TIME = AFTERNOON and SEASON = DRY and WRONG-OVERTAKING = FALSE and CARELESSDRIVING = FALSE and TYREBURST = FALSE and OVERSPEEDING = FALSE and TREE-OBSTRUCTION = FALSE and PUSHED-BY-A-CAR = FALSE and BROKEN-SHAFT = FALSE and BROKEN-SPRING = FALSE and BRAKE-FAILURE = FALSE and ROAD-PROBLEM = FALSE and UNKNOWN-CAUSES = FALSE and ROBBERY-ATTACK = FALSE then LOCATION2
5. Discussion
There are 50 rules generated from this tree. Rule 1- 18 indicate the occurrence of accident in Location 3 and rule 19-50 also shows the occurrence of accident in location 2.This indicate that, location 2 has the highest number of road accident occurrence with Heavy-vehicle in the afternoon and during the dry season.Rule 41 is the best one that can be used for prediction. The rule says that, Tyre bust is the cause of road accident with heavy vehicle within location 2 in the day time and during the dry season.Decision Tree Performance Analysis on Id3Table 5.1. Detailed Accuracy By class |
| Class | TP rate | FT rate | Precision | Recall | F- measure | Roc Area | Location (3) | 0.688 | 0.069 | 0.733 | 0.688 | 0.71 | 0.942 | Location (2) | 0.897 | 0.361 | 0.78 | 0.897 | 0.834 | 0.888 | Location (1) | 0.517 | 0.025 | 0.833 | 0.517 | 0.638 | 0.95 | Weighted Avg. | 0.777 | 0.232 | 0.78 | 0.777 | 0.769 | 0.912 |
|
|
Table 5.2. Confusion matrix Predicted category |
| Actual category | Location (3) | Location (2) | Location (1) | Location (3) | 22 | 10 | 0 | Location (2) | 6 | 78 | 3 | Location (1) | 2 | 12 | 15 |
|
|
Decision Tree performance Analysis on Function Tree (FT)Table 5.3. Detailed Accuracy by Class |
| Class | TP rate | FT rate | Precision | Recall | F- measure | Roc Area | Location (3) | 0.625 | 0.086 | 0.667 | 0.625 | 0.645 | 0.869 | Location (2) | 0.77 | 0.361 | 0.753 | 0.77 | 0.761 | 0.736 | Location (1) | 0.586 | 0.101 | 0.586 | 0.586 | 0.586 | 0.832 | Weighted Avg. | 0.703 | 0.25 | 0.702 | 0.703 | 0.702 | 0.783 |
|
|
Table 5.4. Confusion Matrix Predicted category |
| Actual category | Location (3) | Location (2) | Location (1) | Location (3) | 20 | 12 | 0 | Location (2) | 8 | 67 | 12 | Location (1) | 2 | 10 | 17 |
|
|
6. Conclusions
Using WEKA software to analyze accident data collected on Lagos-Ibadan road, it was found that decision tree can accurately predict the cause(s) of accident and accident prone locations along the road and other roads if relevant data are gathered and analyzed as in this case.In Decision Tree Performance analysis, the, dataset were experimented with two algorithms; Id3 and FT (function tree) For Id3 algorithm, there were 115 correctly classified instances and 33 incorrectly classified instances which represent 77.70% and 22.29% respectively. Mean absolute error was 0.1835 and Root mean squared error was 0.3029. Also for functional tree algorithm (FT), total number of tree size was 5 with 105 correctly classified instances representing 70.27% and 44 incorrectly classified instances representing 29.73%.From the detailed accuracy by class and confusion matrix, Id3 attained accuracy rate of 0.777 and FT attained accuracy rate of 0.703.
References
[1] | Akomolafe et al (2009) “Enhancing road monitoring and safety through the use of geo spatial technology” International Journal of Physical Sciences Vol. 4 (5), pp. 343-348 |
[2] | Akomolafe, O.P. (2004); predicting possibilities of Road Accidents occurring, using Neural Network. M. Sc. Thesis, Department of Computer Science, University of Ibadan |
[3] | Abdalla, I.M., Robert, R., Derek, B. and McGuicagan, D.R.D.,(1987) An investigation into the relationships between area social characteristics and road accident casualties. Accid. Anal prev. 29 5, pp. 583-593, 1997 |
[4] | Gelfand, S.G., Ravishanker, C.S., and Delp, E.J.(1991) An iterative Growing and Pruning Algorithm for Classification Tree Design, PAMI(13), No. 2, February 1991, pp. 163-174 |
[5] | Han J. and Kamber M. (2001) Data mining Concepts and Techniques Morgan Kaufmam, Academic Press |
[6] | Han J. and Kamber M. (2001) Data mining Concepts and Techniques Morgan Kaufmam, Academic Press |
[7] | Hand, D., Mannila, H., & Smyth, P., (2001) Principles of data Mining. The MIT Press, 2001 |
[8] | Kim, K., Nitz, L., Richardson, J., & Li, L., (1995) Personal and Behavioral Predictors of Automobile Crash and Injury Severity. Accident Analysis and Prevention, Vol. 27, No. 4, 1995, pp. 469-481 |
[9] | Martin, P. G., Crandall, J. R., & Pilkey, W. D.,(2000) Injury Trends of Passenger Car Drivers in the USA Accident Analysis and Prevention, Vol. 32, 2000, pp. 541-557 |
[10] | Ossenbruggen, P.J., pendharkar, J. and Ivan, J., (2001) Roadway safety in rural and small urbanized areas. Accid. Anal. Prev. 334, pp. 485-498, 2001 |