American Journal of Intelligent Systems
p-ISSN: 2165-8978 e-ISSN: 2165-8994
2012; 2(5): 111-117
doi: 10.5923/j.ajis.20120205.05
Christiane Ferreira Lemos Lima 1, Francisco M. de Assis 2, Cleonilson Protásio de Souza 3
1Department of Education, Federal Institute of Maranhão, São Luís, MA - Brazil
2Postgraduate Program in Electrical Engineering, Federal University of Campina Grande, Campina Grande, PB - Brazil
3Department of Electrical Engineering, Federal University of Paraíba, João Pessoa, PB - Brazil
Correspondence to: Christiane Ferreira Lemos Lima , Department of Education, Federal Institute of Maranhão, São Luís, MA - Brazil.
Email: |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
Intrusion Detection Systems of computer network perform their detection capabilities by monitoring a set of attributes from network traffic. Since some attributes may be irrelevant, redundant or even noisy, their usage can decrease the intrusion detection efficiency as well as increase the set of attributes. In this context, selecting optimal attributes is a difficult task considering that the set of all attributes can assume a huge variety of data formats (for example: symbol set, e.g. binary, alphanumeric, real number, etc., types, length, among others). In this work, it is presented an empirical investigation of attribute selection techniques based on Shannon, Rényi and Tsallis entropies in order to obtain optimal attribute subsets that increase the detection capability of classifying network traffic as either normal or suspicious. Simulation experiments have been carried out and the obtained results show that when Rényi or Tsallis entropy is applied the number of attributes and the processing time are reduced and, in addition, the classification efficiency is increased.
Keywords: Intrusion Detection System, Attribute Selection, RÉNyi , Tsallis Entropy
Figure 1. Decision tree induction for attribute selection |
(1) |
(2) |
(3) |
(4) |
(5) |
(6) |
Figure 2. Attribute selection schemes |
|
|
|
|
[1] | M. Crosbie and E. Spafford, “Defending a computer system using autonomous agents,” Department of Computer Sciences, Purdue University, CSD-TR-95-022; Coast TR 95-02, 1995. |
[2] | P. A. Estévez, M. Tesmer, C. A. Perez, and J. M. Zurada, “Normalized mutual information feature selection,” IEEE Transactions on Neural Networks, vol. 20, no. 2, pp. 189–201, February 2009. |
[3] | J. R. Quinlan, C4.5 Programs for Machine Learning. San Diego, CA: Morgan Kaufmann Publishers, 1993. |
[4] | A. Rényi, “On measures of entropy and information,” in Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1. Berkeley: University of California Press, 1960, pp. 547–561. |
[5] | C. Tsallis, “Possible generalization of Boltzmann-Gibbs statistics,” Journal of Statistical Physics, vol. 52, no. 1-2, pp. 479–487, 1988. |
[6] | C. E. Shannon, “A mathematical theory of communication,” Bell Systems Technical Journal, vol. 27, pp. 379–423 and 623–656, 1948. |
[7] | Kdd cup 99 intrusion detection data set. Retrieved March 01, 2012. Online Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. |
[8] | L. N. D. Castro and F. J. V. Zuben, “The clonal selection algorithm with engineering applications,” in In GECCO 2002 - Workshop Proceedings, Morgan Kaufmann, 2002, pp. 36–37. |
[9] | J. Brownlee, “Clonal selection theory & CLONALG the clonal selection classification algorithm (CSCA),” Swinburne University of Technology, Tech. Rep. 2–02, 2005. |
[10] | A. Watkins, J. Timmis, and L. Boggess, “Artificial immune recognition system (airs): An immune-inspired supervised learning algorithm,” Genetic Programming and Evolvable Machines, vol. 5, no. 3, pp. 291–317, 2004. |
[11] | J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed.San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., March 2006. |
[12] | Dr. R. Parimala and Dr. R. Nallaswamy, “A study of spam e-mail classification using Feature Selection package,” Global Journal of Computer Science and Technology, vol. 11, no. 7, 2011. |
[13] | J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, 1986. |
[14] | C. F. L. Lima, F. M. de Assis, and C. P. Souza, “Decision tree based on Shannon, Rényi and Tsallis entropies for intrusion tolerant systems,” Fifth International Conference on Internet Monitoring and Protection, vol. 0, pp. 117–122, May 2010. |
[15] | S. Furuichi, “Information theoretical properties of Tsallis entropies,” Journal of Mathematical Physics, vol. 47, no. 2, 2006. |
[16] | C. Tsallis, “Nonextensive Statistics: Theoretical, Experimental and Computational Evidences and Connections,” Brazilian Journal of Physics, vol. 29, pp. 1–35, March 1999. |
[17] | I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, 2nd ed. San Francisco, California: Morgan Kaufmann Publishers, 2005. |
[18] | T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, Jun. 2006. |
[19] | J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37–46, 1960. |
[20] | A. Ben-David, “A lot of randomness is hiding in accuracy,” Eng. Appl. of AI, vol. 20, no. 7, pp. 875–885, 2007. |
[21] | C. F. L. Lima, F. M. de Assis, and C. P. Souza, “Artificial immune systems applied in intrusion tolerant systems,” Wireless Systems International Meeting - RFID: trends to the future, May 2010. |
[22] | A. Watkins, “AIRS: A resource limited artificial immune classifier,” Master’s thesis, Mississippi State University, 2001. |