Science and Technology
p-ISSN: 2163-2669 e-ISSN: 2163-2677
2023; 13(1): 6-11
doi:10.5923/j.scit.20231301.02
Received: Feb. 15, 2023; Accepted: Feb. 26, 2023; Published: Feb. 28, 2023

Enoch Anbu Arasu Ponnuswamy
USA
Correspondence to: Enoch Anbu Arasu Ponnuswamy, USA.
| Email: | ![]()  | 
Copyright © 2023 The Author(s). Published by Scientific & Academic Publishing.
This work is licensed under the Creative Commons Attribution International License (CC BY). 
                    	http://creativecommons.org/licenses/by/4.0/
                    	
Adversarial attacks on artificial intelligence (AI) systems have become an increasingly concerning issue in recent years. These attacks involve intentionally crafted input data to deceive or manipulate the output of the targeted AI model. Adversarial attacks have been demonstrated across a wide range of AI applications, including image and speech recognition, natural language processing, and autonomous vehicles. Such attacks can have significant impacts, from compromising the security of sensitive systems to causing physical harm in critical applications. In this article, we explore the concept of adversarial attacks on AI systems, provide examples of real-world attacks, and discuss the impacts of such attacks. We also examine some of the common defense strategies against adversarial attacks, as well as emerging methods for improving the robustness of AI models.
Keywords: Securing AI systems from adversarial threats
Cite this paper: Enoch Anbu Arasu Ponnuswamy, Securing AI Systems from Adversarial Threats, Science and Technology, Vol. 13 No. 1, 2023, pp. 6-11. doi: 10.5923/j.scit.20231301.02.
Each method has its strengths and weaknesses, and the choice of technique will depend on the specific use case and the desired outcome of the attack. However, regardless of the method used, the goal of adversarial AI attacks is to trick the AI model into making mistakes, which can have serious consequences in fields such as finance, healthcare, and national security.1. FastGradient Sign method (FGSM)The FastGradient Sign method (FGSM) is a popular and effective technique for adversarial attacks on deep learning models. It is a one-step gradient-based attack method that was introduced in the paper "Explaining and Harnessing Adversarial Examples" by Ian Goodfellow et al. FGSM is a white-box attack, meaning the attacker has complete knowledge of the model architecture and parameters.The goal of FGSM is to add a small perturbation to the input data so that the model's prediction changes to a desired incorrect label. The method achieves this by computing the gradient of the loss function concerning the input data and using this gradient to create an adversarial example. The sign of the gradient is used to determine the direction in which the input should be perturbed to maximize the loss. The magnitude of the perturbation is determined by a hyperparameter called the step size.The FGSM attack can be formulated as follows: given an input x, its corresponding label y, a deep learning model f(x), and a loss function L(y, f(x)), the adversarial example x' is calculated as:x' = x + epsilon * sign( grad_x L(y, f(x)) )where epsilon is the step size and sign( grad_x L(y, f(x)) ) is the sign of the gradient of the loss concerning the input x.The FGSM attack is highly effective in fooling deep learning models. It is fast and easy to implement, making it a popular choice for both researchers and attackers. However, the FGSM attack is limited in that it only considers one-step perturbations and does not account for the non-linearity of deep learning models. To overcome these limitations, researchers have proposed variations of the FGSM attack, such as the Basic Iterative Method (BIM) and the Projected Gradient Descent (PGD) method.The FGSM attack is a powerful tool for both researchers and attackers in the field of adversarial machine learning. It provides a simple and effective method for creating adversarial examples and has motivated further research into more sophisticated and robust adversarial attacks.2. Basic Iterative Method (BIM)The basic Iterative Method (BIM) is a widely used technique in adversarial AI attacks. The goal of BIM is to modify the inputs to a machine learning model in such a way that the model's predictions are perturbed. This is achieved by making small modifications to the inputs in multiple iterations until the desired level of misclassification is achieved.BIM is an effective technique for fooling deep neural networks, which are widely used in a variety of applications, including image classification, speech recognition, and natural language processing. By using BIM, an attacker can cause the model to misclassify inputs with high confidence, even when the modifications made to the inputs are small and imperceptible to humans.The basic idea behind BIM is to perform gradient descent on the loss function of the model, to increase the loss for the target class. This is achieved by updating the inputs in the direction that maximizes the loss. The number of iterations and the step size can be adjusted to control the magnitude of the perturbation and the rate of convergence.The Basic Iterative Method (BIM) is a powerful tool for adversarial AI attacks, which can cause deep neural networks to make incorrect predictions with high confidence. It is important to be aware of the potential security implications of BIM and to take measures to protect against this type of attack.3. Projected Gradient Descent (PGD)Projected Gradient Descent (PGD) is a widely used technique for adversarial machine learning. It is a type of gradient-based attack that perturbs the inputs of a machine-learning model to cause misclassification. PGD works by iteratively modifying the inputs in the direction of the gradients of the loss function, to maximize the loss for the target class.PGD is a powerful technique for adversarial attacks, as it can cause deep neural networks to make incorrect predictions with high confidence. It is also more effective than other gradient-based attacks, such as Fast Gradient Sign Method (FGSM), as it allows for more control over the magnitude of the perturbations. This makes it easier to find the optimal perturbations for a specific target model and target class.The PGD algorithm is relatively simple to implement and can be applied to a wide range of machine learning models, including image classifiers, speech recognition systems, and natural language processing models. The success of PGD attacks is largely dependent on the choice of the perturbation size and the number of iterations, which can be adjusted to balance the trade-off between the strength of the attack and the imperceptibility of the perturbations. The Projected Gradient Descent (PGD) is a powerful and effective technique for adversarial machine learning. It is important to be aware of the potential security implications of PGD attacks and to take measures to protect against this type of attack. This can include regular evaluation of the model's robustness and the use of adversarial training to increase the model's robustness against PGD attacks.4. Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs) are a type of deep learning model that has become increasingly popular in recent years. They are designed to generate new, synthetic data samples that are similar to a given training dataset. A GAN consists of two components: a generator and a discriminator. The generator is responsible for producing synthetic samples, while the discriminator is trained to differentiate between the synthetic samples and the real samples from the training dataset. The two components are trained together in a competition, with the generator trying to produce samples that are indistinguishable from the real samples, and the discriminator tries to correctly identify the synthetic samples.The training process continues until the generator produces samples that are of sufficient quality to fool the discriminator. At this point, the generator can be used to generate new samples that are similar to the training data. These synthetic samples can be used for a wide range of applications, including data augmentation, imputing missing data, and generating new images, videos, audio, or text. GANs have been used for a wide range of applications, including image synthesis, style transfer, and super-resolution. They have also been used in the generation of realistic-looking images and videos, as well as in the creation of synthetic training data for other machine-learning models.However, GANs are not without their challenges. One of the biggest challenges is stability during training, as the generator and discriminator can easily get stuck in a local optimum. Another challenge is mode collapse, where the generator only produces a limited set of outputs, rather than a diverse set of samples. Generative Adversarial Networks (GANs) are a powerful and flexible deep learning model for generating synthetic data. They have a wide range of potential applications, including image synthesis, data augmentation, and the creation of synthetic training data. Despite the challenges, GANs are an important area of research and have the potential to have a major impact on many areas of computer science and engineering.5. Evolutionary algorithmsEvolutionary algorithms are a type of optimization algorithm inspired by the process of natural selection and evolution. They are designed to find solutions to complex optimization problems by mimicking the process of evolution. An evolutionary algorithm begins with a population of candidate solutions, also known as individuals or chromosomes. These solutions are evaluated based on their fitness, or how well they meet the objectives of the optimization problem. The fit individuals are then selected and recombined to form new individuals, to create even fitter solutions.This process of selection, recombination, and mutation continues over multiple generations, with the fittest individuals being selected and recombined to form new individuals in each generation. The goal is to find the global optimum, or the best possible solution, by continuously refining and improving the population of individuals. There are several types of evolutionary algorithms, including genetic algorithms, differential evolution, and particle swarm optimization. These algorithms differ in the way they select and recombine individuals, as well as the way they apply mutations. Evolutionary algorithms have been applied to a wide range of optimization problems, including multi-objective optimization, constraint optimization, and combinatorial optimization. They are particularly useful for problems where the objective function is non-linear, complex, or unknown, as they do not require any assumptions about the structure of the objective function.Despite their success, evolutionary algorithms also have some limitations. One of the main limitations is that they are often slow and computationally expensive, especially for large-scale problems. They can also be sensitive to the choice of parameters, such as the population size and mutation rate, which can impact the quality of the solutions found. The evolutionary algorithms are a powerful optimization method inspired by the process of natural selection and evolution. They have been successfully applied to a wide range of optimization problems and have the potential to find high-quality solutions to complex problems. However, they are computationally expensive and can be sensitive to the choice of parameters, making them best suited for problems where other optimization methods are not feasible.6. Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) is an optimization algorithm used for solving large-scale optimization problems. It is an iterative method that seeks to find the optimal solution to a problem by minimizing an objective function. L-BFGS is a quasi-Newton method, meaning that it uses an approximation of the Hessian matrix to determine the search direction in each iteration. Unlike other quasi-Newton methods, L-BFGS uses limited memory to store information about the approximation, hence its name "Limited-memory". This allows it to handle large-scale optimization problems that may not fit into memory, making it an efficient and effective optimization method for many real-world applications. The L-BFGS algorithm works by using gradient information to iteratively improve the approximation of the Hessian matrix. The approximation is updated based on the gradient information from the current iteration and previous iterations. The search direction is then determined using this approximation and used to update the solution. The process continues until convergence, or until a stopping criterion is met.One of the main advantages of L-BFGS is its efficiency and low memory requirements. The limited memory approach allows it to handle large-scale optimization problems that may not fit into memory, making it a popular choice for many real-world applications. L-BFGS is also well-suited for optimization problems with noisy gradient information, as it can handle this noise effectively and still converge to a high-quality solution. Despite its strengths, L-BFGS also has some limitations. It may not always converge to the global optimum, and its performance can be sensitive to the choice of parameters, such as the initial guess, the stopping criterion, and the approximation of the Hessian matrix. Additionally, it is not always the best method for problems with sparse or non-smooth gradient information. The L-BFGS is a powerful optimization algorithm that has proven to be effective and efficient for large-scale optimization problems. Its limited memory approach and ability to handle noisy gradient information make it a popular choice for many real-world applications. However, it may not always converge to the global optimum and its performance can be sensitive to the choice of parameters, making it best suited for problems where other optimization methods are not feasible.7. DeepcoolDeepcool is a highly effective form of attack that produces adversarial AI examples with high misclassification rates and fewer perturbations. This untargeted adversarial sample generation technique reduces the Euclidean distance between the original and perturbed samples.8. JSMAOn the other hand, JSMA is a feature selection-based method where hackers minimize the number of modified features to cause misclassification. In this attack, flat perturbations are iteratively added to the features based on their saliency values in decreasing order.These are just a few examples of Adversarial AI, and it is important for businesses to understand the methods used and how to defend against them. Protecting AI systems from adversarial threats requires a combination of techniques such as adversarial training, model ensembles, and robust optimization.