## Introduction

The Cheetah Optimizer has been implemented in MATLAB and is available on the MATLAB Central File Exchange. It offers a range of functions and has been meticulously designed to tackle complex optimization problems across multiple domains.

Recently, while working on one ML project, I wanted to enhance performance of SVM model. (To understand basic concept of SVM, you can go through my introductory article Understanding SVM published in DEV Community). I was looking for latest metaheuristic algorithm which can improve performance of SVM model. I came across Cheetah Optimizer in MATLAB Central. But given that my project is in Python and Python version is not available, I developed it in Python.

You can refer to my Medium article to understand basic concept about Metaheuristics Algorithms.

In this article, I’ll demonstrate Cheetah Optimizer implementation in Python. You can consider any dataset of your choice. For the sake of simplicity, I have considered one Diabetes dataset.

## Background

In optimization and machine learning, various optimization algorithms and techniques are used to find the best solution to a given problem. Common optimization algorithms include gradient descent, genetic algorithms, particle swarm optimization, and many others. These algorithms are applied to tasks such as training machine learning models, finding the optimal configuration of parameters, and solving complex optimization problems in various domains.

You can read more about Cheetah optimizer here.

## Using the Code

Below is the stepwise implementation of Cheetah Optimizer in Python.

### 1. Import Required Libraries

import numpy as np import pandas as pd from sklearn.svm import SVC # for Support Vector Classifier from sklearn.metrics import accuracy_score

Many of you might have been aware about the above imported libraries. However, for those who are not aware, you can refer to the below explanation.

`import pandas as pd`

: This line imports the Pandas library and gives it the alias "`pd`

." Pandas is a powerful data manipulation and analysis library in Python, often used for handling structured data in data science and machine learning tasks.

`from sklearn.svm import SVC`

: Here, we are importing the SVC (Support Vector Classifier) class from the Scikit-Learn library. Scikit-Learn is a popular Python library for machine learning and provides various tools for tasks like classification, regression, clustering, etc. SVC is a class that allows you to create and train Support Vector Machine (SVM) models for classification.

`import numpy as np`

: This line imports the NumPy library and gives it the alias "`np`

." NumPy is another fundamental library for numerical and mathematical operations in Python. It provides support for handling arrays and matrices, which are often used in machine learning for data manipulation and mathematical operations.

`from sklearn.metrics import accuracy_score`

: This line imports the `accuracy_score`

function from Scikit-Learn's metrics module. `accuracy_score`

is used to calculate the accuracy of a classification model's predictions by comparing them to the true labels. It's a common metric for evaluating the performance of classification models.

### 2. Read Dataset

data = pd.read_csv("diabetesdataset.csv") # change dataset path with your local path X = data.iloc[:, 2:-1].values # Features y = data.iloc[:, 1].values # Labels

`X = data.iloc[:, 2:-1].values`

: This line extracts the features from the `DataFrame`

data and stores them in a `NumPy`

array called `X`

. It uses `.iloc`

to select specific columns from the `DataFrame`

. In this case, it selects all rows (denoted by `:`

) and columns starting from the 3^{rd} column (index 2) up to the last column (excluding the last column denoted by -1). These selected columns are considered as features for the machine learning model.

`y = data.iloc[:, 1].values`

: This line extracts the labels from the `DataFrame`

data and stores them in a `NumPy`

array called `y`

. It selects all rows (denoted by `:`

) from the 2^{nd} column (index 1) of the `DataFrame`

. These selected values are considered as the labels or target variable for the machine learning model.

### 3. Set Target Variable

y = data['CLASS']

In machine learning tasks, we have a `DataFrame`

with multiple columns, where one column represents the target variable (the thing we want to predict) and the others represent features (the attributes used to make predictions). In this case, `y`

is being set to the labels (the "`CLASS`

" column), which will typically be used as the target variable when training and evaluating machine learning models (in our case, its SVC).

### 4. Split Data

from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

`from sklearn.model_selection import train_test_split`

: This line imports the `train_test_split`

function from Scikit-Learn's `model_selection`

module. This function is commonly used to split a dataset into training and testing sets to assess the performance of machine learning models.

`X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)`

: This line splits the dataset into four subsets by taking X and y parameters which are X: This is an array containing the feature data. y: This is an array which holds the target variable.

The `train_test_split`

function takes these arrays as input and returns four subsets:

`X_train`

: This is the subset of features used for training the machine learning model.`X_test`

: This is the subset of features used for testing the machine learning model.`y_train`

: This is the subset of labels corresponding to`X_train`

and is used for training the model.`y_test`

: This is the subset of labels corresponding to`X_test`

and is used for evaluating the model's performance.

`test_size=0.2`

: This specifies that 20% of the data will be reserved for testing, and the remaining 80% will be used for training.

`random_state=42`

: This sets a random seed for reproducibility. Setting `random_state`

ensures that the same random split is generated each time you run the code. You can use any integer value for `random_state`

.

### 5. Define Objective Function

def objective_function(params): # Ensure params contains two values if len(params) != 2: raise ValueError("params must contain two values: C and gamma") C, gamma = params # Create and fit an SVM model with the parameters model = SVC(C=C, gamma=gamma) model.fit(X_train, y_train) # Predict the test labels and compute the accuracy score y_pred = model.predict(X_test) score = accuracy_score(y_test, y_pred) # Return the negative score as we want to minimize it return -score

This objective function is used in hyperparameter tuning tasks. In our case, we want to optimize hyperparameters C and gamma for an SVM model. The goal is to find the values of C and gamma that result in the highest accuracy on the test data, or equivalently, the lowest negative accuracy score.

### 6. Initialize Cheetah

# Define the bounds of the parameters as a numpy array bounds = np.array([[0.01, 100], [0.0001, 10]]) n_cheetahs = 30 n_iterations = 50 # Initialize the cheetah positions randomly within the bounds cheetahs = np.random.uniform(bounds[:, 0], bounds[:, 1], size=(n_cheetahs, len(bounds))) # Initialize the best position and score best_position = None best_score = -np.inf

`cheetahs`

is a NumPy array of shape (`n_cheetahs`

, `len(bounds)`

). It initializes the positions of the "`cheetahs`

" randomly within the specified parameter bounds.

`np.random.uniform(bounds[:, 0], bounds[:, 1], size=(n_cheetahs, len(bounds)))`

generates random values for each `cheetah`

's position within the specified bounds. Each row in the array corresponds to a `cheetah`

, and each column corresponds to a parameter.

`best_position`

is initially set to `None`

, indicating that the best position found so far is unknown.

`best_score`

is initially set to negative infinity (`-np.inf`

) to ensure that any initial solution found in the optimization process will be considered an improvement. The goal is to maximize a score, so initializing it to negative infinity ensures that the first solution encountered will be considered the best until better solutions are found.

### 7. Set Constants

# The probability of choosing searching or sitting-and-waiting strategy alpha = 0.5 # The probability of choosing attacking strategy beta = 0.5 # The probability of leaving the prey and going back home delta = 0.5

### 8. Execute Optimization Function

# Run the optimization loop for i in range(n_iterations): # Evaluate the objective function for each cheetah scores = np.array([objective_function(cheetah) for cheetah in cheetahs])

This loop repeatedly evaluates the objective function for each "`cheetah`

" in the optimization process for a specified number of iterations (`n_iterations`

). The scores obtained for each "`cheetah`

" at each iteration will be used to guide the optimization algorithm in updating the positions of the "`cheetahs`

" and potentially finding a better solution as the optimization progresses.

# Update the best position and score if needed if np.max(scores) > best_score: best_score = np.max(scores) best_position = cheetahs[np.argmax(scores)]

It checks if the maximum score obtained in the current iteration (`np.max(scores)`

) is greater than the previous best score (`best_score`

). If it is, then it updates `best_score`

to the new maximum score, and `best_position`

to the position of the "`cheetah`

" that achieved the best score in this iteration.

# Print the current iteration and best score print(f'Iteration {i+1}: Best score = {-best_score}') # Create a new array to store the updated positions new_cheetahs = np.zeros_like(cheetahs) # Loop through each cheetah for j in range(n_cheetahs): # Generate a random number between 0 and 1 r = np.random.rand() # If r < alpha, choose searching or sitting-and-waiting strategy if r < alpha: # Generate another random number between 0 and 1 s = np.random.rand() # If s < 0.5, choose searching strategy if s < 0.5: # Choose a random function from a pool of functions f = np.random.choice([np.sin, np.cos, np.tan]) # Update the position by following a leader using the function leader = cheetahs[np.argmax(scores)] new_cheetahs[j] = leader + f((i+1) / n_iterations) * (leader - cheetahs[j]) # Else, choose sitting-and-waiting strategy else: # Update the position by adding a random perturbation new_cheetahs[j] = cheetahs[j] + np.random.uniform(-0.01, 0.01, size=len(bounds)) # Else if r < alpha + beta, choose attacking strategy elif r < alpha + beta: # Update the position by moving towards the best position with a random factor new_cheetahs[j] = cheetahs[j] + np.random.rand() * (best_position - cheetahs[j]) # Else, choose leaving-the-prey-and-going-back-home strategy else: # Update the position by moving back to the initial position # with some random perturbation new_cheetahs[j] = cheetahs[0] + np.random.uniform(-delta, delta, size=len(bounds))

Select a Strategy Based on Random Probability (alpha and beta): It generates a random number r between 0 and 1. Depending on the value of r, it selects one of three strategies:

**Searching or Sitting-and-Waiting Strategy**: If r < alpha, it further generates a random number s between 0 and 1. If s < 0.5, it chooses a searching strategy and updates the position using a random mathematical function applied to a leader's position. Otherwise, it chooses a sitting-and-waiting strategy.

**Attacking Strategy**: If alpha <= r < alpha + beta, it chooses an attacking strategy and updates the position by moving towards the best position found so far (`best_position`

) with a random factor.

**Leaving-the-Prey-and-Going-Back-Home Strategy**: If r >= alpha + beta, it chooses a strategy of leaving the prey and going back home. It updates the position by moving back to the initial position (the position of the first "`cheetah`

") with some random small value.

# Clip the position to respect the bounds new_cheetahs[j] = np.clip(new_cheetahs[j], bounds[:, 0], bounds[:, 1])

After updating the positions, it clips the positions to ensure they remain within the specified parameter bounds (bounds[:, 0] and bounds[:, 1]).

# Replace the old positions with the new ones cheetahs = new_cheetahs

Finally, it replaces the old positions of the "`cheetahs`

" with the newly calculated positions stored in the `new_cheetahs`

array.

### 9. Finding Best Position and Score

# Print the final best position and score print(f'Final best position: C = {best_position[0]}, gamma = {best_position[1]}') print(f'Final best score: {-best_score}') cValue = best_position[0] gammaValue = best_position[1]

#### Output

### 10. Training SVM Model

# Import the SVC class from sklearn.svm from sklearn.svm import SVC # Create an SVM model with the optimal C and gamma values model = SVC(C = cValue, gamma = gammaValue)

This line creates an instance of the SVC class and configures it with specific hyperparameters:

`C`

: The regularization parameter, `cValue`

, is used to control the trade-off between maximizing the margin (large C) and minimizing the classification error (small C).

`gamma`

: The kernel coefficient, `gammaValue`

, determines the shape of the decision boundary. A small gamma leads to a more flexible decision boundary, while a large gamma makes it more rigid.

# Fit the model on the training data model.fit(X_train, y_train)

This line trains (fits) the SVM model on the training data. The training data consists of feature vectors stored in `X_train`

and their corresponding labels stored in `y_train`

. The fit method adjusts the model's parameters to learn a decision boundary that best separates the classes in the training data.

#### Output

### 11. Check for Model Accuracy

# Import the accuracy_score function from sklearn.metrics from sklearn.metrics import accuracy_score # Predict the labels of the test data y_pred = model.predict(X_test) # Calculate the accuracy of the model accuracy = accuracy_score(y_test, y_pred) # Print the accuracy print(f"The accuracy of the model is {accuracy:.2f}")

#### Output

## Conclusion

Thus we have implemented Cheetah Optimizer algorithm in Python. Through Cheetah optimizer, we have found best rather optimal values of C and gamma parameters which are required for efficient and accurate result prediction through SVM machine learning model.

## References

You can learn more about Cheetah Optimizer through the below articles:

- The Cheetah Optimizer (CO) - https://optim-app.com/projects/co
- Cheetah Optimizer - Seyedali Mirjalili (2023). Cheetah Optimizer (https://www.mathworks.com/matlabcentral/fileexchange/130404-cheetah-optimizer), MATLAB Central File Exchange. Retrieved
September 30, 2023 .

You can download the source code here.

## History

- 30
^{th}September, 2023: Initial version