Dataset and ML model: In this example, we train a binary classifier on a popular network traffic dataset called the for cyber-attack classification using the .
Problem: Once we train the cyber-attack classification model, it performs well initially, but later, the attackers catch up and change their manner of attacks, which causes our model predictions to go wrong.
Solution: Use the UpTrain framework to indentify the drift in model predictions (aka concept drift).
Divide the data into training and test sets
We use first 10% of the data to train and 90% of the data to evaluate the model in production
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.1, test_size=0.9, shuffle=False)
Step 1: Train our XGBoost Classifier
# Train the XGBoost classifier with training data
classifier = XGBClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_train)
print("Training accuracy: " + str(100*accuracy_score(y_train, y_pred)))
The above code prints the following output:
Training accuracy: 100.0
Woah! 😲🔥 The training accuracy is 100%. Let's see how long the model lasts in production.
Identifying Concept Drift
In this example, we implement two methods to identify concep drift:
A custom drift metric that is defined by the user below. Specifically, the user wants to monitor the difference between accuracy of the model on the first 200 predictions and the most recent 200 predictions. This way, they can quickly identify if there was a sudden degradation in the model performance.
Step 2: Defining a Custom Monitor on the initial and most recent performance of the model
"""
Defining a custom drift metric where
the user just want to check if accuracy
drops beyond a threshold.
"""
def custom_initialize_func(self):
self.initial_acc = None
self.acc_arr = []
self.count = 0
self.thres = 0.02
self.window_size = 200
self.is_drift_detected = False
def custom_check_func(self, inputs, outputs, gts=None, extra_args={}):
batch_size = len(extra_args["id"])
self.count += batch_size
self.acc_arr.extend(list(np.equal(gts, outputs)))
# Calculate initial performance of the model on first 200 points
if (self.count >= self.window_size) and (self.initial_acc is None):
self.initial_acc = sum(self.acc_arr[0:self.window_size])/self.window_size
# Calculate the most recent accuracy and log it to dashboard.
if (self.initial_acc is not None):
for i in range(self.count - batch_size, self.count, self.window_size):
# Calculate the most recent accuracy
recent_acc = sum(self.acc_arr[i:i+self.window_size])/self.window_size
# Logging to UpTrain dashboard
self.log_handler.add_scalars('custom_metrics', {
'initial_acc': self.initial_acc,
'recent_acc': recent_acc,
}, i, self.dashboard_name)
# Print a message when recent model performance goes down
if (self.initial_acc - recent_acc > self.thres) and (not self.is_drift_detected):
print("Concept drift detected with custom metric at time: ", i)
self.is_drift_detected = True
Step 3: Define the list of checks to perform on model
Here, we have two checks: concept drfit check with DDM algorithm and the customized check from above
Step 4: Define config and initialize the UpTrain framework
cfg = {
# Checks to identify concept drift
"checks": checks,
# Folder that stores data logged by UpTrain
"retraining_folder": 'uptrain_smart_data',
# Enable streamlit logging
# Note: Requires streamlit to be installed
"st_logging": True,
}
# Initialize the UpTrain framework
framework = uptrain.Framework(cfg)
Step 5: Deploy the model in production and wait for alerts!
batch_size = 10000
all_ids = []
for i in range(int(len(X_test)/batch_size)):
# Do model prediction
inputs = {'data': {"feats": X_test[i*batch_size:(i+1)*batch_size]}}
preds = classifier.predict(inputs['data']["feats"])
# Log model inputs and outputs to monitor concept drift
ids = framework.log(inputs=inputs, outputs=preds)
# Attach ground truth to corresponding predictions
# in UpTrain framewrok and identify concept drift
ground_truth = y_test[i*batch_size:(i+1)*batch_size]
framework.log(identifiers=ids, gts = ground_truth)
Console will print a message whenever drift is detected
Drift detected with DDM at time: 111298
Concept drift detected with custom metric at time: 111000
As can be noted from above, our two drift monitors predict a drift around the timestamp of 111k
Verification of drifts with the UpTrain dashboard
The UpTrain framework automatically logs important metrics such as accuracy for the user to observe the performance of their models. The dashboard is currently integrated with streamlit and is launched automatically if st_logging is enables in streamlit.
Accuracy versus num_predictions
The following is a screenshot of average accuracy versus time from the dashboard. We can observe a data drift around the timestamp of 111k, which is also predicted by our drift monitors.
Custom Monitor
Finally, the users can also plot the customized metrics we defined earlier, which in this case were the initial accuracy of the model and the most recent accuracy.
Observe how the most recent accuracy of the model is far lower than the initial accuracy, implying that the attackers have learned to fool the model.
Use the popular concept drift detectection algorithm for binary tasks called the . DDM is implemented as a part of the UpTrain package.