Quickstart Tutorial
Monitoring your first ML model with UpTrain
Last updated
Was this helpful?
Monitoring your first ML model with UpTrain
Last updated
Was this helpful?
In , we consider a binary classification task of human orientation while exercising. That is, given the location of 17 key-points of the body such as the nose, shoulders, wrist, hips, ankles, etc., the model tries to predict whether the person is in a horizontal (see image 1 below) or a vertical (see image 2 below) position.
Input: 34-dimensional vector that contains the x and y positions of the 17 key-points. Output: Orientation (horizontal or vertical)
In this example, we will see how we can use UpTrain package to identify data drift and out-of-distribution cases on real-world data.
Data Type Structure
Let's look at the training data features and visualise some of the training samples. Here, id
is the training sample id, gt
is the corresponding ground truth, and the rest of the features are the corresponding locations of the key-points of a human body.
0
18100306191
0
333.109946
76.161688
338.565022
71.526141
328.198963
72.194832
345.656490
...
313.925120
186.053571
335.013894
253.606162
309.011228
249.226721
333.654654
311.513965
311.760718
294.100708
1
12100004003
1
373.043835
207.934236
378.278397
205.678759
373.341256
206.135385
380.165081
...
326.157557
227.332505
351.341468
228.657224
328.581103
226.218648
340.983916
240.702033
327.240044
241.339998
2
17100400995
0
289.116021
218.502992
294.331203
212.576996
284.066039
212.259060
301.216267
...
276.756049
255.008508
345.230291
273.285718
237.498075
272.014232
349.413545
315.731031
237.181977
317.784665
3
18100102279
0
320.897998
71.873468
325.167727
67.468033
317.188621
67.689969
329.814255
...
297.857163
154.614716
329.598707
197.955873
299.710591
203.663761
327.663990
231.684086
298.525798
251.789836
4
12100500969
1
486.122761
218.363896
495.503172
225.783665
493.671955
223.923930
495.000279
...
337.222705
236.111076
258.045097
167.009500
264.609213
167.154231
210.604750
260.979533
224.932744
251.462142
Visualizing some training samples for classifying human orientation
The example follows the following steps for monitoring and retraining your model:
We have defined a simple Neural Net comprised of a fully-connected layer with relu activation followed by a fully-connected layer to transfer latent features into model outputs. We compute Binary Entropy loss and are using Adam optimiser to train the model.\ Note: We use PyTorch in this example, but in other examples such as edge-case detection, we have also run UpTrain with Sklearn and Tensorflow.
With the first version of this model, we observe an accuracy of 90.9% on the golden testing dataset. We will now how we can use UpTrain package to identify data distribution shifts, collect edge cases and retrain the model to improve its accuracy.
In this example, we define a simple data drift check to identify any distribution shift between real-world test set and the reference dataset (the training dataset in this case). To achieve this, we set 'kps' (Keypoints) as the input variable, the framework performs clustering on the training dataset and checks if the real-world test set is following the similar distribution.
Here, the type
refers to the anamoly type, which is data drift in this case. The reference_dataset
is the training dataset, while is_embedding
refers to whether the data type on which drfit is being measured is in a vector/embedding form. Finally, measurable_args
define the input features (or any function of them) on which the drift is to be measured.
We now attach the model training and evaluation pipelines so that UpTrain framework can automatically retrain the model in case it sees that the model is facing significant data drift.
We are now ready to define the UpTrain config as follows
Ship the model to production worry-free because the UpTrain tool will identify any data drifts, collect interesting data points and automatically retrain the model on them. To mimic deployment behavior, we are running the model on a 'real-world test set' and logging model inputs with UpTrain framework. The following is the pseudo-code.
After an automated retraining of the model was launched by UpTrain on points that caused the data drift, we observe that the error rate decreased by 20x.
This is how the sample logs look like
Hurray! Our model, after retraining, performs significantly better.
Let's try to understand how UpTrain helped to improve our classification model.
Training data clusters
While initializing the UpTrain framework, it clusters the reference dataset (i.e. training dataset in our case). We are plotting the centroids and support (ie number of data-points belonging to that cluster) of all the 20 clusters in our training dataset.
Edge cases clusters
As we see, the UpTrain framework identifies out-of-distribution data-points and collects the edge-cases which are sparsely present in the training dataset.
From the above plot generated while monitoring the model in production, we see that data drift occurs in many cases when the person is in a horizontal position. Specifically, cases, when the person is in a push-up position, are very sparse in our training dataset, causing the model predictions to go wrong for them. In the example of edge-case detection, we will see that how we can use this insight to define a "Pushup" signal, collect all push-up related data points and specifically retrain on them.
Apart from data drift, UpTrain has many other features such as
Checking for edge-cases and collecting them for automated retraining
Verifying data integrity,
Monitoring model performance and accuracy of predictions with standard statistical tools,
Write your own custom monitors specific to your use-case, etc.
To dig deeper into it, we recommend you checkout the other examples in the folder "deepdive_examples".