Edge-case Detection
Detect outliers in your production data
Last updated
Was this helpful?
Detect outliers in your production data
Last updated
Was this helpful?
Edge case detection is a crucial aspect of monitoring machine learning models, as it allows users to identify and handle data points that fall outside the typical range of values. These edge cases can be challenging for a machine learning model to predict accurately, and UpTrain provides a robust framework for identifying and handling these cases to improve the model.
We recommend checking out the to see edge case detection in action. In our , we looked at the data drift for this problem, where we realized that pushup positions in human orientation were not a part of the training data, but the model is seeing in production. Thus, in this config, we define a custom edge case where we want to specifically catch pushup position data points so that we can annotate them and retrain our model on them later. The following is how the config looks like
Here, we have defined another edge-case signal which checks for whether the model confidence was less than 0.9. Low-confidence data points might be misclassified, and their real ground truths can also be used to retrain the model.
Applying an edge case monitor on the above dataset yielded the edge case clusters with centroids and their support as shown in the image below. As we can see, the clusters that are close to the pushup position are most frequent, which implies that our edge detection technique is working as expected.
Once an edge case has been identified, UpTrain allows users to flag and handle these cases separately. For example, a user may choose to exclude edge cases from the production dataset to apply a different prediction strategy to these cases (e.g., manual predictions). UpTrain provides flexibility in handling edge cases, allowing users to choose the approach that works best for their specific use case. Finally, they can use the automatically generated edge-case dataset to retrain and improve their models for better performance in the future.
Overall, UpTrain's edge case detection capabilities help improve machine learning models' robustness and reliability by identifying and handling challenging data points that may otherwise lead to poor performance.
UpTrain employs a combination of user-defined signals and statistical techniques, such as outlier detection, to identify edge cases. Users can define their own signals based on their domain expertise, which may include features such as the time of day or user behavior patterns. Additionally, we plan to include several built-in statistical techniques for outlier detection, such as and .