UpTrain Config
Last updated
Was this helpful?
Last updated
Was this helpful?
The UpTrain config is a crucial component of the UpTrain Framework and contains all the necessary information for monitoring, training, and evaluating the performance of your machine learning models.
To illustrate the definition of the config, we use the from the UpTrain repository.
The config is defined by the user and can include various settings such as:
Checks: This config section specifies the that UpTrain should perform on the input data. This includes checks for data drift, edge cases, and data integrity. Users can also specify custom signals specific to their use case to monitor. Further, users can add specific to their training data to monitor or high dimensional data through dimensionality reduction or observe inherent clusters. Such monitors help monitor unstructured data such as high-dimensional embeddings (common in domains such as NLP, recommender systems, etc.) and image data. This is how we can define a check for and in our config (to learn about how to define we can define different checks, we recommend checking out the section):
Training pipeline: In this config section, attach your training arguments, such as annotation parameters, training functions, conditions to retrain the model, data warehouse location, etc., to enable automated model retraining.
This is how the training_args
looks like in the human orientation classification example:
Evaluation pipeline: In this config section, attach your evaluation arguments, such as inference function, golden testing dataset, what measures & data slices to report, etc., to generate a comprehensive comparison report comparing the production and the retrained models. This report can be used to get deep insights into how the model performance changed due to retraining and can help you decide if you want to deploy the new model or continue with the existing one.
Following is an example of the evaluation_args
definition, borrowed from the human orientation classification example:
Logging configuration: This section allows users to configure the logging settings for the UpTrain Framework. Uptrain supports visualizations with the streamlit dashboard. Users can define whether they prefer logging with streamlit to be enabled through the variable st_logging
. This will allow them to monitor their models through the streamlit dashboard. The UpTrain community is working on integrating other popular dashboards, such as , into the framework.
The config also allows users to customize the UpTrain dashboard. Users can specify the dashboard layout, the metrics to be displayed, the URL and port on which the dashboard app runs, the time range for displaying the data, etc. Following is an example definition of the logging args:
Retraining parameters: The parameter retrain_after
determines the retraining of the model after sufficient data-points are collected.
With all the individual pipelines defined, we are now ready to define the dictionary config
for the UpTrain framework:
Next, let's see how we can utilize the UpTrain config
to initialize the UpTrain framework
and seamlessly observe and improve our ML models.
An important point to note is that all the above arguments (except checks
) are optional in the above config. So, as long as we know what we want to monitor, we can quickly get started with UpTrain. The config definition from the requires little information to get started.