# UpTrain Config

The UpTrain config is a crucial component of the UpTrain Framework and contains all the necessary information for monitoring, training, and evaluating the performance of your machine learning models.&#x20;

To illustrate the definition of the config, we use the [human orientation classification example](https://github.com/uptrain-ai/uptrain/blob/main/examples/human_orientation_classification/run.ipynb) from the UpTrain repository.&#x20;

The config is defined by the user and can include various settings such as:

1. Checks: This config section specifies the [monitors and checks](/docs/uptrain-monitors.md) that UpTrain should perform on the input data. This includes checks for data drift, edge cases, and data integrity. Users can also specify custom signals specific to their use case to monitor. \
   Further, users can add specific [statistics](/docs/uptrain-statistics.md) to their training data to monitor or [visualize](/docs/uptrain-visuals/umap-visualization.md) high dimensional data through dimensionality reduction or observe inherent clusters. Such monitors help monitor unstructured data such as high-dimensional embeddings (common in domains such as NLP, recommender systems, etc.) and image data.\
   This is how we can define a check for [data drift](/docs/uptrain-monitors/data-drift.md) and [concept drift](/docs/uptrain-monitors/concept-drift.md) in our config (to learn about how to define we can define different checks, we recommend checking out the [UpTrain Monitors](/docs/uptrain-monitors.md) section):<br>

   ```python
   checks = [
       {
           'type': uptrain.Anomaly.DATA_DRIFT,
           'reference_dataset': orig_training_file,
           'measurable_args': {
               'type': uptrain.MeasurableType.INPUT_FEATURE,
               'feature_name': 'feat_0' 
           },
       },
       {
           'type': uptrain.Anomaly.CONCEPT_DRIFT,
           'algorithm': uptrain.DataDriftAlgo.DDM
       }
   ]
   ```
2. Training pipeline: In this config section, attach your training arguments, such as annotation parameters, training functions, conditions to retrain the model, data warehouse location, etc., to enable automated model retraining. \
   This is how the `training_args` looks like in the human orientation classification example:<br>

   <pre class="language-python" data-overflow="wrap"><code class="lang-python"># Define the training pipeline to annotate collected edge cases and retrain the model automatically
   training_args = {
       "annotation_method": {
           "method": uptrain.AnnotationMethod.MASTER_FILE, 
           "args": annotation_args
           }, 
       "training_func": train_model_torch, 
       "orig_training_file": orig_training_file,
   }
   </code></pre>
3. Evaluation pipeline: In this config section, attach your evaluation arguments, such as inference function, golden testing dataset, what measures & data slices to report, etc., to generate a comprehensive comparison report comparing the production and the retrained models. This report can be used to get deep insights into how the model performance changed due to retraining and can help you decide if you want to deploy the new model or continue with the existing one.\
   Following is an example of the `evaluation_args` definition, borrowed from the human orientation classification example:<br>

   <pre class="language-python" data-overflow="wrap"><code class="lang-python"># Define the evaluation pipeline to compare the retrained and the original model
   evaluation_args = {
       "inference_func": get_accuracy_torch,
       "golden_testing_dataset": golden_testing_file,
   }
   </code></pre>
4. Logging configuration: This section allows users to configure the logging settings for the UpTrain Framework. Uptrain supports visualizations with the streamlit dashboard. Users can define whether they prefer logging with streamlit to be enabled through the variable `st_logging`. This will allow them to monitor their models through the streamlit dashboard. The UpTrain community is working on integrating other popular dashboards, such as [grafana](https://github.com/uptrain-ai/uptrain/issues/78), into the framework. \
   The config also allows users to customize the UpTrain dashboard. Users can specify the dashboard layout, the metrics to be displayed, the URL and port on which the dashboard app runs, the time range for displaying the data, etc. Following is an example definition of the logging args:<br>

   ```python
   logging_args = {
       'st_logging': True,
       'log_folder': 'uptrain_logs',
       'dashboard_port': 50001,
   }
   ```
5. Retraining parameters: The parameter `retrain_after` determines the retraining of the model after sufficient data-points are collected.

With all the individual pipelines defined, we are now ready to define the dictionary `config` for the UpTrain framework:

```python
config = {
    "checks": checks, 
    "training_args": training_args,
    "evaluation_args": evaluation_args,

    # Retrain when 200 datapoints are collected in the retraining dataset
    "retrain_after": 200,
    
    # A local folder to store the retraining dataset (such as the edge cases)
    "retraining_folder": "uptrain_smart_data",
    
    "logging_args": logging_args
}
```

An important point to note is that all the above arguments (except `checks`) are optional in the above config. So, as long as we know what we want to monitor, we can quickly get started with UpTrain. The config definition from the [fraud detection example](https://github.com/uptrain-ai/uptrain/blob/main/examples/fraud_detection/run.ipynb) requires little information to get started.

```python
config = {
    # Check to identify concept drift using the DDM algorithm
    "checks": [{
        'type': uptrain.Anomaly.CONCEPT_DRIFT,
        'algorithm': uptrain.DataDriftAlgo.DDM
    }],
}
```

Next, let's see how we can utilize the UpTrain `config` to initialize the UpTrain `framework` and seamlessly observe and improve our ML models.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://uptrain.gitbook.io/docs/how-to-use-uptrain/uptrain-config.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
