Text Summarization
Monitoring a text summarization model with UpTrain
Install Required packages
Step 1: Setup - Defining model and datasets
Define model and tokenizer for the summarization task
tokenizer_t5 = AutoTokenizer.from_pretrained("t5-small")
model_t5 = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
prefix = "summarize: "Load Billsum dataset from Huggingface which was used to train our model
billsum_dataset = load_dataset("billsum", split="ca_test").filter(lambda x: x['text'] is not None)
billsum = billsum_dataset.train_test_split(test_size=0.2)
billsumDownload the wikihow dataset
Create a test dataset by combining billsum and wikihow datasets
Let's try out our model on one of the sample
Using embeddings for model monitoring
Step 2: Visualizing embeddings using UpTrain
UpTrain package includes two types of dimensionality reduction techniques: U-MAP and t-SNE


Step 3: Quantifying Data Drift via embeddings


Step 4: Identifying edge cases
Last updated
Was this helpful?