Tutorials#

Whether youโ€™re a beginner or an experienced user, these tutorials will walk you through our features and functionalities making it easy for you to understand and implement them.

Feedback Dataset

Note

The dataset class covered in this section is the FeedbackDataset. This fully configurable dataset will replace the DatasetForTextClassification, DatasetForTokenClassification, and DatasetForText2Text in Argilla 2.0. Not sure which dataset to use? Check out our section on choosing a dataset.

Beginner

Are you new to Argilla and do you want to walk through low-key notebooks to reproduce specific features, these tutorials are perfect for you. They can be followed step by step but you can also work through individual examples with an Argilla-compatible dataset we have already prepared for you and which can be downloaded from the Hugging Face hub.

workflow

Configuring Users and Workspaces

Learn how to configure Users and Workspaces.

Creating a FeedbackDataset

Learn how to configure a FeedbackDataset and add FeedbackRecords to it.

Assign records to your team

Learn how to easily assign records to your team.

Adding Metadata to a FeedbackDataset

Learn how to add metadata properties to a FeedbackDataset.

Adding Vectors to a FeedbackDataset

Learn how to add vectors and vector settings to a FeedbackDataset.

Adding Responses and Suggestions to a FeedbackDataset

Learn how to add suggestions and responses to a FeedbackDataset.

Filter and Query your FeedbackDataset

Learn how to filter and query your FeedbackDataset.

Train Your Model with ArgillaTrainer

Learn how to train your model with ArgillaTrainer.

Use Metric to Evaluate Your Model

Learn how to use the metrics module to evaluate your model.

Advanced

Here you can find more advanced applied examples to help you get started with curating datasets and collecting feedback to fine-tune LLMs and other language models.

โ“‚๏ธ Fine-tuning LLMs as chat assistants: Supervised Finetuning on Mistral 7B

Learn how to fine-tune Mistral 7B into a chat assistant using supervised finetuning with the ArgillaTrainer and TRL.

๐Ÿช„ Fine-tuning and evaluating GPT-3.5 with human feedback for RAG

Learn how to fine-tune and evaluate gpt3.5-turbo models with human feedback for RAG applications with LlamaIndex.

๐ŸŽ›๏ธ Fine-tune a SetFit model using the ArgillaTrainer

Learn how to use the ArgillaTrainer to fine-tune your Feedback Dataset using Setfit.

๐Ÿ† Train a reward model for RLHF

Learn how to collect comparison or human preference data and train a reward model with the trl library.

โ“ Train a QnA model with transformers and Argilla

Learn how to fine-tune a QnA model with transformers and annotated data using ArgillaTrainer

๐ŸŒ  Fine-tune RAG pipelines by training retrieval and reranking models

Learn how to boost RAG performance through optimized retrieval and reranking models for better AI accuracy.

โœจ Add zero-shot text classification suggestions using SetFit

Learn how to add suggestions to your Feedback Dataset using SetFit.

๐Ÿงธ Using LLMs for text classification and summarization with spacy-llm

Learn how to add suggestions for text classification and summarization to your Feedback Dataset using spacy-llm.

๐ŸŽก Create synthetic data and annotations with LLMs

Learn how to create synthetic data and annotations with OpenAI, LangChain, Transformers and Outlines.

๐Ÿ–ผ๏ธ Curate an instruction dataset for supervised fine-tuning

Learn how to set up a project to curate a public dataset that can be used to fine-tune an instruction-following model.

๐Ÿ“‘ Making the Most of Markdown: video, audio and image

Learn how to apply multimodality (video, audio and images) to your FeedbackDataset using the Argilla TextFields.

๐Ÿ‘€ Monitoring Ethics and Bias in LLMs: Giskard and DPO

Learn how to monitor bias and ethics in LLMs detecting them with Giskard and fine-tuning with DPO.

๐ŸŽฎ Monitoring a Real-world Example of Data and Model Drift

Learn how to monitor data and model drift in a real-world scenario using different tools.

๐Ÿ’ญ Enhanced Sentiment Analysis: A Span-Based Polarity Approach with Setfit

Learn how to train an ABSA model and evaluate with Argilla.

Other datasets

Note

The records classes covered in this section correspond to three datasets: DatasetForTextClassification, DatasetForTokenClassification, and DatasetForText2Text. These will be deprecated in Argilla 2.0 and replaced by the fully configurable FeedbackDataset class. Not sure which dataset to use? Check out our section on choosing a dataset.

Looking for more tutorials? Check out our notebooks folder!

๐Ÿคฏ Few-shot classification with SetFit

Learn how to use the setfit library to perform few-shot classification.

๐Ÿ‘‚ Few shot text classification with active learning using small-text and SetFit

Learn how to use the setfit and small-text libraries to perform few-shot text classification with active learning.

๐Ÿ’จ Label data with semantic search and Sentence Transformers

Learn how to use the sentence-transformers library to label data with semantic search.

๐Ÿงน Find and clean label errors with cleanlab

Learn how to use the cleanlab library to find and clean label errors.

๐Ÿญ Train a NER model with weak supervision rules using skweak

Learn how to use the snorkel library to perform weak supervision for NER.

๐Ÿ‘ฎ Weak supervision for text classification with semantic search

Learn how to use the sentence-transformers and snorkel to do weak supervision for text classification with semantic search.

๐Ÿ”— Using LLMs for Few-Shot Token Classification Suggestions with spacy-llm

Learn how to use the spacy-llm library to do few-shot token classification.