Tutorials#

Feedback Dataset

Warning

The dataset class covered in this section is the FeedbackDataset. This fully configurable dataset will replace the DatasetForTextClassification, DatasetForTokenClassification, and DatasetForText2Text in Argilla 2.0. Not sure which dataset to use? Check out our section on choosing a dataset.

Here you can find end-to-end examples to help you get started with curanting datasets and collecting feedback to fine-tune LLMs and other language models.

🪄 Fine-tuning and evaluating GPT-3.5 with human feedback for RAG

Learn how to fine-tune and evaluate gpt3.5-turbo models with human feedback for RAG applications with LlamaIndex.

🖼️ Curate an instruction dataset for supervised fine-tuning

Learn how to set up a project to curate a public dataset that can be used to fine-tune an instruction-following model.

🏆 Train a Reward Model for RLHF

Learn how to collect comparison or human preference data and train a reward model with the trl library.

✨ Add zero-shot suggestions using SetFit

Learn how to add suggestions to your Feedback Dataset using SetFit.

🎡 Create and annotate synthetic data with LLMs

Learn how to create synthetic data and annotations with OpenAI, LangChain, Transformers and Outlines.

🎛️ Fine-tune a SetFit model using the ArgillaTrainer

Learn how to use the ArgillaTrainer to fine-tune your Feedback Dataset using Setfit.

Other datasets

Warning

The records classes covered in this section correspond to three datasets: DatasetForTextClassification, DatasetForTokenClassification, and DatasetForText2Text. These will be deprecated in Argilla 2.0 and replaced by the fully configurable FeedbackDataset class. Not sure which dataset to use? Check out our section on choosing a dataset.

Looking for more tutorials? Check out our notebooks folder!

🤯 Few-shot classification

Learn how to use the setfit library to perform few-shot classification.

👂 Few shot text classification with active learning

Learn how to use the setfit and small-text libraries to perform few-shot text classification with active learning.

💨 Label data with semantic search

Learn how to use the sentence-transformers library to label data with semantic search.

🧹 Find and clean label errors

Learn how to use the cleanlab library to find and clean label errors.

🐭 Weak supervision for NER

Learn how to use the snorkel library to perform weak supervision for NER.

👮 Weak supervision for text classification with semantic search

Learn how to use the sentence-transformers and snorkel to do weak supervision for text classification with semantic search.