Open In Colab  View Notebook on GitHub


This tutorial demonstrates a sample usage for FeedbackDataset, which offers implementations different from the old TextClassificationDataset, Text2TextDataset and TokenClassificationDataset. To have info about old datasets, you can have a look at here.

Workflow Feedback Dataset#

Argilla Feedback is a tool designed to obtain and manage both the feedback data from annotators and the suggestions from LLMs.

Install Libraries#

Install Argilla latest version in Colab, along with other libraries and models used in this notebook.

[ ]:
!pip install argilla datasets setfit evaluate seqeval

Set Up Argilla#

You can quickly deploy Argilla Server on HF Spaces.

Alternatively, if you want to run Argilla locally on your own computer, the easiest way to get Argilla UI up and running is to deploy on Docker:

docker run -d --name quickstart -p 6900:6900 argilla/argilla-quickstart:latest

More info on Installation here.

Connect to Argilla#

It is possible to connect to our Argilla instance by simply importing Argilla library, which internally connects to Argilla Server using the ARGILLA_API_URL and ARGILLA_API_KEY environment variables.

[ ]:
import os
#set your variable here
os.environ["ARGILLA_API_URL"] = "your_argilla_URL"
os.environ["ARGILLA_API_KEY"] = "owner.apikey"
[ ]:
import argilla as rg


"owner.apikey" is the default value for ARGILLA_API_KEY variable.

admin is the name of default workspace. A workspace is a โ€œspaceโ€ inside your Argilla instance where authorized users can collaborate.

If you want to initialize a connection manually you can use rg.init(). For more info about custom configurations like headers and workspace separation, check our config page.

If you want to customize the access credentials, take a look at our user management section.

Create Dataset#

FeedbackDataset is the container for Argilla Feedback structure. Argilla Feedback offers different components for FeedbackDatasets that you can employ for various aspects of your workflow. To start, we need to define fields, questions and records while we optionally have the opportunity to employ responses and suggestions for our task later.


fields will store the question and answer structure to be used for each sample.

[ ]:
import argilla as rg
from import ArgillaTrainer, TrainingTask
[ ]:
fields = [
    rg.TextField(name="question", required=True),
    rg.TextField(name="answer", required=True, use_markdown=True)


For the dataset, you need to define at least one question type. As of today, the different question types that Argilla offers are RatingQuestion, TextQuestion, LabelQuestion, MultiLabelQuestion and RankingQuestion. Let us create a LabelQuestion for the current example.

[ ]:
label_question = [
        labels=["yes", "no"],

While name is the indentifier of the question internally, title will be the question/instruction seen on Argilla UI. We also define a dictionary for labels.

Annotation guideline#

As it is helpful for annotators, we can enrich our task with guidelines as well. Clear guidelines will help them to understand the task better and make more accurate annotations. There are two ways to have guidelines: defining it as an argument to the FeedbackDataset or as an argument (description) to the question instances above. Depending on the specific task you employ, you may want to use either one of them so, it is a good practice to give a try to the both.

We can now create our FeedbackDataset instance with the fields, question type as shown above. Do not forget to define fields and questions as a list, while guidelines expects a string.

[ ]:
dataset = rg.FeedbackDataset(
    guidelines="Annotations should be made according to the policy.",

Upload data#


A record refers to each of the data items that will be annotated by the annotator team. The records will be the pieces of information that will be shown to the user in the UI in order to complete the annotation task. In the current single label datasest sample, it can only consist of a text to be labeled while it will be a prompt and output pair in the case of instruction datasets.

For Argilla Feedback, we can define a FeedbackRecord with the mandatory argument fields. Records also offer other optional arguments to further augment each record.

[ ]:
# A sample FeedbackRecord
record = rg.FeedbackRecord(
        "question": "Why can camels survive long without water?",
        "answer": "Camels use the fat in their humps to keep them filled with energy and hydration for long periods of time."


Argilla Feedback can deal with multiple ressponses per record for each one of the annotators. We can define a responses list for each record. Each response will be a dictionary with the annotator name as the key and the response as the value.

[ ]:
record.responses = [
                "value": "yes"


Argilla Feedback offers a way to use suggestions from LLMs and other models as a starting point for annotators. This way, annotators can save time and effort by correcting the predictions instead of annotating from scratch.

[ ]:
        "question_name": "label_question",
        "value": "yes"

Now, it is quite simple to add records to the FeedbackDataset we have previously created, in the form of a list.

[ ]:

Now that we have our dataset with already annotated responses and suggestions as model predictions, we can push the dataset to the Argilla space.


From Argilla 1.14.0, calling push_to_argilla will not just push the FeedbackDataset into Argilla, but will also return the remote FeedbackDataset instance, which implies that the additions, updates, and deletions of records will be pushed to Argilla as soon as they are made. This is a change from previous versions of Argilla, where you had to call push_to_argilla again to push the changes to Argilla.

[ ]:
remote_dataset = dataset.push_to_argilla(name="emotion_dataset", workspace="admin")

Train a model#

As with other datasets, Feedback datasets also allow to create a training pipeline and make inference with the resulting model. After you gather responses with Argilla Feedback, you can easily fine-tune an LLM. In this example, we will have complete a text classification task.

For fine-tuning, we will use setfit library and the Argilla Trainer, which is a powerful wrapper around many of our favorite NLP libraries. It provides a very intuitive abstract representation to facilitate simple training workflows using decent default pre-set configurations without having to worry about any data transformations from Argilla.

Let us first create our dataset to train. For this example, we will use emotion dataset from Argilla, which was created using Argilla. Each text item has its responses as 6 different sentiments, which are Sadness, Joy, Love, Anger, Fear and Surprise.

[ ]:
#besides Argilla, it can also be imported with load_dataset from datasets
dataset_hf = rg.FeedbackDataset.from_huggingface("argilla/emotion")

We can then start to create training pipeline by first defining TrainingTask, which is used to define how the data should be processed and formatted according to the associated task and framework. Each task has its own classmethod and the data formatting can always be customized via formatting_func. You can visit this page for more info. Simpler tasks like text classification can be defined using default definitions, as we do in this example.

[ ]:
task = TrainingTask.for_text_classification(

We can then define our ArgillaTrainer for any of the supported frameworks and customize the training config using ArgillaTrainer.update_config.

Let us define ArgillaTrainer with any of the supported frameworks.

[ ]:
trainer = ArgillaTrainer(

You can update the model config via update_config.

[ ]:

We can now train the model with train

[ ]:

and make inferences with predict.

[ ]:
trainer.predict("This is just perfect!")

We have trained a model with FeedbackDataset in this tutorial. For more info about concepts in Argilla Feedback and LLMs, look at here. To see more hands-on tutorials about FeedbackDataset, please look at here.