Data Model#

Argilla is built around a few simple concepts. This section clarifies what those concepts are. Let’s take a look at Argilla’s data model:

A sketch of the Argilla data model


A dataset is a collection of records of a common type. You can programmatically build datasets with the Argilla client and log them to the web app. In the web app you can dive into your dataset to explore and annotate your records. You can also load your datasets back to the client and export it into various formats, or prepare it for training a model.


A record is a data item composed of text inputs and, optionally, predictions and annotations.

Think of predictions as the classification that your system made over that input (for example: ‘Virginia Woolf’), and think of annotations as the ground truth that you manually assign to that input (because you know that, in this case, it would be ‘William Shakespeare’). Records are defined by the type of task they are related to. Let’s see three different examples:


Text classification record#

Text classification deals with predicting in which categories a text fits. As if you’re shown an image you could quickly tell if there’s a dog or a cat in it, we build NLP models to distinguish between a Jane Austen’s novel or a Charlotte Bronte’s poem. It’s all about feeding models with labelled examples and seeing how they start predicting over the very same labels.

Let’s see examples of a spam classifier.

record = rg.TextClassificationRecord(
    text="Access this link to get free discounts!",

    prediction = [('SPAM', 0.8), ('HAM', 0.2)],
    prediction_agent = "link or reference to agent",

    annotation = "SPAM",
    annotation_agent= "link or reference to annotator",

    # Extra information about this record
        "split": "train"

Multi-label text classification record#

Another similar task to Text Classification, but yet a bit different, is Multi-label Text Classification. Just one key difference: more than one label may be predicted. While in a regular Text Classification task we may decide that the tweet “I can’t wait to travel to Egypts and visit the pyramids” fits into the hastag #Travel, which is accurate, in Multi-label Text Classification we can classify it as more than one hastag, like #Travel #History #Africa #Sightseeing #Desert.

record = rg.TextClassificationRecord(
    text="I can't wait to travel to Egypts and visit the pyramids",

    multi_label = True,

    prediction = [('travel', 0.8), ('history', 0.6), ('economy', 0.3), ('sports', 0.2)],
    prediction_agent = "link or reference to agent",

    annotation = ['travel', 'history'],
    annotation_agent= "link or reference to annotator",

Token classification record#

Token classification kind-of-tasks are NLP tasks aimed to divide the input text into words, or syllables, and assign certain values to them. Think about giving each word in a sentence its grammatical category, or highlight which parts of a medical report belong to a certain speciality. There are some popular ones like NER or POS-tagging.

record = rg.TokenClassificationRecord(
    text = "Michael is a professor at Harvard",
    tokens = ["Micheal", "is", "a", "professor", "at", "Harvard"],

    # Predictions are a list of tuples with all your token labels and its starting and ending positions
    prediction = [('NAME', 0, 7), ('LOC', 26, 33)],
    prediction_agent = "link or reference to agent",

    # Annotations are a list of tuples with all your token labels and its starting and ending positions
    annotation = [('NAME', 0, 7), ('ORG', 26, 33)],
    annotation_agent = "link or reference to annotator",

    metadata={  # Information about this record
        "split": "train"

TextGeneration record#

TextGeneration tasks, like text generation, are tasks where the model receives and outputs a sequence of tokens. Examples of such tasks are machine translation, text summarization, paraphrase generation, etc.

record = rg.TextGenerationRecord(
    text = "Michael is a professor at Harvard",

    # The prediction is a list of texts or tuples if you want to add a score to a prediction
    prediction = ["Michael es profesor en Harvard", "Michael es un profesor de Harvard"],
    prediction_agent = "link or reference to agent",

    # The annotation a strings representing the expected output text for the given input text
    annotation = "Michael es profesor en Harvard"


An annotation is a piece information assigned to a record, a label, token-level tags, or a set of labels, and typically by a human agent.


A prediction is a piece information assigned to a record, a label or a set of labels and typically by a machine process.


Metadata will hold extra information that you want your record to have: if it belongs to the training or the test dataset, a quick fact about something regarding that specific record… Feel free to use it as you need!


A task defines the objective and shape of the predictions and annotations inside a record. You can see our supported tasks at tasks.


For now, only a set of the predefined labels (labels schema) is configurable. Still, other settings like annotators, and metadata schema, are planned to be supported as part of dataset settings.