Skip to content

rg.Record

The Record object is used to represent a single record in Argilla. It contains fields, suggestions, responses, metadata, and vectors.

Usage Examples

Creating a Record

To create records, you can use the Record class and pass it to the Dataset.records.log method. The Record class requires a fields parameter, which is a dictionary of field names and values. The field names must match the field names in the dataset's Settings object to be accepted.

dataset.records.log(
    records=[
        rg.Record(
            fields={"text": "Hello World, how are you?"},
        ),
    ]
) # (1)
  1. The Argilla dataset contains a field named text matching the key here.

To create records with image fields, pass the image to the record object as either a remote url, local path to an image file, or a PIL object. The field names must be defined as an rg.ImageFieldin the dataset's Settings object to be accepted. Images will be stored in the Argilla database and returned as rescaled PIL objects.

dataset.records.log(
    records=[
        rg.Record(
            fields={"image": "https://example.com/image.jpg"}, # (1)
        ),
    ]
)
  1. The image can be referenced as either a remote url, a local file path, or a PIL object.

Note

The image will be stored in the Argilla database and can impact the dataset's storage usage. Images should be less than 5mb in size and datasets should contain less than 10,000 images.

Accessing Record Attributes

The Record object has suggestions, responses, metadata, and vectors attributes that can be accessed directly whilst iterating over records in a dataset.

for record in dataset.records(
    with_suggestions=True,
    with_responses=True,
    with_metadata=True,
    with_vectors=True
    ):
    print(record.suggestions)
    print(record.responses)
    print(record.metadata)
    print(record.vectors)

Record properties can also be updated whilst iterating over records in a dataset.

for record in dataset.records(with_metadata=True):
    record.metadata = {"department": "toys"}

For changes to take effect, the user must call the update method on the Dataset object, or pass the updated records to Dataset.records.log. All core record atttributes can be updated in this way. Check their respective documentation for more information: Suggestions, Responses, Metadata, Vectors.


Record

Bases: Resource

The class for interacting with Argilla Records. A Record is a single sample in a dataset. Records receives feedback in the form of responses and suggestions. Records contain fields, metadata, and vectors.

Attributes:

Name Type Description
id Union[str, UUID]

The id of the record.

fields RecordFields

The fields of the record.

metadata RecordMetadata

The metadata of the record.

vectors RecordVectors

The vectors of the record.

responses RecordResponses

The responses of the record.

suggestions RecordSuggestions

The suggestions of the record.

dataset Dataset

The dataset to which the record belongs.

_server_id UUID

An id for the record generated by the Argilla server.

Source code in src/argilla/records/_resource.py
class Record(Resource):
    """The class for interacting with Argilla Records. A `Record` is a single sample
    in a dataset. Records receives feedback in the form of responses and suggestions.
    Records contain fields, metadata, and vectors.

    Attributes:
        id (Union[str, UUID]): The id of the record.
        fields (RecordFields): The fields of the record.
        metadata (RecordMetadata): The metadata of the record.
        vectors (RecordVectors): The vectors of the record.
        responses (RecordResponses): The responses of the record.
        suggestions (RecordSuggestions): The suggestions of the record.
        dataset (Dataset): The dataset to which the record belongs.
        _server_id (UUID): An id for the record generated by the Argilla server.
    """

    _model: RecordModel

    def __init__(
        self,
        id: Optional[Union[UUID, str]] = None,
        fields: Optional[Dict[str, FieldValue]] = None,
        metadata: Optional[Dict[str, Any]] = None,
        vectors: Optional[Dict[str, VectorValue]] = None,
        responses: Optional[List[Response]] = None,
        suggestions: Optional[List[Suggestion]] = None,
        _server_id: Optional[UUID] = None,
        _dataset: Optional["Dataset"] = None,
    ):
        """Initializes a Record with fields, metadata, vectors, responses, suggestions, external_id, and id.
        Records are typically defined as flat dictionary objects with fields, metadata, vectors, responses, and suggestions
        and passed to Dataset.DatasetRecords.add() as a list of dictionaries.

        Args:
            id: An id for the record. If not provided, a UUID will be generated.
            fields: A dictionary of fields for the record.
            metadata: A dictionary of metadata for the record.
            vectors: A dictionary of vectors for the record.
            responses: A list of Response objects for the record.
            suggestions: A list of Suggestion objects for the record.
            _server_id: An id for the record. (Read-only and set by the server)
            _dataset: The dataset object to which the record belongs.
        """

        if fields is None and metadata is None and vectors is None and responses is None and suggestions is None:
            raise ValueError("At least one of fields, metadata, vectors, responses, or suggestions must be provided.")
        if fields is None and id is None:
            raise ValueError("If fields are not provided, an id must be provided.")
        if fields == {} and id is None:
            raise ValueError("If fields are an empty dictionary, an id must be provided.")

        self._dataset = _dataset
        self._model = RecordModel(external_id=id, id=_server_id)
        self.__fields = RecordFields(fields=fields, record=self)
        self.__vectors = RecordVectors(vectors=vectors)
        self.__metadata = RecordMetadata(metadata=metadata)
        self.__responses = RecordResponses(responses=responses, record=self)
        self.__suggestions = RecordSuggestions(suggestions=suggestions, record=self)

    def __repr__(self) -> str:
        return (
            f"Record(id={self.id},status={self.status},fields={self.fields},metadata={self.metadata},"
            f"suggestions={self.suggestions},responses={self.responses})"
        )

    ############################
    # Properties
    ############################

    @property
    def id(self) -> str:
        return self._model.external_id

    @id.setter
    def id(self, value: str) -> None:
        self._model.external_id = value

    @property
    def dataset(self) -> "Dataset":
        return self._dataset

    @dataset.setter
    def dataset(self, value: "Dataset") -> None:
        self._dataset = value

    @property
    def fields(self) -> "RecordFields":
        return self.__fields

    @property
    def responses(self) -> "RecordResponses":
        return self.__responses

    @property
    def suggestions(self) -> "RecordSuggestions":
        return self.__suggestions

    @property
    def metadata(self) -> "RecordMetadata":
        return self.__metadata

    @property
    def vectors(self) -> "RecordVectors":
        return self.__vectors

    @property
    def status(self) -> str:
        return self._model.status

    @property
    def _server_id(self) -> Optional[UUID]:
        return self._model.id

    ############################
    # Public methods
    ############################

    def get(self) -> "Record":
        """Retrieves the record from the server."""
        model = self._client.api.records.get(self._server_id)
        instance = self.from_model(model, dataset=self.dataset)
        self.__dict__ = instance.__dict__

        return self

    def api_model(self) -> RecordModel:
        return RecordModel(
            id=self._model.id,
            external_id=self._model.external_id,
            fields=self.fields.to_dict(),
            metadata=self.metadata.api_models(),
            vectors=self.vectors.api_models(),
            responses=self.responses.api_models(),
            suggestions=self.suggestions.api_models(),
            status=self.status,
        )

    def serialize(self) -> Dict[str, Any]:
        """Serializes the Record to a dictionary for interaction with the API"""
        serialized_model = self._model.model_dump()
        serialized_suggestions = [suggestion.serialize() for suggestion in self.__suggestions]
        serialized_responses = [response.serialize() for response in self.__responses]
        serialized_model["responses"] = serialized_responses
        serialized_model["suggestions"] = serialized_suggestions

        return serialized_model

    def to_dict(self) -> Dict[str, Dict]:
        """Converts a Record object to a dictionary for export.
        Returns:
            A dictionary representing the record where the keys are "fields",
            "metadata", "suggestions", and "responses". Each field and question is
            represented as a key-value pair in the dictionary of the respective key. i.e.
            `{"fields": {"prompt": "...", "response": "..."}, "responses": {"rating": "..."},
        """
        id = str(self.id) if self.id else None
        server_id = str(self._model.id) if self._model.id else None
        status = self.status
        fields = self.fields.to_dict()
        metadata = self.metadata.to_dict()
        suggestions = self.suggestions.to_dict()
        responses = self.responses.to_dict()
        vectors = self.vectors.to_dict()

        # TODO: Review model attributes when to_dict and serialize methods are unified
        return {
            "id": id,
            "fields": fields,
            "metadata": metadata,
            "suggestions": suggestions,
            "responses": responses,
            "vectors": vectors,
            "status": status,
            "_server_id": server_id,
        }

    @classmethod
    def from_dict(cls, data: Dict[str, Dict], dataset: Optional["Dataset"] = None) -> "Record":
        """Converts a dictionary to a Record object.
        Args:
            data: A dictionary representing the record.
            dataset: The dataset object to which the record belongs.
        Returns:
            A Record object.
        """
        fields = data.get("fields", {})
        metadata = data.get("metadata", {})
        suggestions = data.get("suggestions", {})
        responses = data.get("responses", {})
        vectors = data.get("vectors", {})
        record_id = data.get("id", None)
        _server_id = data.get("_server_id", None)

        suggestions = [Suggestion(question_name=question_name, **value) for question_name, value in suggestions.items()]
        responses = [
            Response(question_name=question_name, **value)
            for question_name, _responses in responses.items()
            for value in _responses
        ]

        return cls(
            id=record_id,
            fields=fields,
            suggestions=suggestions,
            responses=responses,
            vectors=vectors,
            metadata=metadata,
            _dataset=dataset,
            _server_id=_server_id,
        )

    @classmethod
    def from_model(cls, model: RecordModel, dataset: "Dataset") -> "Record":
        """Converts a RecordModel object to a Record object.
        Args:
            model: A RecordModel object.
            dataset: The dataset object to which the record belongs.
        Returns:
            A Record object.
        """
        instance = cls(
            id=model.external_id,
            fields=model.fields,
            metadata={meta.name: meta.value for meta in model.metadata},
            vectors={vector.name: vector.vector_values for vector in model.vectors},
            _dataset=dataset,
            responses=[],
            suggestions=[],
        )

        # set private attributes
        instance._dataset = dataset
        instance._model = model

        # Responses and suggestions are computed separately based on the record model
        instance.responses.from_models(model.responses)
        instance.suggestions.from_models(model.suggestions)

        return instance

    @property
    def _client(self) -> Optional["Argilla"]:
        if self._dataset:
            return self.dataset._client

    @property
    def _api(self) -> Optional["RecordsAPI"]:
        if self._client:
            return self._client.api.records

__init__(id=None, fields=None, metadata=None, vectors=None, responses=None, suggestions=None, _server_id=None, _dataset=None)

Initializes a Record with fields, metadata, vectors, responses, suggestions, external_id, and id. Records are typically defined as flat dictionary objects with fields, metadata, vectors, responses, and suggestions and passed to Dataset.DatasetRecords.add() as a list of dictionaries.

Parameters:

Name Type Description Default
id Optional[Union[UUID, str]]

An id for the record. If not provided, a UUID will be generated.

None
fields Optional[Dict[str, FieldValue]]

A dictionary of fields for the record.

None
metadata Optional[Dict[str, Any]]

A dictionary of metadata for the record.

None
vectors Optional[Dict[str, VectorValue]]

A dictionary of vectors for the record.

None
responses Optional[List[Response]]

A list of Response objects for the record.

None
suggestions Optional[List[Suggestion]]

A list of Suggestion objects for the record.

None
_server_id Optional[UUID]

An id for the record. (Read-only and set by the server)

None
_dataset Optional[Dataset]

The dataset object to which the record belongs.

None
Source code in src/argilla/records/_resource.py
def __init__(
    self,
    id: Optional[Union[UUID, str]] = None,
    fields: Optional[Dict[str, FieldValue]] = None,
    metadata: Optional[Dict[str, Any]] = None,
    vectors: Optional[Dict[str, VectorValue]] = None,
    responses: Optional[List[Response]] = None,
    suggestions: Optional[List[Suggestion]] = None,
    _server_id: Optional[UUID] = None,
    _dataset: Optional["Dataset"] = None,
):
    """Initializes a Record with fields, metadata, vectors, responses, suggestions, external_id, and id.
    Records are typically defined as flat dictionary objects with fields, metadata, vectors, responses, and suggestions
    and passed to Dataset.DatasetRecords.add() as a list of dictionaries.

    Args:
        id: An id for the record. If not provided, a UUID will be generated.
        fields: A dictionary of fields for the record.
        metadata: A dictionary of metadata for the record.
        vectors: A dictionary of vectors for the record.
        responses: A list of Response objects for the record.
        suggestions: A list of Suggestion objects for the record.
        _server_id: An id for the record. (Read-only and set by the server)
        _dataset: The dataset object to which the record belongs.
    """

    if fields is None and metadata is None and vectors is None and responses is None and suggestions is None:
        raise ValueError("At least one of fields, metadata, vectors, responses, or suggestions must be provided.")
    if fields is None and id is None:
        raise ValueError("If fields are not provided, an id must be provided.")
    if fields == {} and id is None:
        raise ValueError("If fields are an empty dictionary, an id must be provided.")

    self._dataset = _dataset
    self._model = RecordModel(external_id=id, id=_server_id)
    self.__fields = RecordFields(fields=fields, record=self)
    self.__vectors = RecordVectors(vectors=vectors)
    self.__metadata = RecordMetadata(metadata=metadata)
    self.__responses = RecordResponses(responses=responses, record=self)
    self.__suggestions = RecordSuggestions(suggestions=suggestions, record=self)

get()

Retrieves the record from the server.

Source code in src/argilla/records/_resource.py
def get(self) -> "Record":
    """Retrieves the record from the server."""
    model = self._client.api.records.get(self._server_id)
    instance = self.from_model(model, dataset=self.dataset)
    self.__dict__ = instance.__dict__

    return self

serialize()

Serializes the Record to a dictionary for interaction with the API

Source code in src/argilla/records/_resource.py
def serialize(self) -> Dict[str, Any]:
    """Serializes the Record to a dictionary for interaction with the API"""
    serialized_model = self._model.model_dump()
    serialized_suggestions = [suggestion.serialize() for suggestion in self.__suggestions]
    serialized_responses = [response.serialize() for response in self.__responses]
    serialized_model["responses"] = serialized_responses
    serialized_model["suggestions"] = serialized_suggestions

    return serialized_model

to_dict()

Converts a Record object to a dictionary for export. Returns: A dictionary representing the record where the keys are "fields", "metadata", "suggestions", and "responses". Each field and question is represented as a key-value pair in the dictionary of the respective key. i.e. `{"fields": {"prompt": "...", "response": "..."}, "responses": {"rating": "..."},

Source code in src/argilla/records/_resource.py
def to_dict(self) -> Dict[str, Dict]:
    """Converts a Record object to a dictionary for export.
    Returns:
        A dictionary representing the record where the keys are "fields",
        "metadata", "suggestions", and "responses". Each field and question is
        represented as a key-value pair in the dictionary of the respective key. i.e.
        `{"fields": {"prompt": "...", "response": "..."}, "responses": {"rating": "..."},
    """
    id = str(self.id) if self.id else None
    server_id = str(self._model.id) if self._model.id else None
    status = self.status
    fields = self.fields.to_dict()
    metadata = self.metadata.to_dict()
    suggestions = self.suggestions.to_dict()
    responses = self.responses.to_dict()
    vectors = self.vectors.to_dict()

    # TODO: Review model attributes when to_dict and serialize methods are unified
    return {
        "id": id,
        "fields": fields,
        "metadata": metadata,
        "suggestions": suggestions,
        "responses": responses,
        "vectors": vectors,
        "status": status,
        "_server_id": server_id,
    }

from_dict(data, dataset=None) classmethod

Converts a dictionary to a Record object. Args: data: A dictionary representing the record. dataset: The dataset object to which the record belongs. Returns: A Record object.

Source code in src/argilla/records/_resource.py
@classmethod
def from_dict(cls, data: Dict[str, Dict], dataset: Optional["Dataset"] = None) -> "Record":
    """Converts a dictionary to a Record object.
    Args:
        data: A dictionary representing the record.
        dataset: The dataset object to which the record belongs.
    Returns:
        A Record object.
    """
    fields = data.get("fields", {})
    metadata = data.get("metadata", {})
    suggestions = data.get("suggestions", {})
    responses = data.get("responses", {})
    vectors = data.get("vectors", {})
    record_id = data.get("id", None)
    _server_id = data.get("_server_id", None)

    suggestions = [Suggestion(question_name=question_name, **value) for question_name, value in suggestions.items()]
    responses = [
        Response(question_name=question_name, **value)
        for question_name, _responses in responses.items()
        for value in _responses
    ]

    return cls(
        id=record_id,
        fields=fields,
        suggestions=suggestions,
        responses=responses,
        vectors=vectors,
        metadata=metadata,
        _dataset=dataset,
        _server_id=_server_id,
    )

from_model(model, dataset) classmethod

Converts a RecordModel object to a Record object. Args: model: A RecordModel object. dataset: The dataset object to which the record belongs. Returns: A Record object.

Source code in src/argilla/records/_resource.py
@classmethod
def from_model(cls, model: RecordModel, dataset: "Dataset") -> "Record":
    """Converts a RecordModel object to a Record object.
    Args:
        model: A RecordModel object.
        dataset: The dataset object to which the record belongs.
    Returns:
        A Record object.
    """
    instance = cls(
        id=model.external_id,
        fields=model.fields,
        metadata={meta.name: meta.value for meta in model.metadata},
        vectors={vector.name: vector.vector_values for vector in model.vectors},
        _dataset=dataset,
        responses=[],
        suggestions=[],
    )

    # set private attributes
    instance._dataset = dataset
    instance._model = model

    # Responses and suggestions are computed separately based on the record model
    instance.responses.from_models(model.responses)
    instance.suggestions.from_models(model.suggestions)

    return instance