Quickstart¶
Argilla is a free, open-source, self-hosted tool. This means you need to deploy its UI to start using it. There is two main ways to deploy Argilla:
Deploy on the Hugging Face Hub
The recommended choice to get started. You can get up and running in under 5 minutes and don't need to maintain a server or run any commands.
If you're just getting started with Argilla, click the deploy button below:
You can use the default values following these steps:- Leave the default Space owner (your personal account)
- Leave
USERNAME
andPASSWORD
secrets empty since you'll sign in with your HF user as the Argilla Spaceowner
. - Click create Space to launch Argilla 🚀.
- Once you see the Argilla UI, go to the Sign in into the Argilla UI section. If you see the
Building
message for longer than 2-3 min refresh the page.
Persistent storage SMALL
Not setting persistent storage to SMALL
means that you will loose your data when the Space restarts. Spaces get restarted due to maintainance, inactivity, and every time you change your Spaces settings. If you want to use the Space just for testing you can use FREE
temporarily.
If you want to deploy Argilla within a Hugging Face organization, setup a more stable Space, or understand the settings, check out the HF Spaces settings guide.
Deploy with Docker
If you want to run Argilla locally on your machine or a server, or tune the server configuration, choose this option. To use this option, check this guide.
Sign in into the Argilla UI¶
If everything went well, you should see the Argilla sign in page that looks like this:
Building errors
If you get a build error, sometimes restarting the Space from the Settings page works, otherwise check the HF Spaces settings guide.
In the sign in page:
- Click on Sign in with Hugging Face
- Authorize the application and you will be logged in into Argilla as an
owner
.
Unauthorized error
Sometimes, after authorizing you'll see an unauthorized error, and get redirected to the sign in page. Typically, clicking the Sign in button solves the issue.
Congrats! Your Argilla server is ready to start your first project using the Python SDK. You now have full rights to create datasets. Follow the instructions in the home page, or keep reading this guide if you want a more detailed explanation.
Install the Python SDK¶
To manage workspaces and datasets in Argilla, you need to use the Argilla Python SDK. You can install it with pip as follows:
Create your first dataset¶
For getting started with Argilla and its SDK, we recommend to use Jupyter Notebook or Google Colab.
To start interacting with your Argilla server, you need to create a instantiate a client with an API key and API URL:
-
The
<api_key>
is in theMy Settings
page of your Argilla Space. -
The
<api_url>
is the URL shown in your browser if it ends with*.hf.space
.
You can't find your API URL
If you're using Spaces, sometimes the Argilla UI is embedded into the Hub UI so the URL of the browser won't match the API URL. In these scenarios, there are two options:
1. Click on the three points menu at the top of the Space, select "Embed this Space", and open the direct URL.
2. Use this pattern: https://[your-owner-name]-[your_space_name].hf.space
.
To create a dataset with a simple text classification task, first, you need to define the dataset settings.
settings = rg.Settings(
guidelines="Classify the reviews as positive or negative.",
fields=[
rg.TextField(
name="review",
title="Text from the review",
use_markdown=False,
),
],
questions=[
rg.LabelQuestion(
name="my_label",
title="In which category does this article fit?",
labels=["positive", "negative"],
)
],
)
Now you can create the dataset with these settings. Publish the dataset to make it available in the UI and add the records.
About workspaces
Workspaces in Argilla group datasets and user access rights. The workspace
parameter is optional in this case. If you don't specify it, the dataset will be created in the default workspace argilla
.
By default, this workspace will be visible to users joining with the Sign in with Hugging Face button. You can create other workspaces and decide to grant access to users either with the SDK or the changing the OAuth configuration.
dataset = rg.Dataset(
name=f"my_first_dataset",
settings=settings,
client=client,
#workspace="argilla"
)
dataset.create()
Now you can add records to your dataset. We will use the IMDB dataset from the Hugging Face Datasets library as an example. The mapping
parameter indicates which keys/columns in the source dataset correspond to the Argilla dataset fields.
from datasets import load_dataset
data = load_dataset("imdb", split="train[:100]").to_list()
dataset.records.log(records=data, mapping={"text": "review"})
🎉 You have successfully created your first dataset with Argilla. You can now access it in the Argilla UI and start annotating the records.
Next steps¶
-
To learn how to create your datasets, workspace, and manage users, check the how-to guides.
-
To learn Argilla with hands-on examples, check the Tutorials section.
-
To further configure your Argilla Space, check the Hugging Face Spaces settings guide.