What is Argilla?#

Argilla is an open-source data curation platform for LLMs. Using Argilla, everyone can build robust language models through faster data curation using both human and machine feedback. We provide support for each step in the MLOps cycle, from data labeling to model monitoring.

๐Ÿ“„ About The Docs#

Section

Goal

๐Ÿš€ Quickstart

Install Argilla and end-to-end toy examples

๐ŸŽผ Cheatsheet

Brief code snippets for our main functionalities

๐Ÿ”ง Installation

Everything deployment: Docker, Kubernetes, Cloud and way more

โš™๏ธ Configuration

User management and deployment tweaking

๐Ÿ’ฅ LLMs

Generative AI, ChatGPT and friends

๐Ÿฆฎ Guides

Conceptual overview of our main functionalities

๐Ÿง—โ€โ™€๏ธ Tutorials

Specific applied end-to-end examples

๐Ÿท๏ธ References

Itemized information and API docs

๐Ÿ˜๏ธ Community

Everything about for developers and contributing

๐Ÿ—บ๏ธ Roadmap

Our future plans

๐Ÿ“ Principles#

  • Open: Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can use and combine your preferred libraries without implementing any specific interface.

  • End-to-end: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions and collect more data to improve your model over time. Argilla is designed to close this gap, enabling you to iterate as much as you need.

  • User and Developer Experience: The key to sustainable NLP solutions are to make it easier for everyone to contribute to projects. Domain experts should feel comfortable interpreting and annotating data. Data scientists should feel free to experiment and iterate. Engineers should feel in control of data pipelines. Argilla optimizes the experience for these core users to make your teams more productive.

  • Beyond hand-labeling: Classical hand-labeling workflows are costly and inefficient, but having humans in the loop is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak supervision in novel data annotation workflows**.

๐Ÿซฑ๐Ÿพโ€๐Ÿซฒ๐Ÿผ Contribute#

We love contributors and have launched a collaboration with JustDiggit to hand out our very own bunds and help the re-greening of sub-Saharan Africa. To help our community with the creation of contributions, we have created our developer and contributor docs. Additionally, you can always schedule a meeting with our Developer Advocacy team so they can get you up to speed.

๐Ÿฅ‡ Contributors#

๐Ÿ˜๏ธ Community#

๐Ÿ™‹โ€โ™€๏ธ Join the Argilla community on Slack and get direct support from the community.

โญ Argilla Github repo to stay updated about new releases and tutorials.

๐ŸŽ Weโ€™ve just printed stickers! Would you like some? Order stickers for free

๐Ÿ—บ๏ธ Roadmap#

We continuously work on updating our plans and our roadmap and we love to discuss those with our community. Feel encouraged to participate.