Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
2.4.0¶
Added¶
- Added
Argilla.deploy_on_spacesto deploy the Argilla server on Hugging Face Spaces. (#5547)
Changed¶
- Changed
Dataset.from_hubmethod to open configure URL whensettings="ui". (#5622) - Terms metadata properties accept other values than
str. (#5594) - Added support for
with_vectorswhile fetching records along with a search query. (#5638)
Removed¶
- Removed name sanitizing for dataset settings names. This may cause issues with old server versions. Especially when working with
from_hub. (#5574)
2.3.0¶
Added¶
- Added support for
CustomField. (#5422) - Added
inserted_atandupdated_attoResourcemodel as properties. (#5540) - Added
limitargument when fetching records. (#5525 - Added similarity search support. (#5546)
- Added filter support for
id,_server_id,inserted_atandupdated_atrecord attributes. (#5545) - Added support to read argilla credentials from colab secrets. (#5541)
Changed¶
- Changed the repr method for
SettingsPropertiesto display the details of all the properties inSettingobject. (#5380) - Changed error messages when creating datasets with insufficient permissions. (#5540)
Fixed¶
- Fixed serialization of
ChatFieldwhen collecting records from the hub and exporting todatasets. (#5554)
2.2.2¶
Fixed¶
- Fixed
from_hubwith unsupported column names. (#5524) - Fixed
from_hubwith missing datasetsubsetconfiguration value. (#5524)
Changed¶
- Changed
from_hubto only generate fields not questions for strings in dataset. (#5524)
2.2.1¶
Fixed¶
- Fixed
from_huberrors when columns names contain uppercase letters. (#5523) - Fixed
from_huberrors when class feature values contains unlabelled values. (#5523) - Fixed
from_huberrors when loading cached datasets. (#5523)
2.2.0¶
- Added new
ChatFieldsupporting chat messages. (#5376) - Added template settings to
rg.Settingsfor classification, rating, and ranking questions. (#5426) - Added
rg.Settingsdefinition based ondatasets.Featureswithinrg.Dataset.from_hub. (#5426) - Added persistent record mapping to
rg.Settingsto be used inrg.Dataset.records.log. (#5466) - Added multiple error handling methods to the
rg.Dataset.records.logmethod to warn, ignore, or raise errors. (#5466) - Changed dataset import and export of
rg.LabelQuestionto usedatasets.ClassLabelnotdatasets.Value. (#5474)
2.1.0¶
Added¶
- Added new
ImageFieldsupporting URLs and Data URLs. (#5279) - Added dark mode (#5412)
- Added settings parameter to
rg.Dataset.from_hubto define the dataset settings before ingesting a dataset from the hub. (#5418)
2.0.1¶
Fixed¶
- Fixed error when creating optional fields. (#5362)
- Fixed error creating integer and float metadata with
visible_for_annotators. (#5364) - Fixed error when logging records with
suggestionsorresponsesfor non-existent questions. (#5396 by @maxserras) - Fixed error from conflicts in testing suite when running tests in parallel. (#5349)
- Fixed error in response model when creating a response with a
Nonevalue. (#5343)
Changed¶
- Changed
from_hubmethod to raise an error when a dataset with the same name exists. (#5258) - Changed
logmethod when ingesting records with no known keys to raise a descriptive error. (#5356) - Changed
code snippetsto add new datasets (#5395)
Added¶
- Added Google Analytics to the documentation site. (#5366)
- Added frontend skeletons to progress metrics to optimise load time and improve user experience. (#5391)
- Added documentation in methods in API references for the Python SDK. (#5400)
Fixed¶
- Fix bug when submit the latest record, sometimes you navigate to non existing page #5419
2.0.0¶
Added¶
- Added core class refactors. For an overview, see this blog post
- Added
TaskDistributionto define distribution of records to users . - Added new documentation site and structure and migrated legacy documentation.
Changed¶
- Changed
FeedbackDatasettoDataset. - Changed
rg.initintorg.Argillaclass to interact with Argilla server.
Deprecated¶
- Deprecated task specific dataset classes like
TextClassificationandTokenClassification. To migrate legacy datasets torg.Datasetclass, see the how-to-guide. - Deprecated use case extensions like
listenersandArgillaTrainer.
2.0.0rc1¶
[!NOTE] This release for 2.0.0rc1 does not contain any changelog entries because it is the first release candidate for the 2.0.0 version. The following versions will contain the changelog entries again. For a general overview of the changes in the 2.0.0 version, please refer to our blog or our new documentation.
1.29.0¶
Added¶
- Added support for rating questions to include
0as a valid value. (#4860) - Added support for Python 3.12. (#4837)
- Added search by field in the
FeedbackDatasetUI search. (#4746) - Added record metadata info in the
FeedbackDatasetUI. (#4851) - Added highlight on search results in the
FeedbackDatasetUI. (#4747)
Fixed¶
- Fix wildcard import for the whole argilla module. (#4874)
- Fix issue when record does not have vectors related. (#4856)
- Fix issue on character level. (#4836)
1.28.0¶
Added¶
- Added suggestion multi score attribute. (#4730)
- Added order by suggestion first. (#4731)
- Added multi selection entity dropdown for span annotation overlap. (#4735)
- Added pre selection highlight for span annotation. (#4726)
- Added banner when persistent storage is not enabled. (#4744)
- Added support on Python SDK for new multi-label questions
labels_orderattribute. (#4757)
Changed¶
- Changed the way how Hugging Face space and user is showed in sign in. (#4748)
Fixed¶
- Fixed Korean character reversed. (#4753)
Fixed¶
- Fixed requirements for version of wrapt library conflicting with Python 3.11 (#4693)
1.27.0¶
Added¶
- Added Allow overlap spans in the
FeedbackDataset. (#4668) - Added
allow_overlappingparameter for span questions. (#4697) - Added overall progress bar on
Datasetstable. (#4696) - Added German language translation. (#4688)
Changed¶
- New UI design for suggestions. (#4682)
Fixed¶
- Improve performance for more than 250 labels. (#4702)
1.26.1¶
Added¶
- Added support for automatic detection of RTL languages. (#4686)
1.26.0¶
Added¶
- If you expand the labels of a
single or multilabel Question, the state is maintained during the entire annotation process. (#4630) - Added support for span questions in the Python SDK. (#4617)
- Added support for span values in suggestions and responses. (#4623)
- Added
spanquestions forFeedbackDataset. (#4622) - Added
ARGILLA_CACHE_DIRenvironment variable to configure the client cache directory. (#4509)
Fixed¶
- Fixed contextualized workspaces. (#4665)
- Fixed prepare for training when passing
RankingValueSchemainstances to suggestions. (#4628) - Fixed parsing ranking values in suggestions from HF datasets. (#4629)
- Fixed reading description from API response payload. (#4632)
- Fixed pulling (n*chunk_size)+1 records when using
ds.pullor iterating over the dataset. (#4662) - Fixed client's resolution of enum values when calling the Search and Metrics api, to support Python >=3.11 enum handling. (#4672)
1.25.0¶
[!NOTE] For changes in the argilla-server module, visit the argilla-server release notes
Added¶
- Reorder labels in
dataset settings pagefor single/multi label questions (#4598) - Added pandas v2 support using the python SDK. (#4600)
Removed¶
- Removed
missingresponse for status filter. Usependinginstead. (#4533)
Fixed¶
- Fixed FloatMetadataProperty: value is not a valid float (#4570)
- Fixed redirect to
user-settingsinstead of 404user_settings(#4609)
1.24.0¶
[!NOTE] This release does not contain any new features, but it includes a major change in the
argilla-serverdependency. The package is using theargilla-serverdependency defined here. (#4537)
Changed¶
1.23.1¶
Fixed¶
- Fixed Responsive view for Feedback Datasets. (#4579)
1.23.0¶
Added¶
- Added bulk annotation by filter criteria. (#4516)
- Automatically fetch new datasets on focus tab. (#4514)
- API v1 responses returning
Recordschema now always includedataset_idas attribute. (#4482) - API v1 responses returning
Responseschema now always includerecord_idas attribute. (#4482) - API v1 responses returning
Questionschema now always includedataset_idattribute. (#4487) - API v1 responses returning
Fieldschema now always includedataset_idattribute. (#4488) - API v1 responses returning
MetadataPropertyschema now always includedataset_idattribute. (#4489) - API v1 responses returning
VectorSettingsschema now always includedataset_idattribute. (#4490) - Added
pdf_to_htmlfunction to.html_utilsmodule that convert PDFs to dataURL to be able to render them in tha Argilla UI. (#4481) - Added
ARGILLA_AUTH_SECRET_KEYenvironment variable. (#4539) - Added
ARGILLA_AUTH_ALGORITHMenvironment variable. (#4539) - Added
ARGILLA_AUTH_TOKEN_EXPIRATIONenvironment variable. (#4539) - Added
ARGILLA_AUTH_OAUTH_CFGenvironment variable. (#4546) - Added OAuth2 support for HuggingFace Hub. (#4546)
Deprecated¶
- Deprecated
ARGILLA_LOCAL_AUTH_*environment variables. Will be removed in the release v1.25.0. (#4539)
Changed¶
- Changed regex pattern for
usernameattribute inUserCreate. Now uppercase letters are allowed. (#4544)
Removed¶
- Remove sending
Authorizationheader from python SDK requests. (#4535)
Fixed¶
- Fixed keyboard shortcut for label questions. (#4530)
1.22.0¶
Added¶
- Added Bulk annotation support. (#4333)
- Restore filters from feedback dataset settings. ([#4461])(https://github.com/argilla-io/argilla/pull/4461)
- Warning on feedback dataset settings when leaving page with unsaved changes. (#4461)
- Added pydantic v2 support using the python SDK. (#4459)
- Added
vector_settingsto the__repr__method of theFeedbackDatasetandRemoteFeedbackDataset. (#4454) - Added integration for
sentence-transformersusingSentenceTransformersExtractorto configurevector_settingsinFeedbackDatasetandFeedbackRecord. (#4454)
Changed¶
- Module
argilla.cli.serverdefinitions have been moved toargilla.server.climodule. (#4472) - [breaking] Changed
vector_settings_by_namefor genericproperty_by_nameusage, which will returnNoneinstead of raising an error. (#4454) - The constant definition
ES_INDEX_REGEX_PATTERNin moduleargilla._constantsis now private. (#4472) nanvalues in metadata properties will raise a 422 error when creating/updating records. (#4300)Nonevalues are now allowed in metadata properties. (#4300)- Refactor and add
width,height,autoplayandloopattributes as optional args into_htmlfunctions. (#4481)
Fixed¶
- Paginating to a new record, automatically scrolls down to selected form area. (#4333)
Deprecated¶
- The
missingresponse status for filtering records is deprecated and will be removed in the release v1.24.0. Usependinginstead. (#4433)
Removed¶
- The deprecated
python -m argilla databasecommand has been removed. (#4472)
1.21.0¶
Added¶
- Added new draft queue for annotation view (#4334)
- Added annotation metrics module for the
FeedbackDataset(argilla.client.feedback.metrics). (#4175). - Added strategy to handle and translate errors from the server for
401HTTP status code` (#4362) - Added integration for
textdescriptivesusingTextDescriptivesExtractorto configuremetadata_propertiesinFeedbackDatasetandFeedbackRecord. (#4400). Contributed by @m-newhauser - Added
POST /api/v1/me/responses/bulkendpoint to create responses in bulk for current user. (#4380) - Added list support for term metadata properties. (Closes #4359)
- Added new CLI task to reindex datasets and records into the search engine. (#4404)
- Added
httpx_extra_kwargsargument torg.initandArgillato allow passing extra arguments tohttpx.Clientused byArgilla. (#4440) - Added
ResponseStatusFilterenum in__init__imports of Argilla (#4118). Contributed by @Piyush-Kumar-Ghosh.
Changed¶
- More productive and simpler shortcut system (#4215)
- Move
ArgillaSingleton,initandactive_clientto a new modulesingleton. (#4347) - Updated
argilla.loadfunctions to also work withFeedbackDatasets. (#4347) - [breaking] Updated
argilla.deletefunctions to also work withFeedbackDatasets. It now raises an error if the dataset does not exist. (#4347) - Updated
argilla.list_datasetsfunctions to also work withFeedbackDatasets. (#4347)
Fixed¶
- Fixed error in
TextClassificationSettings.from_dictmethod in which thelabel_schemacreated was a list ofdictinstead of a list ofstr. (#4347) - Fixed total records on pagination component (#4424)
Removed¶
- Removed
draftauto save for annotation view (#4334)
1.20.0¶
Added¶
- Added
GET /api/v1/datasets/:dataset_id/records/search/suggestions/optionsendpoint to return suggestion available options for searching. (#4260) - Added
metadata_propertiesto the__repr__method of theFeedbackDatasetandRemoteFeedbackDataset.(#4192). - Added
get_model_kwargs,get_trainer_kwargs,get_trainer_model,get_trainer_tokenizerandget_trainer-methods to theArgillaTrainerto improve interoperability across frameworks. (#4214). - Added additional formatting checks to the
ArgillaTrainerto allow for better interoperability ofdefaultsandformatting_funcusage. (#4214). - Added a warning to the
update_config-method ofArgillaTrainerto emphasize if thekwargswere updated correctly. (#4214). - Added
argilla.client.feedback.utilsmodule withhtml_utils(this mainly includesvideo/audio/image_to_htmlthat convert media to dataURL to be able to render them in tha Argilla UI andcreate_token_highlightsto highlight tokens in a custom way. Both work on TextQuestion and TextField with use_markdown=True) andassignments(this mainly includesassign_recordsto assign records according to a number of annotators and records, an overlap and the shuffle option; andassign_workspaceto assign and create if needed a workspace according to the record assignment). (#4121)
Fixed¶
- Fixed error in
ArgillaTrainer, with numerical labels, usingRatingQuestioninstead ofRankingQuestion(#4171) - Fixed error in
ArgillaTrainer, now we can train forextractive_question_answeringusing a validation sample (#4204) - Fixed error in
ArgillaTrainer, when training forsentence-similarityit didn't work with a list of values per record (#4211) - Fixed error in the unification strategy for
RankingQuestion(#4295) - Fixed
TextClassificationSettings.labels_schemaorder was not being preserved. Closes #3828 (#4332) - Fixed error when requesting non-existing API endpoints. Closes #4073 (#4325)
- Fixed error when passing
draftresponses to create records endpoint. (#4354)
Changed¶
- [breaking] Suggestions
agentfield only accepts now some specific characters and a limited length. (#4265) - [breaking] Suggestions
scorefield only accepts now float values in the range0to1. (#4266) - Updated
POST /api/v1/dataset/:dataset_id/records/searchendpoint to support optionalqueryattribute. (#4327) - Updated
POST /api/v1/dataset/:dataset_id/records/searchendpoint to supportfilterandsortattributes. (#4327) - Updated
POST /api/v1/me/datasets/:dataset_id/records/searchendpoint to support optionalqueryattribute. (#4270) - Updated
POST /api/v1/me/datasets/:dataset_id/records/searchendpoint to supportfilterandsortattributes. (#4270) - Changed the logging style while pulling and pushing
FeedbackDatasetto Argilla fromtqdmstyle torich. (#4267). Contributed by @zucchini-nlp. - Updated
push_to_argillato printreprof the pushedRemoteFeedbackDatasetafter push and changedshow_progressto True by default. (#4223) - Changed
modelsandtokenizerfor theArgillaTrainerto explicitly allow for changing them when needed. (#4214).
1.19.0¶
Added¶
- Added
POST /api/v1/datasets/:dataset_id/records/searchendpoint to search for records without user context, including responses by all users. (#4143) - Added
POST /api/v1/datasets/:dataset_id/vectors-settingsendpoint for creating vector settings for a dataset. (#3776) - Added
GET /api/v1/datasets/:dataset_id/vectors-settingsendpoint for listing the vectors settings for a dataset. (#3776) - Added
DELETE /api/v1/vectors-settings/:vector_settings_idendpoint for deleting a vector settings. (#3776) - Added
PATCH /api/v1/vectors-settings/:vector_settings_idendpoint for updating a vector settings. (#4092) - Added
GET /api/v1/records/:record_idendpoint to get a specific record. (#4039) - Added support to include vectors for
GET /api/v1/datasets/:dataset_id/recordsendpoint response usingincludequery param. (#4063) - Added support to include vectors for
GET /api/v1/me/datasets/:dataset_id/recordsendpoint response usingincludequery param. (#4063) - Added support to include vectors for
POST /api/v1/me/datasets/:dataset_id/records/searchendpoint response usingincludequery param. (#4063) - Added
show_progressargument tofrom_huggingface()method to make the progress bar for parsing records process optional.(#4132). - Added a progress bar for parsing records process to
from_huggingface()method withtrangeintqdm.(#4132). - Added to sort by
inserted_atorupdated_atfor datasets with no metadata. (4147) - Added
max_recordsargument topull()method forRemoteFeedbackDataset.(#4074) - Added functionality to push your models to the Hugging Face hub with
ArgillaTrainer.push_to_huggingface(#3976). Contributed by @Racso-3141. - Added
filter_byargument toArgillaTrainerto filter byresponse_status(#4120). - Added
sort_byargument toArgillaTrainerto sort bymetadata(#4120). - Added
max_recordsargument toArgillaTrainerto limit record used for training (#4120). - Added
add_vector_settingsmethod to local and remoteFeedbackDataset. (#4055) - Added
update_vectors_settingsmethod to local and remoteFeedbackDataset. (#4122) - Added
delete_vectors_settingsmethod to local and remoteFeedbackDataset. (#4130) - Added
vector_settings_by_namemethod to local and remoteFeedbackDataset. (#4055) - Added
find_similar_recordsmethod to local and remoteFeedbackDataset. (#4023) - Added
ARGILLA_SEARCH_ENGINEenvironment variable to configure the search engine to use. (#4019)
Changed¶
- [breaking] Remove support for Elasticsearch < 8.5 and OpenSearch < 2.4. (#4173)
- [breaking] Users working with OpenSearch engines must use version >=2.4 and set
ARGILLA_SEARCH_ENGINE=opensearch. (#4019 and #4111) - [breaking] Changed
FeedbackDataset.*_by_name()methods to returnNonewhen no match is found (#4101). - [breaking]
limitquery parameter forGET /api/v1/datasets/:dataset_id/recordsendpoint is now only accepting values greater or equal than1and less or equal than1000. (#4143) - [breaking]
limitquery parameter forGET /api/v1/me/datasets/:dataset_id/recordsendpoint is now only accepting values greater or equal than1and less or equal than1000. (#4143) - Update
GET /api/v1/datasets/:dataset_id/recordsendpoint to fetch record using the search engine. (#4142) - Update
GET /api/v1/me/datasets/:dataset_id/recordsendpoint to fetch record using the search engine. (#4142) - Update
POST /api/v1/datasets/:dataset_id/recordsendpoint to allow to create records withvectors(#4022) - Update
PATCH /api/v1/datasets/:dataset_idendpoint to allow updatingallow_extra_metadataattribute. (#4112) - Update
PATCH /api/v1/datasets/:dataset_id/recordsendpoint to allow to update records withvectors. (#4062) - Update
PATCH /api/v1/records/:record_idendpoint to allow to update record withvectors. (#4062) - Update
POST /api/v1/me/datasets/:dataset_id/records/searchendpoint to allow to search records with vectors. (#4019) - Update
BaseElasticAndOpenSearchEngine.index_recordsmethod to also index record vectors. (#4062) - Update
FeedbackDataset.__init__to allow passing a list of vector settings. (#4055) - Update
FeedbackDataset.push_to_argillato also push vector settings. (#4055) - Update
FeedbackDatasetRecordto support the creation of records with vectors. (#4043) - Using cosine similarity to compute similarity between vectors. (#4124)
Fixed¶
- Fixed svg images out of screen with too large images (#4047)
- Fixed creating records with responses from multiple users. Closes #3746 and #3808 (#4142)
- Fixed deleting or updating responses as an owner for annotators. (Commit 403a66d)
- Fixed passing user_id when getting records by id. (Commit 98c7927)
- Fixed non-basic tags serialized when pushing a dataset to the Hugging Face Hub. Closes #4089 (#4200)
1.18.0¶
Added¶
- New
GET /api/v1/datasets/:dataset_id/metadata-propertiesendpoint for listing dataset metadata properties. (#3813) - New
POST /api/v1/datasets/:dataset_id/metadata-propertiesendpoint for creating dataset metadata properties. (#3813) - New
PATCH /api/v1/metadata-properties/:metadata_property_idendpoint allowing the update of a specific metadata property. (#3952) - New
DELETE /api/v1/metadata-properties/:metadata_property_idendpoint for deletion of a specific metadata property. (#3911) - New
GET /api/v1/metadata-properties/:metadata_property_id/metricsendpoint to compute metrics for a specific metadata property. (#3856) - New
PATCH /api/v1/records/:record_idendpoint to update a record. (#3920) - New
PATCH /api/v1/dataset/:dataset_id/recordsendpoint to bulk update the records of a dataset. (#3934) - Missing validations to
PATCH /api/v1/questions/:question_id. Nowtitleanddescriptionare using the same validations used to create questions. (#3967) - Added
TermsMetadataProperty,IntegerMetadataPropertyandFloatMetadataPropertyclasses allowing to define metadata properties for aFeedbackDataset. (#3818) - Added
metadata_filterstofilter_bymethod inRemoteFeedbackDatasetto filter based on metadata i.e.TermsMetadataFilter,IntegerMetadataFilter, andFloatMetadataFilter. (#3834) - Added a validation layer for both
metadata_propertiesandmetadata_filtersin their schemas and as part of theadd_recordsandfilter_bymethods, respectively. (#3860) - Added
sort_byquery parameter to listing records endpoints that allows to sort the records byinserted_at,updated_ator metadata property. (#3843) - Added
add_metadata_propertymethod to bothFeedbackDatasetandRemoteFeedbackDataset(i.e.FeedbackDatasetin Argilla). (#3900) - Added fields
inserted_atandupdated_atinRemoteResponseSchema. (#3822) - Added support for
sort_byforRemoteFeedbackDataseti.e. aFeedbackDatasetuploaded to Argilla. (#3925) - Added
metadata_propertiessupport for bothpush_to_huggingfaceandfrom_huggingface. (#3947) - Add support for update records (
metadata) from Python SDK. (#3946) - Added
delete_metadata_propertiesmethod to delete metadata properties. (#3932) - Added
update_metadata_propertiesmethod to updatemetadata_properties. (#3961) - Added automatic model card generation through
ArgillaTrainer.save(#3857) - Added
FeedbackDatasetTaskTemplateMixinfor pre-defined task templates. (#3969) - A maximum limit of 50 on the number of options a ranking question can accept. (#3975)
- New
last_activity_atfield toFeedbackDatasetexposing when the last activity for the associated dataset occurs. (#3992)
Changed¶
GET /api/v1/datasets/{dataset_id}/records,GET /api/v1/me/datasets/{dataset_id}/recordsandPOST /api/v1/me/datasets/{dataset_id}/records/searchendpoints to return thetotalnumber of records. (#3848, #3903)- Implemented
__len__method for filtered datasets to return the number of records matching the provided filters. (#3916) - Increase the default max result window for Elasticsearch created for Feedback datasets. (#3929)
- Force elastic index refresh after records creation. (#3929)
- Validate metadata fields for filtering and sorting in the Python SDK. (#3993)
- Using metadata property name instead of id for indexing data in search engine index. (#3994)
Fixed¶
- Fixed response schemas to allow
valuesto beNonei.e. when a record is discarded theresponse.valuesare set toNone. (#3926)
1.17.0¶
Added¶
- Added fields
inserted_atandupdated_atinRemoteResponseSchema(#3822). - Added automatic model card generation through
ArgillaTrainer.save(#3857). - Added task templates to the
FeedbackDataset(#3973).
Changed¶
- Updated
Dockerfileto use multi stage build (#3221 and #3793). - Updated active learning for text classification notebooks to use the most recent small-text version (#3831).
- Changed argilla dataset name in the active learning for text classification notebooks to be consistent with the default names in the huggingface spaces (#3831).
- FeedbackDataset API methods have been aligned to be accessible through the several implementations (#3937).
- The
unify_responsessupport for remote datasets (#3937).
Fixed¶
- Fix field not shown in the order defined in the dataset settings. Closes #3959 (#3984)
- Updated active learning for text classification notebooks to pass ids of type int to
TextClassificationRecord(#3831). - Fixed record fields validation that was preventing from logging records with optional fields (i.e.
required=True) when the field value wasNone(#3846). - Always set
pretrained_model_name_or_pathattribute as string inArgillaTrainer(#3914). - The
inserted_atandupdated_atattributes are create using theutcnowfactory to avoid unexpected race conditions on timestamp creation (#3945) - Fixed
configure_dataset_settingswhen providing the workspace via the argworkspace(#3887). - Fixed saving of models trained with
ArgillaTrainerwith apeft_configparameter (#3795). - Fixed backwards compatibility on
from_huggingfacewhen loading aFeedbackDatasetfrom the Hugging Face Hub that was previously dumped using another version of Argilla, starting at 1.8.0, when it was first introduced (#3829). - Fixed wrong
__repr__problem forTrainingTask. (#3969) - Fixed wrong key return error
prepare_for_training_with_*forTrainingTask. (#3969)
Deprecated¶
- Function
rg.configure_datasetis deprecated in favour ofrg.configure_dataset_settings. The former will be removed in version 1.19.0
1.16.0¶
Added¶
- Added
ArgillaTrainerintegration with sentence-transformers, allowing fine tuning for sentence similarity (#3739) - Added
ArgillaTrainerintegration withTrainingTask.for_question_answering(#3740) - Added
Auto save recordto save automatically the current record that you are working on (#3541) - Added
ArgillaTrainerintegration with OpenAI, allowing fine tuning for chat completion (#3615) - Added
workspaces listcommand to list Argilla workspaces (#3594). - Added
datasets listcommand to list Argilla datasets (#3658). - Added
users createcommand to create users (#3667). - Added
whoamicommand to get current user (#3673). - Added
users deletecommand to delete users (#3671). - Added
users listcommand to list users (#3688). - Added
workspaces delete-usercommand to remove a user from a workspace (#3699). - Added
datasets listcommand to list Argilla datasets (#3658). - Added
users createcommand to create users (#3667). - Added
users deletecommand to delete users (#3671). - Added
workspaces createcommand to create an Argilla workspace (#3676). - Added
datasets push-to-hubcommand to push aFeedbackDatasetfrom Argilla into the HuggingFace Hub (#3685). - Added
infocommand to get info about the used Argilla client and server (#3707). - Added
datasets deletecommand to delete aFeedbackDatasetfrom Argilla (#3703). - Added
created_atandupdated_atproperties toRemoteFeedbackDatasetandFilteredRemoteFeedbackDataset(#3709). - Added handling
PermissionErrorwhen executing a command with a logged in user with not enough permissions (#3717). - Added
workspaces add-usercommand to add a user to workspace (#3712). - Added
workspace_idparam toGET /api/v1/me/datasetsendpoint (#3727). - Added
workspace_idarg tolist_datasetsin the Python SDK (#3727). - Added
argillascript that allows to execute Argilla CLI using theargillacommand (#3730). - Added support for passing already initialized
modelandtokenizerinstances to theArgillaTrainer(#3751) - Added
server_infofunction to check the Argilla server information (also accessible viarg.server_info) (#3772).
Changed¶
- Move
databasecommands underservergroup of commands (#3710) servercommands only included in the CLI app whenserverextra requirements are installed (#3710).- Updated
PUT /api/v1/responses/{response_id}to replacevaluesstored with receivedvaluesin request (#3711). - Display a
UserWarningwhen theuser_idinWorkspace.add_userandWorkspace.delete_useris the ID of an user with the owner role as they don't require explicit permissions (#3716). - Rename
taskssub-package tocli(#3723). - Changed
argilla databasecommand in the CLI to now be accessed viaargilla server database, to be deprecated in the upcoming release (#3754). - Changed
visible_options(of label and multi label selection questions) validation in the backend to check that the provided value is greater or equal than/to 3 and less or equal than/to the number of provided options (#3773).
Fixed¶
- Fixed
remove user modification in text component on clear answers(#3775) - Fixed
Highlight raw text field in dataset feedback task(#3731) - Fixed
Field title too long(#3734) - Fixed error messages when deleting a
DatasetForTextClassification(#3652) - Fixed
Pending queuepagination problems when during data annotation (#3677) - Fixed
visible_labelsdefault value to be 20 just whenvisible_labelsnot provided andlen(labels) > 20, otherwise it will either be the providedvisible_labelsvalue orNone, forLabelQuestionandMultiLabelQuestion(#3702). - Fixed
DatasetCardgeneration whenRemoteFeedbackDatasetcontains suggestions (#3718). - Add missing
draftstatus inResponseSchemaas now there can be responses withdraftstatus when annotating via the UI (#3749). - Searches when queried words are distributed along the record fields (#3759).
- Fixed Python 3.11 compatibility issue with
/api/datasetsendpoints due to theTaskTypeenum replacement in the endpoint URL (#3769). - Fixed
RankingValueSchemaandFeedbackRankingValueModelschemas to allowrank=Nonewhenstatus=draft(#3781).
1.15.1¶
Fixed¶
- Fixed
Text componenttext content sanitization behavior just for markdown to prevent disappear the text(#3738) - Fixed
Text componentnow you need to press Escape to exit the text area (#3733) - Fixed
SearchEnginewas creating the same number of primary shards and replica shards for eachFeedbackDataset(#3736).
1.15.0¶
Added¶
- Added
Enable to update guidelines and dataset settings for Feedback Datasets directly in the UI(#3489) - Added
ArgillaTrainerintegration with TRL, allowing for easy supervised finetuning, reward modeling, direct preference optimization and proximal policy optimization (#3467) - Added
formatting_functoArgillaTrainerforFeedbackDatasetdatasets add a custom formatting for the data (#3599). - Added
loginfunction inargilla.client.loginto login into an Argilla server and store the credentials locally (#3582). - Added
logincommand to login into an Argilla server (#3600). - Added
logoutcommand to logout from an Argilla server (#3605). - Added
DELETE /api/v1/suggestions/{suggestion_id}endpoint to delete a suggestion given its ID (#3617). - Added
DELETE /api/v1/records/{record_id}/suggestionsendpoint to delete several suggestions linked to the same record given their IDs (#3617). - Added
response_statusparam toGET /api/v1/datasets/{dataset_id}/recordsto be able to filter byresponse_statusas previously included forGET /api/v1/me/datasets/{dataset_id}/records(#3613). - Added
listclassmethod toArgillaMixinto be used asFeedbackDataset.list(), also including theworkspaceto list from as arg (#3619). - Added
filter_bymethod inRemoteFeedbackDatasetto filter based onresponse_status(#3610). - Added
list_workspacesfunction (to be used asrg.list_workspaces, butWorkspace.listis preferred) to list all the workspaces from an user in Argilla (#3641). - Added
list_datasetsfunction (to be used asrg.list_datasets) to list theTextClassification,TokenClassification, andText2Textdatasets in Argilla (#3638). - Added
RemoteSuggestionSchemato manage suggestions in Argilla, including thedeletemethod to delete suggestios from Argilla viaDELETE /api/v1/suggestions/{suggestion_id}(#3651). - Added
delete_suggestionstoRemoteFeedbackRecordto remove suggestions from Argilla viaDELETE /api/v1/records/{record_id}/suggestions(#3651).
Changed¶
- Changed
Optional label for * mark for required question(#3608) - Updated
RemoteFeedbackDataset.delete_recordsto use batch delete records endpoint (#3580). - Included
allowed_for_rolesfor someRemoteFeedbackDataset,RemoteFeedbackRecords, andRemoteFeedbackRecordmethods that are only allowed for users with rolesownerandadmin(#3601). - Renamed
ArgillaToFromMixintoArgillaMixin(#3619). - Move
usersCLI app underdatabaseCLI app (#3593). - Move server
Enumclasses toargilla.server.enumsmodule (#3620).
Fixed¶
- Fixed
Filter by workspace in breadcrumbs(#3577) - Fixed
Filter by workspace in datasets table(#3604) - Fixed
Query search highlightfor Text2Text and TextClassification (#3621) - Fixed
RatingQuestion.valuesvalidation to raise aValidationErrorwhen values are out of range i.e. [1, 10] (#3626).
Removed¶
- Removed
multi_task_text_token_classificationfromTaskTypeas not used (#3640). - Removed
argilla_idin favor ofidfromRemoteFeedbackDataset(#3663). - Removed
fetch_recordsfromRemoteFeedbackDatasetas now the records are lazily fetched from Argilla (#3663). - Removed
push_to_argillafromRemoteFeedbackDataset, as it just works when calling it through aFeedbackDatasetlocally, as now the updates of the remote datasets are automatically pushed to Argilla (#3663). - Removed
set_suggestionsin favor ofupdate(suggestions=...)for bothFeedbackRecordandRemoteFeedbackRecord, as all the updates of any "updateable" attribute of a record will go throughupdateinstead (#3663). - Remove unused
ownerattribute for client Dataset data model (#3665)
1.14.1¶
Fixed¶
- Fixed PostgreSQL database not being updated after
begin_nestedbecause of missingcommit(#3567).
Fixed¶
- Fixed
settingscould not be provided when updating aratingorrankingquestion (#3552).
1.14.0¶
Added¶
- Added
PATCH /api/v1/fields/{field_id}endpoint to update the field title and markdown settings (#3421). - Added
PATCH /api/v1/datasets/{dataset_id}endpoint to update dataset name and guidelines (#3402). - Added
PATCH /api/v1/questions/{question_id}endpoint to update question title, description and some settings (depending on the type of question) (#3477). - Added
DELETE /api/v1/records/{record_id}endpoint to remove a record given its ID (#3337). - Added
pullmethod inRemoteFeedbackDataset(aFeedbackDatasetpushed to Argilla) to pull all the records from it and return it as a local copy as aFeedbackDataset(#3465). - Added
deletemethod inRemoteFeedbackDataset(aFeedbackDatasetpushed to Argilla) (#3512). - Added
delete_recordsmethod inRemoteFeedbackDataset, anddeletemethod inRemoteFeedbackRecordto delete records from Argilla (#3526).
Changed¶
- Improved efficiency of weak labeling when dataset contains vectors (#3444).
- Added
ArgillaDatasetMixinto detach the Argilla-related functionality from theFeedbackDataset(#3427) - Moved
FeedbackDataset-relatedpydantic.BaseModelschemas toargilla.client.feedback.schemasinstead, to be better structured and more scalable and maintainable (#3427) - Update CLI to use database async connection (#3450).
- Limit rating questions values to the positive range [1, 10] (#3451).
- Updated
POST /api/usersendpoint to be able to provide a list of workspace names to which the user should be linked to (#3462). - Updated Python client
User.createmethod to be able to provide a list of workspace names to which the user should be linked to (#3462). - Updated
GET /api/v1/me/datasets/{dataset_id}/recordsendpoint to allow getting records matching one of the response statuses provided via query param (#3359). - Updated
POST /api/v1/me/datasets/{dataset_id}/recordsendpoint to allow searching records matching one of the response statuses provided via query param (#3359). - Updated
SearchEngine.searchmethod to allow searching records matching one of the response statuses provided (#3359). - After calling
FeedbackDataset.push_to_argilla, the methodsFeedbackDataset.add_recordsandFeedbackRecord.set_suggestionswill automatically call Argilla with no need of callingpush_to_argillaexplicitly (#3465). - Now calling
FeedbackDataset.push_to_huggingfacedumps theresponsesas aList[Dict[str, Any]]instead ofSequenceto make it more readable via 🤗datasets(#3539).
Fixed¶
- Fixed issue with
boolvalues anddefaultfrom Jinja2 while generating the HuggingFaceDatasetCardfromargilla_template.md(#3499). - Fixed
DatasetConfig.from_yamlwhich was failing when callingFeedbackDataset.from_huggingfaceas the UUIDs cannot be deserialized automatically byPyYAML, so UUIDs are neither dumped nor loaded anymore (#3502). - Fixed an issue that didn't allow the Argilla server to work behind a proxy (#3543).
TextClassificationSettingsandTokenClassificationSettingslabels are properly parsed to strings both in the Python client and in the backend endpoint (#3495).- Fixed
PUT /api/v1/datasets/{dataset_id}/publishto check whether at least one field and question hasrequired=True(#3511). - Fixed
FeedbackDataset.from_huggingfaceassuggestionswere being lost when there were noresponses(#3539). - Fixed
QuestionSchemaandFieldSchemanot validatingnameattribute (#3550).
Deprecated¶
- After calling
FeedbackDataset.push_to_argilla, callingpush_to_argillaagain won't do anything since the dataset is already pushed to Argilla (#3465). - After calling
FeedbackDataset.push_to_argilla, callingfetch_recordswon't do anything since the records are lazily fetched from Argilla (#3465). - After calling
FeedbackDataset.push_to_argilla, the Argilla ID is no longer stored in the attribute/propertyargilla_idbut inidinstead (#3465).
1.13.3¶
Fixed¶
- Fixed
ModuleNotFoundErrorcaused because theargilla.utils.telemetrymodule used in theArgillaTrainerwas importing an optional dependency not installed by default (#3471). - Fixed
ImportErrorcaused because theargilla.client.feedback.configmodule was importingpyyamloptional dependency not installed by default (#3471).
1.13.2¶
Fixed¶
- The
suggestion_type_enumENUM data type created in PostgreSQL didn't have any value (#3445).
1.13.1¶
Fixed¶
- Fix database migration for PostgreSQL (See #3438)
1.13.0¶
Added¶
- Added
GET /api/v1/users/{user_id}/workspacesendpoint to list the workspaces to which a user belongs (#3308 and #3343). - Added
HuggingFaceDatasetMixinfor internal usage, to detach theFeedbackDatasetintegrations from the class itself, and use Mixins instead (#3326). - Added
GET /api/v1/records/{record_id}/suggestionsAPI endpoint to get the list of suggestions for the responses associated to a record (#3304). - Added
POST /api/v1/records/{record_id}/suggestionsAPI endpoint to create a suggestion for a response associated to a record (#3304). - Added support for
RankingQuestionStrategy,RankingQuestionUnificationand the.for_text_classificationmethod for theTrainingTaskMapping(#3364) - Added
PUT /api/v1/records/{record_id}/suggestionsAPI endpoint to create or update a suggestion for a response associated to a record (#3304 & 3391). - Added
suggestionsattribute toFeedbackRecord, and allow adding and retrieving suggestions from the Python client (#3370) - Added
allowed_for_rolesPython decorator to check whether the current user has the required role to access the decorated function/method forUserandWorkspace(#3383) - Added API and Python Client support for workspace deletion (Closes #3260)
- Added
GET /api/v1/me/workspacesendpoint to list the workspaces of the current active user (#3390)
Changed¶
- Updated output payload for
GET /api/v1/datasets/{dataset_id}/records,GET /api/v1/me/datasets/{dataset_id}/records,POST /api/v1/me/datasets/{dataset_id}/records/searchendpoints to include the suggestions of the records based on the value of theincludequery parameter (#3304). - Updated
POST /api/v1/datasets/{dataset_id}/recordsinput payload to add suggestions (#3304). - The
POST /api/datasets/:dataset-id/:task/bulkendpoints don't create the dataset if does not exists (Closes #3244) - Added Telemetry support for
ArgillaTrainer(closes #3325) User.workspacesis no longer an attribute but a property, and is callinglist_user_workspacesto list all the workspace names for a given user ID (#3334)- Renamed
FeedbackDatasetConfigtoDatasetConfigand export/import from YAML as default instead of JSON (just used internally onpush_to_huggingfaceandfrom_huggingfacemethods ofFeedbackDataset) (#3326). - The protected metadata fields support other than textual info - existing datasets must be reindex. See docs for more detail (Closes #3332).
- Updated
Dockerfileparent image frompython:3.9.16-slimtopython:3.10.12-slim(#3425). - Updated
quickstart.Dockerfileparent image fromelasticsearch:8.5.3toargilla/argilla-server:${ARGILLA_VERSION}(#3425).
Removed¶
- Removed support to non-prefixed environment variables. All valid env vars start with
ARGILLA_(See #3392).
Fixed¶
- Fixed
GET /api/v1/me/datasets/{dataset_id}/recordsendpoint returning always the responses for the records even ifresponseswas not provided via theincludequery parameter (#3304). - Values for protected metadata fields are not truncated (Closes #3331).
- Big number ids are properly rendered in UI (Closes #3265)
- Fixed
ArgillaDatasetCardto include the values/labels for all the existing questions (#3366)
Deprecated¶
- Integer support for record id in text classification, token classification and text2text datasets.
1.12.1¶
Fixed¶
- Using
rg.initwith defaultargillauser skips setting the default workspace if not available. (Closes #3340) - Resolved wrong import structure for
ArgillaTrainerandTrainingTaskMapping(Closes #3345) - Pin pydantic dependency to version < 2 (Closes 3348)
1.12.0¶
Added¶
- Added
RankingQuestionSettingsclass allowing to create ranking questions in the API usingPOST /api/v1/datasets/{dataset_id}/questionsendpoint (#3232) - Added
RankingQuestionin the Python client to create ranking questions (#3275). - Added
Rankingcomponent in feedback task question form (#3177 & #3246). - Added
FeedbackDataset.prepare_for_trainingmethod for generaring a framework-specific dataset with the responses provided forRatingQuestion,LabelQuestionandMultiLabelQuestion(#3151). - Added
ArgillaSpaCyTransformersTrainerclass for supporting the training withspacy-transformers(#3256).
Docs¶
- Added instructions for how to run the Argilla frontend in the developer docs (#3314).
Changed¶
- All docker related files have been moved into the
dockerfolder (#3053). release.Dockerfilehave been renamed toDockerfile(#3133).- Updated
rg.loadfunction to raise aValueErrorwith a explanatory message for the cases in which the user tries to use the function to load aFeedbackDataset(#3289). - Updated
ArgillaSpaCyTrainerto allow re-usingtok2vec(#3256).
Fixed¶
- Check available workspaces on Argilla on
rg.set_workspace(Closes #3262)
1.11.0¶
Fixed¶
- Replaced
np.floatalias byfloatto avoidAttributeErrorwhen usingfind_label_errorsfunction withnumpy>=1.24.0(#3214). - Fixed
format_as("datasets")when no responses or optional respones inFeedbackRecord, to set their value to what 🤗 Datasets expects instead of justNone(#3224). - Fixed
push_to_huggingface()whengenerate_card=True(default behaviour), as we were passing a sample record to theArgillaDatasetCardclass, andUUIDs introduced in 1.10.0 (#3192), are not JSON-serializable (#3231). - Fixed
from_argillaandpush_to_argillato ensure consistency on both field and question re-construction, and to ensureUUIDs are properly serialized asstr, respectively (#3234). - Refactored usage of
import argilla as rgto clarify package navigation (#3279).
Docs¶
- Fixed URLs in Weak Supervision with Sentence Tranformers tutorial #3243.
- Fixed library buttons' formatting on Tutorials page (#3255).
- Modified styling of error code outputs in notebooks (#3270).
- Added ElasticSearch and OpenSearch versions (#3280).
- Removed template notebook from table of contents (#3271).
- Fixed tutorials with
pip install argillato not use older versions of the package (#3282).
Added¶
- Added
metadataattribute to theRecordof theFeedbackDataset(#3194) - New
users updatecommand to update the role for an existing user (#3188) - New
Workspaceclass to allow users manage their Argilla workspaces and the users assigned to those workspaces via the Python client (#3180) - Added
Userclass to let users manage their Argilla users via the Python client (#3169). - Added an option to display
tqdmprogress bar toFeedbackDataset.push_to_argillawhen looping over the records to upload (#3233).
Changed¶
- The role system now support three different roles
owner,adminandannotator(#3104) adminrole is scoped to workspace-level operations (#3115)- The
owneruser is created among the default pool of users in the quickstart, and the default user in the server has nowownerrole (#3248), reverting (#3188).
Deprecated¶
- As of Python 3.7 end-of-life (EOL) on 2023-06-27, Argilla will no longer support Python 3.7 (#3188). More information at https://peps.python.org/pep-0537/
1.10.0¶
Added¶
- Added search component for feedback datasets (#3138)
- Added markdown support for feedback dataset guidelines (#3153)
- Added Train button for feedback datasets (#3170)
Changed¶
- Updated
SearchEngineandPOST /api/v1/me/datasets/{dataset_id}/records/searchto return thetotalnumber of records matching the search query (#3166)
Fixed¶
- Replaced Enum for string value in URLs for client API calls (Closes #3149)
- Resolve breaking issue with
ArgillaSpanMarkerTrainerfor Named Entity Recognition withspan_markerv1.1.x onwards. - Move
ArgillaDatasetCardimport under@requires_versiondecorator, so that theImportErroronhuggingface_hubis handled properly (#3174) - Allow flow
FeedbackDataset.from_argilla->FeedbackDataset.push_to_argillaunder different dataset names and/or workspaces (#3192)
Docs¶
1.9.0¶
Added¶
- Added boolean
use_markdownproperty toTextFieldSettingsmodel. - Added boolean
use_markdownproperty toTextQuestionSettingsmodel. - Added new status
draftfor theResponsemodel. - Added
LabelSelectionQuestionSettingsclass allowing to create label selection (single-choice) questions in the API (#3005) - Added
MultiLabelSelectionQuestionSettingsclass allowing to create multi-label selection (multi-choice) questions in the API (#3010). - Added
POST /api/v1/me/datasets/{dataset_id}/records/searchendpoint (#3068). - Added new components in feedback task Question form: MultiLabel (#3064) and SingleLabel (#3016).
- Added docstrings to the
pydantic.BaseModels defined atargilla/client/feedback/schemas.py(#3137) - Added the information about executing tests in the developer documentation ([#3143]).
Changed¶
- Updated
GET /api/v1/me/datasets/:dataset_id/metricsoutput payload to include the count of responses withdraftstatus. - Added
LabelSelectionQuestionSettingsclass allowing to create label selection (single-choice) questions in the API. - Added
MultiLabelSelectionQuestionSettingsclass allowing to create multi-label selection (multi-choice) questions in the API. - Database setup for unit tests. Now the unit tests use a different database than the one used by the local Argilla server (Closes #2987).
- Updated
alembicsetup to be able to autogenerate revision/migration scripts using SQLAlchemy metadata from Argilla server models (#3044) - Improved
DatasetCardgeneration onFeedbackDataset.push_to_huggingfacewhengenerate_card=True, following the official HuggingFace Hub template, but suited toFeedbackDatasets from Argilla (#3110)
Fixed¶
- Disallow
fieldsandquestionsinFeedbackDatasetwith the same name (#3126). - Fixed broken links in the documentation and updated the development branch name from
developmenttodevelop([#3145]).
1.8.0¶
Added¶
/api/v1/datasetsnew endpoint to list and create datasets (#2615)./api/v1/datasets/{dataset_id}new endpoint to get and delete datasets (#2615)./api/v1/datasets/{dataset_id}/publishnew endpoint to publish a dataset (#2615)./api/v1/datasets/{dataset_id}/questionsnew endpoint to list and create dataset questions (#2615)/api/v1/datasets/{dataset_id}/fieldsnew endpoint to list and create dataset fields (#2615)/api/v1/datasets/{dataset_id}/questions/{question_id}new endpoint to delete a dataset questions (#2615)/api/v1/datasets/{dataset_id}/fields/{field_id}new endpoint to delete a dataset field (#2615)/api/v1/workspaces/{workspace_id}new endpoint to get workspaces by id (#2615)/api/v1/responses/{response_id}new endpoint to update and delete a response (#2615)/api/v1/datasets/{dataset_id}/recordsnew endpoint to create and list dataset records (#2615)/api/v1/me/datasetsnew endpoint to list user visible datasets (#2615)/api/v1/me/dataset/{dataset_id}/recordsnew endpoint to list dataset records with user responses (#2615)/api/v1/me/datasets/{dataset_id}/metricsnew endpoint to get the dataset user metrics (#2615)/api/v1/me/records/{record_id}/responsesnew endpoint to create record user responses (#2615)- showing new feedback task datasets in datasets list ([#2719])
- new page for feedback task ([#2680])
- show feedback task metrics ([#2822])
- user can delete dataset in dataset settings page ([#2792])
- Support for
FeedbackDatasetin Python client (parent PR #2615, and nested PRs: [#2949], [#2827], [#2943], [#2945], [#2962], and [#3003]) - Integration with the HuggingFace Hub ([#2949])
- Added
ArgillaPeftTrainerfor text and token classificaiton #2854 - Added
predict_proba()method toArgillaSetFitTrainer - Added
ArgillaAutoTrainTrainerfor Text Classification #2664 - New
database revisionscommand showing database revisions info
Fixes¶
- Avoid rendering html for invalid html strings in Text2text ([#2911]https://github.com/argilla-io/argilla/issues/2911)
Changed¶
- The
database migratecommand accepts a--revisionparam to provide specific revision id tokens_lengthmetrics function returns empty data (#3045)token_lengthmetrics function returns empty data (#3045)mention_lengthmetrics function returns empty data (#3045)entity_densitymetrics function returns empty data (#3045)
Deprecated¶
- Using Argilla with Python 3.7 runtime is deprecated and support will be removed from version 1.11.0 (#2902)
tokens_lengthmetrics function has been deprecated and will be removed in 1.10.0 (#3045)token_lengthmetrics function has been deprecated and will be removed in 1.10.0 (#3045)mention_lengthmetrics function has been deprecated and will be removed in 1.10.0 (#3045)entity_densitymetrics function has been deprecated and will be removed in 1.10.0 (#3045)
Removed¶
- Removed mention
density,tokens_lengthandchars_lengthmetrics from token classification metrics storage (#3045) - Removed token
char_start,char_end,tag, andscoremetrics from token classification metrics storage (#3045) - Removed tags-related metrics from token classification metrics storage (#3045)
1.7.0¶
Added¶
- add
max_retriesandnum_threadsparameters torg.logto run data logging request concurrently with backoff retry policy. See #2458 and #2533 rg.loadacceptsinclude_vectorsandinclude_metricswhen loading data. Closes #2398- Added
settingsparam toprepare_for_training(#2689) - Added
prepare_for_trainingforopenai(#2658) - Added
ArgillaOpenAITrainer(#2659) - Added
ArgillaSpanMarkerTrainerfor Named Entity Recognition (#2693) - Added
ArgillaTrainerCLI support. Closes (#2809)
Fixes¶
- fix image alignment on token classification
Changed¶
- Argilla quickstart image dependencies are externalized into
quickstart.requirements.txt. See #2666 - bulk endpoints will upsert data when record
idis present. Closes #2535 - moved from
clicktotyperCLI support. Closes (#2815) - Argilla server docker image is built with PostgreSQL support. Closes #2686
- The
rg.logcomputes all batches and raise an error for all failed batches. - The default batch size for
rg.logis now 100.
Fixed¶
argilla.trainingbugfixes and unification (#2665)- Resolved several small bugs in the
ArgillaTrainer.
Deprecated¶
- The
rg.log_asyncfunction is deprecated and will be removed in next minor release.
1.6.0¶
Added¶
ARGILLA_HOME_PATHnew environment variable (#2564).ARGILLA_DATABASE_URLnew environment variable (#2564).- Basic support for user roles with
adminandannotator(#2564). id,first_name,last_name,role,inserted_atandupdated_atnew user fields (#2564)./api/usersnew endpoint to list and create users (#2564)./api/users/{user_id}new endpoint to delete users (#2564)./api/workspacesnew endpoint to list and create workspaces (#2564)./api/workspaces/{workspace_id}/usersnew endpoint to list workspace users (#2564)./api/workspaces/{workspace_id}/users/{user_id}new endpoint to create and delete workspace users (#2564).argilla.tasks.users.migratenew task to migrate users from old YAML file to database (#2564).argilla.tasks.users.createnew task to create a user (#2564).argilla.tasks.users.create_defaultnew task to create a user with default credentials (#2564).argilla.tasks.database.migratenew task to execute database migrations (#2564).release.Dockerfileandquickstart.Dockerfilenow creates a defaultargilladatavolume to persist data (#2564).- Add user settings page. Closes #2496
- Added
Argilla.trainingmodule with support forspacy,setfit, andtransformers. Closes #2504
Fixes¶
- Now the
prepare_for_trainingmethod is working whenmulti_label=True. Closes #2606
Changed¶
ARGILLA_USERS_DB_FILEenvironment variable now it's only used to migrate users from YAML file to database (#2564).full_nameuser field is now deprecated andfirst_nameandlast_nameshould be used instead (#2564).passworduser field now requires a minimum of8and a maximum of100characters in size (#2564).quickstart.Dockerfileimage default users fromteamandargillatoadminandannotatorincluding new passwords and API keys (#2564).- Datasets to be managed only by users with
adminrole (#2564). - The list of rules is now accessible while metrics are computed. Closes#2117
- Style updates for weak labeling and adding feedback toast when delete rules. See #2626 and #2648
Removed¶
emailuser field (#2564).disableduser field (#2564).- Support for private workspaces (#2564).
ARGILLA_LOCAL_AUTH_DEFAULT_APIKEYandARGILLA_LOCAL_AUTH_DEFAULT_PASSWORDenvironment variables. Usepython -m argilla.tasks.users.create_defaultinstead (#2564).- The old headers for
API Keyandworkspacefrom python client - The default value for old
API Keyconstant. Closes #2251
1.5.1 - 2023-03-30¶
Fixes¶
- Copying datasets between workspaces with proper owner/workspace info. Closes #2562
- Copy dataset with empty workspace to the default user workspace 905d4de
- Using elasticsearch config to request backend version. Closes #2311
- Remove sorting by score in labels. Closes #2622
Changed¶
- Update field name in metadata for image url. See #2609
- Improvements in tutorial doc cards. Closes #2216
1.5.0 - 2023-03-21¶
Added¶
- Add the fields to retrieve when loading the data from argilla.
rg.loadtakes too long because of the vector field, even when users don't need it. Closes #2398 - Add new page and components for dataset settings. Closes #2442
- Add ability to show image in records (for TokenClassification and TextClassification) if an URL is passed in metadata with the key _image_url
- Non-searchable fields support in metadata. #2570
- Add record ID references to the prepare for training methods. Closes #2483
- Add tutorial on Image Classification. #2420
- Add Train button, visible for "admin" role, with code snippets from a selection of libraries. Closes [#2591] (https://github.com/argilla-io/argilla/pull/2591)
Changed¶
- Labels are now centralized in a specific vuex ORM called GlobalLabel Model, see https://github.com/argilla-io/argilla/issues/2210. This model is the same for TokenClassification and TextClassification (so both task have labels with color_id and shortcuts parameters in the vuex ORM)
- The shortcuts improvement for labels #2339 have been moved to the vuex ORM in dataset settings feature #2444
- Update "Define a labeling schema" section in docs.
- The record inputs are sorted alphabetically in UI by default. #2581
- The record inputs are fully visible when pagination size is one and the height of collapsed area size is bigger for laptop screen. #2587
Fixes¶
- Allow URL to be clickable in Jupyter notebook again. Closes #2527
Removed¶
- Removing some data scan deprecated endpoints used by old clients. This change will break compatibility with client
<v1.3.0 - Stop using old scan deprecated endpoints in python client. This logic will break client compatibility with server version
<1.3.0 - Remove the previous way to add labels through the dataset page. Now labels can be added only through dataset settings page.