Examples#

Here you can find end-to-end examples to help you get started with curanting datasets and collecting feedback to fine-tune LLMs.

Curate an instruction dataset for supervised fine-tuning

Learn how to set up a project to curate a public dataset that can be used to fine-tune an instruction-following model.

Train a Reward Model for RLHF

Learn how to collect comparison or human preference data and train a reward model with the trl library.