Русские видео

Сейчас в тренде

Иностранные видео


Скачать с ютуб How To Create Datasets for Finetuning From Multiple Sources! Improving Finetunes With Embeddings. в хорошем качестве

How To Create Datasets for Finetuning From Multiple Sources! Improving Finetunes With Embeddings. 1 год назад


Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса savevideohd.ru



How To Create Datasets for Finetuning From Multiple Sources! Improving Finetunes With Embeddings.

Today, we delve into the process of setting up data sets for fine-tuning large language models (LLMs). Starting from the initial considerations needed before dataset construction, we navigate through various pipeline setup questions, such as the need for embeddings. We discuss how to structure raw text data for fine-tuning, exemplified with real coding and medical appeals scenarios. We also explore how to leverage embeddings to provide additional context to our models, a crucial step in building more general and robust models. The video further explains how to transform books into structured data sets using LLMs, with an example of transforming the book 'Twenty Thousand Leagues Under the Sea' into a question-and-answer format. In addition, we look at the process of fine-tuning LLMs to write in specific programming languages, showing a practical application with a Cipher query for graph databases. Lastly, we demonstrate how to enhance the performance of a medical application with the use of embedded information utilizing the Superbooga platform. Whether you're interested in coding, medical applications, book conversion, or simply fine-tuning LLMs in general, this video provides comprehensive insights. Tune in to discover how to augment your models with advanced techniques and tools. Join us on our live stream for a deep dive into how to broaden the context in local models and results from our book training and comedy sets. 0:00 Intro 0:44 Considerations For Finetuning Datasets 2:45 Reviewing Embeddings 5:35 Finetuning With Embeddings 8:31 Creating Datasets From Raw/Books 12:08 Coding Finetuning Example 14:02 Medicare/Medicaid Appeals Example 17:01 Outro Training datasets: https://github.com/tomasonjo/blog-dat... Massive Text Embeddings: https://huggingface.co/blog/mteb Github Repo: https://github.com/Aemon-Algiz/Datese... #machinelearning #ArtificialIntelligence #LargeLanguageModels #FineTuning #DataPreprocessing #Embeddings

Comments