Русские видео

Сейчас в тренде

Иностранные видео


Скачать с ютуб How ChatGPT is Trained в хорошем качестве

How ChatGPT is Trained 1 год назад


Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса savevideohd.ru



How ChatGPT is Trained

This short tutorial explains the training objectives used to develop ChatGPT, the new chatbot language model from OpenAI. Timestamps: 0:00 - Non-intro 0:24 - Training overview 1:33 - Generative pretraining (the raw language model) 4:18 - The alignment problem 6:26 - Supervised fine-tuning 7:19 - Limitations of supervision: distributional shift 8:50 - Reward learning based on preferences 10:39 - Reinforcement learning from human feedback 13:02 - Room for improvement ChatGPT: https://openai.com/blog/chatgpt Relevant papers for learning more: InstructGPT: Ouyang et al., 2022 - https://arxiv.org/abs/2203.02155 GPT-3: Brown et al., 2020 - https://arxiv.org/abs/2005.14165 PaLM: Chowdhery et al., 2022 - https://arxiv.org/abs/2204.02311 Efficient reductions for imitation learning: Ross & Bagnell, 2010 - https://proceedings.mlr.press/v9/ross... Deep reinforcement learning from human preferences: Christiano et al., 2017 - https://arxiv.org/abs/1706.03741 Learning to summarize from human feedback: Stiennon et al., 2020 - https://arxiv.org/abs/2009.01325 Scaling laws for reward model overoptimization: Gao et al., 2022 - https://arxiv.org/abs/2210.10760 Proximal policy optimization algorithms: Schulman et al., 2017 - https://arxiv.org/abs/1707.06347 Special thanks to Elmira Amirloo for feedback on this video. Links: YouTube:    / ariseffai   Twitter:   / ari_seff   Homepage: https://www.ariseff.com If you'd like to help support the channel (completely optional), you can donate a cup of coffee via the following: Venmo: https://venmo.com/ariseff PayPal: https://www.paypal.me/ariseff

Comments