Русские видео

Сейчас в тренде

Иностранные видео


Скачать с ютуб SOLVED: Perfect Reasoning for every AI AGENT (ReasonAgain) в хорошем качестве

SOLVED: Perfect Reasoning for every AI AGENT (ReasonAgain) 9 дней назад


Если кнопки скачивания не загрузились НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу страницы.
Спасибо за использование сервиса savevideohd.ru



SOLVED: Perfect Reasoning for every AI AGENT (ReasonAgain)

PERFECT REASONING FOR EVERY AI AGENT, EXPLAINED with CODE. Symbolic Reasoning for all LLM based Agents to boost reasoning performance. ReasonAgain aims to enhance the evaluation of large language models' (LLMs) mathematical reasoning by employing symbolic programs instead of relying solely on final answer accuracy. The methodology involves transforming existing mathematical questions into Python-based symbolic programs that encapsulate the underlying reasoning logic. By generating new input-output pairs through parameter perturbations, ReasonAgain tests whether LLMs can consistently apply correct reasoning across different variations. This approach provides a dynamic evaluation that highlights LLM fragility, revealing inconsistencies that static, final-answer-based metrics may overlook. In experiments, ReasonAgain demonstrated a significant decline in performance when LLMs were exposed to perturbed versions of questions compared to static datasets. This outcome emphasizes the current limitations of LLMs, suggesting that they often depend on superficial heuristics rather than deeply understanding the reasoning processes. By systematically surfacing these weaknesses, ReasonAgain points toward pathways for improving the robustness of LLM reasoning. All rights w/ authors: ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning https://arxiv.org/pdf/2410.19056 GitHub repo: https://github.com/CogComp/reasoning-... 00:00 LLM fail in logic reasoning 01:48 Symbolic Code representation 03:35 Symbolic perturbations 04:50 20 percent LLM accuracy 07:12 My Logic Test Symbolic encoded 11:25 Prolog Code for logic test 11:43 LISP, Haskell, CLIPS, SCALA code 13:18 NEW Reasoning power for AI Systems 17:00 AI Agent Reasoning enhanced 18:42 ReasonAgain paper Microsoft AMD 21:18 Limitations 22:47 No Prompt Engineering required #airesearch #reasoning #aiagents #microsoft #amd

Comments