본문 바로가기

NLP

(3)

[Paper Review] Training Verifiers to Solve Math Word Problems (GSM8K) paper: Cobbe, Karl, et al. "Training verifiers to solve math word problems." arXiv preprint arXiv:2110.14168 (2021).link: https://arxiv.org/abs/2110.14168 Training Verifiers to Solve Math Word ProblemsState-of-the-art language models can match human performance on many tasks, but they still struggle to robustly perform multi-step mathematical reasoning. To diagnose the failures of current models..

[Paper Review] Mistral 7B paper: Jiang, Albert Q., et al. "Mistral 7B." arXiv preprint arXiv:2310.06825 (2023)link: https://arxiv.org/abs/2310.06825 Mistral 7BWe introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our marxiv.org..

[Paper Review] Reflexion: Language Agents with Verbal Reinforcement Learning Paper: Shinn, Noah, et al. "Reflexion: Language agents with verbal reinforcement learning." Advances in Neural Information Processing Systems 36 (2024).link: https://proceedings.neurips.cc/paper_files/paper/2023/hash/1b44b878bb782e6954cd888628510e90-Abstract-Conference.html Reflexion: language agents with verbal reinforcement learningRequests for name changes in the electronic proceedings will b..

이전 1 다음

티스토리툴바