Multimodal (2) 썸네일형 리스트형 [Paper Review] Android in the Zoo:Chain-of-Action-Thought for GUI Agents (AITZ) paper: Zhang, Jiwen, et al. "Android in the zoo: Chain-of-action-thought for gui agents." arXiv preprint arXiv:2403.02713 (2024)link: https://arxiv.org/abs/2403.02713 Android in the Zoo: Chain-of-Action-Thought for GUI AgentsLarge language model (LLM) leads to a surge of autonomous GUI agents for smartphone, which completes a task triggered by natural language through predicting a sequence of ac.. [Paper Review] Multimodal Chain-of-Thought Reasoning inLanguage Models (MM-CoT) paper: Zhang, Zhuosheng, et al. "Multimodal chain-of-thought reasoning in language models." arXiv preprint arXiv:2302.00923 (2023).link: https://arxiv.org/abs/2302.00923 Multimodal Chain-of-Thought Reasoning in Language ModelsLarge language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains.. 이전 1 다음