
[2410.02703] Selective Attention Improves Transformer - arXiv.org
Oct 3, 2024 · We introduce Selective Attention, a simple parameter-free change to the standard attention mechanism which reduces attention to unneeded elements. Selective attention …
Selective Attention Improves Transformer - OpenReview
Jan 22, 2025 · Abstract: Unneeded elements in the attention’s context degrade performance. We introduce Selective Attention, a simple parameter-free change to the standard attention …
Accepted Main Conference Papers - ACL 2025
Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. …
Artificial Intelligence 2025 - arXiv.org
Subjects: Artificial Intelligence (cs.AI); Materials Science (cond-mat.mtrl-sci); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Neural and Evolutionary …
ATTENTION2D: Communication Efficient Distributed Self-Attention …
Mar 20, 2025 · In this paper, we introduce ATTENTION2D, a novel approach that exploits parallelism along two dimensions - query and key/value - of the self-attention operation. This method enables …
Computer Science - arXiv.org
Apr 13, 2026 · Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Multiagent Systems (cs.MA)
Artificial Intelligence May 2025 - arXiv.org
Subjects: Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA); Networking and Internet Architecture (cs.NI) Comments: 11 pages, 1 …
NeurIPS 2025 Papers
DERD-Net: Learning Depth from Event-based Ray Densities Reduction-based Pseudo-label Generation for Instance-dependent Partial Label Learning More Than Just Functional: LLM-as-a-Critique for …
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs - arXiv.org
Abstract Abstract In this paper, we introduce a novel learning paradigm for adaptive Large Language Model (LLM) agents that eliminates the need for fine-tuning the underlying LLMs.
arXiv.org e-Print archive
arXiv is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, …