About 303 results
Open links in new tab
  1. [2410.02703] Selective Attention Improves Transformer - arXiv.org

    Oct 3, 2024 · We introduce Selective Attention, a simple parameter-free change to the standard attention mechanism which reduces attention to unneeded elements. Selective attention

  2. Selective Attention Improves Transformer - OpenReview

    Jan 22, 2025 · Abstract: Unneeded elements in the attention’s context degrade performance. We introduce Selective Attention, a simple parameter-free change to the standard attention

  3. Accepted Main Conference Papers - ACL 2025

    Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. …

  4. Artificial Intelligence 2025 - arXiv.org

    Subjects: Artificial Intelligence (cs.AI); Materials Science (cond-mat.mtrl-sci); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Neural and Evolutionary …

  5. ATTENTION2D: Communication Efficient Distributed Self-Attention

    Mar 20, 2025 · In this paper, we introduce ATTENTION2D, a novel approach that exploits parallelism along two dimensions - query and key/value - of the self-attention operation. This method enables …

  6. Computer Science - arXiv.org

    Apr 13, 2026 · Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Multiagent Systems (cs.MA)

  7. Artificial Intelligence May 2025 - arXiv.org

    Subjects: Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA); Networking and Internet Architecture (cs.NI) Comments: 11 pages, 1 …

  8. NeurIPS 2025 Papers

    DERD-Net: Learning Depth from Event-based Ray Densities Reduction-based Pseudo-label Generation for Instance-dependent Partial Label Learning More Than Just Functional: LLM-as-a-Critique for …

  9. AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs - arXiv.org

    Abstract Abstract In this paper, we introduce a novel learning paradigm for adaptive Large Language Model (LLM) agents that eliminates the need for fine-tuning the underlying LLMs.

  10. arXiv.org e-Print archive

    arXiv is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, …