Mamba vs Python

About 3,200 results

Open links in new tab

Any time

iclr.cc
https://proceedings.iclr.cc › paper_files › paper › file
[PDF]
MAMBA: EFFECTIVE WORLD MODEL APPROACH FOR M -R …
4 EXPERIMENTS provide an investigation of the effectiveness of MAMBA. We first show that MAMBA is a high-performing meta-RL algorithm: compared to baseli es it obtains high returns …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
MAMBA: an Effective World Model Approach for Meta …
MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning Part of International Conference on Representation Learning 2024 (ICLR 2024) Conference
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
MambaQuant: Quantizing the Mamba Family with Variance …
Mamba is an efficient sequence model that rivals Transformers and demonstrates significant potential as a foundational architecture for various tasks. Quantization is commonly used in …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
Autoregressive Pretraining with Mamba in Vision
The vision community has started to build with the recently developed state space model, Mamba, as the new backbone for a range of tasks. This paper shows that Mamba's visual capability …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › file
[PDF]
Published as a conference paper at ICLR 2025
nspired its adoption in vision applications. Vision Mamba (Vim) (Zhu et al., 2024) utilizes Vim blocks composed of pure Mamba layers: each Vim block leverages both forward and …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
LongMamba: Enhancing Mamba's Long-Context Capabilities via …
To address this significant shortfall and achieve both efficient and accurate long-context understanding, we propose LongMamba, a training-free technique that significantly enhances …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
Jamba: Hybrid Transformer-Mamba Language Models
We study various architectural decisions, such as how to combine Transformer and Mamba layers, and how to mix experts, and show that some of them are crucial in large scale modeling.
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › file
[PDF]
M E : A TRAINING-FREE APPROACH TO I L -CONTEXT …
At its core, each Mamba block utilizes the selective SSM (S6) layer (Gu & Dao, 2023), which is specifically designed to handle sequential data by preserving structured state dynamics across …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › file
[PDF]
MAMKO: MAMBA-BASED KOOPMAN OPERATOR FOR
We propose a novel Mamba-based Koopman operator (MamKO) modeling method, which leverages matrices generated from the Mamba structure to model complex nonlinear systems. …
iclr.cc
https://proceedings.iclr.cc › paper_files › paper › hash
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is …
Transformers, on the other hand, suffer from the quadratic memory and computational complexity of self-attention mechanisms, scaling as $O (n^2)$, where $n$ is the sequence length.To …

Pagination
- 1
- 2
- 3
- 4
- Next

MAMBA: EFFECTIVE WORLD MODEL APPROACH FOR M -R …

MAMBA: an Effective World Model Approach for Meta …

MambaQuant: Quantizing the Mamba Family with Variance …

Autoregressive Pretraining with Mamba in Vision

Published as a conference paper at ICLR 2025

LongMamba: Enhancing Mamba's Long-Context Capabilities via …

Jamba: Hybrid Transformer-Mamba Language Models

M E : A TRAINING-FREE APPROACH TO I L -CONTEXT …

MAMKO: MAMBA-BASED KOOPMAN OPERATOR FOR

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is …