GatedLinearRNN

GatedLinearRNN Equation

GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling
Tobias Katsch*
Paper: https://arxiv.org/abs/2311.01927

About

Linear Gated RNNs (Mamba, GateLoop, HGRN) form a novel class of sequence models which generalize and generalize linear recurrent models such as S4, S5, LRU and RetNet, by employing data-controlled state transitions. While having a low cost linear complexity inference mode, they can be trained extremely efficient in parallel with logarithmic complexity making use of the highly optimized JAX associative scan implementation. This repository implements a practical gated linear recurrent model with default choices for input-, hidden- and gate activations and provides a drop-in replacement for causal multi-head-attention and a linear gated RNN language model architecture. Furthermore, linear gated RNNs can be used to train true recurrent models (GRU, LSTM) extremely fast by first training using associative scans and switching to a true recurrent mode (by enabling recurrent weights) for finetuning.

Installation

Other requirements:

Usage

We provide 2 main modules:

Synthetic speech generation examples

https://tobiaskatsch.github.io/GatedLinearRNN/

Citation

If you use this codebase, please cite:

@misc{katsch2024GateLoop,
      title={GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling}, 
      author={Tobias Katsch},
      year={2024},
      eprint={2311.01927},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}