Quantized Approximately Orthogonal Recurrent Neural Networks

Armand Foucault; Franck Mamalet; François Malgouyres

Pré-Publication, Document De Travail Année : 2024

Quantized Approximately Orthogonal Recurrent Neural Networks

(1) , (2, 3) , (1)

1
2
3

Armand Foucault

Fonction : Auteur

Institut de Mathématiques de Toulouse UMR5219

Franck Mamalet

Fonction : Auteur

IRT Saint Exupéry - Institut de Recherche Technologique

Université de Toulouse

François Malgouyres

Fonction : Auteur
PersonId : 12988
IdHAL : francois-malgouyres
ORCID : 0000-0002-1213-858X
IdRef : 164520767

Institut de Mathématiques de Toulouse UMR5219

Résumé

In recent years, Orthogonal Recurrent Neural Networks (ORNNs) have gained popularity due to their ability to manage tasks involving long-term dependencies, such as the copy-task, and their linear complexity. However, existing ORNNs utilize full precision weights and activations, which prevents their deployment on compact devices. In this paper, we explore the quantization of the weight matrices in ORNNs, leading to Quantized approximately Orthogonal RNNs (QORNNs). The construction of such networks remained an open problem, acknowledged for its inherent instability. We propose and investigate two strategies to learn QORNN by combining quantization-aware training (QAT) and orthogonal projections. We also study post-training quantization of the activations for pure integer computation of the recurrent loop. The most efficient models achieve results similar to state-of-the-art full-precision ORNN, LSTM and FastRNN on a variety of standard benchmarks, even with 4-bits quantization.

Mots clés

Recurrent Neural Networks Orthogonal Recurrent Neural Networks Quantized Neural Networks time series

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Réseau de neurones [cs.NE] Traitement du signal et de l'image [eess.SP] Statistiques [math.ST]

Fichier principal

qornn_neurips.pdf (675.75 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Franck MAMALET : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04434011

Soumis le : vendredi 7 juin 2024-07:52:42

Dernière modification le : mercredi 12 juin 2024-03:17:06

Dates et versions

hal-04434011 , version 1 (02-02-2024)

hal-04434011 , version 2 (07-06-2024)

Identifiants

HAL Id : hal-04434011 , version 2
ARXIV : 2402.04012

Citer

Armand Foucault, Franck Mamalet, François Malgouyres. Quantized Approximately Orthogonal Recurrent Neural Networks. 2024. ⟨hal-04434011v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS INSA-TOULOUSE INSMI IMT UT1-CAPITOLE IRT_SAINT-EXUPERY INSA-GROUPE ANR ANITI UNIV-UT3 UT3-TOULOUSEINP

40 Consultations

26 Téléchargements

Quantized Approximately Orthogonal Recurrent Neural Networks

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager