A Study on Hierarchical Text Classification as a Seq2seq Task - Equipe Signal, Statistique et Apprentissage Access content directly
Conference Papers Year : 2024

A Study on Hierarchical Text Classification as a Seq2seq Task

Fatos Torba
Christophe Gravier
Abderrhammen Kammoun
  • Function : Author
  • PersonId : 1343270
Julien Subercaze
  • Function : Author


With the progress of generative neural models, Hierarchical Text Classification (HTC) can be cast as a generative task. In this case, given an input text, the model generates the sequence of predicted class labels taken from a label tree of arbitrary width and depth. Treating HTC as a generative task introduces multiple modeling choices. These choices vary from choosing the order for visiting the class tree and therefore defining the order of generating tokens, choosing either to constrain the decoding to labels that respect the previous level predictions, up to choosing the pre-trained Language Model itself. Each HTC model therefore differs from the others from an architectural standpoint, but also from the modeling choices that were made. Prior contributions lack transparent modeling choices and open implementations, hindering the assessment of whether model performance stems from architectural or modeling decisions. For these reasons, we propose with this paper an analysis of the impact of different modeling choices along with common model errors and successes for this task. This analysis is based on an open framework coming along this paper that can facilitate the development of future contributions in the field by providing datasets, metrics, error analysis toolkit and the capability to readily test various modeling choices for one given model.
Fichier principal
Vignette du fichier
paper_331.pdf (413.69 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-04423348 , version 1 (29-01-2024)


  • HAL Id : hal-04423348 , version 1


Fatos Torba, Christophe Gravier, Charlotte Laclau, Abderrhammen Kammoun, Julien Subercaze. A Study on Hierarchical Text Classification as a Seq2seq Task. 46th European Conference on Information Retrieval (ECIR 2024), Mar 2024, GLASGOW, United Kingdom. ⟨hal-04423348⟩
66 View
36 Download


Gmail Facebook X LinkedIn More