Statistical comparison in empirical computer science with minimal computation usage

Timothée Mathieu; Philippe Preux

doi:10.1145/3641525.3663618

Communication Dans Un Congrès Année : 2024

Statistical comparison in empirical computer science with minimal computation usage

(1, 2) ,

1
2

Timothée Mathieu

Fonction : Auteur
PersonId : 1130096
IdHAL : timothee-mathieu

Scool

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Philippe Preux

Fonction : Auteur
PersonId : 5488
IdHAL : preux-philippe
ORCID : 0000-0002-2067-2838
IdRef : 059896353

Résumé

The replicability of computational experiments remains a fundamental question. For example, the machine learning community has recently become aware of the poor replicability of many experimental studies that aim at comparing the performance of various algorithms. Due to computational costs, it is often necessary to use methods that require as few computations as possible to obtain a replicable conclusion. The conclusion of the comparison should also be replicable which calls for appropriate statistical tests. AdaStop is a recently introduced statistical test based on multiple group sequential tests. AdaStop adapts the number of executions of each experiment to stop as early as possible while ensuring that enough information is available to distinguish algorithms that perform better than the others in a statistically significant way. AdaStop has been initially exemplified on reinforcement learning tasks. In this short paper, we consider 3 case studies to investigate the use AdaStop beyond its original field of application, and demonstrate that it is a test that may be used on a wide range of application domains.

CCS CONCEPTS

Mots clés

• Mathematics of computing → Nonparametric statistics • General and reference → Empirical studies • Computing Statistical Reproducibility Statistical Tests Benchmarking ACM Reference • Mathematics of computing → Nonparametric statistics • General and reference → Empirical studies • Computing Statistical Reproducibility

Domaines

Autres [stat.ML]

Fichier principal

adastop_acm.pdf (492.03 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Timothée Mathieu : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04718314

Soumis le : jeudi 3 octobre 2024-09:59:37

Dernière modification le : vendredi 4 octobre 2024-03:20:29

Dates et versions

hal-04718314 , version 1 (03-10-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04718314 , version 1
DOI : 10.1145/3641525.3663618

Citer

Timothée Mathieu, Philippe Preux. Statistical comparison in empirical computer science with minimal computation usage. ACM REP '24: ACM Conference on Reproducibility and Replicability, Jun 2024, Rennes, France. pp.20-24, ⟨10.1145/3641525.3663618⟩. ⟨hal-04718314⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 UNIV-LILLE CRISTAL-SCOOL

14 Consultations

6 Téléchargements

Statistical comparison in empirical computer science with minimal computation usage

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager