THE 20 QUESTIONS GAME TO DISTINGUISH LARGE LANGUAGE MODELS

Gurvan Richardeau; Erwan Le Merrer; Camilla Penzo; Gilles Trédan

Pré-Publication, Document De Travail Année : 2024

THE 20 QUESTIONS GAME TO DISTINGUISH LARGE LANGUAGE MODELS

(1) , (2) , (1) , (3)

1
2
3

Gurvan Richardeau

Fonction : Auteur
PersonId : 1415897

Pôle d'expertise de la régulation numérique

Erwan Le Merrer

Fonction : Auteur
PersonId : 1186574

the World Is Distributed Exploring the tension between scale and coordination

Camilla Penzo

Fonction : Auteur
PersonId : 1411726

Pôle d'expertise de la régulation numérique

Gilles Trédan

Fonction : Auteur
PersonId : 14277
IdHAL : gilles-tredan
IdRef : 119990385

Équipe Tolérance aux fautes et Sûreté de Fonctionnement informatique

Résumé

In a parallel with the 20 questions game, we present a method to determine whether two large language models (LLMs), placed in a black-box context, are the same or not. The goal is to use a small set of (benign) binary questions, typically under 20. We formalize the problem and first establish a baseline using a random selection of questions from known benchmark datasets, achieving an accuracy of nearly 100% within 20 questions. After showing optimal bounds for this problem, we introduce two effective questioning heuristics able to discriminate 22 LLMs by using half as many questions for the same task. These methods offer significant advantages in terms of stealth and are thus of interest to auditors or copyright owners facing suspicions of model leaks.

Mots clés

LLMs black-box distinguishability

Domaines

Informatique [cs]

Fichier principal

main-arxiv.pdf (314.01 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Erwan Le Merrer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04699271

Soumis le : lundi 16 septembre 2024-16:54:18

Dernière modification le : jeudi 19 septembre 2024-09:00:08

Dates et versions

hal-04699271 , version 1 (16-09-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04699271 , version 1

Citer

Gurvan Richardeau, Erwan Le Merrer, Camilla Penzo, Gilles Trédan. THE 20 QUESTIONS GAME TO DISTINGUISH LARGE LANGUAGE MODELS. 2024. ⟨hal-04699271⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 UNIV-RENNES1 CNRS INRIA INSA-RENNES INSA-TOULOUSE LAAS IRISA LAAS-TSF UT1-CAPITOLE CENTRALESUPELEC LAAS-INFORMATIQUE-CRITIQUE INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE LAAS-RISC UR1-MATH-NUM CYBERSCHOOL TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP LAAS-TRUST

0 Consultations

0 Téléchargements

THE 20 QUESTIONS GAME TO DISTINGUISH LARGE LANGUAGE MODELS

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager