Autotelic LLM-based exploration for goal-conditioned RL

Guillaume Pourcel; Thomas Carta; Grgur Kovač; Pierre-Yves Oudeyer

Communication Dans Un Congrès Année : 2024

Autotelic LLM-based exploration for goal-conditioned RL

(1, 2) , (1) , (1) , (1)

1
2

Guillaume Pourcel

Fonction : Auteur

Flowing Epigenetic Robots and Systems

Modeling Intelligent Dynamical Systems

Thomas Carta

Fonction : Auteur

Flowing Epigenetic Robots and Systems

Grgur Kovač

Fonction : Auteur

Flowing Epigenetic Robots and Systems

Pierre-Yves Oudeyer

Fonction : Auteur

Flowing Epigenetic Robots and Systems

Résumé

Designing autotelic agents capable of autonomously generating and pursuing their own goals represents a promising endeavor for open-ended learning and skill acquisition in reinforcement learning. This challenge is especially difficult in open worlds that require inventing new previously unobserved goals. In this work, we propose an architecture where a single generalist autotelic agent is trained on an automatic curriculum of goals. We leverage large language models (LLMs) to generate goals as code for reward functions based on learnability and difficulty estimates. The goal-conditioned RL agent is trained on those goals sampled based on learning progress. We compare our method to an adaptation of OMNI-EPIC to goal-conditioned RL. Our preliminary experiments imply that our method generates a higher proportion of learnable goals, suggesting better adaptation to the goalconditioned learner.

Domaines

Informatique [cs]

Fichier principal

52_Autotelic_LLM_based_explora.pdf (967.85 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Grgur Kovac : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04861896

Soumis le : jeudi 2 janvier 2025-16:52:39

Dernière modification le : samedi 4 janvier 2025-03:17:22

Dates et versions

hal-04861896 , version 1 (02-01-2025)

Licence

Paternité

Identifiants

HAL Id : hal-04861896 , version 1

Citer

Guillaume Pourcel, Thomas Carta, Grgur Kovač, Pierre-Yves Oudeyer. Autotelic LLM-based exploration for goal-conditioned RL. Intrinsically Motivated Open-ended Learning Workshop at NeurIPS 2024, Dec 2024, Vancouver, Canada. ⟨hal-04861896⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA ENSTA_U2IS INRIA2 IP_PARIS

0 Consultations

0 Téléchargements

Autotelic LLM-based exploration for goal-conditioned RL

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager