Contribution to the study and the design of reinforcement functions

Santos, Juan Miguel

Buscar material

Busque entre los 170197 recursos disponibles en el repositorio

Subir material

Suba sus trabajos a SEDICI, para mejorar notoriamente su visibilidad e impacto

Mostrar el registro sencillo del ítem

dc.date.accessioned	2022-04-29T13:14:50Z
dc.date.available	2022-04-29T13:14:50Z
dc.date.issued	2002
dc.identifier.uri	http://sedici.unlp.edu.ar/handle/10915/135299
dc.description.abstract	We have studied the Reinforcement Function Design Process in two steps. For the first one we have considered the translation of a natural language description into an instance of our proposed Reinforcement Function General Expression. For the second step, we have gone deeply into the tuning of the parameters in this expression. It allowed us to obtain optimal definitions of the reinforcement function (relative to exploration). Since the General Expression is based on constraints, we have indentified them according to the type of state variable estimator on which they act, in particular: position and velocity.Using a particular, but representative Reinforcement Function (RF) expression, we study the relation between the Sum of each reinforcement type and the RF parameters during the exploration phase of the learning. For linear relations, we propose an analytic method to obtain the RF parameters values (no experimentation requires). For non-linear, but monotonous relations, we propose the Update Parameter Algorithm (UPA) and show that UPA can efficiently adjust the proportion of negative and positive reinforcements received during the exploratory phase of the learning. Additionally, we study the feasibility and consequences of adapting the RF during the learning process so as to improve the learning convergence of the system. Dynamic-UPA allows the whole learning process to maintain a desired ratio of positive and negative rewards. Thus, we introduce an approach to undertake the exploration-exploitation dilemma - a necessary step for efficient Reinforcement Learning. We show, with several experiments involving robots (mobile and arm), the performance of the proposed design methods. Finally, we emphasize the main conclusions and present some future directions of research.	en
dc.language	en	es
dc.subject	Reinforcement function	es
dc.subject	Reinforcement learning	es
dc.subject	robot learning	es
dc.subject	autonomous robot	es
dc.subject	behavior-based approach	es
dc.title	Contribution to the study and the design of reinforcement functions	es
dc.type	Articulo	es
sedici.identifier.uri	https://publicaciones.sadio.org.ar/index.php/EJS/article/view/111	es
sedici.identifier.issn	1514-6774	es
sedici.creator.person	Santos, Juan Miguel	es
sedici.subject.materias	Ciencias Informáticas	es
sedici.description.fulltext	true	es
mods.originInfo.place	Sociedad Argentina de Informática e Investigación Operativa	es
sedici.subtype	Articulo	es
sedici.rights.license	Creative Commons Attribution 4.0 International (CC BY 4.0)
sedici.rights.uri	http://creativecommons.org/licenses/by/4.0/
sedici.description.peerReview	peer-review	es
sedici.relation.journalTitle	Electronic Journal of SADIO	es
sedici.relation.journalVolumeAndIssue	vol. 4	es

Descargar archivos

Documento
Descargar archivo (381.8Kb) - PDF

Enlace externo

publicaciones.sadio.org.ar/...

Este ítem aparece en la(s) siguiente(s) colección(ones)

Electronic Journal of SADIO (EJS) → Volumen 04

Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution 4.0 International (CC BY 4.0)

Iniciar sesión