Constraints and rewards in behavior and optimal decision making

Ramirez-Ruiz, Jorge

Constraints and rewards in behavior and optimal decision making

dc.contributor

Universitat Pompeu Fabra. Departament de Tecnologies de la Informació i les Comunicacions

dc.contributor.author

Ramirez-Ruiz, Jorge

dc.date.accessioned

2023-07-12T08:29:28Z

dc.date.available

2023-07-12T08:29:28Z

dc.date.issued

2023-06-22

dc.identifier.uri

http://hdl.handle.net/10803/688642

dc.description.abstract

The idea of optimal decision-making presupposes certain features about the agent and their environment. This thesis examines two common assumptions in disciplines that study natural and artificial behavior: perfect rationality and reward maximization. Defining or inferring a reward function to maximize can be problematic, especially when one considers the constraints faced by the agents. First, we explore the breadth-depth dilemma, a tradeoff that contrasts superficial versus deep sampling of options by having finite resources. In the models, two major regimes of optimal sample allocation arise as a function of sampling capacity, offering alternative ways to understand “suboptimal” behavior. Additionally, we propose a novel intrinsic motivation approach based on occupying as many paths in the environment as possible, using rewards as means rather than the goal. Agents can thus attach meaning to reward, and develop diverse yet goal-directed behaviors. This approach presents novel opportunities to understand fluid, naturalistic behavior.

dc.description.abstract

La noción de decisiones óptimas presupone algunas características del agente y su entorno. Esta tesis examina dos suposiciones comunes en diversas disciplinas que estudian comportamiento natural y artificial: racionalidad perfecta y maximización de recompensas. Definir o inferir una función de recompensa a maximizar puede ser problemático, especialmente cuando uno considera las constricciones a las que el agente se enfrenta. Primeramente, esta tesis explora el dilema amplitud– profundidad, un balance que contrasta un muestreo superficial contra uno profundo de las opciones a elegir al tener recursos limitados. En nuestros modelos, dos regímenes principales emergen para la distribución óptima de recursos en función de la capacidad de muestreo, lo cual ofrece alternativas para entender algunos comportamientos “subóptimos”. Adicionalmente, se propone una perspectiva de motivación intrínseca basada en la ocupación máxima de trayectorias en el entorno, usando las recompensas como medio y no como fin. Los agentes pueden así asignar un significado a las recompensas, y desarrollan comportamientos variables al mismo tiempo que orientado a metas. Este enfoque ofrece nuevas oportunidades de entender comportamientos naturales y fluidos.

dc.format.extent

154 p.

dc.language.iso

eng

dc.publisher

Universitat Pompeu Fabra

dc.rights.license

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-nd/4.0/

dc.rights.uri

http://creativecommons.org/licenses/by-nc-nd/4.0/

dc.source

TDX (Tesis Doctorals en Xarxa)

dc.subject

Decision making

dc.subject

Optimality

dc.subject

Constrained optimization

dc.subject

Bounded rationality

dc.subject

Breadth–depth tradeoff

dc.subject

Intrinsic motivation

dc.subject

Reward hypothesis

dc.subject

Entropy

dc.subject

Reinforcement learning

dc.subject

Goal-directed behavior

dc.subject

Toma de decisiones

dc.subject

Optimalidad

dc.subject

Optimización con restricciones

dc.subject

Racionalidad limitada

dc.subject

Balance amplitud–profundidad

dc.subject

Motivación intrínseca

dc.subject

Hipótesis de recompensa

dc.subject

Entropía

dc.subject

Aprendizaje por refuerzos

dc.subject

Comportamiento orientado a metas

dc.title

Constraints and rewards in behavior and optimal decision making

dc.type

info:eu-repo/semantics/doctoralThesis

dc.type

info:eu-repo/semantics/publishedVersion

dc.subject.udc

dc.contributor.authoremail

jorge.ramirez@upf.edu

dc.contributor.director

Moreno-Bote, Ruben

dc.embargo.terms

cap

dc.rights.accessLevel

info:eu-repo/semantics/openAccess

dc.description.degree

Programa de Doctorat en Tecnologies de la Informació i les Comunicacions

Documents

tjrr.pdf

11.18Mb PDF

This item appears in the following Collection(s)

Programa de Doctorat en Tecnologies de la Informació i les Comunicacions [376]