de
en
Schliessen
Detailsuche
Bibliotheken
Projekt
Impressum
Datenschutz
zum Inhalt
Detailsuche
Schnellsuche:
OK
Ergebnisliste
Titel
Titel
Inhalt
Inhalt
Seite
Seite
Im Dokument suchen
Fleer, Sascha: Scaffolding for learning from reinforcement: Improving interaction learning. 2020
Inhalt
Abstract
Acknowledgement
Declaration
Contents
1 Introduction
1.1 Outline
I The toolbox
2 Reinforcement learning — a paradigm of human inspired artificial intelligence
2.1 The basic description of a reinforcement learning problem
2.2 The Bellman equation
2.3 Optimal value functions
2.4 Q-learning using linear function approximators
2.4.1 Approximating the state-action space for a set of discrete actions
2.5 Policy gradient methods
2.5.1 The REINFORCE algorithm
2.6 Going deeper with neural networks
2.6.1 Deep Q-learning
2.6.2 Asynchronous models
2.7 Summary
3 The guiding principle of scaffolding
3.1 The concept of scaffolding in educational psychology
3.1.1 Learning how to ride a bicycle: an example for scaffolded learning
3.2 Teaching devices: employing computer-based tools for scaffolding the learning process of humans
3.2.1 Scaffolding the learning of a foreign language with the help of a computer-based tool
3.3 Scaffolding artificial agents by organizing learning on a meta-level
3.3.1 Recruiting and maintaining the learner's attention
3.3.2 Simplifying the task
3.3.3 Modelling and demonstration
3.3.4 Ongoing diagnosis and assessment
3.3.5 Fading support and eventual transfer of responsibility
3.3.6 Summary
3.4 Reformulating scaffolding as a principle for guiding the learning process of machines
3.4.1 Scaffolding in practice: inject meta-knowledge by compiling individual auxiliaries
3.5 A research map for scaffolding in machine learning
3.5.1 Four research questions for scaffolding an artificial agent
3.6 Summary
II Scaffolding: a universal approach for fostering the learning process
4 Scaffolding attention control by exploiting ``perceptive acting''
4.1 The concept of entropy and mutual information in the context of reinforcement learning
4.1.1 Exploiting mutual information as a ranking criteria for action sets
4.2 Applying the concept to complex environments
4.2.1 Estimating the probability distribution of state transitions
4.2.2 Estimating the entropy & mutual information
4.3 Summary
5 Scaffolding attention control by exploiting ``active visual perception''
5.1 The recurrent attention asynchronous advantage actor-critic model
5.1.1 Training
5.2 Summary
6 Scaffolding the learning of efficient haptic exploration using ``active haptic perception''
6.1 From human haptic perception to robotics
6.2 The haptic attention model
6.2.1 Training
6.3 Summary
7 Scaffolding the agent's internal representation through skill transfer
7.1 The combination of a structured curriculum with transfer learning — 4 strategies of skill transfer
7.1.1 Strategy 1 & 2: simple techniques for skill transfer
7.1.2 Strategy 3 & 4: refining skill transfer by analysing the learners way of perception
7.2 Summary
III Facilitating the learning process of interaction problems: testing the proposed scaffolding approaches
8 A learning domain for mediated interaction
8.1 The general design concept of the simulation world
8.2 Realization of a 2D simulation world with simplified physics
8.3 Perceiving & acting: defining a suitable state and action space for multi-object interaction scenarios
8.3.1 A general set of discrete actions
8.3.2 Perceiving the environment
8.4 Designing suitable learning scenarios
8.5 Learning with a distance-related sensory input
8.5.1 The construction of a linear Q-learner
8.5.2 Creating a deep Q-Learner
8.6 Summary
9 A first scaffold for learning the ``Extension-of-Reach Scenario'': determining the best action set
9.1 Experiments
9.2 Results
9.3 Discussion
10 A second scaffold for learning the ``Extension-of-Reach Scenario'': structuring the learning process
10.1 Experiments
10.1.1 Learning and evaluation
10.1.2 Applying the four transfer learning strategies
10.2 Results
10.3 Discussion
11 Scaffolding the learning process through ``active visual perception'': an attention based approach
11.1 Experiments
11.2 Results
11.3 Discussion
12 A scaffold for enabling ``active haptic perception'': learning efficient haptic exploration
12.1 Designing the simulation world
12.1.1 The three building blocks of the simulation world
12.1.2 Implementing essential control primitives
12.1.3 The classification task
12.1.4 Creation of the dataset
12.2 Implementation of the haptic attention model
12.3 Experiments
12.4 Results
12.5 Discussion
IV Conclusion
13 Summary, conclusion & outlook
13.1 Four scaffolding approaches — a summary
13.2 Conclusion
13.3 Recommendations for future research
Bibliography
Appendices
A Pseudocode
B Used learning parameters
B.1 Linear Q-learning
B.1.1 State representations
B.1.2 Learning the ``Extension-of-Reach Scenario'' using different coordinate systems
B.2 Deep Q-learning
B.3 Recurrent attention advantage actor-critic model
B.4 Haptic attention model
C Floating Myrmex sensor: experimental results
D Supplementary material