Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Simons Institute via YouTube Direct link

Intro

1 of 35

1 of 35

Intro

Class Central Classrooms beta

YouTube playlists curated by Class Central.

Classroom Contents

Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Acknowledgements
  3. 3 Reinforcement Learning (RL)
  4. 4 Challenges of Real-World RL
  5. 5 Goals and Preferences
  6. 6 Linear Temporal Logic (LTL) A compelling logic to express temporal properties of traces.
  7. 7 Challenges to RL
  8. 8 Toy Problem Disclaimer
  9. 9 Running Example
  10. 10 Decoupling Transition and Reward Functions
  11. 11 The Rest of the Talk
  12. 12 Define a Reward Function using a Reward Machine
  13. 13 Reward Function Vocabulary
  14. 14 Simple Reward Machine
  15. 15 Reward Machines in Action
  16. 16 Other Reward Machines
  17. 17 Q-Learning Baseline
  18. 18 Option-Based Hierarchical RL (HRL)
  19. 19 HRL with RM-Based Pruning (HRL-RM)
  20. 20 HRL Methods Can Find Suboptimal Policies
  21. 21 Q-Learning for Reward Machines (QRM)
  22. 22 QRM In Action
  23. 23 Recall: Methods for Exploiting RM Structure
  24. 24 5. QRM + Reward Shaping (QRM + RS)
  25. 25 Test Domains
  26. 26 Test in Discrete Domains
  27. 27 Office World Experiments
  28. 28 Minecraft World Experiments
  29. 29 Function Approximation with QRM
  30. 30 Water World Experiments
  31. 31 Creating Reward Machines
  32. 32 Reward Specification: one size does not fit all
  33. 33 1. Construct Reward Machine from Formal Languages
  34. 34 Generate RM using a Symbolic Planner
  35. 35 Learn RMs for Partially-Observable RL

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.