Machine Learning, Robotics, Reinforcement Learning
TU Darmstadt, FG-IAS
Hochschulstr. 10, 64289 Darmstadt
Office. Room E225, Building S2|02
João joined the Intelligent Autonomous Systems Group as a PhD student in November 2019. He received a MSc degree in Computer Science from the Albert-Ludwigs-Universität Freiburg, and previously completed a Master in Electrical and Computer Engineering from the Instituto Superior Técnico at the University of Lisbon.
His master's thesis entitled "Nonparametric Off-Policy Policy Gradient" was written at IAS, supervised by Samuele Tosatto, and explored an approach to obtain an off-policy gradient update with better sample-efficiency.
Offpolicy and Offline Reinforcement Learning
- Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. (2020). A Nonparametric Off-Policy Policy Gradient, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS).
Download Article [PDF] BibTeX Reference [BibTex]