João Carvalho

Quick Info

Research Interests

Machine Learning, Robotics, Reinforcement Learning

More Information

Personal Website Publications Google Scholar

Contact Information

Mail. João Carvalho
TU Darmstadt, FG-IAS
Hochschulstr. 10, 64289 Darmstadt
Office. Room E325, Building S2|02
work+49-6151-16-25372

João joined the Intelligent Autonomous Systems Group as a PhD student in November 2019. He received a MSc degree in Computer Science from the Albert-Ludwigs-Universität Freiburg, and previously completed a Master in Electrical and Computer Engineering from the Instituto Superior Técnico at the University of Lisbon.

His master's thesis entitled "Nonparametric Off-Policy Policy Gradient" was written at IAS, supervised by Samuele Tosatto, and explored an approach to obtain an off-policy gradient update with better sample-efficiency.

Key References

Measure-Valued Derivatives

  1. Carvalho, J., Tateo, D., Muratore, F., Peters, J. (2021). An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients, International Joint Conference on Neural Networks (IJCNN).   Download Article [PDF]   BibTeX Reference [BibTex]

Offpolicy and Offline Reinforcement Learning

  1. Tosatto, S.; Carvalho, J.; Peters, J. (in press). Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. (2020). A Nonparametric Off-Policy Policy Gradient, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS).   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Carvalho, J.A.C. (2019). Nonparametric Off-Policy Policy Gradient, Master Thesis.   Download Article [PDF]   BibTeX Reference [BibTex]


  

zum Seitenanfang