João Carvalho

Quick Info

Research Interests

Reinforcement Learning, Robotics, Manipulation

More Information

Publications Google Scholar ORCID

Contact Information

Mail. João Carvalho
TU Darmstadt, FG-IAS
Hochschulstr. 10, 64289 Darmstadt
Office. Room E325, Building S2|02
work+49-6151-16-25372


João joined the Intelligent Autonomous Systems group as a PhD student in November 2019. He received a MSc degree in Computer Science from the Albert-Ludwigs-Universität Freiburg, and previously completed a Master in Electrical and Computer Engineering from the Instituto Superior Técnico of the University of Lisbon. His master thesis was written at IAS under the supervision of Samuele Tosatto and explored an approach to obtain an off-policy gradient with better sample-efficiency.

Currently he is working within the IKIDA project to develop algorithms that enable robots to learn from human input.

Robots require a particular set of skills to solve an assembly task, e.g. planning, vision and manipulation. While they can excel at the planning part, and there are fairly good computer vision methods, they still lack the fine manipulation skills of humans, e.g. for insertion or placing tasks. Usually these manipulation skills have to be hard-coded by a specialized technician and are not learned by the robot. During his PhD, João is looking into new ways to teach low-level manipulation skills to robots through imitation and reinforcement learning. His research topics include variance reduction in policy gradients, exploration for robotics, residual policy learning, imitation learning from video and active visual search.


Key References

Measure-Valued Derivatives

  1. Carvalho, J., Tateo, D., Muratore, F., Peters, J. (2021). An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients, International Joint Conference on Neural Networks (IJCNN).   Download Article [PDF]   BibTeX Reference [BibTex]

Offpolicy and Offline Reinforcement Learning

  1. Tosatto, S.; Carvalho, J.; Peters, J. (in press). Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. (2020). A Nonparametric Off-Policy Policy Gradient, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS).   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Carvalho, J.A.C. (2019). Nonparametric Off-Policy Policy Gradient, Master Thesis.   Download Article [PDF]   BibTeX Reference [BibTex]

Student Supervision

Teaching Assistant

  • Robot Learning (WS 2020)
  • Computational Engineering and Robotics (SS 2020, SS 2021)


  

zum Seitenanfang