Journal Papers
  •     Bib
    Tosatto, S.; Carvalho, J.; Peters, J. (2022). Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 44, 10, pp.5996--6010.
Conferences
  •     Bib
    Palenicek, D.; Lutter, M.; Carvalho, J.; Peters, J. (2023). Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning, International Conference on Learning Representations (ICLR).
  •       Bib
    Carvalho, J.; Le, A. T.; Baierl, M.; Koert, D.; Peters, J. (2023). Motion Planning Diffusion: Learning and Planning of Robot Motions with Diffusion Models, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
  •     Bib
    Carvalho, J.; Koert, D.; Daniv, M.; Peters, J. (2022). Adapting Object-Centric Probabilistic Movement Primitives with Residual Reinforcement Learning, 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids).
  •     Bib
    Vorndamme, J.; Carvalho, J.; Laha, R.; Koert, D.; Figueredo, L.; Peters, J.; Haddadin, S. (2022). Integrated Bi-Manual Motion Generation and Control shaped for Probabilistic Movement Primitives, 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids).
  •     Bib
    Carvalho, J., Tateo, D., Muratore, F., Peters, J. (2021). An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients, International Joint Conference on Neural Networks (IJCNN).
  •     Bib
    Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. (2020). A Nonparametric Off-Policy Policy Gradient, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS).
Workshop Papers
  •     Bib
    Carvalho, J.; Peters, J. (2022). An Analysis of Measure-Valued Derivatives for Policy Gradients, Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM).
  •     Bib
    Carvalho, J.; Baierl, M; Urain, J; Peters, J. (2022). Conditioned Score-Based Models for Learning Collision-Free Trajectory Generation, NeurIPS 2022 Workshop on Score-Based Methods.
Theses
  •     Bib
    Carvalho, J.A.C. (2019). Nonparametric Off-Policy Policy Gradient, Master Thesis.