I have graduated and have become a fulltime employee at the German Research Center for AI (DFKI), Research Department: Systems AI for Robot Learning.
Boris Belousov
Boris Belousov joined IAS as a PhD student in February 2016. He holds an MSc degree in Electrical Engineering from FAU Erlangen-Nürnberg with a major in Communications and Multimedia Engineering and a BSc degree in Applied Mathematics and Physics from Moscow Institute of Physics and Technology with a specialization in Electrical Engineering and Cybernetics.
Boris is interested in optimal control, information theory, robotics, and reinforcement learning. To realize the vision of intelligent systems of the future—that autonomously set and accomplish goals, learn from experience, and adapt to changing conditions—Boris develops foundational algorithms based on Bayesian decision theory. He has worked on maximum entropy reinforcement learning, risk-sensitive policy search, active exploration, curriculum learning, distributionally-robust optimization, domain randomization, visuotactile manipulation.
References
Systems AI for Robot Learning
- Lutter, M.; Belousov, B.; Mannor, S.; Fox, D.; Garg, A.; Peters, J. (in press). Continuous-Time Fitted Value Iteration for Robust Policies, IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI).
Download Article [PDF] BibTeX Reference [BibTex]
- Siebenborn, M.; Belousov, B.; Huang, J.; Peters, J. (2022). How Crucial is Transformer in Decision Transformer?, Foundation Models for Decision Making Workshop at Neural Information Processing Systems.
Download Article [PDF] BibTeX Reference [BibTex]
- Galljamov, R.; Zhao, G.; Belousov, B.; Seyfarth, A.; Peters, J. (2022). Improving Sample Efficiency of Deep Reinforcement Learning for Bipedal Walking, 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids).
BibTeX Reference [BibTex]
- Belousov, B.; Abdulsamad H.; Klink, P.; Parisi, S.; Peters, J. (2021). Reinforcement Learning Algorithms: Analysis and Applications, Studies in Computational Intelligence, Springer International Publishing.
Download Article [PDF] BibTeX Reference [BibTex]
Reinforcement Learning and Tactile Manipulation for Autonomous Robotic Assembly
- Liu, Y.; Belousov, B.; Funk, N.; Chalvatzaki, G.; Peters, J.; Tessman, O. (2023). Auto(mated)nomous Assembly, International Conference on Trends on Construction in the Post-Digital Era, pp.167-181, Springer, Cham.
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Wibranek, B.; Schneider, J.; Schneider, T.; Chalvatzaki, G.; Peters, J.; Tessmann, O. (2022). Robotic Architectural Assembly with Tactile Skills: Simulation and Optimization, Automation in Construction, 133, pp.104006.
Download Article [PDF] BibTeX Reference [BibTex]
- Zhu, Y.; Nazirjonov, S.; Jiang, B.; Colan, J.; Aoyama, T.; Hasegawa, Y.; Belousov, B.; Hansel, K.; Peters, J. (2022). Visual Tactile Sensor Based Force Estimation for Position-Force Teleoperation, IEEE International Conference on Cyborg and Bionic Systems.
Download Article [PDF] BibTeX Reference [BibTex]
- Funk, N.; Chalvatzaki, G.; Belousov, B.; Peters, J. (2021). Learn2Assemble with Structured Representations and Search for Robotic Architectural Construction, Conference on Robot Learning (CoRL).
Download Article [PDF] BibTeX Reference [BibTex]
- Wibranek, B.; Liu, Y.; Funk, N.; Belousov, B.; Peters, J.; Tessmann, O. (2021). Reinforcement Learning for Sequential Assembly of SL-Blocks: Self-Interlocking Combinatorial Design Based on Machine Learning, Proceedings of the 39th eCAADe Conference.
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Sadybakasov, A.; Wibranek, B.; Veiga, F.; Tessmann, O.; Peters, J. (2019). Building a Library of Tactile Skills Based on FingerVision, Proceedings of the 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids).
Download Article [PDF] BibTeX Reference [BibTex]
- Wibranek, B.; Belousov, B.; Sadybakasov, A.; Peters, J.; Tessmann, O. (2019). Interactive Structure: Robotic Repositioning of Vertical Elements in Man-Machine Collaborative Assembly through Vision-Based Tactile Sensing, Proceedings of the 37th eCAADe and 23rd SIGraDi Conference.
Download Article [PDF] BibTeX Reference [BibTex]
- Wibranek, B.; Belousov, B.; Sadybakasov, A.; Tessmann, O. (2019). Interactive Assemblies: Man-Machine Collaboration through Building Components for As-Built Digital Models, Computer-Aided Architectural Design Futures (CAAD Futures).
Download Article [PDF] BibTeX Reference [BibTex]
Information-Theoretic Active Exploration
- Schneider, T.; Belousov, B.; Chalvatzaki, G.; Romeres, D.; Jha, D.K.; Peters, J. (2022). Active Exploration for Robotic Manipulation, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Download Article [PDF] BibTeX Reference [BibTex]
- Schneider, T.; Belousov, B.; Abdulsamad, H.; Peters, J. (2022). Active Inference for Robotic Manipulation, 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM).
Download Article [PDF] BibTeX Reference [BibTex]
- Muratore, F.; Gruner, T.; Wiese, F.; Belousov, B.; Gienger, M.; Peters, J. (2021). Neural Posterior Domain Randomization, Conference on Robot Learning (CoRL).
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Abdulsamad, H.; Schultheis, M.; Peters, J. (2019). Belief Space Model Predictive Control for Approximately Optimal System Identification, 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM).
Download Article [PDF] BibTeX Reference [BibTex]
- Schultheis, M.; Belousov, B.; Abdulsamad, H.; Peters, J. (2019). Receding Horizon Curiosity, Proceedings of the 3rd Conference on Robot Learning (CoRL).
Download Article [PDF] BibTeX Reference [BibTex]
Maximum Entropy Reinforcement Learning and Stochastic Optimal Control
- Abdulsamad, H.; Dorau, T.; Belousov, B.; Zhu, J.-J; Peters, J. (2021). Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative Entropy Trust-Regions, arXiv.
Download Article [PDF] BibTeX Reference [BibTex]
- Klink, P.; Abdulsamad, H.; Belousov, B.; D'Eramo, C.; Peters, J.; Pajarinen, J. (2021). A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning, Journal of Machine Learning Research (JMLR).
Download Article [PDF] BibTeX Reference [BibTex]
- Eilers, C.; Eschmann, J.; Menzenbach, R.; Belousov, B.; Muratore, F.; Peters, J. (2020). Underactuated Waypoint Trajectory Optimization for Light Painting Photography, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).
Download Article [PDF] BibTeX Reference [BibTex]
- Lutter, M.; Belousov, B.; Listmann, K.; Clever, D.; Peters, J. (2019). HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints, Conference on Robot Learning (CoRL).
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Peters, J. (2019). Entropic Regularization of Markov Decision Processes, Entropy, 21, 7, MDPI.
Download Article [PDF] BibTeX Reference [BibTex]
- Klink, P.; Abdulsamad, H.; Belousov, B.; Peters, J. (2019). Self-Paced Contextual Reinforcement Learning, Proceedings of the 3rd Conference on Robot Learning (CoRL).
Download Article [PDF] BibTeX Reference [BibTex]
- Nass, D.; Belousov, B.; Peters, J. (2019). Entropic Risk Measure in Policy Search, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Peters, J. (2018). Mean Squared Advantage Minimization as a Consequence of Entropic Policy Improvement Regularization, European Workshops on Reinforcement Learning (EWRL).
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Peters, J. (2017). f-Divergence Constrained Policy Improvement, arXiv.
Download Article [PDF] BibTeX Reference [BibTex]
- Belousov, B.; Neumann, G.; Rothkopf, C.; Peters, J. (2016). Catching Heuristics Are Optimal Control Policies, Advances in Neural Information Processing Systems (NIPS / NeurIPS).
Download Article [PDF] BibTeX Reference [BibTex]
Boris serves as a reviewer for JMLR, TMLR, NeurIPS, ICML, AAAI, ICLR, EAAI, CORL, ICRA, IROS, AURO, RA-L, TR-O. For a full list of publications, see Publications.
Supervised Theses and Projects
- RL:IP'22w (with Tim Schneider; Alap Kshirsagar), Irina Rath, Dominik Horstkötter, Tactile Active Exploration of Object Shapes
- RL:IP'22w (with Tim Schneider; Alap Kshirsagar), Mario Gomez, Frederik Heller, Object Hardness Estimation with Tactile Sensors
- RL:IP'22w (with Tim Schneider; Alap Kshirsagar), Ruben Spari, Duc Huy Nguyen, Simulation of Vision-Based Tactile Sensors
- RL:IP'22w (with Tim Schneider; Yuxi Liu), Paul Hallmann, Nicolas Nonnengießer, Task and Motion Planning for Sequential Assembly
- BS Thesis (with Tim Schneider; Alap Kshirsagar), Alina Böhm, Robotic Tactile Exploratory Procedures for Identifying Object Properties
- MS Thesis (with Niklas Funk; Anton Savchenko), Paul-Otto Müller, Learning Interpretable Representations for Visuotactile Sensors
- MS Thesis (co-supervised with Niklas Funk), Xiangyu Xu, Visuotactile Grasping From Human Demonstrations
- RL:IP'22s (with Mehrzad Esmaeili; Yuxi Liu), Bingqun Liu, A Digital Framework for Interlocking SL-Blocks Assembly with Robots
- RL:IP'22s (with Tim Schneider; Yuxi Liu), Paul Hallmann, Patrick Siebke, Nicolas Nonnengießer, Multi-View Multi-Marker Object Tracking
- BS Thesis (co-supervised with Junning Huang), Max Siebenborn, Evaluating Decision Transformer Architecture on Robot Learning Tasks
- Seminar Humanoid Robotics, Timo Imhof, A Review of the Decision Transformer Architecture
- MS Thesis (co-supervised with Fabio Muratore), Theo Gruner, Wasserstein-Optimal Bayesian System Identification for Domain Randomization
- MS Thesis (co-supervised with Hany Abdulsamad), Tim Schneider, Active Inference for Robotic Manipulation
- MS Thesis (Funk, N.; Calandra, R.), Frederik Wegner, Learning Vision-Based Tactile Representations for Robotic Architectural Assembly
- RL:IP'20w-21s (with Fabio Muratore), Theo Gruner, Arlene Kühn, Florian Wiese, Likelihood-Free Inference for Domain Randomization
- RL:IP'20w (with Funk, N.; Chalvatzaki, G.), Jan Emrich, Simon Kiefhaber, Probabilistic Object Tracking Using Depth Carmera
- RL:IP'20w-21s (with Funk, N.; Chalvatzaki, G.), Leon Magnus, Svenja Menzenbach, Max Siebenborn, Object Tracking for Robotic Assembly
- RL:IP'20w (with Funk, N.; Chalvatzaki, G.;Wibranek, B.), Jan Schneider, Architectural Assembly: Simulation and Optimization
- MS Thesis (co-supervised with Davide Tateo), Jan Rathjens, Accelerated Policy Search
- MS Thesis (with Guoping Zhao), Rustam Galljamov, Sample-Efficient Learning-Based Controller for Bipedal Walking in Robotic Systems
- MS Thesis (co-supervised with Hany Abdulsamad), Tim Dorau, Distributionally Robust Optimization for Optimal Control
- RL:IP'20s (Chalvatzaki, G.;Wibranek, B.), Schneider, J.; Schneider, T., Architectural Assembly w/ Tactile Skills: Simulation and Optimization
- RL:IP'20s (Tosatto,S.;Wibranek,B.), Wietschorke,L.;Liu,Y.; Chen,J.Reinforcement Learning for Architectural Combinatorial Optimization
- MS Thesis (Abdulsamad, H.; Lutter, M.), Markus Semmler, Sequential Bayesian Optimal Experimental Design for Nonlinear Dynamics
- RL:IP'19w (with Michael Lutter), Rustam Galljamov, Searching for a Better Policy Architecture for the OpenAI Bipedal Walker Environment
- RL:IP'19w (co-supervised with Tuan Dam), Maximilian Hensel, A Functional Mirror Descent Perspective on Reinforcement Learning
- RL:IP'19s, Maximilian Hensel, Kai Cui, Likelihood-free Inference in Reinforcement Learning
- BS Thesis (co-supervised with Fabio Muratore), Christian Eilers, Bayesian Optimization for Learning from Randomized Simulations
- MS Thesis (co-supervised with Hany Abdulsamad), Matthias Schultheis, Bayesian Reinforcement Learning for System Identification
- MS Thesis (co-supervised with Hany Abdulsamad), Pascal Klink, Generalization and transferability in reinforcement learning
- RL:IP'18 (Muratore,F.), Eschmann, J.; Menzenbach, R.; Eilers, C. Underactuated trajectory-tracking for long-exposure photography
- MS Thesis, Alymbek Sadybakasov, Learning vision-based tactile skills for robotic architectural assembly
- BS Thesis, Lennart Ebeling, Experimental validation of an MPC-POMDP model of ball catching
- MS Thesis, David Nass, Risk-sensitive policy search with applications to robot-badminton
- MS Thesis, Yunlong Song, Minimax and entropic proximal policy optimization
- RL:IP'17w (Robot Learning: Integrated Project, Winter 2017), Alymbek Sadybakasov, Goal-directed reward generation
- RL:IP'17s (Robot Learning: Integrated Project, Summer 2017), Alymbek Sadybakasov, Interpolation of skills in the game of robopong
- RL:IP'16w (Robot Learning: Integrated Project, Winter 2016), Yunlong Song, Rong Zhi, Learning intuitive physics from videos
- RL:IP'16s (Robot Learning: Integrated Project, Summer 2016), Tang, J.; Staschewski, T., Gou, H. Playing badminton with robots