# Notation and Symbols used for Robot Learning

## General Notation

• Vectors will always be written in bold font and lower case, i.e., .
• With a vector, we will always denote a column vector, i.e., . A row vector is denoted as .
• Matrices will always be written in bold font and upper case, i.e., .
• Gradients are always defined as row fectors, i.e., .
• The gradient of a vector valued function is a matrix defined as
• The expectation of a function with respect to a distribution will be written as

## Robotics

• ... joint positions, ... joint velocities, ... joint accelerations
• ... motor command, controls
• ... 1. torques (a motor command), 2. trajectory or 3. temporal scaling parameter for movement primitives ()
• ... action (often and can be replaced)
• ... state of the agent (used in most RL literature)
• ... 1. state of the system (used in control literature, often and can be replaced), 2. task space coordinates (for example end-effector coordinates) 3. input sample for supervised learning methods
• ... 1. state of a dynamical movement primitive, 2. output sample for supervised learning methods
• ... 1. ... forward kinematics, 2. (or similar notation for state and control) ... forward dynamics
• ... Jacobian (of the forward kinematics)

## Machine Learning

• ... 1. parameter vector, 2. (occasionally) joint angles
• ... feature vector of a single sample
• ... feature matrix containing the feature vectors of all samples (each row is a transposed feature vector)
• ... regularization constant = precision of the prior over the parameters
• ... measurement noise
• ... matrix of all input vectors (in each row a sample)
• ... matrix of all output vectors (in each row a sample)

## Optimal Decision Making

• ... determinstic policy
• ... stochastic policy
• ... state visit distribution of policy
• ... initial state distribution
• ... reward function
• ... expected long term reward of policy
• ... value function of policy $\pi$
• ... state-action value function of policy $\pi$
• ... optimal value function
• ... optimal state-action value function