Learning in intelligent systems is a result of direct and indirect interaction with the environment. While humans can learn by way of different states of (inter)action such as the execution or the imagery of an action, their unique potential to induce brain- and mind-related changes in the motor action system is still being debated. The systematic repetition of different states of action (e.g., physical and/or mental practice) and their contribution to the learning of complex motor actions has traditionally been approached by way of performance improvements. More recently, approaches highlighting the role of action representation in the learning of complex motor actions have evolved and may provide additional insight into the learning process. In the present perspective paper, we build on brain-related findings and sketch recent research on learning by way of imagery and execution from a hierarchical, perceptual-cognitive approach to motor control and learning. These findings provide insights into the learning of intelligent systems from a perceptual-cognitive, representation-based perspective and as such add to our current understanding of action representation in memory and its changes with practice. Future research should build bridges between approaches in order to more thoroughly understand functional changes throughout the learning process and to facilitate motor learning, which may have particular importance for cognitive systems research in robotics, rehabilitation, and sports.