Advancements in Sensory-Motor Perception and Biologically-Inspired Hierarchical Learning for Embodied Intelligence
Published in Doctoral Thesis, 2024
Recommended citation: Eleftherios Triantafyllidis (2024). "Advancements in Sensory-Motor Perception and Biologically-Inspired Hierarchical Learning for Embodied Intelligence". Doctoral Dissertation. The University of Edinburgh, February, 2024. Edinburgh, United Kingdom. https://era.ed.ac.uk/handle/1842/41453
This doctoral thesis delves into the fascinating world where biology meets robotics. Inspired by the remarkable sensory, dexterous, and cognitive abilities of humans, this thesis explores pathways to augment machines with human-like cognition and dexterity, inspired from a neurobiological standpoint.
Thesis Abstract
From a biological perspective, humans possess incredible sensory, dexterous and cognitive abilities. By virtue of these abilities, humans and more broadly their sensory systems are able to adapt to environmental demands seamlessly. Switching from a biological perspective to robotic systems as embodied intelligence, achieving such adaptation is currently far from trivial. Understanding the biological mechanisms that govern and render humans proficient in interacting with their surroundings, could ultimately illuminate new pathways to replicate such human-like cognition and dexterity in machines. Inspired by the aforementioned narrative, this thesis delves into and addresses four main research areas. The first contribution of this thesis provides novel insights into the intricate interplay of multimodal interfaces, their impact on the human sensory-motor system and their correlation to the generation of meaningful motor actions. Different sensory modalities are examined, entailing a full factorial comparison of auditory, visual and somatosensory states and their influence on motor performance. Through a series of varying complexity motor tasks with human subjects, a correlation is established between sensory states and their influence on motor actions. Results provide novel evidence of which sensory combinations contribute to enhanced task performance and how these can be harnessed. The second contribution of this work is the derivation of a novel metric capable of quantifying motor actions stemming from the intricate human sensory-motor system. Measuring human motor performance is a complex phenomenon and the absence of a standardised metric renders inter-study comparability challenging. To this end, four motor tasks, increasing in spatial complexity, were devised to establish a correlation of which spatial variables influence motor performance. Results revealed which spatial variables had the most notable effect, highlighting that existing metrics are inadequate for modelling higher dimensions. To account for this, a novel metric is derived, capable of modelling human motor performance in full 3D space, underlining its value for quantifying commonly seen motor movements and enhancing inter-study comparability. The third and penultimate contribution builds on the foundation laid by the preceding segments, striving to align human capabilities closer to embodied intelligence. To realise this aim, inspired by a biological standpoint, the RObotic MAnipulation Network (ROMAN) is introduced. ROMAN is a novel Hybrid Hierarchical Learning (HHL) architecture designed to address the challenges of notably complex long-horizon sequential tasks. ROMAN utilises the exploratory nature of Reinforcement Learning (RL) while simultaneously exploiting the higher-level skills of humans in the form of imitation. Consisting of a plethora of specialising skills, ROMAN’s hierarchical architecture demonstrates versatility in intricate, long-horizon sequential tasks; while exhibiting robustness against various levels of sensory uncertainties. By virtue of the HHL employed, ROMAN also exhibits adaptability beyond demonstrated behaviour; featuring failure recovery capabilities and adaptation in avoiding local minima. These results underline the significance of ROMAN for autonomous manipulation tasks necessitating intelligent and adaptive behaviour. The fourth and concluding contribution of this thesis investigates the potential of language-guided exploration in augmenting embodied intelligence. In pursuit of this goal, the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework is presented, capable of complementing the existing bio-inspired hierarchy of ROMAN. By harnessing LLMs as an assistive intrinsic reward source alongside the conventional RL paradigm, IGE-LLMs enhances the exploratory process to address intricate settings challenged by sparse rewards and long-horizons. Validated on environments challenged by exploration and long-horizons, IGE-LLMs exhibits notably higher performance over existing methods and is capable of complementing the shortcomings of using LLMs in isolation. Moreover, the modularity and robustness of IGE-LLMs is underscored, due to its ability to complement existing intrinsic reward methods and its insensitivity to most intrinsic scaling parameters. Finally, the framework’s resilience is highlighted over existing methods when faced with increased uncertainties and horizons. Capable of fostering exploration and the ability to automate the orchestration of ROMAN’s intricate macro-actions, IGE-LLMs value as a language-guided framework is underlined. This thesis provides novel findings on harnessing human sensory-motor abilities for generating meaningful motor actions, which can be adequately measured and quantified. Ultimately, these represent the inspiration for shaping the development of a novel bio-inspired learning method to align human capabilities closer to embodied intelligence that can further be complemented and automated by eliciting language-guided exploration; tailored to address notably intricate, long-horizon tasks with sparse rewards. Nevertheless, to further narrow the gap between humans and machines, a deeper understanding of designing artificial intelligence inspired by biological insights is necessitated.