Percepción dinámica del entorno en un robot móvil

Pilar Bachiller
Tecnología de los Computadores y las Comunicaciones, Extremadura
July, 2008
Full text (external site)
 

Abstract


During the last few years, attention has become an important issue in machine vision. Studies of attentional
mechanisms in biological vision have inspired many computational models. Most of them follow
the assumption of limited capacity associated to the role of attention from psychological proposals. These
theories hypothesize that the visual system has limited capacity of processing and that attention acts as
a filter that selects the information that should be processed at each time. This assumption has been
criticized by many authors who afirm that processing capacity of human perceptual systems is enormous.
From this view, there is no need for an stage of selection of the information to be processed. Instead, they
claim the role of attention from the perspective of selection for action. According to this new conception,
the function of attention is to avoid a behavioral disorganization by selecting the appropriate information
to drive task execution. Such a notion of attention is very interesting in robotics where the aim is to
build autonomous robots that interact with complex environments, keeping multiple behavioral objectives.
Attentional selection for action can guide robot behaviors by focusing on relevant visual targets while
avoiding distracters. Moreover, it can be conceived as a coordination mechanism, since it allows serializing
the actions of, potentially, multiple active behaviors. To exploit these ideas, we propose a visual attention
system based on the selection for action theory. It has been design and tested on a mobile robot endowed
with a stereo vision head.
The proposed system has been modeled as a collection of components collaborating to select, fix and
track visual targets according to different task requirements. The low level components are related to image
acquisition, motor control, as well as computation and maintenance of regions of interest (ROI). Components
of intermediate level are in charge of extracting sets of ROI features related to what (appearance
information) and how (spatial information) matters. These features are used by high level components,
called target selectors (TS), to drive attention according to certain top-down behavioral specifications.
Attention control is not centralized, but distributed among several target selectors. Each of them drives
attention from different top-down specifications to focus on different types of visual targets. At a given
time, overt attention is driven by one TS, while the rest attends covertly to their corresponding targets.
The frequency of overt control of attention of each TS is modulated by high level behavioral units according
to their information requirements. The fixation of a selected target is accomplished by two independent
camera movements: a saccadic and tracking movement in one of the cameras and a vergence movement
in the other. This allows controlling attention from monocular information while keeping stable binocular
fixation. Once this perceptual-motor process is completed, the foveated target is sent to the behavioral
units. Only actions compatible with the focus of attention are then executed, solving the behavior coordination
problem. The whole system works as a control architecture that is attracted towards different visual
targets to keep several behavioral goals. The specific interleaving between actions is given by an implicit
time relation that links internal parameters and external world features.


ISSN: 1888-0258