| | | | | | | |
Urban Location from Street Signs
Street Plate Detector and Recogniser
Object awareness is investigated to detect
and recognise objects of high interest in urban
scenarios, such as, buildings, infrastructure,
people, and signs. MOBVIS demonstrates
how geo-indexing significantly improves
performance in mobile object recognition
by exploiting the information of augmented
digital city maps. Query image and GPS based
position estimate are sent to the server which
responds with results from the geo-indexed
object recognition. Furthermore, visitors
might be informed with annotation, including
history, event and shop relevant information,
about the point of interest.
The user image is automatically localized
by relating the image to the MOBVIS image
database. Via triangulation the user's position
and orientation is determined, yielding
accuracies comparable to GPS. In addition,
image-based localization enables novel
services, like hyperlinking reality or georeferenced
The illustration shows
a query image (blue frame)
and some reference images
(green frames) used to position
and orient the query image and
consequently the user.
Some geometric relations relating
the query image with one of the
reference images are indicated by
the dark green lines.
MOBVIS introduced new outdoor positioning possibilities that are offered by combination of GPS and WLAN positioning, as well as motion estimation by dead reckoning and state-ofthe- art vision positioning. The combination of vision-based technology with incremental positioning has found to enable continuous position estimates, making it directly compar able to standard techniques such as GPS and WiFi. Interestingly, computer vision has shown to enable localization accuracies compar able to GPS.
Link to more details
Visual Context Awareness
MOBVIS provided a concept of vision based context on how to extract, learn and use contextual features to guide object detection. Three complementary types of contextual features are proposed: viewpoint prior, geometrical context and textural context. The concept aids the detection process, yielding speedup and increasing detection accuracy. Examples are shown for pedestrian detection.
Link to more details
Activity is an important source of context information. MOBVIS explored methods for unsupervised activity modelling based on signals from multiple body-worn sensors, including accelerometers. For a given set of long-time captured information it was possible to build models that correspond to different everyday activities, including eating and shopping, and without requiring a prior training, user annotation or information about the number of tasks involved.
Augmented Digital City Maps
Vehicles are collecting data about urban infrastructure for the definition of map features and points of interest, including geo-referenced images, traffic infrastructure and tourist sight information. Map features are stored in and provided to mobile vision services by the Mobile Mapping Data Warehouse of Tele Atlas. Standard digital city maps are augmented with these data as a support of mobile vision services. User track and image reference data are visualised and can be interactively accessed in the MOBVIS user interface.
Geo-Services & Incremental Map Updating
Geo-services are responsible for the interaction with the map based geoinformation knowledge. A complex functional interface to the digital map information has been defined in MOBVIS for the capabilities to realise appropriate responses to requests from the MOBVIS system components e.g., under variation of the spatial scope and the quality of the request on geo-information, and for the provision of specific information to the vision module to generate object hypotheses. Geo-services enable intelligent user position and orientation based filtering of surrounding objects for geo-indexed object recognition and analysing of map features for real-time context detection.
MOBVIS supports incremental updating of maps and therefore automated authoring of urban infrastructure, including road furniture, public transport, and public objects, such as coffee shops.
Strategies of attention naturally refer to a cascaded processing of – potentially – different visual features, each indexing to a certain coverage of an associated search space. A first step in the cascaded processing is to localise categorical visual features, those that would relate to a specific set of objects, or, inversely, to relate to background information, such as vegetation and cobblestones.
MOBVIS developed a multi-cue attention system that combines bottom-up and topdown influences. Sequential attention was developed to exploit geometrical constraints for object recognition by a concept that is inspired from human attention and eye movements.
In addition, the extraction of street profiles from 3D information recovery supports indexing into city maps for location awareness.
Link to more details
The context framework used in the Attentive Machine Interface (AMI) defines a cue as an abstraction of logical and physical sensors which may represent a context itself, generating a recursive definition of context. Sensor data, cues and context descriptions are defined in a framework of uncertainty. The architecture of the AMI reflects the enabling of both bottom-up and top-down (attention driven) information processing. Attention enabled by the AMI means focusing operations on a specific detail of a situation that is described by the context.