E N D
Display Registration for Device InteractionNick Pears1, Patrick Olivier2, Dan Jackson2Contact email: nep@cs.york.ac.uk1. Department of Computer Science, University of York, UK2. Culture Lab, King’s Walk, Newcastle University, Newcastle, UKPresented at VISAPP’09, Madeira, Portugal, January 2008
Outline of the talk • Context of the application. • Display registration. • Marker based registration • Natural registration • Evaluation. • Conclusions.
Direct interaction with a PC • Lets say you have taken a digital picture using your mobile phone, whilst out walking and you want to transfer this to your PC when you get home. • It would be nice if you could manage all of this process using the phone itself. • Point the phone camera at the ‘My Documents’ folder, tap to select, twist the phone to open the folder, do the same for the ‘My Pictures’ folder, select the digital picture(s) on the mobile phone using the keypad, and push the phone towards the screen to transfer the images from the phone into the PC folder.
The estate agent example. • You want to get details from an estate agent window, where the properties are displayed on a screen. • Point the phone camera at the property that you are interested in, press the keypad or tap the touch sensitive phone screen, and this triggers the PC connected to the screen to send the property details to your phone wirelessly. • Another action, such as phone push, could trigger an email to be sent to your account, with the property details. • A twist of the phone could open larger images and more details onto the full screen, and so on.
Other examples of large public displays. • At the cinema there is a large public display, with icon representing different movies. • The purchase of a ticket may be done by pointing at the appropriate icon and executing a phone action (push/pull/twist/tap) • Similarly one could do this to purchase tickets at a railway station… • …or interact with advertising displays in a shopping mall, to check if an item is in stock.
What do you need to make this work • [Essential] Mobile phone (or PDA) equipped with a camera on the rear of the device. • [Essential] Wireless connection to the PC screen (or display) that you are interacting with. Bluetooth is ideal. • [Desirable] A touch sensitive screen on the mobile phone / PDA.
The key concept: Display Registration • If we can register the correct part of the PC display with the image of that display on the mobile phone, then we have established “display registration”. • This means that, for every pixel on the mobile phone, we know precisely the point on the PC display that it corresponds to or “points to”. • Thus we can choose one pixel, typically the centre pixel of the mobile display, as a pointing device, with which we can initiate any context dependent direct interaction (“push/pull/twist/tap”) with the PC (intelligent display).
A ‘cloned’ display on the cell phone. • Essentially we have a copy of a small part of the PC display on the mobile phone – and so we can interact with this, as if we were interacting directly with the PC screen, by echoing such interactions over the Bluetooth link to the appropriate screen position on the PC. • We developed a ‘write through’ application using a PDA – here you hold the PDA up to the PC screen, with the “MS Paint” application running on the PC. You can write text on the PDA and hand written text is echoed onto the PC to the place on the PC screen that the PDA is pointing to, as if you were writing on the PC screen itself..
Direct interaction with a 6 DOF mouse • A side effect of establishing display registration is that, if the cell phone is calibrated, we can compute the 6 DOF position and motion of the phone relative to the PC screen, and thus have Wii like interaction with the screen. • Typically, this manifests itself as push/pull and twist movements of the mobile phone, which can be detected, even without calibration.
How do we register the displays? • If the imaging process can be modelled by a pinhole camera, then the relationship between the cell phone image and the PC screen is a planar projectivity or homography. • Thus achieving display registration requires that the homography between the display and image is both frequently (10Hz) and accurately computed. • If we can find four corresponding points, no three collinear, across the image and display, we can compute the homography – more points allow us too use LS to increase accuracy of the estimation. • Two approaches are possible • Natural (markerless) registration : relatively hard • Marker based registration: relatively easy
Natural, markerless, registration • May need care to choose a background and windows that have some texture when imaged. • There are many possible approaches to image registration in the Computer Vision literature. • Corners • Planar invariants based on the cross-ratio • SIFT features for matching across different scales
Our first prototype is marker-based. • bullet
System operation • Bluetooth communication link is established. • PC moves markers (four green squares) around its screen, while the user points the phone anywhere on the screen. • The phone segments the markers, using colour, and transmits the positions of the centres of the fours squares to the PC. • The PC associates these positions with the centres of the four squares on its screen and computes the plane-to-plane homography between the phone image and PC display. • This homography is then used by the PC to compute where the corners of each individual green square should appear on the PC screen, such that the markers remain approximately constant in position and size on the cell-phone screen. • Marker positions and shapes on the PC screen are updated • For further cycles of the operation, the targets are switched between ‘filled’ and ‘hollow’ so we correctly associate pairs of PC target display position and imaged target position.
Video 1: write through • PDA based ‘write-through’ application (not evaluated)
Montage talk-through evaluation Translate, rotate and scale 3 images, using a phone.
Video 2. Photo-montage rearrangement • Phone based ‘photo-montage’ talk-through evaluation.
Video 3. Copy house with phone-mouse • Phone based ‘house-draw’ talk-through evaluation.
Results • Four subjects were asked to complete two tasks (photo-montage and house-draw). • All four could use the system with minimal guidance. • Transient registration errors were distracting, for example, causing erroneous brushstrokes. • The system always re-established registration after losing the markers (phone not correctly pointed at screen). • The direction of scaling seemed counter intuitive. • Rotating and translating were easier than scaling, in general.
System specification • Siemens SX1 cell phone, 130 MHz TI OMAP 310 • Symbian OS V6.1 • Frame rate 8Hz • Camera FOV 30 degrees • Maximum phone speed: 30cm/sec at 45cm from a 17 inch display • Target acquisition time: 2-3s • AVI videos for all tests are at http://irgen.ncl.ac.uk/data/temp/displayreg/TaskVideos/
Conclusions. • We have presented a novel application, which allows a cell phone equipped with a camera and bluetooth, to directly interact with a PC or intelligent display. • We have implemented a marker-based prototype and used talk-through evaluations to prove the concept. • Further work is needed to improve the tracking smoothness and reliability of this prototype. This is becoming easier with improved phone technology. • Ultimately we would like this to progress to a marker-less system, which has widespread use on many mobile phone platforms.
Press interest. Recently reported by New Scientist, the ACM technical news alert, PC advisor, several newspapers and many websites around the world. (Due to VISAPP paper abstract.)
title • bullet