Real Object Recognition and Augmentation for n-Screen Convergence Service

Jong-il  Park; Jong-Hyoun  Kim; Soo  Hong Kim

Computer Science and Engineering

p-ISSN: 2163-1484 e-ISSN: 2163-1492

2012; 2(3): 32-36

doi: 10.5923/j.computer.20120203.05

Real Object Recognition and Augmentation for n-Screen Convergence Service

Jong-il Park¹, Jong-Hyoun Kim², Soo Hong Kim³

¹Dep’t of Computer and Information Telecommunication Engineering, Sangmyung University, Seoul, 110-743, South Korea

²Dep’t of Gameware, Kaywon School of Art and Design, Uiwang, 437-712, South Korea

³Dep’t of Computer Software Engineering, Sangmyung University, Cheonan, 330-720, South Korea

Correspondence to: Jong-Hyoun Kim, Dep’t of Gameware, Kaywon School of Art and Design, Uiwang, 437-712, South Korea.

Email:

Abstract

It is very challenging issue to converge n-screen media (Web, IPTV, SmartPhone) which are encouraged by AR, motion detection, hand shape recognition and non-marker object tracking technologies. Our services converge virtual/real world, individual/social relationship and in-door/out-door space pragmatically and intimately. We propose token-sharing mechanism among media for guarantying synchronization and continuity of playing contents. Web side service is main region for educational works. User develops his/her and friend’s territories as solving educational quests. Our mobile out-door service traces the objects by not a marker but a non-marker image tracking for according the feeling of realistic response during collaboration activities to users. The in-door IPTV service interacts with user’s intuition simultaneously thru motions and hands recognition technologies. These approaches will be one of initiative applications to make virtual and real space seamless, and will open the new field of learning system as last.

Keywords: Object Recognition, 3D Object Augmentation, n-Screen Convergence

Article Outline

1. Introduction

2. Related Work

2.1. N-screen Convergence

2.2. Non-marker based Object Tracking

3. n-Screen Convergence Services

3.1. Service Overview

3.2. SmartPhone@outdoor Service

3.3. IPTV@classroom Service

4. Conclusions

ACKNOWLEDGEMENTS

1. Introduction

We are embarking on the exploration of converging spaces, media, screens and persons that can make our cognition seamless, continuous and simultaneous. Especially in educational environment, these works are very useful for consolidating learning, thinking and acting of students. As making borders of online and offline vague, our experiments force students to be motivated to study naturally and logically.

Our service, one of educational adventure game, connects virtual/real world, individual/social relationship and in-door/out-door space pragmatically and intimately. We implemented web-based social network contents on an in-door personal desk, group competition contents embedding motion detection and hand tracking on an IPTV in a classroom and mobile contents encouraged by Augmented Reality on mobile devices like IPhone or IPad operated in out-door places. User develops one’s and friend’s territories as solving educational quests. The SmartPhone service traces the objects by not marker but non-marker image tracking to feel realistic response during collaboration activities. The IPTV interacts with user’s intuition simultaneously thru motions and hands recognition methods.

For guarantying completed synchronization and continuity among every media and its progressing contents running under each medium, our services propose token-sharing mechanism among media. Tokens are accessed/transmitted from/into media. Each medium has up-to-date values every time and every place to play the game. The up-to-dated values of tokens are stored in web-side database by user.

The remainder of this paper is organized as follows. Section 2 compares with existing related work. Section 3 introduces our services including not only overall system structures and flow of development of the services, but also convergence protocol among various media. Finally, we conclude with a note on the current status of our projects and future works.

2. Related Work

A variety of research efforts have recently explored n-screen convergence and computationally augmented interfaces that emphasize human interaction.

2.1. N-screen Convergence

The SMILES (Smartphones for Interacting with Local Embedded Systems) project proposes the use of Smart Phones as universal remote controllers[1]. They define a service discovery protocol built on top of Bluetooth SDP, an interaction mechanism to operate over the services discovered, and a payment protocol to pay for their use. Personal Server is a small-size mobile device that stores user’s data on a removable Compact Flash and wirelessly utilizes any I/O interface available in its proximity (e.g., display, keyboard)[2]. Its main goal is to provide the user with a virtual personal computer wherever the user goes. Unlike Personal Server which cannot connect directly to the Internet, Smart Phones do not have to carry every possible data or code that the user may need; they can obtain on demand data and interfaces from the Internet. CoolTown proposes web presence as a basis for bridging the physical world with the World Wide Web. For example, entities in the physical world are embedded with URL-emitting devices (beacons) which advertise the URL for the corresponding entities. Our model proposes web service presence and makes use of only off-theshelf hardware[3].

2.2. Non-marker based Object Tracking

[4] used both edges and optical flow without the need of a known motion model, which is the case of most AR applications. Texture based feature extraction and optical flow tracking were also joined together in a multithreaded manner in[5]. Another approach to speed up the tracking is to use only a subset of the template pixels for pose calculation, which can be selected previously in an offline phase.[6] proposed the Selective Pixel Integration, where the pixels to be used are randomly selected from the ones that contain more texture information. There was initiative trial to apply object tracking technology into online dice and TCG game[7, 8, 9].

Figure 1. Service concepts

3. n-Screen Convergence Services

Our service converges media, spaces and contents, and so user can do work continuously and seamlessly in any places with dedicated media. The SmartPhone@out-door, IPTV@classroom and Web@home,are key media for our services as shown in Figure 1. User develops one’s and friend’s territories as solving educational quests. The SmartPhone service traces the objects by not marker but non-marker image tracking to feel realistic response during collaboration activities. The IPTV interacts with user’s intuition simultaneously thru motions and hands recognition methods.

3.1. Service Overview

Our token-sharing mechanism guarantees synchronization and continuity of educational contents. Each medium gets and puts values defined as following Token structure. These values are stored in database by user. Whenever the medium needs the values of token it can access the database and acquire up-to-date one.

typedef struct Token {

member_id; // member identification

book_id; // magic book identification

page_no; // page number of the magic book

card_id; // card identification for the page

territory_no; // territory number current developing

class_id; // class identification assigned

} token;

The flow of development of the services is depicted in Figure 2. A sky-blue circle represents Web@home duty, yellow circle on SmartPhone@out-door and green circle on IPTV@classroom. User can achieve a card, which is a key for making magic-book, by self-playing on Web@home service, by solving group competition quiz on IPTV@classroom and by finding predefined objects on SmartPhone@out-door.

Figure 2. Service development flow

Figure 3. Extracting keypoint from source object

Figure 4. Extracting keypoint from target scene

3.2. SmartPhone@outdoor Service

This service supports out-door field activities which may be collaborated among friends. After arriving at the destination user finds predefined animals or historic relics. To detect specified features most important thing is select the key-point of the image. This finds high-contrast region like an edge. The red or blue circles in Figure 3 and 4 represent the extracted key-points for the source object and target scene respectively.

This non-marker based object tracking takes two phases; off-line training and real-time recognition. The process of object recognition and tracking by each phase is descripted in Figure 4. We’ve got big hints from previous researches[10, 11, 12, 13, 14] for it.

As a result of implementation of the two phase algorithm, we call it Image Hot Spot Extraction Algorithm, this service can detect and trace the object in any orientation, within tilted pose about 60 degree above and below and in complicated environment. Please refer Figure 5.

Figure 5. Detecting source object within target scene

The 3D model can be augmented on the source object detected within target scene nicely. The model is translated, rotated and scaled by xyz 3 axis. The degrees in top-right corner of Figure 6 represent rotated angles by 3 axis. The degree value represents how many degrees are rotated of target object compared to source one.

Figure 6. 3D model augmentation on the detected source object

Using these recognition and augmentation technology we demonstrated educational App-application which encourages outdoor activities of students. A student takes mission to solve quest from Iphone-based game. When the student gets a pre-defined sign in pre-defined location he/she can take word card which is one of key to open magic-book. The flow to get a card is in Figure 7.

Figure 7. Flow of outdoor activity; (a) quest-app main screen (b) go to the destination using GPS (c) take a sign and recognition (d) get a game card

3.3. IPTV@classroom Service

This service supports group competition quiz in the classroom supervised by a teacher. As soon as a question is taken on the IPTV students raise his/her hand. The hand tracker picks out the fastest one. My skin color blob matching method can detect various shapes and motions of hands[15]. This hand tracking method detects skin color, moving area, face and hand as shown in Figure 8.

Figure 8. Flow of skin color blob matching

Experiments have been performed in many live demonstrations and shown very good tracking performance with near frame rate speed. Figure 8 shows some tracking results with hand segmentation. One of main feature of the proposed algorithm is robustness to the fast and large movement. Figure 8(a-d) shows the successful tracking for fast movement that causes motion blur.

Figure 9. (a)pointing (b)pus (c)pull (d) pass

4. Conclusions

We proposed n-screen convergence services which combine Web, IPTV and SmartPhone medium. These are encouraged by AR, non-marker object tracking, motion detection and hand tracking. These approaches will be one of initiative applications to make virtual and real space seamless, and will open the new field of learning system finally. For guarantying completed synchronization and continuity among every media and its progressing, our services also proposed token-sharing mechanism among media. Tokens are accessed/transmitted from/into media. Each medium has up-to-date values every time and every place to play the game

We are still working on following areas.

• New digital interface

• Applying 3D to education

• Virtual advertisement in produced videos

• Interactive mechanism to make real and virtual world seamless

ACKNOWLEDGEMENTS

This research is supported by Sangmyung University Research Fund and Program of the year 2012.

References

[1]	Iftode L, Borcea C., Ravi N., Kang N., and Zhou P. “Smart Phone: An Embedded System for Universal Interactions”, Proceedings of 10th International Workshop on Future Trends in Distributed Computing Systems FTDCS, 2004.
[2]	Want R., et al. “The Personal Server: Changing the Way We Think about Ubiquitous Computing”, In Proceedings of Ubicomp2002, pp. 194–209. Springer LNCS, September 2003.
[3]	Kindberg T., Baron J., et al.” People, places, things: Web presence for the real world”, 3rd IEEE Workshop on Mobile Computing Systems and Applications, December 2000.
[4]	Wagner, D., and Schmalstieg, D., “First steps towards handheld augmented reality”, In Proceedings of Seventh IEEE International Symposium on Wearable Computers, 2003.
[5]	ROHS, M., “Real-world interaction with camera-phones”, In 2nd International Symposium on Ubiquitous Computing Systems (UCS 2004), 2004.
[6]	M. Hachet, J. Pouderoux, and P. Guitton, “A camerabased interface for interaction with mobile handheld computers”, In Proceedings of ACM symposium on 3D interactive graphics and games (I3D’O5). 2005.
[7]	.Jong-Hyoun Kim, “The Initiative Experiments about New Interface for a Networked Game”, IDC’09, 5th international conference on Digital Content, Multimedia Technology and its Applications, Seoul, Korea, August 26-28, 2009.
[8]	Jong-Hyoun Kim, Teresa Cho, “The initiative experiments for utilizing real cards in online Trading Card Game”, ICCIT’10, Seoul, Korea, Nov.30~Dec.2, 2010.
[9]	Jong-Hyoun Kim, “The Initiative Experiments to make virtual and real space seamless”, IJIPM: International Journal of Information Processing and Management, Vol. 2, No. 1, pp. 133 ~ 139, 2011.
[10]	David G. Lowe, "Object recognition from local scale-invariant features,” in International Conference on Computer Vision, pp. 1150–1157, 1999.
[11]	Lowe, D. “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, vol. 60, no. 2, p. 91-110, 2004..
[12]	Vacchetti, L., Lepetit, V. and Fua, P. “Stable Real-Time 3D Tracking Using Online and Offline Information”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, n. 10, p. 1385-1391, 2004.
[13]	krypnyk, I. and Lowe, D. “Scene Modelling, Recognition and Tracking with Invariant Image Features”, Proc. International Symposium on Mixed and Augmented Reality, p. 110-119, 2004.
[14]	Matas, J., Zimmermann, K., Svoboda, T. and Hilton, A. “Learning Efficient Linear Predictors for Motion Estimation”, Proc. Indian Conference on Computer Vision, Graphics and Image Processing, p. 445-456, 2006.
[15]	Jung-Ho Ahn and Jong-Hyoun Kim, “A Stable Hand Tracking Method by Skin Color Blob Matching”, PSR Journal, Pacific Science Review, vol.12, no.2, pp.146-151, 2010Mayank Suhirid, Kiran B Ladhane, Mahendra Singh, Vishwas A Sawant, "Lateral Load Capacity of Rock Socketed Piers Using Finite Difference Approach", Scientific & Academic Publishing, Journal of Civil Engineering Research, vol.1, no.1, pp.1-8, 2011.

Paper Information

Journal Information

Real Object Recognition and Augmentation for n-Screen Convergence Service

Article Outline

1. Introduction

2. Related Work

2.1. N-screen Convergence

2.2. Non-marker based Object Tracking

3. n-Screen Convergence Services

3.1. Service Overview

3.2. SmartPhone@outdoor Service

3.3. IPTV@classroom Service

4. Conclusions

ACKNOWLEDGEMENTS

References