| Advisor |
Prof. Calton Pu calton [at] cc.gatech.edu |
| Area | Systems/Distributed Computing |
Advances in computing and communication technologies have allowed wide range of devices, from workstation to cellular phones, to connect Internet and access Web contents. Due to this trend, content provider (e.g. Web servers) must provide multiple versions of same content according to each client device's computing power such as display resolution, memory space, CPU power, and network bandwidth as well as software running on it. On top of this wide variety, Web servers must satisfy clients' preferences for content - this is called personalization of content. Some Web servers create multiple versions of their content and place them on their repositories or database at the moment they create new content or modify old content. But this is obviously very cumbersome burden to Web content authors. In order to overcome this issue, many research groups have proposed dynamic content adaptation techniques[1, 2, 4, 5, 9], called transcoding, and cache mechanisms[7, 8]. There are three approaches for transcoding content depending on who controls the adaptation process. Those are server side, client side, and intermediate. Due to the lack of computing power and bandwidth, the transcoding process of client side approach is very limited or impossible for the multimedia content because these kinds of contents need very expensive transcoding process. Server side approach is able to overcome those problems client side approach has, but the wide variety mentioned above and transcoding cost could make servers suffer heavy load and therefore generate a long delay of transmitting content to client. In this project, I have investigated recent researches on complete distributed adaptation using intermediary approach for transcoding and caching content, and figured out what are the issues and problems on those researches. Finally, I have suggested my own solutions as a future work on this topic for those problems.
In this section, I have categorized content adaptation models into two areas - transcoding and caching contents - in terms of where each model focuses on. I briefly discuss issues introduced by related papers at each subsection and how they have approached the problems. Through investigating representative models as below, I have figured out what issues and problems still remain on content adaptation area. Those issues and problems have been discussed in section 3.
2.1 Transcoding
Tailoring content to match the device characteristics or users' preference requires functionalities for transforming content. Those functionalities are called transcoding. Many research groups have proposed infrastructures and algorithms to resolve issues and challenges related with this area. In [2], they have focused on dynamic and distributed adaptation of individually injected components in response to system and network condition under the ubiquitous and heterogeneous devices environments. They have proposed CANS - composable, adaptive network services - infrastructure, which is an application-level infrastructure for injecting application-specific components into the network. Using CANS infrastructure, distributed adaptation components can be efficiently composed to generate target content, and also their mechanism allows those generated composition to be dynamically changed to build the best datapath under the certain changes such as network condition changes, system overloads and user environment changes. Modularizing adaptation functionality is also shown in [1]. What they focused on is how to connect each modules using flexible mechanism. The mechanism they employed is a clear interface language for connections of modules consisting of input pair and output pair. When a user requests a content, this request message includes the output pair -type and its properties- of the content. The intermediate server then looks up the appropriate module through matching the output pair with an input pair of a module. This process keeps continuing until building a complete pipeline. In [4], they used the similar approach in their middleware called CAP - content adaptation pipeline - architecture. Unique features of CAP architecture are they use XML language as a interface language for describing users' characteristics, content and adaptation commands, and they divide modules per operation rather than functionalities. That is, the pipes consist of data characterization function, adaptation command generator, and content adaptation executor. They have applied CAP architecture into the medical systems of UCLA.
2.2 Caching Content
Since the transcoding process is expensive especially when handling complex multimedia data type, many researchers have been focusing on cache mechanism located on somewhere between server and client. If the content cached on proxy, which is usually located close to client is exactly matched with user request, this is the best case in terms of latency and user response time. Even partially matched, which means a system found higher fidelity content cached on a proxy that can be transformed to the content user requests, this can also save much system factors than the purely transcoding case. [7] has proposed a peer-to-peer based caching systems for content adaptation called Tuxedo. In [6], they have evaluated flat-based cache mechanism architecture (e.g. peer-to-peer) against hierarchical based cache mechanism. [8] has introduced a programmable proxy, which can intelligently adapt to prevailing system conditions to intelligently decide whether to transcode locally or fetch an appropriate version from the server when the cache has a higher fidelity version of content. They used a simple learning mechanism for the decision.
There are many related works handling unique issues of content adaptation areas rather than caching and transcoding mechanisms. In [3], they proposed a method that can find the optimal tradeoff point between transcoding overhead and storage needed for the various pre-processed content versions. That is, they tried to find out the point between CPU cost and I/O cost. [10] has suggested adding well known on-demand data broadcasting approach and simple QoS mechanism into a transcoding proxy. All requests are stacked at the queue, the system gathers the same requests from the queue, transcodes content to a certain level, and then broadcasts the transcoded content to all users. The level is decided by the system in response to the system condition.
Due to the cost of transcoding, the intermediary or completely distributed content adaptation approach provides many beneficial advantages. Thus, many research groups have proposed frameworks and infrastructures of this style of content adaptation, and improved mechanisms to resolve issues and challenges. Using efficient caching mechanisms has been proposed also to resolve the problem of transcoding. During this project, however, I have figured out there are still many issues not handled yet. One of them is the lack of location algorithm in the completely distributed content adaptation systems over wide-area network environment like [2]. When a module should be connected to another module using their input pair and output pair, then how the module is able to look up the other module? They have omitted the detail mechanism about location in the paper. Another issue is that distributed based content adaptation mechanism still need more optimization in terms of reducing user response time and system overhead through building adaptation chain - or called datapath in [2], pipeline in [4] - as short as possible and making parallel adaptation chain when it is allowed. The experience of working on [11] pushes me to employ Infopipe concepts to generate optimized and parallel workflow for the content adaptation. In next step for content adaptation research, I will improve the distributed base content adaptation more optimized using workflow concept, and develop the efficient protocol to look up connection chains.
[1] Intermediary-based transcoding framework, by S.C. Ihde, P.P. Maglio, J. Meyer, and R. Barrett, IBM Systems Journal 2001
[2] CANS: Composable, Adaptive Network Services Infrastructure, Xiaodong Fu, Weisong Shi, Anatoly Akkerman, and Vijay Karamcheti, In Proc. Of the USENIX Symposium on Internet Technologies and Systems, 2001
[3] On Balancing Between Transcoding Overhead and Spatial Consumption in Content Adaptation, Wai Yip Lum and Francis C.M. Lau, Mobicom 2002
[4] An Extensible and Scalable Content Adaptation Pipeline Architecture to Support Heterogeneous Clients, Thomas Phan, George Zorpas, and Rajive Bagrodia, ICDCS 2002
[5] Cooperative Architectures and Algorithms for Discovery and Transcoding of Multi-version Content, Claudia Canali, Valeria Cardellini, Michele Colajanni, Riccardo Lancellotti, and Philip S. Yu, Web Caching Workshop 2003
[6] Tuxedo: A Peer-to-Peer Caching System, Weisong Shi, Kandarp Shah, Yonggen Mao, and Vipin Chaudhary, In Proc. Of the 2003 Int'l Conf. On Paralle and Distributed Processing Techniques and Applications (PDPTA)
[7] PTC: Proxies that Transcode and Cache in Heterogeneous Web Client Environments, Aameek Singh, Abhishek Trivedi, Krithi Ramamritham, and Prashant Shenoy, WWW, 2003
[8] Middleware Support for Reconciling Client Updates and Data Transcoding, Thomas Phan, George Zorpas, and Rajive Bagrodia, MobiSys 2004
[9] A QoS- Aware Transcoding Proxy Using On-demand Data Broadcasring, Jiun-Long Huang, Ming-Syan Chen and Hao-Ping Hung, Infocomm 2004
[10] Infopipe Project http://www.cc.gatech.edu/projects/infosphere/software/