|
Terascale computing and large scientific experiments produce enormous quantities of data that require effective and efficient management. The task of managing scientific data is so overwhelming that scientists spend much of their time managing the data by developing special purpose solutions, rather than using their time effectively for scientific investigation and discovery. The goal of the DoE SciDAC SDM project is to establish an Enabling Technology Center that will provide a coordinated framework for the unification, development, deployment, and reuse of scientific data management software.
The SDM Georgia Tech team provides their expertises to the SciDAC SDM center in the following areas:
- Information Extraction and Wrapper Generation
- Service Discovery,, Service Selection, and Service Composition.
- Trust and Data Provenance Support in Data Intensive Web Services to large data sets
- Ontology Generation and Ontology Utilization
- Information Fusion and Information Integration
A unique contribution from Georgia Tech team to the SDM center is the research and development of the open source software - XWRAPComposer Code Generator system, which provides one-stop services to help scientists and engineers to collect, fuse and integrate large amount of deep web data published through different scientific and engineering chanels effciently.
XWRAP Composer differs from other wrapper generation systems in two aspects. First, it supports the workflow of wrapper programs to allow scientists to zoom into the concrete steps of an wrapper execution to learn and understand where and how the data is collected and how the data will be fused and integrated.

Second, XWRAPComposer is the first code generation system that can generate code capable of extracting information from multiple linked pages. For further detail, please visit XWRAPComposer home.
|
Scientific Data Management Research Group
College Of Computing
Georgia Institute of Technology
Atlanta, GA 30332
|
|
|