Current version: 3.3
What is PBIO?
Storing and transmitting data in binary form is often desirable both
to conserve I/O bandwidth and to reduce storage and processing requirements.
However, transmission of binary data between machines in heterogeneous
environments is problematic. When binary formats are used for long-term
data storage, similar problems are encountered with data portability.
PBIO is a general approach to dealing with binary data in storage and
transmission. Essentially, PBIO is a data meta-representation. Users register
the structure of the data that they wish to transmit/store or receive/read
and PBIO transparently masks the differences. In particular, PBIO handles
differences in the sizes, locations and even basic types of the fields
in the records to be exchanged. Field matching is by field names (provided
by the application). Records are C-style structures consisting of fields
which can be any of the usual atomic data types or NULL-terminated strings.
A previously registered structure may also serve as the basic type of a
field, allowing the creation of complex nested record types. PBIO also
supports fields which are statically sized one or two dimensional arrays
or dynamically sized one dimensional arrays of any of the above datatypes.
Record meta-information is transmitted once, when record formats are registered.
Thereafter, transmission occurs in the writer's native format and the PBIO
library on the receiver transparently handles discrepancies between the
writer's format and the format required by the reader. In the case of transfers
between homogeneous machines, the only additional overhead imposed by PBIO
is the transmission of a 4-byte format ID. Between heterogeneous machines,
the extent of the overhead depends upon the degree to which the record
formats and atomic type representations differ between the two machines.
PBIO is described in the following publications:
Efficient Wire Formats for High Performance Computing,
Fabian Bustamente, Greg Eisenhauer, Karsten Schwan and Patrick Widener, SC'2000.
postscript,
PDF
Fast Heterogenous Binary Data Interchange for Event-based Monitoring,
Beth Plale, Greg Eisenhauer, Lynn K. Daley, Patrick Widener and Karsten
Schwan, Proceedings of the International Conference on Parallel and
Distributed Computing Systems (PDCS2000), August 8-10 2000.
postscript,
PDF
Fast Heterogenous Binary Data Interchange, Greg Eisenhauer and Lynn
K. Daley, Proceedings of the 9th Heterogeneous Computing Workshop (HCW
2000), pp 90-101.
postscript,
PDF
Morphable Messaging: Efficient Support for Evolution in Distributed
Applications, Sandip Agarwala, Greg Eisenhauer and Karsten Schwan,
submitted to Challenges of Large Applications in Distributed Environments
(CLADE), June 2004.
PostScript [220K] |
PDF Format [129K]
What platforms does PBIO support?
PBIO has been compiled and tested amongst heterogenous combinations
of the following platforms:
- Sun Solaris 2.x (32 and 64 bit architectures)
- Sun SunOS 4.x
- x86 Solaris 2.x
- SGI Irix 4.x,5.x,6.x (32 and 64 bit architectures)
- IBM AIX
- x86 Linux
- Windows NT
PBIO is distributed with a configure script and should adapt itself
to most rational Unix-based platforms.
Where can I get source and documentation?
PBIO is available under the new BSD license and can be downloaded in
gzip
or compress
formats directly. A paper describing PBIO and containing basic usage information
is available (pdf).
An earlier version of this paper appeared as College of Computing Tech
Report GIT-CC-94-45,
which has the following bibtex entry:
@TechReport{Eisenhauer94PSD,
author = "Greg Eisenhauer",
title = "Portable Self-Describing Binary Data Streams",
institution = "College of Computing, Georgia Institute of Technology",
year = "1994",
number = "GIT-CC-94-45",
note = "{\it (anon. ftp from ftp.cc.gatech.edu)}",
}
Older versions of PBIO source are available below:
Known limitations
- PBIO is will not compile and run on machines which do not support byte-wise
access to memory. Old Crays are the likeliest source of this problem.
- For floating point numbers, PBIO can currently only read values for
which the native machine has a corresponding representation. Given the
widespread use of the IEEE floating point format in modern machines, this
is most likely to be a problem when the source machine supports a different
range of data sizes than the target machine. The future addition of more
general floating point conversion routines to PBIO will alleviate this
problem.
Recent PBIO news.
- Significant development on PBIO halted several years ago. A follow-on projects, Fast Flexible Serialization (FFS) exists and can currently be obtained as part of the EVPath project.
- Version 3.3 of PBIO was released on Oct 1, 2003. This release
includes PBIO format server authentication and
message morphing support.
- Version 3.2 of PBIO was released on July 31, 2001. This release
includes updated XML support.
- Version 3.1 of PBIO was released on June 2, 1997. This release updated
the documentation and included minor changes to the non-connected PBIO
functionality.
- Version 3.0 of PBIO is released as of April 23, 1997. This is the first official
version of PBIO that supports the Windows NT platform. It contains more
extensive support for replacing the bottom level of I/O with other connected
and reliable transports, as well as mechanisms to support the use of PBIO
encodings across non-connected and non-reliable transport layers. In a
minor accomodation to Windows NT programmers, the flags parameters
to open_IOfile() and open_IOfd() has been changed. Previous
versions of PBIO used the set of flags accepted by the Unix open()
call. In PBIO 3.0, the flags parameter is of type char* and valid
strings are "r" and "w", for reading
and writing respectively. For backwards compatibility, we still accept
the old flags values, though your compiler may generate warning messages
about type mismatches.
Debugging PBIO programs.
Some general questions about debugging PBIO programs are
answered in another document.
This page is maintained by PBIO author
Greg Eisenhauer
Last Modified Sep 27, 2010.