Current version: 3.3

## What is PBIO?

Storing and transmitting data in binary form is often desirable both to conserve I/O bandwidth and to reduce storage and processing requirements. However, transmission of binary data between machines in heterogeneous environments is problematic. When binary formats are used for long-term data storage, similar problems are encountered with data portability.

PBIO is a general approach to dealing with binary data in storage and transmission. Essentially, PBIO is a data meta-representation. Users register the structure of the data that they wish to transmit/store or receive/read and PBIO transparently masks the differences. In particular, PBIO handles differences in the sizes, locations and even basic types of the fields in the records to be exchanged. Field matching is by field names (provided by the application). Records are C-style structures consisting of fields which can be any of the usual atomic data types or NULL-terminated strings. A previously registered structure may also serve as the basic type of a field, allowing the creation of complex nested record types. PBIO also supports fields which are statically sized one or two dimensional arrays or dynamically sized one dimensional arrays of any of the above datatypes. Record meta-information is transmitted once, when record formats are registered. Thereafter, transmission occurs in the writer's native format and the PBIO library on the receiver transparently handles discrepancies between the writer's format and the format required by the reader. In the case of transfers between homogeneous machines, the only additional overhead imposed by PBIO is the transmission of a 4-byte format ID. Between heterogeneous machines, the extent of the overhead depends upon the degree to which the record formats and atomic type representations differ between the two machines.

PBIO is described in the following publications:

• Efficient Wire Formats for High Performance Computing, Fabian Bustamente, Greg Eisenhauer, Karsten Schwan and Patrick Widener, SC'2000. postscript, PDF
• Fast Heterogenous Binary Data Interchange for Event-based Monitoring, Beth Plale, Greg Eisenhauer, Lynn K. Daley, Patrick Widener and Karsten Schwan, Proceedings of the International Conference on Parallel and Distributed Computing Systems (PDCS2000), August 8-10 2000. postscript, PDF
• Fast Heterogenous Binary Data Interchange, Greg Eisenhauer and Lynn K. Daley, Proceedings of the 9th Heterogeneous Computing Workshop (HCW 2000), pp 90-101. postscript, PDF
• Morphable Messaging: Efficient Support for Evolution in Distributed Applications, Sandip Agarwala, Greg Eisenhauer and Karsten Schwan, submitted to Challenges of Large Applications in Distributed Environments (CLADE), June 2004. PostScript [220K] | PDF Format [129K]
• ## What platforms does PBIO support?

PBIO has been compiled and tested amongst heterogenous combinations of the following platforms:

• Sun Solaris 2.x (32 and 64 bit architectures)
• Sun SunOS 4.x
• x86 Solaris 2.x
• SGI Irix 4.x,5.x,6.x (32 and 64 bit architectures)
• IBM AIX
• x86 Linux
• Windows NT

PBIO is distributed with a configure script and should adapt itself to most rational Unix-based platforms.

## Where can I get source and documentation?

PBIO is available under the new BSD license and can be downloaded in gzip or compress formats directly. A paper describing PBIO and containing basic usage information is available (pdf). An earlier version of this paper appeared as College of Computing Tech Report GIT-CC-94-45, which has the following bibtex entry:

@TechReport{Eisenhauer94PSD,
author =       "Greg Eisenhauer",
title =        "Portable Self-Describing Binary Data Streams",
institution =  "College of Computing, Georgia Institute of Technology",
year =         "1994",
number =       "GIT-CC-94-45",
note =         "{\it (anon. ftp from ftp.cc.gatech.edu)}",
}


## Known limitations

• PBIO is will not compile and run on machines which do not support byte-wise access to memory. Old Crays are the likeliest source of this problem.
• For floating point numbers, PBIO can currently only read values for which the native machine has a corresponding representation. Given the widespread use of the IEEE floating point format in modern machines, this is most likely to be a problem when the source machine supports a different range of data sizes than the target machine. The future addition of more general floating point conversion routines to PBIO will alleviate this problem.

## Recent PBIO news.

• Significant development on PBIO halted several years ago. A follow-on projects, Fast Flexible Serialization (FFS) exists and can currently be obtained as part of the EVPath project.
• Version 3.3 of PBIO was released on Oct 1, 2003. This release includes PBIO format server authentication and message morphing support.
• Version 3.2 of PBIO was released on July 31, 2001. This release includes updated XML support.
• Version 3.1 of PBIO was released on June 2, 1997. This release updated the documentation and included minor changes to the non-connected PBIO functionality.
• Version 3.0 of PBIO is released as of April 23, 1997. This is the first official version of PBIO that supports the Windows NT platform. It contains more extensive support for replacing the bottom level of I/O with other connected and reliable transports, as well as mechanisms to support the use of PBIO encodings across non-connected and non-reliable transport layers. In a minor accomodation to Windows NT programmers, the flags parameters to open_IOfile() and open_IOfd() has been changed. Previous versions of PBIO used the set of flags accepted by the Unix open() call. In PBIO 3.0, the flags parameter is of type char* and valid strings are "r" and "w", for reading and writing respectively. For backwards compatibility, we still accept the old flags values, though your compiler may generate warning messages about type mismatches.

## Debugging PBIO programs.

Some general questions about debugging PBIO programs are answered in another document.