VBD Specs Sheet

Variable Block Database Version 2, Revision C


Topics:

Overview
VBD File Format
Variable Block Headers
VBD CRC Checking
VBD Advisory File/Record Locking
VBD File Specs
Platform Independent Data Types


Overview

The Variable Block Database is a file format used to store any type of variable-length binary data in re-sizeable blocks. The VBD file format extremely very flexible, allowing it to support both object-oriented database models and relational database models.

The object-oriented database model is based on persistent objects and extends the object-oriented programming principles of abstraction, encapsulation, inheritance, and polymorphism into the database. A persistent object is an object that saves its state between program invocations. Normally an object's data, stored in memory, is lost when the program terminates. Persistent objects store their data both in memory and in a disk file. When the program is terminated and restarted again, the persistent object can restore its last state by loading its data from the disk file. This is the basis of an objected-oriented database. A persistent object is able to represent any kind of data because there is no restrictions on its format and content. This allows an object-oriented database to define complex data items, arrays, variable lists, and user-defined data types within its records definition.

The relational database model consists of fixed-length, fixed-format records, which form tables of rows and columns. Each record has a primary key data element to uniquely identify the record in the database. In a relational database the relationship between the rows and columns is represented by data values rather than record addresses. Both the relational and object-oriented database models have their advantages. An object-oriented database is not bound by rows and columns and can be made to represent any type of data. A relational database can easily provide a general-purpose query and report writer languages to find, sort, view, and print the records in the database.

By design the VBD file format is not limited the object-oriented or relational database model. Any type of data can be stored and manipulated in a VBD file. VBD files have this capability because the methods by which any data type is stored or retrieved must be defined within the application.


VBD File Format

Every VBD file contains a file header, a file lock header, an optional static storage area, and a dynamic storage area where variable blocks of data are stored. The file header is composed of six fields and will vary in length depending on whether 32-bit or 64-bit file offsets are used.

(1) Free space field - stores a pointer to the first block of de-allocated heap space
(2) End of file field - stores a pointer to the address of the last byte in the file
(3) Start of heap field - stores a pointer to the start of the dynamic storage area
(4) Highest block field - stores a pointer to the highest allocated block
(5) Signature field - signature used to identify VBD file types
(6) Version field - used to identify the VBD file version number

This header is used to store information needed by the allocation functions, a signature, and a version number. The lock header is used to lock the entire file during multi-threaded read and write operations. The static data area is used to store fixed data that cannot be altered by any of the dynamic allocation routines. The size of the static data must by specified by the application that creates the VBD file. If no static area is specified then the dynamic data area will start directly after the VBD file lock header.

The VBD file header information starts at file address 0 reserves enough space at the beginning of the file for the file and lock headers. If a static area is requested by the application, the static area will occupy the number of bytes requested. The dynamic data area will start directly after the static data area and occupy the rest of the file.

The dynamic data area always starts out empty and grows when variable data blocks are allocated. The "free space", "end of file", and "start of heap" pointers stored in the VBD file header are used to maintain the dynamic data area. The "start of heap" pointer stores the address where the dynamic data area starts. The "end of file" pointer marks the end of the file and points to the location where the file can be extended during block allocation.

As new blocks are allocated the "highest block" pointer is used to store the address of the highest allocated variable block. When blocks are de-allocated, the block is marked deleted and left in the file. The size of the file is determined by the number of blocks allocated and will remain the same no matter how many blocks are deleted. The number of deleted blocks is maintained in a non-contiguous list. Each block in the list points to the next block in the list starting at a specified address. The "free space" pointer is used to store the file address of the block where the free space list starts.

The signature and version fields must be set when a new file is created or when an existing file is opened. The signature field is used to determine if the file is of the correct type. If the "VBDBASE" signature or "VBDBASE64" (when 64-bit offsets are used) is not found then it is assumed that this is not a valid VBD file. The eighth byte or tenth byte (for 64-bit files) is used for all revision changes. Version 2 sets the revision letter to 'C' and performs compatibility checks to ensure backward compatibility with version 2 revision 'A', revision 'B', and revision 'C' files. The VBD version letter is used to determine the amount of overhead per variable block when a new database file is created. When an existing file is opened the version/revision stored in the file header will be used. VBD revision letter zero (denoted by a null value) excludes the persistent checksum value, the persistent file lock header, and the persistent record lock header with a total block overhead of 16 bytes. Revision 'A' reserves space at the end of each block for an optional persistent checksum value and excludes the persistent file lock header, and the persistent record lock with a total block overhead of 20 bytes. Revision 'B' includes the persistent checksum value with addition of a persistent file lock header. Revision 'C' includes all the features of revision 'A' and 'B' with the addition of persistent lock headers. The total revision 'C' overhead per block equals 32 bytes.

The version field is used to represent VBD library version number. A version number change indicates that changes have been made to the code library used to create VBD files. Version numbers can be used inside of an application to conditionally perform certain operations based on its value. This will ensure backward compatibility with previous versions of an application.


Variable Block Headers

Variable block headers manage all the variable data blocks created inside the dynamic data area. Block headers are used to mark the start of a variable data block. Every time a block is allocated a block header is written to the file. Allocation works by writing the block header plus the block overhead and then reserving a specified number of bytes for the data that will be stored in the block. Revision 'A' files reserve four bytes at the end of the block to allow the application to store a block checksum. Revision 'C' files reserve twelve bytes following the block header for a record lock. After the space is allocated, the application is responsible for writing the data to the file starting at the address after the space reserved for the block overhead. Each block header contains four fields and will vary in length depending on whether 32-bit or 64-bit file offsets are used.

(1) Check word field - check word used to mark variable blocks
(2) Length field - block length including the object, block header, record lock, and CRC
(3) Status field - stores the status of dynamic data stored in the block
(4) Next deleted block field - stores a pointer to the next deleted block

The VB header "check word" field represents a 32-bit check word used for file integrity checks and is used to maintain synchronization within the VBD creator and the application.

The "length" field stores the length of the object plus the size of the block header, record lock, and the size of the block's CRC checksum value. This field is used to index the file block by block. By reading this value the application will always know where the next block is in sequence. The check-word is used to ensure that the next block in sequence is a valid variable block.

The "status" field stores the status of dynamic data stored in this block. Only two bytes of the status field are used. The remaining two bytes of the status field is reserved for future use. The status of a variable data block can be determined by one of three byte values stored in the first byte of the status field: 'N' for normal (ASCII 78), 'D' for deleted (ASCII 68), or 'R' for removed (ASCII 82.) A block marked 'N' for normal indicates that the block is in use and cannot be reclaimed by any of the allocation routines. A block marked 'D' for deleted means that the data in the block is still valid, but the block can be overwritten if needed. A block marked 'R' for removed means that the data in the block has been removed and the block can be overwritten if needed. Marking deleted blocks 'N' for normal can easily restore them. Once a block is removed, meaning that the original data set is no longer intact, it cannot be restored.

The second byte of the "status" field stores device control commands. Device control commands allow blocks to be added, changed, removed, and requested by local and remote devices. Device control commands are also used to signal events between local and remote devices. The use of a device control command signifies the use of a device block header. Device block headers contain synchronization and control information and are used to transfer blocks and to signal events between devices. When an application sends a block of raw data the header precedes the block and informs the receiver of the block size and the block status. After reception of a device header the receiver then waits for the block data and processes the data or signals/handles an event according to the status of the block.

The "next deleted block" pointer in the VB header stores a pointer to the next deleted or removed variable block, only if this block has been deleted or removed. The total number of deleted blocks is maintained in a non-contiguous list. Each deleted or removed block in the list points to the next deleted or removed block in the list. The "free space" pointer in the VBD file header is used to store the file address where the head of the free space list is located.

In order to prevent the VBD files from becoming extremely fragmented due to numerous deletions, blocks marked deleted or removed will be reused by the allocation routines. When a new variable block is allocated and the "free space" field is not empty, the allocation routine will walk through the free space list looking for a deleted or removed block of the size to be allocated. One of two methods can be used to prevent fragmentation, the best-fit method or the first-fit method. The best-fit method works by scanning the entire free space list until the best location to reuse a block is found. Best-fit values are calculated based on a percentage of the number of bytes requested. Any blocks greater then 2.5 times bigger then the number of bytes requested will not be reused. This ensures that smaller blocks will not use small portions of very large blocks and all new blocks will use the maximum amount of space in any block that is reused. However, the best-fit method is very costly in terms of speed when the free space list is very large.

The first-fit method works by searching the free space list until the first block of the appropriate size is found. The first block large enough to hold two block headers including overhead plus the number of bytes requested with at least one byte left over will be reused. Several splits can occur if the blocks vary greatly in size. When a block is split, the unused portion of the block is assigned a new block header (marked removed) and placed back on the free space list. A small block can cause a very large block to be divided several times leaving gaps of smaller and smaller blocks. The reclaim method used is application dependent. If all the blocks are more or less the same size the first-fit method is more efficient terms of speed. If all the blocks vary greatly in size the best-fit method is more efficient in preventing fragmentation.


VBD CRC Checking

The VBD file functions use a 32-bit CRC checksum routine to detect any bit errors that occur during data storage. The CRC is based on the Ethernet polynomial of 0x4C11DB7. A checksum is calculated when data is written to the VBD file, this includes the block header, record lock, and the block data. The calculated checksum is then compared to data actually stored on disk. If the calculated checksum does not match the actual checksum, a bit error has occurred during data storage. All bit errors must be handled by the application since the type of data being stored is not known.

Revision 'A', 'B', and 'C' files reserve four bytes at the end of each block that can be used by an application to store a persistent 32-bit checksum with each block. Each time a new block is allocated space is reserved for the number of bytes requested plus four additional bytes. The application is responsible for reading and writing the block checksum since the type of data being stored is not known. The use of a block checksum is optional and does not have to be used to detect bit errors because of the built-in CRC checksum routine used each time any data is stored. Some applications require the use of persistent checksums to maintain the integrity of the file from one program invocation to the next.


VBD Advisory File/Record Locking

The VBD advisory locking scheme enable database files to facilitate optimum operability in a multi-thread/multi-machine environment. Both file and record locking is achieved through the use of lock headers. Lock headers operate independently of the I/O subsystem thus allowing platform independent file and record locking. Additionally, the use of advisory file and record locks ensure that database engine will maintain maximum flexibility in both single user and multi-user applications. The absence of a mandatory locking protocol places responsibility of adherence and enforcement of the locking sub-system on the application and not the database engine itself.

The file lock header is used in revision 'B' and higher to allow applications to lock the entire file during a multi-threaded/multi-machine read or write operation. Record lock headers are used by an application in revision 'C' to lock a specific node during a multi-threaded/multi-machine read or write operation. Both the file and record lock headers are comprised of three fields and are designed specifically to work with the VBD lock primitives:

(1) Lock protect field - used to serialize access to the lock itself
(2) Read lock field - shared lock
(3) Write lock field - exclusive lock

The "lock protect" field is required to protect the lock values during multiple file access. The lock protect member is required in the VBD platform independent locking sub-system because file and record lock headers are manipulated by the database engine in the same manner as variable blocks.

The "read-lock" field is used in a file lock header to alert competing threads that the file cannot be altered by a write operation until this field is cleared. The "read-lock" field is used in a record lock to prevent competing threads from writing to specific node until this field is cleared. Read locks are shared meaning that more then one thread can hold a read lock. The VBD locking scheme will allow a total of 2^32 or 4,294,967,295 threads to read lock the entire file or a single record.

The "write-lock" field is used in a file lock header to alert competing threads that the file cannot be read or altered by another write operation until this field is cleared. The "write-lock" field is used in a record lock to prevent competing threads from reading or writing to specific node until this field is cleared. Write locks are exclusive meaning that only one thread can hold a write lock.


VBD File Specifications

In VBD version 2, revision 'C' file addresses can be represented by 32-bit signed integer values or by 64-bit signed integers if large files are supported by the underlying operating system. If 32-bit offsets are used file will be allowed to grow to a maximum size of 2.1 GB with a maximum record size of 2.1 GB minus the total size of the block overhead and data. If 64-bit offsets are used the file size will be limited to maximum single file size supported by the underlying operating system. The 64-bit model supports a maximum record size 4.2 GB, which includes the block overhead plus the block data. NOTE: Currently 64-bit support is limited in the VBD library to HPUX 11.0 and Solaris 2.8 platforms, with HPUX 11.0 supporting a maximum single file size of 128 GB. As large file support and 64-bit operating systems become more prevalent support will included for all the platforms supported by the 32-bit database engine.


Platform Independent Data Types

Platform independent data types are implemented in the VBD library in order to achieve database interoperability in a heterogeneous environment. Essentially they allow VBD database files to overcome the big and little endian byte ordering problems encountered when writing integer values to a common database file or device accessed by several different types of hardware architectures. The term "endian" is used to describe the order in which multi-byte numbers are stored in the computer's memory. File addresses are multi-byte numbers used to point to specific locations in a disk file. The byte order in which multi-byte numbers are stored in a computer's memory is specific to each microprocessor. For example, the Intel x86 family is little-endian, meaning that the lowest-order byte is stored first. Hewlett-Packard's PA-RISC and Sun's SuperSPARC are big-endian, meaning the highest-order byte is stored first. The Silicon Graphics MIPS and IBM/Motorola Power PC processors are both little and big endian (bi-endian). A file address stored in a disk file will represent different values if the file is created on one system and read on the other system.

Value

Big-Endian

Little-Endian

0x12345678

0x12345678

0x78563412

0x1234

0x1234

0x3412

0x5678

0x5678

0x7856

"ABC"

41 42 43

41 42 43

With big-endian ordering, the address of the multi-byte value is its most significant byte (its big end.) With little-endian ordering, the address of the multi-byte value is its least significant byte (its little end.) Character stings are stored in memory exactly as they appear regardless of the byte ordering used. In a data structure the order of bytes in memory will differ depending on the byte ordering and the particular data type used. If the contents of a data structure are written to disk or a device, the byte ordering will affect the data when it is moved to another platform.

In order to gain platform independent VBD files must use their own representation of 32 and 64 bit-signed integers for file 32-bit file addresses and 64-bit file address. By manipulating the file addresses in memory before they are written to disk, it is possible to represent 32 and 64 bit signed integers independently of the operating system or hardware platform used. This will overcome any byte ordering problems encountered when writing file addresses to a database file share by multiple platforms.

The same scheme used to overcome the byte ordering problem encountered with 32 and 64 bit signed integers must also be applied to unsigned integers and floating point values if any of these data types are stored in the file. String values are represented in memory exactly as they appear. Since none of the bytes in a string are reordered in memory, they can be written directly from memory to disk regardless of the platform used to create them.


End Of Document