Data Dictionary

Overview

Broadly defined, a data dictionary is an organized description of data files. A data dictionary describes physical file attributes, such as record lengths and file types, and logical file attributes, such as field names and output masks.

Many software packages, including packages written in PRO/5, include their own data dictionaries. Using the BASIS Data Dictionary allows developers to use the power and convenience of a data dictionary without writing their own data dictionary utilities from scratch. The following are some of the advantages of describing an application's data in a BASIS Data Dictionary:

  • Using the data dictionary can make it much easier to upgrade when file structures change because the application code can be data independent. When a file format changes, it is not necessary to change IOLISTs and string references in numerous programs. Instead, the change is made once in the data dictionary. Modifying the data dictionary does not automatically rebuild data files, but the data dictionary does provide the ease of access necessary for a simple file update program to do the job.

  • The data dictionary provides immediate documentation of data structures, including file layouts and data types.

  • The BASIS ODBC Driver, TAOS/Views, and TAOS: The BBx Developer's Workbench access PRO/5 data files through the BASIS Data Dictionary. These products cannot be used without a BASIS Data Dictionary set up.

For information on how to set up data dictionaries, refer to Data Dictionary - Overview.

Concepts and Terms

The following sections cover data dictionary terms and concepts used in this manual.

Alias

All files are referenced in terms of an alias. The alias may (but does not necessarily) match the file name. For example, in the Demonstration Database (described in Demonstration Database), the alias CUSTOMER is used to refer to the data file (DATA)cust.dat. An alias may contain up to 16 characters.

Field

Field can have two different interpretations, depending on the part of the file definition being specified:

  • Usually, the term "field" indicates a series of bytes that have a significance when taken as a group. For example, the first four bytes of a record may be the CUSTOMER.CUST_NUM field. A field separator (typically a linefeed character) does not have to follow this field. This type of field is called a logical field. A logical field definition is based on the meaning of the data rather than how it is stored in the record.

  • When defining the indices (keys) to the file, it may be necessary to specify a physical field number, offset, and length. In this context, the term field refers to blocks of data. These are called physical fields because they are described by the physical structure of the data rather than by its meaning.

It is not necessary to know the difference between logical fields and physical fields when defining a traditional record format where all fields are separated by field separators.

Index

An index is an access method (or key) to the data in a file. When defining the data dictionary for an existing database that uses DIRECT, or single-key MKEYED files, define only one index for each file. When defining a new data file, the DD Editor assumes a multi-keyed (MKEYED) file, permitting the definition of up to 16 indices for the file. MKEYED files must have one unique index defined as the primary key. The first index defined is assumed to be the primary key.

The most powerful and flexible method of setting up a database is to define the files as MKEYED. The application may be significantly enhanced if indices are defined for all the fields needed in the application for search and sort routines. For example, a customer file might have a primary index for CUSTOMER_NUMBER, as well as additional indices for CUSTOMER_NAME, ZIP_CODE, PHONE, and CUSTOMER_TYPE.

Indices can be defined based on fields in the data file, on specific physical segments of bytes within the record, or both. Usually, an index will be based on a single field in the record and have the same name as that field. Indices that meet these criteria are called dependent indices.

Segment

Indices consist of one or more segments. A segment is a named field (called a dependent segment), or a specific substring from within the record (called an independent segment). For example, a CUSTOMER_NAME index might be made up of two concatenated dependent segments, LAST_NAME and FIRST_NAME. Or, an index may be made up of the first two bytes of the first physical field along with the first two bytes of the second physical field ([1:1:2]+[2:1:2]).