SELECT Verb

Syntax

SELECT (channelno{,MODE=string}{,ERR=lineref})template
{:fieldname{,fieldname...}} FROM fileid {WHERE expr} {SORTBY expr} {LIMIT first, count}

Description

The SELECT verb opens a channel in read-only mode and provides a filter to the records in an MKEYED file. (Although SELECT will open a channel for other types of files, the results are unpredictable.)

Note: The SELECT verb can be used with all file types. However, the SELECT verb does not work with delimited string files. When using string files, the record template (specified in the code for the SELECT verb and in the data dictionary for SQL) requires that all fields must either be fixed length or variable length padded.

Parameter

Description

channelno

Valid channel number. Access is limited to the READ RECORD verb.

MODE= string

Mode(s) that qualify the OPEN. For example, MODE="CISAM" is required to SELECT records from a C-ISAM-type file. The SKIPLOCKR=1 mode skips the records currently locked by another user. On multitasking and network systems, a separate process reads and sorts the records. This process returns qualifying records to the main process by means of a buffered communications scheme. A special MODE= string, "BLOCK=int", allows the number of records held in the buffer to be specified. PRO/5 assumes a reasonable default block size if this mode is not used.

ERR=lineref

Branch to be taken if an error occurs during execution.

template

DIMed (dimensioned) string template containing the record structure of the file to be read. It can be found starting at byte 65 of the FIN() data on the SELECTed channel.

fieldname

Names of template fields to be returned upon execution of a READ RECORD. If data is required from a limited number of fields, this parameter can reduce the I/O processing load. If this parameter is unspecified, data from all fields in the template is returned.

FROM fileid

Name of the MKEYED file to be opened and filtered.

WHERE expr

Conditional expression, which is evaluated on each record read from the file. If expr evaluates to TRUE, the record is made available for reading on channelno. Within expr, only the PRO/5 verbs and functions listed below are available:

String ADJN(), AND(), CVS(), DATE, FILL(), IOR(), NOT(), STR(), TBL(), and XOR()
Numeric ABS(), ASC(), ATN(), COS(), EPT(), FPT(), INT(), JUL(), LEN(), LOG(), MASK(), MAX(), MIN(), MOD(), NUM(), POS(), SGN(), SIN(), and SQR()

A function that does not appear in the above list, even if user-defined, may be used if it does not contain a reference to the template containing the record structure. In addition, all PRO/5 operators (arithmetic, relational, logical AND and OR, and string concatenation) are allowed, and parentheses may be used to control the order of evaluation. All arithmetic operations are done in the precision active at the time the SELECT operation was started. Subsequent changes to the active mathematical precision do not affect channels already opened by SELECT. The MASK() function should be limited to fully anchored regular expressions.

SORTBY expr

Sort expression. The sort expression is evaluated on each record read from the file and should contain a reference to at least one field defined in template. The value of the expression is used as a key to sort the selected records. The generated key must not be variable in length. If needed, the FILL() function may be used to ensure a constant-length key.

The expression must be a string expression and is limited to the same PRO/5 functions as the WHERE expr. The ADJN() can be used to convert numeric quantities to 8-byte strings to obtain the desired numerical order.

If SORTBY is not used, the order the records are returned in is not guaranteed. Sorting may require additional processing time. PRO/5 sorts in memory, if possible. Otherwise, it uses the temporary, single-keyed MKEYED files in the directory named by the TMPDIR environment variable. If TMPDIR does not exist, the current directory is used.

The SELECT verb recognizes most cases in which the SORTBY expr matches an MKEYED key chain. The chain is used to read the records in order, rather than reading the records in default order and then sorting the selected data. Omitting the SORT expr significantly enhances access speed. For SELECT to recognize that the SORTBY expression matches a key chain, key definitions must correspond to the field positions and widths of the template fields in the SORTBY expr. The SELECT verb correlates the SORTBY expr to the key chain for only simple SORTBY expressions. String concatenation, the ADJN() function, unary minus, and the NOT() function are permitted. Any other function prevents the optimization.

The following table illustrates examples of SORTBY expressions that will be optimized:

SORTBY T.NAME$

Matches the start of the first key.

SORTBY NOT(T.NAME)

Steps through the first chain backward

SORTBY T.NAME+NOT(ADJN(T.B))

Matches the key exactly

SORTBY T.NAME+ADJN(-T.B)

Matches the key exactly

SORTBY NOT(T.NAME$)+ADJN(T.B)

Steps through the first chain backward.

LIMIT first, num

Using the LIMIT clause will specify the offset and the maximum number of records to be read from the channel. When this limit is reached, a READ RECORD on the channel returns an end-of-file error (!ERROR=2).

Optimizing SELECT Operations

SELECT optimizations take advantage of a query on a particular file that may match one of the file's keys according to the following conditions:

  • The file open for selecting must be a multikey MKEYED file.

  • The WHERE expression must involve the comparison of a template field with a string using one of the following operators: =, <, <=, >, or >=.

  • The template field type must be a character string (type "C").

  • Each template field must match a key segment in terms of key number, offset, and length as they are defined in the FIN() information of the file.

Select Optimization Example

The following statement searches directly for a record matching "Jones" and begins returning records, stopping after a record matching "Kilgore" is found:

SELECT(1)A$ WHERE A.NAME$ >= "Jones" AND A.NAME$ <= "Kilgore"

The following statement on a file with the defined segments "[last name]+[first name]" searches directly for the record that matches "Crockett"+"Davey."

SELECT(1)A$ WHERE A.FIRST$ >= "Davey" AND A.LAST$ >= "Crockett"

If a particular key has more than one segment, then multiple template fields in the expression are matched to as many segments as possible.

This may be a narrower bracket condition than simply seeking to the record that matches "Crockett" alone.

It is also possible that within a WHERE expression, all of the template fields of the expression may match to more than one key. If so, SELECT uses a set of rules to choose the best key.

WHERE Optimization Precedence

The best keys for WHERE optimization are chosen with the following precedence:

  1. A complete key, where template fields match each key segment.

  2. Equality expression on a key.

  3. Head and tail brackets on a key. (e.g.: field$ > "str1" AND field$ < "str2")

  4. Head or tail bracket on a key. (e.g.: field$ > "str1")

  5. Primary key.

  6. Unique alternate key.

  7. Non-unique alternate key.

  8. Largest number of segments in a key.

The greatest precedence is given to a key that is "complete." This implies that template field(s) from the expression match each segment of the key. However, bracketing can still be done, even if matches are only found for the first segment or first few segments of a key.

A potentially optimizable WHERE expression, in combination with a potentially optimizable SORTBY expression, requires some shifts in priority.

The problem is that the WHERE and SORTBY expressions may match entirely different keys. In this situation, the WHERE expression receives priority. This means that a subsequent sorting of the records that matched the WHERE expression is required before any records are returned.

However, if a key that matches the SORTBY expression also matches any part of the WHERE expression, that key is chosen as the best key for reading the file.

These optimizations can be disabled selectively with the following special mode strings:

Optimizations Permitted

Mode String

SORTBY only

SELECT (1,MODE="OPT=NOWHERE")...

WHERE only

SELECT (1,MODE="OPT=NOSORT")...

None

SELECT (1,MODE="OPT=NONE")...

These strings can be used in special situations where a key in the SORTBY expression should be used preferentially over one in the WHERE expression, or in order to determine if the optimizations are improving search times.

To accommodate convenient SELECT usage, if either the WHERE or SORTBY expression is CPL(strexpr), the expression in strexpr is evaluated at run time and used as the WHERE or SORTBY expression. The CPL() itself is not evaluated and is used only as a signal to SELECT that the strexpr contains the expression to evaluate. The table below provides additional information on subexpressions in WHERE and SORTBY expressions.

Subexpressions not containing a reference to template must evaluate to a simple string or numeric. Also, these subexpressions may not contain array indices composed of fields from template. Substring references containing fields from template are permitted.

For example, assuming that the template is REC$:

WHERE REC.CODE$(SUB)>"A"

Valid: Substring always O.K.

WHERE ARRAY[I]

Valid: Array not indexed by template field.

WHERE ARRAY[REC.CODE]

Invalid: Array indexed by template field.

WHERE ARRAY[T.NUM]

Valid: Array indexed by field from a different template.

WHERE REC.ARRAY[REC.CODE]

Valid: Array and index are both from template.

WHERE FATTR(A$)>"A"

Valid: Improper function does not refer to template.

WHERE FATTR(REC$)>"A"

Invalid: Improper function refers to template.

Once the SELECT process is started, the channel may be accessed using the READ RECORD verb, with a string template as the input variable. If a list was used, this template must match the list of fieldnames specified in the SELECT statement; otherwise, it must be equivalent to the original template.

Guidelines for Using the SELECT Verb with the Data Server

Guidelines for Using the SELECT Verb with the Data Server include:

  • Specify the full path name of the remote file, complete with the name of the network host. Be sure to include the network host name in the FROM clause. This enables the remote system (with the PRO/5 Data Server) to perform all of the necessary I/O operations locally and return only the requested data over the network connection to reduce network traffic and improve performance.

  • Relying on a PREFIX to locate the remote file causes the SELECT to be performed locally and places heavier demands on the network and increases response time. For example:

Preferred:

SELECT (1) REC$ FROM "/<server>remotefile"

Discouraged:

SELECT (1) REC$ FROM "remotefile"

  • On multitasking systems, PRO/5 allocates a background user slot for each active SELECT.

  • Under the UNIX operating system, the SELECT verb currently requires two semaphores and one shared-memory segment for each active SELECT.

  • Set the TMPDIR environment variable in the shell of the process that is performing the SELECT. If the PRO/5 client is performing the SELECT, TMPDIR should be set in its shell. If the SELECT is to be performed via the Data Server, TMPDIR should be set before the Data Server is invoked. If the Data Server is invoked via a startup script, the script should set TMPDIR before invoking the Data Server.

Example

In the following example, records in the master file that have a balance field greater than the limit field are selected from the file and sorted in order of increasing balance. The MODE="BLOCK=200" sets up a 200-record buffer between the SELECT process and the main PRO/5 process. The READ RECORD in statement 2010 reads the sorted records from the buffer, returning the fields as specified in the SELECT statement, by way of the SELREC$ template.

The parentheses in the WHERE expr are for readability only and are not required.

1000 DIM REC$:"CUST_NUM:C(7),NAME:C(35),ADDRESS1:C(35) 1000:,ADDRESS2:C(35),CITY:C(35),STATE:C(2),ZIP:N(5),
1000:PHONE:C(10),OPEN_DATE:C(8),LAST_PAY_DATE:C(8),
1000:LIMIT:N(7),BALANCE:N(7)"
1010 SELECT(ARMAST,MODE="BLOCK=200")REC$:REC.CUST_
1010:NUM$,REC.NAME$,REC.PHONE$,REC.LIMIT,REC.BALANCE
1010:FROM ARMASTER$ WHERE (REC.BALANCE>REC.LIMIT)
1010:SORTBY ADJN(REC.BALANCE)

2000 FIN$=FIN(ARMAST)
2005 DIM SELREC$:FIN$(65)
2010 READ RECORD(ARMAST,END=4000)SELREC$

See Also

LIMIT Clause

Verbs - Alphabetical Listing