MASK() Function - Scan String for Matching Substring

For this topic's original documentation, see the MASK() Function - Scan String for Matching Substring.

BBj-Specific information

Syntax

MASK(str1{,str2}{,ERR=lineref})

Description

The MASK() function scans a string for a matching substring. This differs from the POS() function because str2 may contain pattern matching specifications similar to the UNIX operating system "grep" command.

BBj uses a new regular expression engine that supports most of the masking features of Perl 5.

Parameter

Description

str1

String to be scanned

str2

Substring to be located.

ERR=lineref

Branch to be taken if an error occurs during execution.

The value returned is the starting position instr1 fitting the mask. A returned value of zero indicates that no matching areas were found. If a match was made, the TCB(16) function returns the length of the string matched. If str2 is not given, the prior mask is reused. This is important to program performance because a certain amount of time is spent "compiling" the mask string. Repeated use of the same mask string should specify the mask only once.

Most characters in the mask string must simply match themselves. For example, a mask of "abc" only matches the string "abc". In this case, MASK(A$,"abc") and POS("abc"=A$) return the same result.

Mask String Operators

Operator

Operator name

Description

?

question mark

Indicates that something is optional. For example, a mask of "ab?c" indicates that the "b" is optional. This mask matches "abc" or "ac". Use parentheses for grouping. A mask of "(ab)?c" considers the sequence "ab" optional and matches either "abc" or just "c".

*

asterisk

Zero or more occurrences of the preceding item.

+

plus

One or more occurrences of the preceding item.

|

pipe

Alternate choices. The "|" is lower in precedence than the above operations

.

period

Matches any character.

[]

brackets

Match the contained character(s).

-

hyphen

Matches a range in a character list.

^

caret

Excludes characters or matches only those at the beginning of the target string.

\

backslash

Forces the next character in the mask to be taken literally. For example, a backslash must be used to match a plus sign.

$

dollar sign

Added at the end of the mask. Matches only characters at the end of the target string.

Within a regular expression, the characters in the following table have special meanings. The operators unique to BBj are identified.

Operator Type

Operator

Matches

Notes

Positional

^

Beginning of a line.

$

End of a line.

\A

Start of the entire string.

New to BBj

\Z

End of the entire string.

New to BBj

Single Character

.

Any single character except $0A$.

\d

Any decimal digit.

New to BBj

\D

Any non-digit.

New to BBj

\n

Newline ($0A$)

New to BBj

\r

Return ($0D$)

New to BBj

\s

Any whitespace character.

New to BBj

\S

Any non-whitespace character.

New to BBj

\t

A tab character ($09$).

New to BBj

\w

Any word (alphanumeric or _) character.

New to BBj

\W

Any non-word (alphanumeric or _) character.

New to BBj

\0

Null ($00$)

New to BBj

\0xxx

Octal (e.g. \007)

New to BBj

\xXX

Hex (e.g. \xFF)

New to BBj

\cX

Control character (e.g. \cC = $03$)

New to BBj

\Q

Quote (disable) special characters until \E

New to BBj

\E

Ends quoting of special characters

New to BBj

\x

Character x, if x is not one of the above listed escape sequences.

Character Class

[abc]

Any character in the set a, b or c.

[^abc]

Any character not in the set a, b or c.

[a-z]

Any character in the range a to z, inclusive.

Grouping

(abc)

This is used to refer to a group of characters.

Branching (Alternation)

a|b

Matches either a or b.

Repeating

 

Matching is normally "greedy" – the largest possible match which will allow the remainder of the mask to also match. If any of the repeat specifiers is immediately followed by a ?, the repeat specifiers will stop at the minimal number of repetitions that can complete the rest of the match.

?

Matches the preceding expression 0 or 1 time.

*

Matches the preceding expression 0 or more times.

+

Matches the preceding expression 1 or more times.

{num}

Matches exactly num repetitions of the preceding expression.

{min,max}

Matches at least min, but no more than max, repetitions of the preceding pattern.

{min,}

Matches min or more repetitions of the preceding pattern.

Operator Examples

String Example

Matches

"ab*c"

Any string beginning with "a", ending with "c", and having any number of b's in between.

"ab+c"

Any string beginning with "a", ending with "c", and having at least one "b" in between.

"(ab)*c"

Any string with any number of occurrences of "ab" finally ending with a "c". For example, "c", "abc", "ababababc".

"abc|def"

Either "abc" or "def".

"a(b|c)"

Either "ab" or "ac".

"a(b|c)*"

A string beginning with "a", followed by any number of "b" and "c" (in any combination).

"a.c"

A string beginning with "a", ending with "c", and with any character in the middle.

"a.*c"

Any string beginning with "a" and ending with "c".

"[abc]"

Either "a", "b", or "c".

"[abc]+"

Any string containing only a's, b's, and c's.

"[abc][0123456789]"

Any string consisting of an "a", "b", or "c" followed by any digit.

"[0-9]"

Characters 0-9. This is the same as "[0123456789]".

"[a-zA-Z]"

Any string consisting of upper- and lower-case characters.

"[^0-9]"

Any character that is not a digit.

"^abc"

"abc" only if it occurs at the beginning of the string.

"a\+c"

Only "a+c"

"a+c"

Any sequence of a's ending with a "c".

"abc$"

"abc" only if it occurs at the end of the string.

"^a[0-9]*c$"

A string that begins with an "a", ends with a "c", and contains only digits in between.

Example 1

1000 PRINT STR(MASK("config.bbx",".*bbx"))+" "+STR(TCB(16))

Example 2

The following displays all file names matching a particular pattern:

1000 INPUT "MASK: ",M$
1010 LET A=MASK("",M$,ERR=1000); REM - establish mask
1020 LET CHAN=UNT; OPEN (CHAN)DIR("")
1030 READ RECORD (CHAN,END=1200)F$
1040 IF MASK(F$)=1 THEN PRINT F$
1050 GOTO 1030
1200 end