MASK() Function - Scan String for Matching Substring
For this topic's original documentation, see the MASK() Function - Scan String for Matching Substring.
BBj-Specific information
Syntax
MASK(str1{,str2}{,ERR=lineref})
Description
The MASK() function scans a string for a matching substring. This differs
from the POS() function because str2
may contain pattern matching specifications similar to the UNIX operating
system "grep" command.
BBj uses a new regular expression engine that supports most of the masking
features of Perl 5.
Parameter |
Description |
---|---|
str1 |
String to be scanned |
str2 |
Substring to be located. |
ERR=lineref |
Branch to be taken if an error occurs during execution. |
The value returned is the starting position instr1
fitting the mask. A returned value of zero indicates that no matching
areas were found. If a match was made, the TCB(16) function returns the
length of the string matched. If str2
is not given, the prior mask is reused. This is important to program performance
because a certain amount of time is spent "compiling" the mask
string. Repeated use of the same mask string should specify the mask only
once.
Most characters in the mask string must simply match themselves. For example,
a mask of "abc" only matches the string "abc". In
this case, MASK(A$,"abc") and POS("abc"=A$) return
the same result.
Mask String Operators
Operator |
Operator name |
Description |
---|---|---|
? |
question mark |
Indicates that something is optional. For example, a mask of "ab?c" indicates that the "b" is optional. This mask matches "abc" or "ac". Use parentheses for grouping. A mask of "(ab)?c" considers the sequence "ab" optional and matches either "abc" or just "c". |
* |
asterisk |
Zero or more occurrences of the preceding item. |
+ |
plus |
One or more occurrences of the preceding item. |
| |
pipe |
Alternate choices. The "|" is lower in precedence than the above operations |
. |
period |
Matches any character. |
[] |
brackets |
Match the contained character(s). |
- |
hyphen |
Matches a range in a character list. |
^ |
caret |
Excludes characters or matches only those at the beginning of the target string. |
\ |
backslash |
Forces the next character in the mask to be taken literally. For example, a backslash must be used to match a plus sign. |
$ |
dollar sign |
Added at the end of the mask. Matches only characters at the end of the target string. |
Within a regular expression, the characters in the following table have special meanings. The operators unique to BBj are identified.
Operator Type |
Operator |
Matches |
Notes |
---|---|---|---|
Positional |
^ |
Beginning of a line. |
|
$ |
End of a line. |
||
\A |
Start of the entire string. |
New to BBj |
|
\Z |
End of the entire string. |
New to BBj |
|
Single Character |
. |
Any single character except $0A$. |
|
\d |
Any decimal digit. |
New to BBj |
|
\D |
Any non-digit. |
New to BBj |
|
\n |
Newline ($0A$) |
New to BBj |
|
\r |
Return ($0D$) |
New to BBj |
|
\s |
Any whitespace character. |
New to BBj |
|
\S |
Any non-whitespace character. |
New to BBj |
|
\t |
A tab character ($09$). |
New to BBj |
|
\w |
Any word (alphanumeric or _) character. |
New to BBj |
|
\W |
Any non-word (alphanumeric or _) character. |
New to BBj |
|
\0 |
Null ($00$) |
New to BBj |
|
\0xxx |
Octal (e.g. \007) |
New to BBj |
|
\xXX |
Hex (e.g. \xFF) |
New to BBj |
|
\cX |
Control character (e.g. \cC = $03$) |
New to BBj |
|
\Q |
Quote (disable) special characters until \E |
New to BBj |
|
\E |
Ends quoting of special characters |
New to BBj |
|
\x |
Character x, if x is not one of the above listed escape sequences. |
||
Character Class |
[abc] |
Any character in the set a, b or c. |
|
[^abc] |
Any character not in the set a, b or c. |
||
[a-z] |
Any character in the range a to z, inclusive. |
||
Grouping |
(abc) |
This is used to refer to a group of characters. |
|
Branching (Alternation) |
a|b |
Matches either a or b. |
|
Repeating |
|
Matching is normally "greedy" – the largest possible match which will allow the remainder of the mask to also match. If any of the repeat specifiers is immediately followed by a ?, the repeat specifiers will stop at the minimal number of repetitions that can complete the rest of the match. |
|
? |
Matches the preceding expression 0 or 1 time. |
||
* |
Matches the preceding expression 0 or more times. |
||
+ |
Matches the preceding expression 1 or more times. |
||
{num} |
Matches exactly num repetitions of the preceding expression. |
||
{min,max} |
Matches at least min, but no more than max, repetitions of the preceding pattern. |
||
{min,} |
Matches min or more repetitions of the preceding pattern. |
Operator Examples
String Example |
Matches |
---|---|
"ab*c" |
Any string beginning with "a", ending with "c", and having any number of b's in between. |
"ab+c" |
Any string beginning with "a", ending with "c", and having at least one "b" in between. |
"(ab)*c" |
Any string with any number of occurrences of "ab" finally ending with a "c". For example, "c", "abc", "ababababc". |
"abc|def" |
Either "abc" or "def". |
"a(b|c)" |
Either "ab" or "ac". |
"a(b|c)*" |
A string beginning with "a", followed by any number of "b" and "c" (in any combination). |
"a.c" |
A string beginning with "a", ending with "c", and with any character in the middle. |
"a.*c" |
Any string beginning with "a" and ending with "c". |
"[abc]" |
Either "a", "b", or "c". |
"[abc]+" |
Any string containing only a's, b's, and c's. |
"[abc][0123456789]" |
Any string consisting of an "a", "b", or "c" followed by any digit. |
"[0-9]" |
Characters 0-9. This is the same as "[0123456789]". |
"[a-zA-Z]" |
Any string consisting of upper- and lower-case characters. |
"[^0-9]" |
Any character that is not a digit. |
"^abc" |
"abc" only if it occurs at the beginning of the string. |
"a\+c" |
Only "a+c" |
"a+c" |
Any sequence of a's ending with a "c". |
"abc$" |
"abc" only if it occurs at the end of the string. |
"^a[0-9]*c$" |
A string that begins with an "a", ends with a "c", and contains only digits in between. |
Example 1
1000 PRINT STR(MASK("config.bbx",".*bbx"))+" "+STR(TCB(16))
Example 2
The following displays all file names matching a particular pattern:
1000 INPUT "MASK: ",M$
1010 LET A=MASK("",M$,ERR=1000); REM - establish mask
1020 LET CHAN=UNT; OPEN (CHAN)DIR("")
1030 READ RECORD (CHAN,END=1200)F$
1040 IF MASK(F$)=1 THEN PRINT F$
1050 GOTO 1030
1200 end