SQL — How to Create Custom Functions and Date Formats
Overview
Starting in BBj 19.10, it is relatively easy for someone with some Java ability to implement their own SQL scalar functions, group functions, and date formats, and make them available in a BBj database. The relevant Java classes and interfaces which make this possible are in BBjStartup.jar in the packages com.basis.startup.type.sql and com.basis.startup.type.dateformat. In general, functions must implement a method to type check potential arguments, and methods to actually execute the desired functionality.
Scalar functions are executed per row and provide a single result. Built-in scalar functions include functions like ABS, UCASE, SUBSTRING, and NOW.
Group functions collect values from several rows and then provide a single result. Built-in group functions include functions like AVG, COUNT, and MAX.
Date formats have methods that take a Java Calendar object and return a byte array and vice-versa.
In order to create a custom scalar function, group function, or date format, you must first implement the desired functionality in a Java class that implements the appropriate interface. After you have the functions defined, you need to implement a new Java class with a public default constructor that implements the BBjSQLFactory interface and provides the methods to return an instance of the requested function or date format, given a function or date format name.
Note that internal functions and date formats take precedence, and that group and scalar functions share the same namespace. This means that you should not use the name of an existing group or scalar function, and you cannot use the same name for both a group and scalar function (the group function would take precedence). Function names are normally case insensitive and should therefore not depend on exact case. Date format names can be case sensitive if desired.
Creating a Scalar Function
When creating a custom scalar function you need to put the actual implementation into a Java class that implements the interface BBjSQLScalarFunction. The interface only has two methods that you must implement: validate and execute.
The validate method is called with potential argument type information so that the function can check for the correct number and types of arguments. The method returns the type information for the result of the function, since the result type information may depend on the type information of the arguments provided. For example, you may have a function that accepts either a number or a string as an argument and returns the same type as the argument.
The execute method actually executes the function, taking literal values as arguments and returning a literal value as a result.
Scalar Function Example
Consider the implementation of a sample custom scalar function CHAR_AT, which takes two arguments, a character argument and a numeric argument, and returns the nth character of the character argument (0 based) or NULL if there is no such character. All that we need to do in this case is implement the BBjSQLScalarFunction interface. The class name and package do not matter; the class name is the same as the function name in this example for convenience. In this case the return type is static, so we use a static variable to instantiate it.
package foo; |
Creating a Group Function
When creating a custom group function, you need to put the actual implementation into a Java class that implements the interface BBjSQLGroupFunction. The interface has four methods that you must implement: validate, reset, collect, and getResult.
The validate method is called with potential argument type information so that the function can check for the correct number and types of arguments. The method returns the type information for the result of the function, since the result type information may depend on the type information of the arguments provided. For example, you may have a function that accepts either a number or a string as an argument and returns the same type as the argument.
A group function generally requires internal state, which is managed by the extra methods:
The reset method tells the function that it should discard any collected values, possibly in preparation for a new set.
The collect method is called zero or more times with literal values for each row in a group.
The getResult method actually produces the result of the function, using the collected literal values to produce a literal value.
Group Function Example
Consider the implementation of the following sample custom group function LAST. This function takes a single argument of any type and returns the last collected value or NULL if no values were collected. For this function, we only need to implement the BBjSQLGroupFunction interface. The class name and package do not matter; for this example the class name is the same as the function name for convenience. In this case, the return type comes from the validation since it is always the same as the argument type.
package foo; |
Creating a Date Format
When creating a custom date format you need to put the actual implementation into a Java class that implements the interface BBjDateFormat. The interface has five methods that you must implement: toCalendar, toBytes, getTemplateDataType, getTemplateSize, and getSize.
The toCalendar and toBytes methods are the heart of the implementation since they convert between byte arrays and actual dates represented by Java Calendar objects. The other methods provide information on how the date format bytes should be stored in files.
Date Format Example
Consider an implementation of the following sample custom date format MyMillis. This date format stores the raw milliseconds from a Java Calendar as an eight byte signed long. For this date format, we only need to implement the BBjDateFormat interface. The class name and package do not matter, so for this example, the class name is the same as the date format name for convenience.
The following is an important note regarding the value returned from the toBytes() method and the value passed into the toCalendar() method. Types that store the data in the BBj file record as a number (e.g. I, U, B, N, X, Y, and F) should convert the numeric value to a string and return the byte[] of that numeric string when implementing the toBytes() method. Likewise, values passed into the toCalendar() method (when the underlying record contains numeric data) will be a byte[] representing the number as a string. The SQL engine will set/get the value in the record appropriately for you. See the sample code below for details. For data of type C or O, the methods toBytes() and toCalendar() are responsible for encoding and decoding the character or binary data respectively.
package foo; |
Adding Custom Functions to a Database
Once you have written the appropriate implementations for scalar or group functions you need to ensure that the SQL database knows about them. You accomplish this by implementing the BBjSQLFactory interface and telling a specific database configuration about the implementation. A single BBjSQLFactory implementation can tell the database about any number of scalar and group function implementations. The configuration is per database, not global, for security and encapsulation purposes. The database administrator decides which custom functions are available in an individual database, and those functions will not leak to unrelated databases.
Creating a Factory
In order to link implementations of new scalar functions, group functions, and date formats to a database, you need to implement the BBjSQLFactory interface.
Implement the findScalarFunction method if you want one or more scalar function implementations.
Implement the findGroupFunction method f you want one or more group function implementations.
Implement the findDateFormat and getDateFormats methods if you want one or more date format implementations.
Each method takes a name and optionally returns an instance of the appropriate BBjSQLScalarFunction, BBjSQLGroupFunction, or BBjDateFormat implementation. If the factory does not recognize the name then it should return an Optional.empty(). Normally, every call to a find method will return a new instance of the implementation. However, if you are sure that the implementation does not have any state then it could return the same instance if desired. The BBjSQLFactory implementation must have a public default constructor for the SQL database to create an instance.
Note that scalar and group functions share the same namespace, so you should not create scalar and group functions with the same name. Group functions will take precedence over scalar functions of the same name.
Note that all function names will normally be converted to uppercase before being passed into a factory, but date format names may be case sensitive if desired.
As an example, here is an implementation of a BBjSQLFactory that will link the two example functions above with an SQL database.
package foo; |
Configuring the Database
Now that you have an implementation of BBjSQLFactory that links the appropriate function implementations to names, you need to tell a specific database about the factory implementation.
Classpath Configuration
First, you need to make sure that everything your factory and function implementations need are available in a session specific classpath (SSCP). This SSCP should include the new factory and function implementations as well as any extra JARs that they may require. When compiling the implementations, the BBjSQL objects are in BBjStartup.jar, but BBjStartup.jar does not need to be explicitly added to the SSCP because the BBjSQL objects are always implicitly available to SSCPs. For the purposes of the sample above, there are no other external dependencies, and we will create a single JAR with the factory and function implementations. The implementations can be compiled and packaged in whatever way is most convenient; this may be easiest with a build tool such as gradle and/or an IDE such as Eclipse. However, in order to keep the example simple, we will use the Java JDK directly from the command line.
We start by creating a project directory, changing into that directory, and creating a "src" directory and a package subdirectory "foo" with the class source code above. Create a "build" directory for the class files and compile the source to the "build" directory and create a JAR file from the compiled class files.
mkdir project
cd project
mkdir -p src/foo
# Create CHAR_AT.java in the src/foo directory
# Create LAST.java in the src/foo directory
# Create MyMillis.java in the src/foo directory
# Create MyFactory.java in the src/foo directory
mkdir build
javac -cp $BBJ_HOME/lib/BBjStartup.jar -sourcepath src -d build src/foo/*.java
jar cvf myfactory.jar -C build foo
Now that you have a JAR file "myfactory.jar" with the new implementations, you can create an SSCP in BBj with that JAR file.
-
For this example, just copy "myfactory.jar" to a directory accessible to BBj.
-
Open the BBjServices > Java Settings page and select the Classpath tab.
-
Create a new classpath called "myfactory".
-
Add your myfactory.jar file to this classpath. If your class depends on any additional libraries, make sure to add those as well.
-
Save the changes.
-
Restart BBjServices.
-
Open the database properties for your database.
-
Select the classpath you just created from the available options in the “Scalar/Group Function SSCP” field in the “Custom Functionality” section of the database properties.
-
Enter the full package and class specification in the “BBjSQLFactory Implementation” field on the database properties. It should look something like: foo.MyFactory.
-
Save the database configuration changes.
Now you are ready to use the functions from SQL! If you open an SQL connection to the database above you should be able to test the functions.
create table test (a varchar(10) primary key)
insert into test values ('hello')
insert into test values ('goodbye')
select char_at(a,2) from test
select last(a) from test
You should see the new functions being transparently executed, just like built-in functions.