KaiwuDB supports many different types of SQL statements, such as create, insert, etc. This article describes the process of adding new statements to KaiwuDB SQL Parser (hereinafter referred to as the parser) and its implementation. We'll look at how to use the goyacc tool to update the parser and how the executor and the query planner work together to execute this statement.
Adding a new SQL statement starts with adding the necessary syntax to the SQL parser. The parser is generated by goyacc, which is the go version of the popular yacc compiler. The syntax definition is located in pkg sql parser sqly file. The output of the parser is an abstract syntax tree (AST) where node types are defined in various files in the pkg SQL SEM tree directory.
There are three main components to adding a new statement to a SQL parser: adding a new keyword, adding syntax to the parser, and adding a new syntax node type.
In this example, we will use a new statement in KaiwuDB: frobnicate. This statement will randomly modify the settings of the database. It will have three options: frobnicate cluster, which is used to manipulate the cluster settings;frobnicate session, which is used to manipulate session settings;frobnicate all, which is used to process both at the same time.
Let's start by checking if all the keywords are defined. Open pkg sql parser sqly file and search"ordinary key words"。You'll see a series of alphabetical tag definitions. Since the other syntax already defines the session, cluster, and all keywords, we don't need to add them, but we do need to create a keyword for frobnicate. It should look like this:
%token frobnicateThis tells the lexer to recognize the keyword, but we still need to add it to one of the category lists. If a keyword can appear in an identifier position, it must be reserved (which requires that it must be quoted in quotation marks for other uses, such as as as column names). Since our new keyword serves as the start of a SQL statement, it can't be mistaken for an identifier, so we can safely add it to the list of non-reserved keywords. In pkg sql parser sqlyfile, search for unreserved keyword: and add | as follows frobnicate:
unreserved_keyword:
frobnicate
Now that the lexer knows all of our keywords, we need to teach the parser how to handle our new statements. There are three places where we need to add references: a list of statement types, a list of statement cases, and a parsing clause.
In the syntax file (pkg sql parser sqly) and you'll find a list of types. Add a line about our new statement type, something like this:
%type frobnicate_stmtSo we're going to create a new statement type"frobnicatestmt"A type declaration has been added. Please note"frobnicatestmt"Just a sample name, you can customize it according to your actual situation.
Next, we need to add the new statement type to the list of statement cases. Continue searching for the syntax file and find it"stmt"(e.g. stmt select, stmt insert, etc.). Add the following cases to these rules:
stmt:
frobnicate_stmt // extend with help: frobnicate
Finally, we need to add a generative rule to our statement. In pkg sql parser sqly file:
frobnicate_stmt:
frobnicate cluster
frobnicate session
frobnicate all
Here is a list of the three expressions we allow, separated by a vertical character. Each generator also has an implementation enclosed in curly braces (which temporarily reports an error and displays an "unimplemented" error message).
Finally, add the help documentation to our statement. Above the generative rule we just added, add the following comment:
// %help: frobnicate - twiddle the various settings
category: misc
text: frobnicate
Now our parser will be able to recognize the new sentence types and generate some comments about the new syntax to help the user. After recompiling **, try to execute this statement, and get the following result:
$ kwbase sql --insecure -e "frobnicate cluster"
error: at or near "cluster": syntax error: unimplemented: this syntax
sqlstate: 0a000
detail: source sql:
frobnicate cluster
This means that our new syntax has been parsed successfully, but we can't do anything because it hasn't been implemented yet.
Now that the syntax layer is added, we need to give the new statement the proper semantics. We need an ast to pass the structure of the statement from the parser to the runtime. As mentioned above, our statement is %type, which means that it needs to implement treestatement interface, which can be found in the pkg sql sem tree stmtgo.
We need to write four functions: three for the Statement interface itself (StatementReturnType, StatementType, and StatementTag), one for NodeFormatter(Format), and the standard FMTstringer。
Please create a new file for our statement type: pkg sql sem tree frobnicatego。In it, put in the format and definition of our AST node.
package tree
type frobnicate struct
type frobnicatemode int
const (
frobnicatemodeall frobnicatemode = iota
frobnicatemodecluster
frobnicatemodesession
func (node *frobnicate) format(ctx *fmtctx)
To add statements and string representations to our ast tree, open pkg sql sem tree stmtGo file and search for StatementReturnType implements the Statement Interface. Now you can see a list of implementations of different types of AST. Insert the following into it in alphabetical order:
func (node *frobnicate) statementreturntype() statementreturntype
statementtype implements the statement interface.
func (node *frobnicate) statementtype() statementtype
statementtag returns a short string identifying the type of statement.
func (node *frobnicate) statementtag() string
Next, add the following in alphabetical order:
func (n *frobnicate) string() stringNow we need to update the parser to return a frobnicate node (AST) with the appropriate schema type when our syntax is encountered. Return to pkg sql parser sqly file, search for %help: frobnicate, and replace the statement with the following:
frobnicate_stmt:
frobnicate cluster }
frobnicate session }
frobnicate all }
The special symbol $$val represents the node value generated by this rule. There are a few other $ symbols that can be used in yacc. One of the more useful forms is to reference the node value of a sub-generator (e.g., in these three statements, $1 would be the token frobnicate).
Next, recompile kaiwudb and re-enter the new syntax, and get the following result:
$ kwbase sql --insecure -e "frobnicate cluster"
error: pq: unknown statement type: *tree.frobnicate
failed running "sql"
Now we see a different error than before. This error comes from the SQL Planner, which doesn't know what to do when it encounters a new statement type. We need to teach it the meaning of new statements. Although our statement won't work in any query plan, we'll do it by adding a method to the planner. This is where the centralized statement is distributed, so the semantics are added there.
Find the source of the error we're currently seeing**, and you'll see that it's in pkg sql opaqueAt the end of a long list of type selection statements in a go file. Let's add a case:
case *tree.frobnicate:
return p.frobnicate(ctx, n)
Similarly, in the same file pkg sql opaqueGo's init() function:
&tree.frobnicate{},This will call a method on the planner itself (not yet implemented). Let's start with pkg sql frobnicatego file.
package sql
import (
context"
github.com/kwbasedb/kwbase/pkg/sql/sem/tree"
github.com/kwbasedb/errors"
func (p *planner) frobnicate(ctx context.context, stmt *tree.frobnicate) (plannode, error) {
return nil, errors.assertionfailedf("we're not quite frobnicating yet...")
At this point, recompile the kaiwudb and execute the statement again:
$ kwbase sql --insecure -e "frobnicate cluster"
error: pq: we're not quite frobnicating yet...
failed running "sql"
At this point, we've been able to get the error passed to the SQL client. We only need to add the above interface function to make the statement take effect.