Summary of JAVACC use: Syntax Advanced - Recursive Steps of Aggregate Functions

Aggregate functions

I remember that I should have learned the concepts of sets and related operations on sets in junior high school. Now I will review the following summary.

1. The meaning of the set: some specified objects are set together to become a set, and each object is called an element.

2. Three characteristics of the elements in the set: 1) The certainty of the elements; 2) The mutuality of the elements; 3) The disorder of the elements

For example, the set {1,4,7} is the same set as {1,4,7,7}, {7,4,1}, {4,7,1}

The definition of a collection is actually almost the same as the implementation of the Set collection in java.

Take a look at the set operations in the table below

operatesymbolexample
union{1, 2}∪{red, white} = {1, 2, red, white}
intersect

{1, 2}∩{red, white} ={}

{1, 2, green}∩{red, white, green} = {green}

subtract-

{1, 2}−{red, white} = {1, 2}

{1, 2, green}−{red, white, green} = {1, 2}

{1, 2} -{1,2, white} ={}

Example: (A∩B)∩C = A∩(B∩C)

The result of the set operation operation is still a set, and it can also participate in the next set operation, which is a bit similar to the feeling of recursive function call. This is the core of this article. We need to implement a set of set functions through javacc, including the operation of sets (which can be recursive), and write through this step to raise our level of syntax definition and action implementation in javacc.

Of course, in order to reflect the definition of functions and the characteristics of recursion, the operation of sets is not represented by mathematically defined runes (such symbols are not easy to type), but by "function name (parameter 1, parameter 2...)". .

For example: subtract(u3,intersect(union(u1,u2),u3))

The definition of the set is still reflected in the form of "{1,2,'red'}". Since it is a step, then the variable assignment definition is also indispensable. Let's follow the footstep variables of python, which is simpler.

Example 1: a = 1+5*(7-3)

Example 2: u1 = {1,4,7+9,'red'}

 

Parsing

In the article on the analysis of the four operations that I summarized, I mentioned that the method of parsing is to draw a parsing tree diagram. This time, it is explained by drawing. However, only the parse tree diagram related to the set definition and set operation functions is drawn. After this part is processed, this subtree can be merged into the entire parse tree. This is also the idea of ​​divide and conquer.

set definition syntax tree

The set is composed of 0 to n elements e, and element e is composed of number four (also a syntax tree) or 'string' (common character syntax tree). Therefore, the set definition syntax tree diagram of the above figure is formed.

Set Manipulation Function Syntax Tree

 

funName: union,intersect,subtract

The parameters of the set operation function are composed of at least two set sets. The set set can be a set definition tree, or a set operation (the result of the operation is still a set), so a cyclic nested syntax tree diagram is formed.

It seems that this tree is a bit similar to the previous four operations, and the implementation is different. The operands and results of the four operations are both numbers (double), but the parameter set of the set operation function is two different types from the figure. To test the ability of abstract definition, you can define an interface called operator (xxOperator), the definition of the set and the set operation function are the implementation classes of the interface, then they are the same type in form.

Well, that's it for the analysis of the core point. The syntax trees such as other assignment expression definitions are relatively simple, so I won't draw the analysis here.

step syntax implementation

Zcode.jj

options{
     STATIC = false;
     JDK_VERSION = "1.8";

}
PARSER_BEGIN(Zcode)
package com.javacc.zcode;
import com.javacc.zcode.*;
import com.javacc.zcode.operator.*;
import com.javacc.zcode.operator.bean.*;
import java.io.StringReader;
import java.util.*;

public class Zcode {
    ZpContext zpContext;
    public Zcode(String expr){
         this(new StringReader(expr));
         zpContext = new ZpContext();
    }

}


PARSER_END(Zcode)


SKIP : { " " | "\t" | "\n" }

TOKEN : {
    <NUMBER : <DIGITS>
      | <DIGITS> "." <DIGITS>
     >
  |
    <#DIGITS :(["0"-"9"])+>
}

TOKEN : {
     < LPAREN: "(" >
   | < RPAREN: ")" >
   | < LBPAREN: "{">
   | < RBPAREN: "}">
   | < ADD : "+" >
   | < SUBTRACT : "-" >
   | < MULTIPLY : "*" >
   | < DIVIDE : "/" >
   | < EQUALS : "=">
   | < COMMA : ",">
   | < SEMIC : ";">
   | < SINGLE_QUOT : "'">

   |< UNION : "UNION" | "union">
   |< INTERSECT : "INTERSECT"| "intersect">
   |< SET_SUB : "SUBTRACT" | "subtract">

   |<PRINT : "PRINT" | "print">
   |<PRINTLN : "PRINTLN" | "println">

}
TOKEN : { < EOL : "\n" | "\r" | "\r\n" > }

TOKEN : {
<ID : <LETTER> (<DIGITS>|<LETTER>)*>
}



TOKEN : { < # LETTER : (["_",
                         "~",
                         "a"-"z","A"-"Z"])+ > }
//footstep parsing entry
void  parse():
{

 }
{
  (
      LOOKAHEAD(2,<ID> <EQUALS>)
       assignExp() <SEMIC>  //assignment expression
    | printFunction() <SEMIC> //print function
  )+ //The script contains multiple assignment expressions and printing functions, each of which ends with ";" as a distinction


}
//print function
void printFunction():
{
  int flag;
  //print target
  Object target = null;
  Token vt;
}
{
    (
      <PRINT> {flag=1;}
    | <PRINTLN>{flag=2;}
    )
      <LPAREN>
      (
       vt = <ID> {target = zpContext.getVariable(vt.image);} //variable
       | target = calc() //Arithmetic
       | target = defineSets() //set definition
       | target = setOperFunciton() //set operation function
      )
      <RPAREN>
    {
        PrintOperator print = new PrintOperator(flag,target);
        print.operator();
     }
}

//assignment expression
void assignExp():
{
   Token t;
   double v;
   ZcOperator operator = null;
}
{
   t=<ID>
        <EQUALS>
      (
         v = calc() {zpContext.putVariable(t.image,v);}
       | operator = setOperFunciton() {zpContext.putVariable(t.image,operator);}
       | operator = defineSets() {zpContext.putVariable(t.image,operator);}
      )
 }
//Collection definition: {element,element.....} or {}
ZcOperator defineSets():
{
    Set data = new LinkedHashSet();
    ZcResult elem = null;
}
{
   <LBPAREN>
    (
       elem = element() {data.add(elem.getData());}
       (<COMMA>
         elem = element() {data.add(elem.getData());}
       )*
    )*
   <RBPAREN>
   { return new SetOperator(data);}
}
//Collection element definition
ZcResult element():
{
  Token t;
  double v;
}
{
   v = calc() {return new ZcResult(v);}   //match number four
  | <SINGLE_QUOT> t =  <ID> <SINGLE_QUOT> {return new ZcResult(t.image);} //matches 'string'
}


//Parse set manipulation functions
ZcOperator setOperFunciton():
{
  int flag;
  List<ZcOperator> pList = new ArrayList<ZcOperator>();
  ZcOperator operator = null;
}
{
  (<UNION> {flag=1;}
   |<INTERSECT> {flag =2;}
   |<SET_SUB> {flag =3;}
  )
     <LPAREN>
          operator = setOperBase() {pList.add(operator);} //Action collection
          (
            <COMMA>
            operator = setOperBase() {pList.add(operator);}
          )+
     <RPAREN>

     { return new SetFunctionOperator(flag,pList,zpContext); }
}
//gather
ZcOperator setOperBase():
{
    ZcOperator operator = null;
    Token t;
}
{
       operator = defineSets() {return operator;} //set definition
   |   t = <ID> { return zpContext.getVariable(t.image);} //The variable that the set is assigned to
   |   operator = setOperFunciton() {return operator;} //set manipulation functions
}

//Parse the first-level tree processing addition and subtraction
double calc():
{
 double left;
 double right;

}
{
  left = mutlOrDiv()


  (<ADD>   right = mutlOrDiv() {left += right;}
    | <SUBTRACT> right = mutlOrDiv() {left = left - right;}
     )*

  {
     return left;
   }
}

//Parse the secondary tree to handle multiplication and division
double mutlOrDiv():
{
 double left;
 double right;

}
{
  left = parseBase()
  (<MULTIPLY> right = parseBase() {left *= right ;}
    | <DIVIDE> right = parseBase() {left = left/right;}
     )*
  {
    return left;
   }
}

//Parse tertiary tree
double parseBase() :
{
 Token t = null;
 double num;
}
{
  t = <NUMBER> {return Double.parseDouble(t.image);}
//Handle the four rules in parentheses
 | <LPAREN> num = calc() <RPAREN> {return num;}

}


This grammar file is extended on the basis of four operations. Its core implementation is basically the same as the idea of ​​grammar analysis. The difference is to increase the analysis support for variables, and also increase the analysis of assignment expressions and printing functions, etc. .

ZcOperator.java

/**
 * @description: Operator Definition Interface
 * 
 */
public interface ZcOperator<T> {

    /**
     * operator operation
     * @return ZcResult return result
     */
    ZcResult<T> operator();

}

 ZcResult.java

/**
 * @description: The result class of the operator operation
 */
public class ZcResult <T>{
    private final T data;

    public ZcResult(T data) {
        this.data = data;
    }

    public T getData() {
        return data;
    }

}

ZcVoid.java

/**
 * @description: An operator result class that does not actually return a result
 */
public class ZcVoid  extends ZcResult<ZcVoid>{
    private final static  ZcVoid zcVoid = new ZcVoid();

    public ZcVoid(ZcVoid data) {
        super(data);
    }

    private ZcVoid(){
        super(zcVoid);
    }

    @Override
    public ZcVoid getData() {
        throw new  UnsupportedOperationException("ZcVoid Unsupported");
    }

    public static ZcVoid getInstance(){
        return zcVoid;
    }
}

 

SetOperator.java

/**
 * @description: set definition operator
 */
public class SetOperator implements ZcOperator<Set>{

    /**
     * result set
     */
    private ZcResult<Set> result;

    public SetOperator(Set data){
        result = new ZcResult<Set>(data);
    }

    public ZcResult<Set> operator() {
        return result;
    }
}

SetFunctionOperator.java

/**
 * @description: set operation function operator
 */
public class SetFunctionOperator implements ZcOperator<Set>{
    /**
     * function name tag
     */
    private final int flag;

    /**
     * parameter list in function
     */
    private List<ZcOperator> params;

    /**
     * operational context
     */
    private ZpContext context;

    public SetFunctionOperator(int flag, List<ZcOperator> params, ZpContext context) {
        this.flag = flag;
        this.params = params;
        this.context = context;
    }

    public ZcResult<Set> operator() {
       List<Set> list = new ArrayList<Set>();

       //First, take the result from the parameter list and put it into the list array
        for (ZcOperator op : params){
            //Get from {e,...e} collection definition
            if(op instanceof SetOperator){
                SetOperator setOperator = (SetOperator) op;
                list.add(setOperator.operator().getData());
            }
            //Get the result from an operation on a collection
            if(op instanceof SetFunctionOperator){
                SetFunctionOperator sfp = (SetFunctionOperator) op;
                //Note that sfp.operator() will perform recursive operations
                list.add(sfp.operator().getData());
            }
        }


        Set set = new LinkedHashSet();
        //Use the set tool in the hutool toolkit to perform set operations
        switch (flag){
            //perform a union (and) operation
            case 1:
                if (list.size()>2){
                    set =  CollUtil.unionDistinct(list.get(0),list.get(1),
                            list.subList(2,list.size()).toArray(new Set[0]));
                }else {
                    set = CollUtil.unionDistinct(list.get(0),list.get(1));
                }
                break;
            //Perform an intersection operation
            case 2:
                if (list.size()>2){
                    set = CollUtil.intersectionDistinct(list.get(0),list.get(1),
                            list.subList(2,list.size()).toArray(new Set[0]));
                }else {
                    set = CollUtil.intersectionDistinct(list.get(0),list.get(1));
                }
                break;
            //perform a subtract (difference) operation
            case 3:
               set = (Set) CollUtil.subtract(list.get(0),list.get(1));
               break;

        }

        return new ZcResult<Set>(set);

    }
}

ZpContext.java

/**
 * @description: footstep parsing context
 */
public class ZpContext {

    protected ZpContext(){}

    private Map<String, ZcResult> context = new ConcurrentHashMap<String, ZcResult>(256);


    /**
     *  Does the variable name exist?
     * @param var variable name
     * @return
     */
    public boolean existVariable(String var){
       return context.containsKey(var);
    }

    /**
     * put variable
     * @param var variable name
     * @param data variable
     */
    public void putVariable(String var,Object data){
        context.put(var,new ZcResult(data));
    }

    /**
     * get variable value
     * @param var variable name
     * @return <T> returned result
     */
    public <T> T getVariable(String var){
       ZcResult result = context.get(var);
       //A non-existing variable throws an error
       if (null == result){
          throw new IllegalArgumentException("variable does not exist:"+var);
       }
       return (T) result.getData();
    }

}

PrintOperator.java

/**
 * @description: print function
 */
public class PrintOperator implements ZcOperator{
    /**
     * print method mark
     */
    private int flag;
    /**
     * print target
     */
    private Object target;

    public PrintOperator(int flag, Object target) {
        this.flag = flag;
        this.target = target;
    }

    public ZcResult operator() {
        Object t = target ;
        if(target instanceof ZcOperator){
            ZcOperator zo = (ZcOperator) target;
            t =zo.operator().getData();
        }

        switch (flag){
            case 1:
                System.out.print(t);
            case 2:
                System.out.println(t);
        }
        return ZcVoid.getInstance();
    }
}

test case

public class ZcodeTest {

    @Test
    void test1() throws ParseException {
        String prog =" a = 1+5*(7-3) ; \n"+
                " println(a); \n"+
                " u1 = {1,4,7+9}; \n"+
                " u2 = {'red','yellow'}; \n"+
                " u3 ={10+6,'red','ind'}; \n"+
                " unionSet = union(u1,u2,u3); \n"+
                " println(unionSet); \n"+
                " interSet = intersect(union(u1,u2),u3); \n"+
                " println(interSet); \n"+
                " println(subtract(u3,intersect(union(u1,u2),u3)));";

        Zcode zcode = new Zcode(prog);
        zcode.parse();
    }
}

Results of the

It can be seen from the test case that the footstep syntax parsing ability has already taken the form of a programming language. This is the unique charm of parser generators such as javacc and antlr. Of course, you can also find some grammar parsing examples of programming languages ​​developed by javacc and antlr on their official website for everyone to learn.

  Previous: Summary of JAVACC use (4): LOOKAHEAD's sharp edge in resolving syntax selection conflicts - Programmer Sought

Tags: Java Zookeeper Back-end

Posted by sirish on Tue, 03 May 2022 21:06:58 +0300