A Parser which uses Grammar for evaluation
Parsers are systems which read, analyze & evaluate input data conforming to some formal grammar wikipedia link. Below I present an Arithmetic Expression parser which evaluates an expression comprising of symbols [ + - * / ( ) and numbers ]
The parser is encapsulated as the class ArithExprParser with tokenize & evaluate methods.
After cloning the repo, it can built into executable ‘ArithExprParser’ as shown below.
From the working directory run the below commandg++ -std=c++14 -I./include ./src/*.cpp -o ArithExprParser
This will build the executable ‘ArithExprParser’. Just run the executable to start evaluating expressions.
If you want to use it inside your code, just include the header file ‘ArithExprParser.h’ and link ‘ArithExprParser.cpp’ into your build. A sample usage is shown below
#include <iostream>
#include <ArithExprParser.h>
int main()
{
std::string expr;
ArithExprParser parser;
while(true){
std::cout<<"Enter Expression or type 'Q' to quit:";
std::cin>>expr;
if(expr == "Q")
break;
parser.tokenize(expr);
std::cout<<"Result = "<<parser.evaluate()<<std::endl;
}
}
The order of operations performed to evaluate an arithmetic expression is usually referred to by acronym of BODMAS(Bracket, Order, Division, Multiplication, Addition, Subtraction) or PEMDAS(Paranthesis, Exponents, Multiplication, Division, Addition, Subraction). Here we implement this order by formalizing it into a grammar. Grammars are the classical solution to parsing, as they formally define the structure of the input. Formal grammars are a much more detailed topic and for those interested further pls refer to this wikipedia page Formal Grammars
ArithmeticExpr-Parser consists of 2 sub-systems:-
In order to parse any input expression, we need to breakdown the input into its basic symbols/tokens. In our case the basic tokens of arithmetic expressions are the usual symbols [ + - * / ( ) and numbers ]. For differentiating between - and -sign we follow the below rules
Once tokens are available, the evaluator should follow the order of operations(BODMAS) to evaluate the tokens into the final result. We formalize BODMAS order into the below grammar.
Grammar for evaluating arithmetic expressions:-
* Expression:
* term //should end with a term
* Expression : + : term
* Expression : - : term
* term:
* primary //should end with a primary
* term : * : primary
* term : / : primary
* primary:
* float
* ( expression )
*
The above grammar should be interpreted like this:-
The above structure is naturally recursive and hence we can define functions for each of expression, term & primary. Another advantage of defining the order into a grammar is it is highly intuitive.
The reasoning behind the grammar definition is :-
An example evaluation order is shown below for a sample expression.
Input expression:- 2.3 + 3*(1+3)
Its evaluation is broken into the below steps:-
Arithmetic expression evaluation shows a simple application of grammars. Grammars are used almost universally for parsing. I plan to explore grammars in more detail in future by showing a markup language parser.