Crafting a compiler, part II: A simple compiler


The AC

The ac or adding calculator is the language used in the compiler, this is a simple and relatively way of facilitate the performance and process of the compiler.

What is AC?

Stands for adding calculator, is the language that serves for examining the phases of a compiler and his data structures. We can define a language by his data types, keywords and variables.

  • Data types: Integers and decimal, set by the floating point, allowing five fractional digits in the decimal point.
  • Keywords: Three reserved words, f, which declares a float type, i, declare and integer and p, which print the value of the variable.
  • Variables: only 23 options as letters in the alphabet. 26 minus the three reserved ones.

The type conversion are very common on languages and in AC have certain rules as int to float is very simple but no the other way around. For the cast or translation we the available desk calculator, which is a calculator that uses reverse Polish notation, (RPN). The ac program is translated into the dc program (desk calculator), as stack based languages commonly serve as target because of they compact representation. View as the study of larger systems.

The AC definition

For the definition we us the context free grammar (CFG) as an example of analysis and some productions serves to specify syntax and regular expression to specify symbols of the language.

The following expression is a part of the entire language defined by the context free grammar:

The Stmt represents a generator, which can be replaced as the following symbols that can rewrite the expression, these expressions generator by the main generator are either terminal values or non terminal.

The terminal values are the ones that can not be rewritten, can be Id’s pr reserved words, while the non terminal can be rewrite into another symbols, the non terminals always are on upper case as convention and terminal on lower case starting just with upper case, terminal or non terminal.

Once we have a generator is replaced with a non terminal this can be a symbol defined by the CFG or can be the λ, which means that doesn’t generate anything and ends up with nothing, this helps to solve to close the cycle or ecurivity in some cases. Where you can keep adding symbols or stop whenever you like.

In the ac there are reserved words as in any language, f, i and p are the ones that can no be replicate on the language as Id’s, there expression defined by the CFG are converted a validated into tokens to be converted into a parse tree where can represent the statements of structure.

The phases

The scanner. reads the source of ac program and produce and stream of tokens. This is where data types are recognized and keywords, not their meaning but their existence.

The parser. It receives the tokens, and creates the abstract syntax tree (AST) for the compiler. By a recursive descendant style. Order of the tokens/words.

The AST. It creates the symbol table. The symbol table associates the data type with the information that was previously assigned.

The semantics. The semantic analysis are in charge of the parsing task. Gives meaning to the code and transform the AST to becomes more clear. Replacing most of the cases with a more simplified.

The AST traversed. To generate a translation to the original program. Optimization implementation for the code generation.

The Scanning process

The scanner translates the stream of characters that receive it turn into the stream of tokens, which represents the the end of a terminal symbol. Regular expressions validate that such words or at least keywords/reserved exists. Parts that define a token: the type of data, and the semantic value that provides additional information.

Sometimes there is no need to interpret semantic value like plus (+), but when it comes to another tokens there is a need to reveal id or number has been scanned. Often ignore comments and symbols that serve to format text like blank or tab spaces.

The Parsing

The parsing is the one that verifies that the received stream of tokens is conformed by the grammar language specifies. Simply checks that the sequence that contains the statements are correct value for the sequence represented and previously defined. The CFG sets the correct way to put the words in order.

The Abstract Syntax Tree

This is the product that the parser and the scanner phase generate, when the syntax analysis is being held. They make sure that the compiler’s input in conformed by the language tokens on the context free grammar specifications.

This validates a lot of stuff, for example the correct way to declare a variable in certain data type in order to be used correctly. Sometimes they have codes or certain phrases that helps to reference way easier. For example x.y.z, meaning in java certain project dot package dot field. Most languages are way overloaded in order to mean more than the actual operation. Example: the + operator, because in single language could represent sum number or strings.

The semantic Analysis

The semantic analysis is the post parsing processing. Enforces the language definition that is not easily verified in the syntax analysis. This analysis focuses and depend on the use on two aspects: simple table construction and type checking.

The symbol table

Declared and defined prior use. The semantic process uses to record all different identifiers. and their types. From the AST to record.In this cas the ac has 23 diferent entries due to the 23 alphabet option of identifier’s names. Can be integer, float or unused/null.

The code generation

The final task done by a compiler. consists on generating code source. Consist into doing the ac program suitable and translate to the dc which is a simple stack calculator model.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

Up ↑

Create your website at WordPress.com
Get started
%d bloggers like this: