CSCU9A5 Week 1

Read Chapters 1 and 2 of Clean Code

You get the drift. Indeed, the ratio of time spent reading vs. writing is well over 10:1. We are constantly reading old code as part of the effort to write new code. Because this ratio is so high, we want the reading of code to be easy, even if it makes the writing harder. Of course there’s no way to write code without reading it, so making it easy to read actually makes it easier to write.

Class names

Classes and objects should have noun or noun phrase names like Customer, WikiPage, Account, and AddressParser. Avoid words like Manager, Processor, Data, or Info in the name of a class. A class name should not be a verb.

Method names

Methods should have verb or verb phrase names like postPayment, deletePage, or save. Accessors, mutators, and predicates should be named for their value and prefixed with get, set, and is according to the javabean standard.

string name = employee.getName();
customer.setName("mike");
if (paycheck.isPosted())...

When constructors are overloaded, use static factory methods with names that describe the arguments. For example,

Complex fulcrumPoint = Complex.FromRealNumber(23.0);  

is generally better than

Complex fulcrumPoint = new Complex(23.0);  

Consider enforcing their use by making the corresponding constructors private.

Note: Listings 2-1 and his solution in 2-2 make me cringe

Its too long! Random methods with very long names makes it hard to read on screens like IDE's. You also end up jumping around the code, back and forth and trying to figure out what method returns what and where the next bit of code goes. There is also the issue of creating entire methods for one case of something happening. You could easily split the decision making within the one method with better coding. You can check if there is more and 0 counts and then handle what happens there. We already have access to all the variables we need passed as parameters, why split this off into many methods. I hate it.

Extended BNF (EBNF)

N ::== a | (b | c)*

This can be read as "N may consist of a or alternatively a series of characters comprising zero to many b or c". A valid syntax matching this rule would include:

Example

We have a simple programming language that allows the user to write arithmetic sums for some existing variables a, b, c, d, or e. We can use +, -, /, * and () on the variables too.

Expression              ::== primary-Expression (Operator primary-Expression)*
primary-Expression      ::== Identifier
                            | (Expression)
Idenifier               ::== a | b | c | d | e
Operator                ::== + | - | / | *

For example:

a + (b * c)

Here you can view the Java specifications as well as Pythons! Go has a very good break down of its grammar, with lots of documentation!

Triangle

Features of Triangle

Examples of Triangle (.tri)

#hi.tri
begin
    put('H'); put('i'); put('!')
end

Output: Hi!

#while.tri
let
    var a : Integer
in
begin
    a := 0;
    while a < 5 do
    begin
        put('a');
        a := a + 1;
    end
end

Output: aaaaa

#str.tri
let
    type Str ~ array 10 of Char;
    
    func replicate (c: Char): Str ~
        [c,c,c,c,c,c,c,c,c,c];
        
  var s: Str
in

begin
    s := replicate('*');
    put (s[0]); put(s[9]); puteol()
end

Output: **

AST Examples

See OneNote folder (Year\ 3/AST/) for the answers

Syntactic Analysis

Let's have a look at Java's syntax

int myNumber = 55;

Kindis the category (Identifier, Integer etc) Spelling is the actual text used in the code

Let's break it down into tokens:

Below is the same example in Triangle:

let
    var myNumber : Integer
in 
    myNumber := 55

Let's break it down into tokens again:

Scanning a string into tokens

The basic ideas is that we have a loop implemented using recursion that gobbles up characters from the text until they match a known template for a token.

myNumber := 55+ 10
^

Our starting point is char 'm', so we assume that we are working along a Identifier. We continue working our way through to the end of the word (until we hit a space). At this stage, it is impossible to know if we are working with either method names, string literals or anything of that nature.

myNumber := 55+ 10
~~~~~~~~^

We now check that string with a list of reserved words (such as if, end, while). If it was, say for example, if, the kind of this token will be set to If. We don't match any known reserved words so this token is labelled as an Identifier with the spelling "myNumber".

myNumber := 55+ 10
~~~~~~~~~^

We now keep moving along the spaces until we hit something other than a space. We've hit a colon. There are two types of colon in Triangle, one for separating a variable type from its name or a colon-equals (Becomes token). Let's take the next character as we will know what to do with it after this.

myNumber := 55+ 10
~~~~~~~~~~^

It is an colon equals sign. That tells us that the token is Becomes with the spelling ":="

myNumber := 55+ 10
~~~~~~~~~~~~^

We hit a 5. Triangle can only support Integer types (handy!) so we know that we can carry along this number until we hit something that isn't a number (to get the full context of the Integer). In this case, we move along until we hit a space!

myNumber := 55+ 10
~~~~~~~~~~~~~~^

We've hit something that isn't a digit! This means that this token is of kind Integer Literal and has the spelling "55".

myNumber := 55+ 10
~~~~~~~~~~~~~~^

The next character is a plus. Operators can start with this character so we keep taking characters until reaching something that's not an operator. The next character is a space

myNumber := 55+ 10
~~~~~~~~~~~~~~~~~~^

We hit another Integer (10!) and reach the EOL (end of line). Most methods use recursion to keep reading until we reach the end of a line of a end of file.

Syntactic Analysis: Parsing into an AST

Let's create a theoretical language, Micro-English. Here is the EBNF:

Sentence    ::== Subject Verb Object .
Subject     ::== I | a Noun | the Noun
Object      ::== Me | a Noun | the Noun 
Noun        ::== cat | mat | rat
Verb        ::== like | is | see | sees
(terminals are english lowercase words, e.g. like, the, etc)

Bottom Up

   Noun
    |
the cat sees a rat.
Subject
 ____
 | Noun
 |  |
the cat sees a rat.
  S
 ____
 |  N   Verb
 |  |    |
the cat sees a rat.
  S
 ____
 |  N    V      N
 |  |    |      |
the cat sees a rat.
  S          Obj
 ____        ____
 |  N    V   |  N
 |  |    |   |  |
the cat sees a rat.
     Sentence
___________________
  |      |    |   |
  S      |   Obj  |
 ____    |   ____ |
 |  N    V   |  N |
 |  |    |   |  | |
the cat sees a rat.

Key definitions for compilers