Chapter 2:
Data types, Syntax and Evaluation

In traditional statement-oriented languages, like C or BASIC, you write an algorithm as series of operations for the computer to perform in sequence. The task of programming in these languages ultimately revolves around writing statements. Each statement is a command or action written for an effect, like storing a value into a variable, or transferring control in a loop. Object-oriented languages like C++ or Java allow you to further structure the program in terms of the interaction of objects, but even so you are ultimately writing statements about actions and effects, whether on user-defined objects or built-in data types. In contrast, Daisy is an expression-oriented language. In general, an expression is written for a value and not an effect. This is true for imperative languages like C or BASIC too; after all, they also have expressions, and rules for how expressions evaluate to values. But the major distinction here is that in those languages expressions are just a part of statements, and writing a statement is the goal. In Daisy, the expression is the goal, and not an intermediate step. You write programs by combining expressions together to make larger, more complex expressions. Eventually, you wind up with an expression that computes the result that you want and-Presto! You have a written a program. For programmers weaned on imperative (statement-oriented) languages, this seemingly small twist on things actually requires a fairly major mind-shift for programming work. You must take a more functional view of the computation. You still think in terms of the input data (if any) and what you want to get as output data (if any), but the process involves the transformation of data. What result data (and in what format) do I want, and how can I write a complex expression

Here are some simple Daisy expressions. An expression can stand by itself, or be combined with other expressions to form a single, bigger (compound) expression. A number, 45; the list containing the symbols "string" and "beans"; and an expression introducing a local variable named x with the initial value 45 and an expression for incrementing x:

45	["string" beans]	let:[x 45 inc:x]
An expression appearing in an evaluation context is reduced to a value. An evaluation context is any place where this reduction can occur. For example, in the top-level evaluation context the above expressions would evaluate to:
45	[string |ubi:beans|]	46
In some scripting languages, non-evaluation is the default behavior and evaluation must be explicitly indicated. For example, in a shell script you might put a $ prefix in front of a variable name. In Daisy, evaluation is the default behavior, so almost any expression will be in an evaluation context and we have to quote expressions if we want to avoid evaluation. We'll get back to quotation later; for now, lets look at some basic syntax for expressions and their evaluation rules.

Comments

Comments are not really expressions, but are an important part of any language for program documentation and readability. Comments begin with a vertical bar character and run till the end of the current line, like so:
| This is a comment, it will be
| stripped out by the Daisy parser.
Comments are removed by the Daisy parser and thus do not have any effect on the program. The two-character sequence #! (pound sign-exclamation mark) can be used interchangeably with vertical bar to introduce a comment. This is especially useful for writing scripts under Unix, as in:
#!/usr/local/bin/daisy
| This is the beginning of an executable Daisy script under Unix
...

Atomic Data Types

Atomic data types are like primitive data types in languages such as C or Java. They are elemental and cannot be further subdivided into parts. Daisy's set of atomic data types includes numbers, strings, symbols and closures (functions).

Numbers

Daisy has three kinds of numbers: directives, integers and floating point values. A number evaluates to itself. That means that in an evaluation context a number stands for itself, and is not interpreted in any other way. This treatment is consistent with most other programming languages.

Integers are expressed in standard base 10 notation. Floating point values can be expressed in C-style floating point notation (i.e. either decimal or exponential). Both integers and floating point values are currently limited to 32 bits of precision.

Examples:

45              | regular integer
-1              | negative value
-58.23          | floating point value

Directives are unsigned 24-bit values stored in an efficient "unboxed" format. They are expressed by prefixing an unsigned integer with the pound sign (#) character.

#45             | the directive '45'

Strings (Errons)

Daisy has a string type that contains a packed sequence of bytes. However, this type is not very useful for general programming, because it is treated as an error value, or erron, by most Daisy primitive operations. Errons are typically generated when a primitive is given an improper value, such as passing a non-number to an arithmetic primitive, or when an unbound variable (an undefined variable) is evaluated.

Attempting to do something with an erron generally results in a new erron being generated that includes the first erron as a suffix. Since errons themselves are rarely legal values, one generated deep inside a nested expression or function call will expand as it bubbles up through the evaluation stack resulting in something of a stack dump when it finally gets written out at the top level.

Most of Daisy's string-like primitives operate on symbols instead of strings.

Symbols

We have referred to Daisy as a symbolic programming language, which means it is adept at manipulating symbols as well as numbers. Symbols combine the concepts of strings and variables, and serve both purposes in Daisy. Each symbol has a name (which is a string-see above) and a value, or binding (but note that the binding may be undefined). Like a string, a symbol can be passed around as an object and its name can be examined and manipulated with string-like operations.

A symbol appearing in an evaluation context is assumed to be a variable identifier; it evaluates to the value, if any, of the variable it represents. An identifier is introduced syntactically by a sequence of alphanumeric characters beginning with a letter, and optionally containing any of the following characters:

~ % & _ - + / ?
If you want an identifier to begin with a non-letter, or include other characters not in the above set, you can escape the offending character by prefixing it with a back quote (`).

Examples:

tommy-boy
tommy_boy
tommy`!boy      | the identifier 'tommy!boy'

Evaluating an unbound symbol (i.e. a variable without a value) results in an erron:

Unbound identifier: tommy-boy

To prevent a symbol from being interpreted as a variable (i.e. to treat it strictly as a string-like value) it must be quoted. One way to quote a symbol is by enclosing it in double quotes. This also allows you to include spaces and other special characters in the symbol's name.

Examples:

"Red fire truck"
"[This looks [[like a] list but it's] a symbol!"
Quoted symbols just evaluate to the symbol itself, not to the value of the symbol. We will return to the topic of quoted expressions later.

Characters

Daisy's characters are just single character symbols, sometimes referred to as singletons. Daisy has primitives that return a list of singletons corresponding to a multi-character symbol, but singletons and symbols are not fundamentally different data types as are, say, C's characters and strings.

When a Daisy operation is said to take a character, it means you should supply a single-character symbol.

Compound Data

Compound data are the set of built-in data structures supported by Daisy. These contain, or are composed of, other compound or atomic data.

Lists

[ E0 E1 ... En ]
[ E0 E1 ... En-1 ! En ]
[ E0 E1 ... En * ]
[ ]
Lists are a built-in, ubiquitous data type in Daisy. Lists are expressed as a sequence of one or more expressions enclosed within square brackets.. Daisy lists are composed of cells, each cell containing a head and a tail field (called the car and cdr, in Lisp). In a typical list, the head of each cell contains a list element and the tail contains the rest of the list.
[1 2 3 4 5]           | list of numbers
[x y [z delta] 2.5]   | a more complex list
[]                    | nil
The empty list denotes the system null terminator, nil, which is an atomic object. By default, lists are terminated by nil, unless the final expression is prefixed with an exclamation point, indicating that the final expression denotes the tail, or the final element is an asterisk, which indicates a self-referential, one-cycle (infinitely repeating) tail:
[1 2 3 4 ! 5]         | an "improper" list
[a ! b]               | a "dotted pair"
[x ! [y ! [z ! []]]]  | same as [x y z]
[1 2 3 *]             | [1 2 3 3 3 ...]
A list evaluates to a list construction, meaning to build a new list using the current list as a template. Each expression in the list is evaluated and a list is built of the results. We will have more to say on list construction later.

Other Expressions

Assignment

identifier = exp
An assignment expression gives a value to a symbol. Daisy's treatment of assignment differs significantly from most conventional programming languages. Assignment expressions can only appear at the top level of a program. It is not possible, for example, to assign a variable inside a function body. While this restriction may seem severe, it is necessary to insure that Daisy has the ability to extract as much implicit parallelism as possible from the program. Daisy programs are typically written in a functional style, where assignment is only used to provide names for the functions that are defined at top level. Many languages return the value (right-hand side) as the result of an assignment expression. In contrast, Daisy returns the left-hand side (the identifier).

Examples:

pi = 3.1415
path = "/usr/local/lib/daisy"

Functions (Closures)

\ Eformals . Ebody
Functions play a central role in Daisy programming. A function expression is introduced by a backslash character followed by two expressions, a formals and a body, separated by a period. The formals expression is either a single identifier or an arbitrarily-nested list of identifiers naming the formal parameters to the function. The body may be any valid Daisy expression. It may contain variables corresponding to the formal parameters.

A function expression evaluates to a closure, which captures the current environment (the current bindings of any local variables in the current lexical scope) with the body expression on the right-hand side. When a closure is applied to an argument (see Application, below) the environment stored within the closure is combined with the function argument to form a new environment. The body expression is then evaluated, using the extended environment to resolve the values for any parameters or local variables.

Examples:

add3 = \x. add:[x 3]  | a function that adds 3 to x

Daisy has first-class functions, which means that functions can be handled like any other primitive value; they can be returned as the value of another function or primitive, can be passed as arguments to other functions, can be stored in structures, and so on. That is one reason a function expression does not need a name for the function itself; instead, we simply assign the value of the function expression to a variable like we might any other value.

Examples:

| adder takes a number y, and returns a function
| that takes a number x and returns x + y
adder = \y. \x. add:[x y]

If this terminology sounds confusing, just remember the following:

  1. All functions return a value (which can be a list).

  2. All functions take a single parameter (which can be a list).

  3. Functions can be created anytime, not just at compile time. When you evaluate a function inside some nested variable scope at run time it captures those local variables and can refer to them in the function body. The resulting closed function object is called a closure.

  4. Functions (closures) can (for the most part) be treated like other data.

Application

Efunc : Earg
An application expression denotes the application of a function object, or closure, to an argument. Application is expressed by two expressions representing these entities separated by an infix colon. The expression on the left-hand side must evaluate to a closure; the expression on the right can evaluate to any valid Daisy value. The closure may enforce further type restrictions on its arguments, of course.

Examples:

add:[x y]
inc:z

Quotation

^Eany
Any Daisy expression can be quoted by prefixing it with a caret or hat character (^). A quoted expression evaluates to itself, stripped of the quote. Quotation is used to suppress evaluation when you want to refer to the expression as a piece of data in an evaluation context, such as the name of a symbol instead of its value.

Examples:

^45                   | a number of
^hello                | examples of  
^[a b [c d] e]        | quoted expressions
^a:b
^^hello               | doubly quoted

As noted previously, strings can also be quoted with double quotes.

Parenthesis

( Eany )
The compound expressions for application, functions and quotation all have an ambiguous grammar, which is resolved by associating to the right by default. You can override the default precedence by enclosing any expression in parentheses:
foo: bar: baz         | By default these two 
foo: (bar: baz)       |   are the same.
(foo: bar): baz       | Overriding the default.