CS 331 Spring 2013 > Lecture Notes for Monday, February 18, 2013 |
Now we begin the second section of the course: a look at various programming languages. As we study programming languages, we will consider how they differ in five main areas.
6
”.
C++ programs spend much of their time modifying values.
Haskell, on the other hand, has no mutable values;
nothing can be modified.
Haskell also has first-class functions,
that is, functions that can manipulated with the same
ease and versatility as (say) integers can
in almost any programming language.
We thus consider functions to be ordinary values in Haskell.if
, switch
),
and loops.
Haskell has no loops;
we use tail recursion.
An important consideration is how flow of control
is used in error handling.
C++ provides throw
and try
...catch
,
for use in handling exceptions;
what do other languages do?We will also discuss categories of programming languages: functional, concatenative, logic, etc.
The first programming language we will look at is Haskell, named for logician Haskell Curry. Haskell was created as a result of a meeting in 1987. Some members of the functional-programming community decided that their efforts were too fragmented; they created a single language intended to support research or development by large numbers of people. The language was standardized in 1998, with a new standard issued in 2010.
The 1998 standard was implemented in a simple interactive Haskell environment called Hugs; this was supported on all major platforms. There was also a more full-featured compiler: the Glasgow Haskell Compiler, or GHC. Development on Hugs seems to have stopped in the mid 2000s. Today, GHC supports an interactive environment, called GHCi, which is very similar to the old Hugs environment.
Haskell is intended to support functional programming (FP), a programming style which generally has the following characteristics.
One can do functional programming, in some sense, in just about any programming language. However, some languages support it better than others. C offers rather poor support. Support in C++ is improving; the C++11 standard has added features to enable FP. Python has better support, Javascript’s is even better, and the various dialects of Lisp offer very good support. Haskell is pretty much at the top of the heap.
A functional language is a language designed to support functional programming well. No one calls C a functional language. Opinions vary about Javascript. But everyone agrees that Haskell is a functional language.
Not only does Haskell support FP, it offers little support for anything else. Haskell is a pure functional language, meaning that it does not allow for side effects. Values in Haskell are not mutable; nothing can be modified.
Haskell also includes first-class functions. A data type is first-class if all operations on its values are fully available at runtime.
For example, in C++, int
is a first-class type.
Consider what we can do with int
s in C++.
[C++]
int a = b; cout << 2+3;
Above, we declare a variable of type int
and set it equal to another.
Then we operate on two int
s to create
a new int
value that has no name.
Before the 2011 standard, we could do none of these things
with functions in C++.
C++11 does allow for unnamed functions,
but it still does not permit us to manipulate functions
with quite the same ease as int
values.
But functions are first-class in Haskell. Creating a list of functions, or writing a function that manipulates functions (a higher-order function) are common, ordinary operations in Haskell.
As mentioned above, Haskell is a pure functional language (no mutable data or side effects). It has first-class functions and thus supports higher-order functions.
It is difficult to do loops without things like loop counters. And, indeed, Haskell has no iterative constructs. It uses recursion instead, with tail recursion preferred. The latter will generally be optimized using tail call optimization (or TCO).
Haskell has a simple syntax, without as much “punctuation” as C++. Here is a function call in C++:
[C++]
foo(a, b, c)
Here is a more-or-less equivalent function call in Haskell:
[Haskell]
foo a b c
Haskell has significant indentation. In C++, indenting is only for people who read the code; the compiler ignores it. In Haskell, indenting is one way to tell the compiler where a block begins and ends. Here is a function in C++:
[C++]
int bar(int a) { int b = 7; int c = 42; return foo(a, b, c); }
And here is a more-or-less equivalent function in Haskell:
[Haskell]
bar a = foo a b c where b = 7 c = 42
Like C++, Haskell has static typing of both variables and values. Unlike C++, Haskell typing is mostly implicit; that is, types usually do not need to be specified. In C++, the typing is mostly explicit; we write
[C++]
int x = 3;
while in Haskell, we can simply say
[Haskell]
x = 3
The variable x
still has a type
(Integer
in this case),
but the compiler is able to figure this out for us,
using a type inference algorithm
(the Hindley-Milner Algorithm).
Similarly, a function in C++:
[C++]
bool blug(int a, int b) { return a == b+1; }
And in Haskell:
[Haskell]
blug a b = (a == b+1)
Haskell still allows for explicit typing, if desired. For example, we can say
[Haskell]
x :: Integer x = 3
to mark x
as an Integer
explicitly.
This also allows for our intentions to be communicated to
the compiler.
So this is legal:
[Haskell]
s = "abc"
But this will not compile:
[Haskell]
s :: Integer s = "abc" -- Type error!
since "abc"
is not an Integer
value.
Haskell’s type system is sound,
meaning that operations not defined on a type
are not permitted.
In contrast, the type system of C++ is unsound
(which does not mean “bad”!),
since we can convert any type to any other,
using the various kinds of ..._cast
functionality.
Note: Many people like to talk about strong(er) and weak(er) type systems. A type system is generally consider stronger if its rules are applied more strictly. But these terms are not used consistently, and I prefer to avoid them.
By default Haskell does lazy evaluation, meaning that expressions are not evaluated until they need to be. C++ does the opposite, evaluating as soon as an expression is encountered; this is called strict evaluation, or eager evaluation. For example, here is a C++ function:
[C++]
int f(int x, int y) { return x+1; }
Suppose we do
“f(g(1), g(2))
”.
This is executed as follows.
First function g
is called with argument 1
.
It is called again with argument 2
.
Then function f
is called.
Note that the value of g(2)
is determined,
but not used.
Here is the corresponding Haskell function:
[Haskell]
f x y = x+1
We can call f
as we did in C++,
passing it two values of function g
:
“f (g 1) (g 2)
”.
But since f
never uses its second parameter,
the expression “(g 2)
” will never be evaluated.
Indeed, if the return value of f
is not used,
then “(g 1)
” will never be evaluated either.
Later we will see that lazy evaluation has other interesting consequences.
GHC is an AOT compiler that usually generates machine code. The GHCi interactive environment does JIT compilation to a bytecode, which is then interpreted.
GHCi allows for the loading of source files, as well as evaluation of Haskell expressions that are entered interactively. This kind of environment is often called a Read-Eval-Print Loop, or REPL (the term comes from Lisp).
Haskell source code is stored in text files
whose name ends in “.hs
”.
On the command line, GHC is used much like g++
or any other command-line compiler.
[*ix command line]
ghc myprog.hs -o myprog
If there are no errors, then an executable named
“myprog
”
will be generated.
Running that file will execute function main
in module Main
.
Of course, if you are not using the command line, then things will be handled differently. GHC is supported by various IDEs, including Eclipse.
GHCi is a program you run,
which presents you with a prompt.
Commands for the environment all begin with colon
(“:
”).
Some important ones:
:l FILENAME.hs
:r
:t EXPRESSION
:i IDENTIFIER
:e FILENAME.hs
:e
command, preferring to start
an editor in the usual way.
But you may find it convenient—or not.If you type something at the GHCi prompt that does not begin with a colon, then it is taken as a Haskell expression. This is evaluated, and its value is printed.
When a compiled Haskell program is executed,
function main
in module Main
is called.
To do this with GCHi, load (:l
) the source file,
and then type “Main.main
”
(usually just “main
” works).
On the other hand, the interactive environment gives you the ability
to call any function defined in a source file,
not just main
.
A Haskell expression is a stream of words separated by blanks where necessary, with optional parentheses for grouping. For example:
[Haskell]
2+3 (2+3)*5 reverse "abcde" map (\ x -> x*x) [1,2,3,4]
Each line above is a single Haskell expression. Type it at the GHCi prompt and press Enter to see its value.
Single-line comments begin with two dashes (“--
”)
and continue to the end of the line.
The two dashes must not form part of a legal lexeme;
thus, I encourage you to put a blank after them.
Multi-line comments begin with “{-
”
and end with “-}
”.
A Haskell identifier
begins with a letter or underscore (“_
”)
and includes only letters, digits, underscores
and single quotes (“'
”).
(That last character is because Haskell was designed by mathematicians;
they want to be able to write “y'
”.)
“Normal” Haskell identifiers begin with
lower-case letters or the underscore;
these name variables and functions.
“Special” identifiers begin with
upper-case letters;
these name types, modules, and constructors.
(Recall that function main
goes in module Main
.)
Haskell allows new operators to be defined. The names of these must consist of special characters:
! # $ % & * + . / < = > ? @ \ ^ | - ~ :
The “normal” ones, used for infix binary operators, do not begin with colon. The “special” ones, used for constructors, begin with colon.
We define a variable (note that there are no mutable values; “variables” cannot vary) by giving its name, an equals sign, and an expression giving its value.
[Haskell]
a = 3 myNineVariable = 4+5
The above are not expressions in Haskell. they are not legal at the GHCi prompt, and must be typed in a source file.
Once the source file is loaded, we can use the identifiers
(I use “>
” to represent the GHCi prompt).
[Interactive Haskell]
> a 3 > a+5 8 > :t a a :: Integer
Note that a
has a type,
even though we did not declare it.
But we can declare it if we want.
[Haskell]
b :: Integer b = 3
Here is an alternate form, which gives the type of the value, rather than the variable.
[Haskell]
c = 3::Integer
Some important numeric types are the following.
Int
int
.Integer
Double
double
.
The difference between Int
and Integer
can be awe-inspiring.
Note that “^
”
is the Haskell exponentiation operator.
[Interactive Haskell]
> (3::Int)^1000 -742892767 > (3::Integer)^1000 1322070819480806636890455259752144365965422032752148167664920368226828597346704899540778313850608061963909777696872582355950954582100618911865342725257953674027620225198320803878014774228964841274390400117588618041128947815623094438061566173054086674490506178125480344405547054397038895817465368254916136220830268563778582290228416398307887896918556404084898937609373242171846359938695516765018940588109060426089671438864102814350385648747165832010614366132173102768902855220001
We define a function just like a variable, except that we include parameters.
[Haskell]
square x = x*x
Now we can do this.
[Interactive Haskell]
> square 5 25
The above allows any number as a parameter.
We can restrict this to Integer
values,
as follows.
[Haskell]
square2::Integer -> Integer square2 x = x*x
Note that function application has very high precedence.
[Interactive Haskell]
> square 5 - 3 22 > (square 5) - 3 22 > square (5 - 3) 4 > square -6 -- subtract 6 from function square (makes no sense) [error message printed here] > square (-6) 36
A very useful technique for defining functions is pattern matching. For a given argument, the first matching pattern is used. Here is a Fibonacci-number computation function.
[Haskell]
fibo 0 = 0 fibo 1 = 1 fibo n = fibo (n-1) + fibo (n-2)
The above is very inefficient. In C++ we would prefer an iterative version, but Haskell has no loops. But we can still do a fast Fibonacci function.
[Haskell]
fiboFast n = first (advance n [0,1]) where advance 0 [a,b] = [a,b] advance n [a,b] = advance (n-1) [b,a+b] first [a,b] = a
Function advance
above is tail-recursive.
Note that we use significant indentation
to define our block.
The where
keyword introduces local definitions.
Lazy evaluation allows easy definition of infinite data structures.
[Haskell]
allnonneg = [0..]
The above is a list of all nonnegative integers.
If we print the list
(type “allnonneg
” at the GHCi prompt),
then it does on forever,
but we can look at just the first few.
Here are the first 20.
[Interactive Haskell]
> take 20 allnonneg [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]
Now we can apply fiboFast
to each number in the infinite list.
[Haskell]
allfibos = map fiboFast [0..]
The result is a list of all Fibonacci numbers. Again, we can print the first 20.
[Interactive Haskell]
> take 20 allfibos [0,1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181]
Notice what is going on here: we have done a computation using an infinite list as an intermediate result. With lazy evaluation, this is not a problem.
See
haskell_intro.hs
for Haskell source code related to today’s lecture.
ggchappell@alaska.edu