CS 331 Spring 2023 > A Quick Introduction to Forth
CS 331 Spring 2023
A Quick Introduction to Forth
By Glenn G. Chappell
Text updated: 2023-02-28
Table of Contents
- The Forth Programming Language
- The Stack
- Arithmetic
- Compiled Words
- Stack-Effect Notation
- Stack Manipulation
- Strings
- Named Parameters
- Flow of Control: Selection
- Flow of Control: Iteration
- Topics Not Covered
- Copyright & License
1. The Forth Programming Language
History
Forth is a programming language developed in the 1960s by Charles H. Moore at the U.S. National Radio Astronomy Observatory. Forth rose to prominence in the 1970s, finding success in various low-level applications involving controlling hardware connected to a computer. In the late 1970s and early 1980s, Forth’s low memory requirements made it suitable for early microcomputers. A Forth interpreter was often the first nontrivial program to be executed on newly developed computer hardware; in some cases, it still is.
Forth was standardized in 1983. In 1994 ANSI issued another Forth standard. A free implementation of this is available from the GNU project: Gforth. However, there exist many mututally incompatible versions of Forth.
Interest in Forth has waned since the late 1980s, but Forth has influenced a number of prominent programming languages. Notable among these are PostScript, a page-description programming language developed by Adobe Systems beginning in 1982, and Factor, a dynamic programming language that first appeared as a scripting language for games in 2003, and has since grown into a useful general-purpose programming language.
Characteristics
Forth is a concatenative programming language, meaning that the concatenation of two valid Forth programs is another valid Forth program, with the return values of the first part becoming the parameters of the second part. Like most concatenative programming languages, Forth is stack-based, meaning that parameters and return values are all passed via a stack.
Forth programs are generally written in an imperative style: they consist primarily of instructions that tell a computer what to do.
Forth has an extremely simple syntax: programs consist of sequences of words: strings of non-space characters separated by space.
Forth was an early example of an extensible programming language. New functionality can be defined; this has equal status with previously defined functionality. In particular, Forth allows programmers to create new flow-of-control constructs and new words that define new words.
ANSI Forth has a very limited notion of type.
The majority of Forth operations involve values that
are essentially ints.
These are used as numbers, booleans, pointers, and,
with some caveats, characters.
There is also support for floating-point values.
However ANSI Forth has no type checking;
rather, values of different types
handled via different constructions.
Execution & Conventions Used in This Article
As with many modern programming languages,
Forth can be used interactively.
Forth programs can also be stored in source files
and compiled for later execution.
The filenames of Forth source files traditionally end with
the suffix “.fs”.
The rest of this article describes ANSI Forth. Examples will mostly use the Gforth interactive environment. As we would expect, this environment allows the user to type in Forth code, which is immediately executed.
Source files can be loaded into the interactive environment.
For example, to load the file myprog.fs,
type the following.
[Interactive Forth]
include myprog.fs
Other source files can be loaded in a similar manner. When a file is loaded, the code in it is executed. After loading, words defined in the file can be used interactively.
In the interactive Forth examples in this document,
boldface typewriter-font characters
are those typed by the user.
Brownish non-bold typewriter-font characters
show Gforth output.
As you read this introduction,
I recommend that you follow along with Gforth.
2. The Stack
Typing a number in Gforth pushes that number
on the stack.
The word “.s” shows the current stack,
with the top of the stack to the right.
[Interactive Forth]
37 ok -3 17 8 100 ok 52 ok .s <6> 37 -3 17 8 100 52 ok
Above, the number in angle brackets is the current size of the stack.
The “ok”
indicates that all processing is done,
and Gforth is ready for input again.
Note that, to Forth, all space is the same:
blanks and newlines are treated identically—outside
of string literals and comments.
“drop” pops the top item off the stack.
“clearstack” clears the stack.
[Interactive Forth]
drop ok .s <5> 37 -3 17 8 100 ok -333 .s <6> 37 -4 17 8 100 -333 ok drop .s <4> 37 -3 17 8 100 ok drop .s <4> 37 -3 17 8 ok drop .s <3> 37 -3 17 ok clearstack ok .s <0> ok
3. Arithmetic
The word “+”
pops the top two items off the stack,
adds them,
and pushes the result.
[Interactive Forth]
\ Comments begin with backslash, continue to end-of-line ok \ I assume the stack is clear now ok 31 8 ok .s <2> 31 8 ok + ok .s <1> 39 ok \ We have added 31 and 8 to get 39 ok
The word “.”
pops the top number off the stack and prints it,
followed by a blank.
Thus we do not have to keep looking at the whole stack.
[Interactive Forth]
31 8 + . 39 ok
Words
“-”,
“*”, and
“/”
perform the usual arithmetic operations.
The division is integer division.
This allows us to do more complex computations.
Suppose we wish to compute (1+2) × (5+6).
The first sum is “1 2 +” in Forth.
This pushes the result (3) on the stack.
The second sum is “5 6 +”.
This pushes another result (11) on the stack.
Now we have 3 and 11 on the stack.
Doing “*” multiplies,
pushing the final result (33).
Lastly,
“.” pops and prints it.
[Interactive Forth]
1 2 + 5 6 + * . 33 ok
Here is an example computation in C++.
[C++]
cout << (23 + 17) * (8 - 27 + 6) + 19 * 25 * 6;
Here is the same computation in Forth.
[Interactive Forth]
23 17 + 8 27 - 6 + * 19 25 * 6 * + . 2090 ok
You may recognize this style of writing an expression as postfix form, also known as reverse Polish notation (RPN). Notice that we do not need to worry about precedence and associativity. Forth needs no parentheses!
4. Compiled Words
To define a new word, use a colon
(“:”),
followed by the new word,
the code for the new word,
and a semicolon
(“;”).
After that, using the newly defined word executes the given code.
[Interactive Forth]
: triple 3 * ; ok 6 triple . 18 ok : printTriple triple . ; ok 17 printTriple 51 ok
Forth words are not case-sensitive.
[Interactive Forth]
5 PRINTTRIPLE 15 ok 10 printtriPLE 30 ok
5 Stack-Effect Notation
Stack-effect notation indicates the effect of a Forth word
by showing a picture of the stack before the word is executed,
followed by two dashes (“--”)
and then a picture of the stack after the word is executed.
Usually, all this is enclosed in parentheses.
For example, the stack effect of our word triple
is “( x -- 3*x )”.
Here are stack-effect descriptions for a few of the words we have covered.
31 ( -- 31 ) drop ( x -- ) + ( x y -- x+y ) - ( x y -- x-y )
Anything surrounded by parentheses in Forth is a comment. Thus, stack-effect descriptions can be included when defining a word. Since they are comments, they are aimed only at human readers; the compiler ignores them.
[Interactive Forth]
: triple ( x -- 3*x ) compiled 3 * ; ok
Above, “compiled”
indicates that we are in the middle of defining a compiled word.
6. Stack Manipulation
Forth has a number of standard words that manipulate the stack. Here are a few of them, along with their stack-effect descriptions.
drop ( a -- ) dup ( a -- a a ) \ dup for "duplicate" swap ( a b -- b a ) rot ( a b c -- b c a ) \ rot for "rotate" -rot ( a b c -- c a b ) \ inverse of rot nip ( a b -- b ) tuck ( a b -- b a b ) over ( a b -- a b a )
For some of these, prepending a “2”
does the same operation on pairs of stack items.
2drop ( a1 a2 -- ) 2dup ( a1 a2 -- a1 a2 a1 a2 ) 2swap ( a1 a2 b1 b2 -- b1 b2 a1 a2 ) 2rot ( a1 a2 b1 b2 c1 c2 -- b1 b2 c1 c2 a1 a2 ) 2nip ( a1 a2 b1 b2 -- b1 b2 ) 2tuck ( a1 a2 b1 b2 -- b1 b2 a1 a2 b1 b2 ) 2over ( a1 a2 b1 b2 -- a1 a2 b1 b2 a1 a2 )
Note. Adding a “2”
to the beginning of a word to get a similar word that
works on pairs,
is not a general property of Forth.
“drop”
and
“2drop”
are completely separate words
that happen to have related actions.
Similarly, rot and -rot
are completely separate words.
The word “pick”
pops off the top of the stack,
and then uses it as an index for an item
in the stack to duplicate as a new top item.
Indexing starts from 0 (top of stack).
Thus “0 pick”
is the same as “dup”,
and “1 pick”
is the same as “over”.
[Interactive Forth]
966 955 944 933 922 911 900 ok 4 pick ok .s <8> 966 955 944 933 922 911 900 944 ok
“roll” is similar,
but it brings the indexed item to the top of the stack
without duplicating it.
[Interactive Forth]
966 955 944 933 922 911 900 ok 4 roll ok .s <8> 966 955 933 922 911 900 944 ok
Thus “0 roll”
does nothing,
“1 roll”
is the same as “swap”,
and “2 roll”
is the same as “rot”.
7. Strings
To enter a Forth string, start with the word “s"”.
This should be followed by a blank, as usual.
After the blank, the next character begins a string,
which ends with a double quote.
For example the string “abc”
would be entered as “s" abc"”.
These strings may contain blanks, even at the beginning.
The string “ abc def ”
would be entered as
“s" abc def "”.
Note the two blanks before the “a”.
The result on the stack is two values: a pointer to the start of the string, and its length.
[Interactive Forth]
s" Hello" ok .s <2> 138245848 5 ok
The value 138245848 above
is the pointer.
If you enter the above string,
then the pointer you get will probably have a different value.
The 5 is the length of the string.
If a string is represented on the stack as address-length,
then we can print it using “type”.
[Interactive Forth]
\ type ( addr len -- ) ok s" Hello, world!" ok type Hello, world! ok
The most common thing to do with a string is output it.
So Forth has the word
“."”,
which works just like
“s"”,
except that it prints the string
and leaves nothing on the stack.
[Interactive Forth]
." Hello, world!" Hello world! ok
8. Named Parameters
A useful variation on stack-effect notation allows for named parameters. Replace the parentheses in the stack-effect notation with braces. The parameters are popped off the stack and given the listed identifiers as (local) names.
Here is triple, redone.
[Forth]
: triple { x -- 3*x } x 3 * ;
The part of the stack-effect notation after the double dash
(“3*x” above)
is treated as a comment and ignored.
We can handle a passed string using two named parameters. Here is a word that takes a string (address & length) and prints it twice.
[Forth]
: type2x { addr len -- } addr len type addr len type ;
9. Flow of Control: Selection
In C++ we have the if-else control structure.
[C++]
if (CONDITION) { THENCODE; } else { ELSECODE; }
Here is the Forth equivalent.
[Forth]
CONDITION if THENCODE else ELSECODE endif
The word “if” pops the top of the stack.
If this value is non-zero, then THENCODE
is executed;
otherwise, ELSECODE is executed.
In this context, comparison operators are useful. These are the following
= <> < <= > >=
The not-equal operator is “<>”.
But if you prefer the C/C++/Java version, then you can easily define it.
[Forth]
: != <> ;
Each comparison operator pops the top two items off the stack,
compares them,
and pushes the result:
−1 for true and 0 for false.
Bitwise AND, OR, and NOT are the words
“and”,
“or”,
and
“invert”.
When used with values −1 and 0, these function
as logical operators.
Here is an example.
[Forth]
: tenToTwenty { x -- } x 10 >= x 20 <= and if x . s" is in the range [10, 20]." type else s" Alas! " type x . s" is NOT in the range [10, 20]." type endif ;[Interactive Forth]
14 tenToTwenty 14 is in the range [10, 20]. ok 26 tenToTwenty Alas! 26 is NOT in the range [10, 20]. ok
10. Flow of Control: Iteration
Forth has a large number of iteration constructs. In this section we look at a general-purpose loop.
Consider the following C++ code.
[C++]
while (true) { LOOPBODY1; if (!CONDITION) break; LOOPBODY2; }
Here is the Forth equivalent.
[Forth]
begin LOOPBODY1 CONDITION while LOOPBODY2 repeat
The word “begin” marks the start of the loop.
The word “repeat” marks the end.
When the word “while” is executed,
the top of the stack is popped.
If this is zero, then the loop exits:
execution moves to the code just after “repeat”;
otherwise, execution continues
after the “while”.
For example, here
is a Forth word that takes a parameter called n.
It prints “Howdy!” n times.
The word “cr”
prints a newline.
[Forth]
: multiHowdy ( n -- ) \ Prints "Howdy!" n times cr begin dup while s" Howdy!" type cr 1 - repeat drop ;
As mentioned above, Forth has a number of iteration constructs. There are others that would make the above code rather clearer.
As in most programming languages, Forth flow-of-control structures can be nested.
11. Topics Not Covered
This has only been a brief introduction. There is more to say about Forth—but not a lot more, as it is a simple programming language. Below are a few topics that were not covered here.
- Other Iteration Constructs (counted loop, etc.)
- Recursion
- The Dictionary & Redefining Words
- Variables
- Dynamic Allocation & Arrays
- Input
- File I/O
- Floating Point
- Other Stacks
- Defining New Control Structures
- Defining New Defining Words
12. Copyright & License
© 2014–2023 Glenn G. Chappell
A Quick Introduction to Forth
is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.