CS 331 Spring 2025 > Identifiers, Values, and Variables
CS 331 Spring 2025
Identifiers, Values, and Variables
This is a brief discussion of identifiers and values as they appear in programming languages, the relationship between the two, and related terminology.
Identifiers
An identifier is the name of something in source code: the name of a variable, function, class, module, namespace, type, etc.
Declarations
In many programming languages, in order to use an identifier, we must first declare it—that is, formally introduce it, indicating that it is an identifer, and possibly what kind of identifier. Consider the following C++ code.
[C++]
class Foo { … }; int bar;
The above code contains declarations of two identifiers:
Foo
, the name of a new type.bar
, the name of a variable of typeint
.
Namespaces
Every identifier lies in some namespace. As a general rule, it is only a problem for two entities to have the same name if the two names lie in the same namespace.
Here is more C++ code.
[C++]
double x; int x; // ERROR! void ff() { int x; // No error … } int ff(int n) // No error { … }
In the above code,
the first two variables named “x
”
are problematic,
because they lie in the same namespace.
On the other hand, varable x
inside the first function ff
is not a problem,
since it lies in a different namespace:
the namespace for variables
local to the first function ff
.
The two functions named ff
have names that do lie in the same namespace.
However, this is not a problem;
C++ allows such functions to exist
as long as they have different signatures
(a function’s signature includes
its parameter types and passing methods).
The two functions named “ff
”
form an example of overloading:
when a single name in a single namespace
refers to two (or more) different entities.
Scope
The code from which an identifier is accessible forms the identifier’s scope.
For example, the scope of a C++ local variable
declared inside a block
begins with the variable’s declaration
and ends at the end of the block—a right brace (}
).
In code that is part of an identifier’s scope, the identifier is said to be in scope; otherwise it is out of scope. When execution passes from the former to the latter, the identifier is said to go out of scope.
C++, along with Haskell, uses static scope: the scope of an identifier is determined before runtime (in both programming languages, scope is determined by the compiler).
Most (all?) statically scoped programming languages use lexical scope: the scope consists of a fixed portion of the text of the program’s source code.
Some programming languages use dynamic scope: scope determined at runtime. For many such programming languages, whether an identifier is in scope depends only on when code is executed at runtime.
Values
A value might be a number or a string or a Boolean or some kind of object.Values & Expressions
[C++]
int x = 77; cout << "Sum: " << x+2;
An expression is an entity that has a value. The following are expressions in the above C++ code:
77
cout
"Sum: "
cout << "Sum: "
x
2
x+2
cout << "Sum: " << x+2
Literals
A literal is a representation of a fixed value in source code. The value itself is represented, not an identifier bound to the value, and not a computation whose result is the value.
Here are some literals in C++ and Lua.
C++ Literals | |
---|---|
Literal | Type |
42 |
int |
42.5 |
double |
42.5f |
float |
false |
bool |
'A' |
char |
"zebra" |
char[] |
Lua Literals | |
---|---|
Literal | Type |
42.5 |
number |
false |
boolean |
"zebra" |
string |
[=[xy]=] |
string |
{ 1, 2 } |
table |
nil |
nil |
Lifetime
When a value comes into existence, we say it is constructed (noun form: construction). This mostly happens at runtime, although compile-time values also exist in some programming languages.
A value continues to exist until it is destroyed (noun form: destruction).
The time between construction and destruction of a value is the value’s lifetime, that is, the period during which the value exists.
Variables
Variables & Binding
A variable is an identifier than can be associated with a value. Creating such an association is called binding. A variable can be bound to a value, which makes it a bound variable.Note that, despite the name, the value of a variable might not actually vary. For example, a C++ variable declared const, along with any variable at all in Haskell, has a fixed value once it is bound.
In Haskell, we bind a variable using an equals sign (=
).
[Haskell]
x = 77 –– Bind variable x to the value 77 addem x y = x+y addTwo = addem 2 –– Here, the first parameter of addem is bound, –– but the second is not.
A variable that is not bound within an expression is said to be free in that expression.
For example, in the Haskell expression n*n
,
variable n
is a free variable.
A Haskell lambda function with one parameter n
is written
“\ n –>
EXPR”,
where EXPR is an expression in which n
is free.
Therefore,
“\ n –> n*n
”
is a valid Haskell lambda function.
Scope + Lifetime
Remember:- An identifier has a scope.
- A value has a lifetime.
Because a bound variable involves both an identifier and a value, scope and lifetime are both applicable.
For example, for a C++ automatic variable (normal local variable), the lifetime of its value ends when the identifier goes out of scope. This fact, along with the execution of the value’s destructor when it is destroyed, forms the basis for the RAII programming idiom.
Implementation
General
At runtime, a value is typically implemented as a block of memory large enough to hold its internal representation.
When a value is set, it is computed, and its representation is stored in the memory block.
[C++]
int n = ff(1) + gg(2);
When the above C++ is executed,
functions ff
and gg
are called.
Their return values are added together.
Say the result is 37
.
This is stored in the memory location for variable n
.
Lazy Evaluation
But what if evaluation is lazy? Recall that lazy evaluation means that an expression is only evaluated when its value is needed. By default, evaluation in the programming language Haskell is lazy. Here is Haskell code that performs the same computation as the above C++ code.
[Haskell]
n = (ff 1) + (gg 2)
When lazy evaluation is done, the usual implementation has an unevaluated value hold a thunk: a reference to code whose execution computes the value.
Now consider Haskell lists. Internally, their implementation will be something like a Linked List.
Recall: a linked list is composed of nodes, each of which contains a data item and some kind of reference (pointer?) to the next node; at the end of the list, the latter has a null value. A nonempty linked list is referred to by a reference (pointer?) to its first node.
In the context of lazy evaluation, any of these items and references might actually hold a thunk (shown as an upper-case T).
For a nonempty Linked List,
the two parts of the first node
are the list’s head and tail,
respectively.
So the head is the first item in the list,
and the tail is a list of all the other items.
For example, for the list [3,1,5,3]
,
the head is 3
,
and the tail is the list [1,5,3]
.
Haskell has literals for lists of consecutive integers.
For example, [5..]
means the list
whose first item is 5
,
with following items increasing by one.
So the items in this list are
5
,
6
,
7
,
8
,
9
,
10
,
etc.
The head of this list is 5
,
and the tail is [6..]
.
Now let’s consider what the following Haskell code might do, internally.
[Haskell]
xs = [5..] –– A list holding values 5, 6, 7, 8, etc. r = xs !! 2 –– "!!" is lookup by index (zero-based) –– We would expect that r is 7
To begin with, xs
simply holds a thunk.
And so does r
.
If the values of these two variables are never needed,
then no evaluation is ever performed.
Done.
But what if the value of r
turns out to be needed?
For example, we might run a program that defines
xs
and r
as above,
and then prints the value of r
.
In Haskell, this printing might be written as follows.
[Haskell]
main = putStrLn $ show r
If a value is to be printed, then it first needs to be computed. Let’s see what this might look like, internally.
We want item 2 of xs
.
Again, xs
holds a thunk.
Evaluate xs
.
It is a nonempty list: [5..]
.
Item 2 of xs
is item 1 of the tail of xs
.
Evaluate the tail of xs
: [6..]
(that would be the list holding
6
, 7
, 8
, 9
, etc).
Item 1 of the tail of xs
is item 0 of the tail of the tail of xs
.
Evaluate the tail of the tail of xs
:
[7..]
.
And the value we want is the head of that,
that is, the head of [7..]
.
Evaluate.
Result.
The value of r
is 7
.
Note that the Haskell list [5..]
is actually infinite;
it goes on forever.
And yet, the only memory we need for the above evaluation
is enough to hold a pointer and three list nodes.
This illustrates how lazy evaluation,
handled internally using thunks,
can allow for infinite data structures.