CS 331 Spring 2025  >  Identifiers, Values, and Variables


CS 331 Spring 2025
Identifiers, Values, and Variables

This is a brief discussion of identifiers and values as they appear in programming languages, the relationship between the two, and related terminology.

Identifiers

An identifier is the name of something in source code: the name of a variable, function, class, module, namespace, type, etc.

Declarations

In many programming languages, in order to use an identifier, we must first declare it—that is, formally introduce it, indicating that it is an identifer, and possibly what kind of identifier. Consider the following C++ code.

[C++]

class Foo {
    …
};

int bar;

The above code contains declarations of two identifiers:

Namespaces

Every identifier lies in some namespace. As a general rule, it is only a problem for two entities to have the same name if the two names lie in the same namespace.

Here is more C++ code.

[C++]

double x;
int x;         // ERROR!

void ff()
{
    int x;     // No error
    …
}

int ff(int n)  // No error
{
    …
}

In the above code, the first two variables named “x” are problematic, because they lie in the same namespace.

On the other hand, varable x inside the first function ff is not a problem, since it lies in a different namespace: the namespace for variables local to the first function ff.

The two functions named ff have names that do lie in the same namespace. However, this is not a problem; C++ allows such functions to exist as long as they have different signatures (a function’s signature includes its parameter types and passing methods).

The two functions named “ff” form an example of overloading: when a single name in a single namespace refers to two (or more) different entities.

Scope

The code from which an identifier is accessible forms the identifier’s scope.

For example, the scope of a C++ local variable declared inside a block begins with the variable’s declaration and ends at the end of the block—a right brace (}).

In code that is part of an identifier’s scope, the identifier is said to be in scope; otherwise it is out of scope. When execution passes from the former to the latter, the identifier is said to go out of scope.

C++, along with Haskell, uses static scope: the scope of an identifier is determined before runtime (in both programming languages, scope is determined by the compiler).

Most (all?) statically scoped programming languages use lexical scope: the scope consists of a fixed portion of the text of the program’s source code.

Some programming languages use dynamic scope: scope determined at runtime. For many such programming languages, whether an identifier is in scope depends only on when code is executed at runtime.

Values

A value might be a number or a string or a Boolean or some kind of object.

Values & Expressions

[C++]

int x = 77;
cout << "Sum: " << x+2;

An expression is an entity that has a value. The following are expressions in the above C++ code:

Literals

A literal is a representation of a fixed value in source code. The value itself is represented, not an identifier bound to the value, and not a computation whose result is the value.

Here are some literals in C++ and Lua.

C++ Literals
Literal Type
42 int
42.5 double
42.5f float
false bool
'A' char
"zebra" char[]
Lua Literals
Literal Type
42.5 number
false boolean
"zebra" string
[=[xy]=] string
{ 1, 2 } table
nil nil

Lifetime

When a value comes into existence, we say it is constructed (noun form: construction). This mostly happens at runtime, although compile-time values also exist in some programming languages.

A value continues to exist until it is destroyed (noun form: destruction).

The time between construction and destruction of a value is the value’s lifetime, that is, the period during which the value exists.

Variables

Variables & Binding

A variable is an identifier than can be associated with a value. Creating such an association is called binding. A variable can be bound to a value, which makes it a bound variable.

Note that, despite the name, the value of a variable might not actually vary. For example, a C++ variable declared const, along with any variable at all in Haskell, has a fixed value once it is bound.

In Haskell, we bind a variable using an equals sign (=).

[Haskell]

x = 77            –– Bind variable x to the value 77

addem x y = x+y
addTwo = addem 2  –– Here, the first parameter of addem is bound,
                  ––  but the second is not.

A variable that is not bound within an expression is said to be free in that expression.

For example, in the Haskell expression n*n, variable n is a free variable.

A Haskell lambda function with one parameter n is written “\ n –> EXPR”, where EXPR is an expression in which n is free.

Therefore, “\ n –> n*n” is a valid Haskell lambda function.

Scope + Lifetime

Remember:

Because a bound variable involves both an identifier and a value, scope and lifetime are both applicable.

For example, for a C++ automatic variable (normal local variable), the lifetime of its value ends when the identifier goes out of scope. This fact, along with the execution of the value’s destructor when it is destroyed, forms the basis for the RAII programming idiom.

Implementation

General

At runtime, a value is typically implemented as a block of memory large enough to hold its internal representation.

When a value is set, it is computed, and its representation is stored in the memory block.

[C++]

int n = ff(1) + gg(2);

When the above C++ is executed, functions ff and gg are called. Their return values are added together. Say the result is 37. This is stored in the memory location for variable n.

Image not displayed: Memory Usage with a Variable

Lazy Evaluation

But what if evaluation is lazy? Recall that lazy evaluation means that an expression is only evaluated when its value is needed. By default, evaluation in the programming language Haskell is lazy. Here is Haskell code that performs the same computation as the above C++ code.

[Haskell]

n = (ff 1) + (gg 2)

When lazy evaluation is done, the usual implementation has an unevaluated value hold a thunk: a reference to code whose execution computes the value.

Image not displayed: Memory Usage with Lazy Evaluation

Now consider Haskell lists. Internally, their implementation will be something like a Linked List.

Recall: a linked list is composed of nodes, each of which contains a data item and some kind of reference (pointer?) to the next node; at the end of the list, the latter has a null value. A nonempty linked list is referred to by a reference (pointer?) to its first node.

Image not displayed: Linked List

In the context of lazy evaluation, any of these items and references might actually hold a thunk (shown as an upper-case T).

Image not displayed: Linked List with Thunks

For a nonempty Linked List, the two parts of the first node are the list’s head and tail, respectively. So the head is the first item in the list, and the tail is a list of all the other items. For example, for the list [3,1,5,3], the head is 3, and the tail is the list [1,5,3].

Haskell has literals for lists of consecutive integers. For example, [5..] means the list whose first item is 5, with following items increasing by one. So the items in this list are 5, 6, 7, 8, 9, 10, etc. The head of this list is 5, and the tail is [6..].

Now let’s consider what the following Haskell code might do, internally.

[Haskell]

xs = [5..]  –– A list holding values 5, 6, 7, 8, etc.
r = xs !! 2    –– "!!" is lookup by index (zero-based)
               –– We would expect that r is 7

To begin with, xs simply holds a thunk.

Image not displayed: Lazy Evaluation Step 1

And so does r. If the values of these two variables are never needed, then no evaluation is ever performed.

Done.

But what if the value of r turns out to be needed? For example, we might run a program that defines xs and r as above, and then prints the value of r. In Haskell, this printing might be written as follows.

[Haskell]

main = putStrLn $ show r

If a value is to be printed, then it first needs to be computed. Let’s see what this might look like, internally.

We want item 2 of xs. Again, xs holds a thunk.

Image not displayed: Lazy Evaluation Step 1

Evaluate xs. It is a nonempty list: [5..].

Image not displayed: Lazy Evaluation Step 2

Item 2 of xs is item 1 of the tail of xs. Evaluate the tail of xs: [6..] (that would be the list holding 6, 7, 8, 9, etc).

Image not displayed: Lazy Evaluation Step 3

Item 1 of the tail of xs is item 0 of the tail of the tail of xs. Evaluate the tail of the tail of xs: [7..].

Image not displayed: Lazy Evaluation Step 4

And the value we want is the head of that, that is, the head of [7..]. Evaluate.

Image not displayed: Lazy Evaluation Step 5

Result. The value of r is 7.

Note that the Haskell list [5..] is actually infinite; it goes on forever. And yet, the only memory we need for the above evaluation is enough to hold a pointer and three list nodes. This illustrates how lazy evaluation, handled internally using thunks, can allow for infinite data structures.