TclWise, a Tcl book with free chapters online

TCLWISE

An introduction to the Tcl programming language

Sponsored Project: The Jim interpreter
A small footprint implementation of Tcl

Send a comment to the author

7. MORE ON PROCEDURES

Tcl procedures are not new to the reader, we already had the chance to write some simple procedure using the proc command. Still there are details that must be investigated in order to be able to write non trivial programs. This chapter will explain what is a local variable for a Tcl procedure, how to write procedures with a variable number of arguments, with default arguments, how to write recursive procedures, and finally what are and how to access to global variables from Tcl procedures.

7.1 Local variables

Tcl programs can create new variables using commands like set, append, lappend, and so on. If a variable is created inside a procedure, then the variable is called local variable, and his visibility and life are closely related to the procedure call.

The following foobar procedure creates two variables called a and b, and returns a two-element list with the value of a as first element, and the value of b as second element:


proc foobar {} {
    set a foo
    set b bar
    list $a $b
}

Every time we call the procedure (that does not take any argument), it will return the two elements list "foo bar". In order to understand well what a local variable is, we'll try to follow what happens inside Tcl every time the procedure foobar is called: The first event is the execution of the command set a foo, this command creates a local variable with the string "foo" as content. The same happens when the second line of the foobar procedure is executed, the local variable b is created. Finally the values of the two variables are used as arguments of the list command. Because list $a $b is the last command of the foobar procedure, the Tcl interpreter is now ready to return its value to the caller: the procedure can exit. What happens to the local variables a and b? They are just destroyed, the value is lost, and this happens every time the foobar procedure is called, for every call this two variables are created and then destroyed.

So we can say: in Tcl local variables are variables created inside a procedure (i.e. while a procedure is running), and destroyed once the creating procedure is ready to return to the caller.

So far we know the life duration of a local variable, but what about it's ability to be accessed? A local variable can be accessed only by code inside the procedure that created it, in other terms the visibility of a local variable is limited to the creating procedure (actually this rule have an important exception, but you need to wait the next chapter to know more).

7.2 Top level

The previous section explained the behaviour of variables created inside procedures, but actually in Tcl it's possible to run code outside a procedure, in a context called top level. For example, when you start the tclsh and write some Tcl command, the command is executed at top level. In general every Tcl code that does not appear inside a procedure is at top level. Look at the following program as an example:


puts "Here the code is running at top level"
proc foo {} {
	puts "But not here!"
}
foo

The program calls the puts command while at top level, then the proc command, still at top level, and finally the foo command is called. At this point we are inside the foo function (no longer at top level). Because toplevel is something like a procedure that Tcl automatically call to start the program, and that never returns while the program is running, variables created at top level are never destroyed. This variables are called global variables.

7.3 Global variables

Actually global variables are somewhat special, not only they are never destroyed (unless the program explicitly destroy a global variable via the unset command), but also they can be created and accessed from Tcl procedures (i.e. it's possible to create or access global variables while not at top level). In order to make it possible, the global command is used. In the first example we will create a global variable using set at top level, then we will access it from a procedure using the global command:


set PI 3.1415926536


proc area radius {
    global PI
    expr $radius*$radius*$PI
}

The top level set command creates the global variable PI containing the approximated value of PI. The area procedure uses the global variable to compute the area of a circle having a given radius (passed as unique argument to the area procedure). After the command global PI is called, the area procedure is free to use the PI variable.

A procedure may also create a global variable in a similar way:


proc createPI {} {
    global PI
    set PI 3.1415926536
}


createPI
puts $PI

As you can guess, the output of this program is "3.1415926536". The createPI command creates the PI global variable, that can be accessed directly from top level by the puts $PI command.

It's important to understand that while global variables are useful in order to take some important state of the program, a wise use is raccomended: procedures using global variables tend to be less reusable in other contexts, to have subtle side effects, and in general don't help a lot in the attempt of writing clean and readable code.

7.4 Procedures arguments and pass by value

Arguments of procedures are a special kind of local variables. The only thing that is special about this variables is that they are created automatically every time the procedure is called, and their value is set to the value of the corresponding argument passed to the command. Look at the following code:


proc myproc x {
    set x ""
}
set list "1 2 3 4 5"
myproc $list
puts $list

The procedure myproc takes one argument x, and set it to the empty string. It's just a dummy function useful only for this example. The code creates a variable list with a list of five numbers, then calls myproc $list. What happens at this point? $list is expanded to it's value, then the myproc procedure is called, so it's like to call directly:


myproc "1 2 3 4 5"

What this means is that we always pass strings as procedure arguments. At this point the execution of myproc starts: the first thing it does is to create a local variable x, that's the argument of the function, setting "1 2 3 4 5" as value of this variable. The first line of the myproc function will then set the empty string as value of the x variable. Finally the puts $list command prints the value of the list variable on the screen, so the program will output "1 2 3 4 5".

What's the point here? That in Tcl, arguments to procedures are always passed by value. This means that myproc can't alter the value of the list variable: this value is just expanded, passed to myproc, and then set to the x argument. Unless a procedure requires the name of a variable as argument, and not a value, you are always safe, the procedure should not alter the value of variables of the caller.

This makes Tcl programming very safe, you may not remeber at all how mystrangeprocedure works internally, you know that the following code fragment can't alter the value of the list stored in the l variable:


set l [list foo bar]
mystrangeprocedure $l

mystrangeprocedure can mess with its argument as much as it likes, but still the value of the variable l will continue to be a two element list "foo bar".

At this point you should wonder how it is possible in Tcl to write a procedure like the incr command, able to increment a variable living in the context of the caller procedure. In the next chapter we will see how in Tcl rules are done to be violated thanks to the great flexibility and introspection offered by this language.

7.5 Procedures with a variable number of arguments

Remember our attempt to write a + procedure? This is the code we wrote in a previous chapter:


% proc + {a b} {expr $a+$b}
% + 3 4
7

This code is ok, but works with just two arguments. If I want to sum three numbers, instead to write + 1 2 3 I've to write + 1 [+ 2 3]. I'm sure you don't like this limitation, and I agree. In Tcl is very simple to write procedures with a variable number of arguments, so we can write a new version of the + procedure able to accept from 1 to infinite arguments. This is the code:


proc + {x args} {
    foreach e $args {
        set x [expr $x+$e]
    }
    return $x
}

We can test it interactively using the tclsh as usually to make sure it is working as expected:


% + 10 20
30
% + 1 2 3 4 
10
% + 50
50
%

Now let's see how it works. As you can see the proc command was called with {x args} as second argument (that is the argument list of the procedure we are creating). The second argument is args, and it is a special one: if the last argument of the argument list is exactly the string args, the function can accept an infinite number of arguments, that are stores as a list into the args argument when the procedure is called.

In the example of the + procedure, if we call it with a single argument it will be assigned to x, and args will be an empty list. If we call it with two arguments the first is assigned to x, and the second as unique element of the list args, and so on. The following is table of what x and args will contain with a different number of arguments passed to +:


+                  ;# error, wrong number of arguments for procedure
+ 10               ;# x <- "10", args <- ""
+ 10 11            ;# x <- "10", args <- [list 10 11]
+ 10 11 20         ;# x <- "10", args <- [list 10 11 20]

This is why the implementation of the + procedure uses foreach in order to iterate over the list of arguments, adding all the arguments to the first one (contained in x).

Note that as long as args appears as the last argument, any number of normal arguments may be provided (including zero), for instance {x y z args} is a valid argument list for a procedure that will require from 3 to an infinite number of arguments. The first three arguments will be assigned to x, y, z, all the remaining arguments will be put in a list and assigned to args.

7.6 Procedures with default arguments

Another useful tool of the proc command is the ability to write procedures with default arguments. Default arguments can be omitted when calling the procedure, and will default to a value specified when the procedure was created. There are a lot of cases when this is desiderable, we already shown core commands where this feature is used (for example in the join command used to join elements of a list into a string, the joinString argument can be omitted and will default to a single space).

As an example we can write a procedure that increment every element of a list by one:


proc lincr l {
	set result {}
	foreach e $l {
	    lappend result [expr $e+1]
	}
	return $result
}

This is some output:


% lincr {10 20 30}
11 21 31
% lincr {5 6}
6 7

Note that we need to initialize the result variable to the empty list to be sure the function will work well if the input is an empty list, otherwise the result variable may not be created because lappend will never be called, and return $result will generate an error.

What about if I want to increment the elements by 10 and not just by 1? We can rewrite the function so that it will take an additional argument called increment, and use its value inside the foreach loop to create the new string.


proc lincr {l increment} {
	set result {}
	foreach e $l {
	    lappend result [expr $e+$increment]
	}
	return $result
}

Again this works well, cut&paste the code into the tclsh and try yourself:


% lincr {10 20 30} 1
11 21 31
% lincr {10 20 30} 5
15 25 35

But immagine that after some time you use this procedure in your programs you discover that 80% of the times you need to increment just by one, why you can't just write lincr $mylist instead of lincr $mylist 1, and specify the increment only when it is different than the common case? This is where default arguments enter in scene:


proc lincr {l {increment 1}} {
	set result {}
	foreach e $l {
	    lappend result [expr $e+$increment]
	}
	return $result
}

The new version is exactly the same as the previous one, but for a difference in the argument list of the procedure that now is {l {increment 1}}. In short, if one of the arguments in the argument list is itself a two elements list, the first is interpreted as the name of the argument and the second as the default value to give to the argument if it is not specified. Now the function can accept one or two arguments, with just one argument it will increment the list elements by one, with more it will increment using the specified vale:


% lincr {1 2 3}
2 3 4
% lincr {1 2 3} 10
11 12 13

A procedure may have multiple default arguments, but they must all be at the end of the arguments list, you can't add a default argument in the middle like {a {b 10} c}, but it's ok to have multiple default arguments like in the case {a {b 10} {c 20}}: in the example if you specify just one argument, b will default to 10 and c to 20, if you specify one argument more it will be used to set the value of b, finally if you add another one it will be used for c.

An exception to this rule is that it's possible to have the args special argument to write a procedure with variable number of arguments where some of the last arguments have a default value:


% proc foo {a {b 10} {c 20} args} {puts "$a - $b - $c - $args"}
% foo 5 
5 - 10 - 20 - 
% foo 5 1 2
5 - 1 - 2 - 
% foo 5 1 2 a b c d
5 - 1 - 2 - a b c d
%

Just a final note, what about if the default value is inside a variable or you want to compute it at run-time with command substitution? If the argument list uses { } grouping, variable and command substitution will not happen, so you need to write the procedure in a different way using the list command, like this:


set value 100
proc myproc [list a b [list c $value]] {
	puts "$a $b $c"
}

The second argument of proc is a list, so you can create it at runtime. Default arguments and procedures with a variable number of arguments are improtant because it's nice to write procedures that require less typing for the base cases and can accept an infinite number of arguments when it makes sense (like in the case of the + procedure).

7.7 Recursion

A Tcl procedure can call itself, this makes possible to write recursive procedures. Recursion is so important because many problems are trivial to express in terms of a simple case of theirself. The first example of recursive procedure is used to compute the maximum element of a list of integers. We know how to solve the case of a list of length one (the max is just the only element), and using recursion we can solve the problem for any length of the list:


proc lmax l {
    if {[llength $l] == 1} {
        lindex $l 0
    } else {
        if {[lindex $l 0] > [lmax [lrange $l 1 end]]} {
	      lindex $l 0
	  } else {
	      lmax [lrange $l 1 end]
	  }
    }
}

The procedure can be read: if the list length is one, the max is the only element it contains, otherwise split the list in two parts, the first element, and the rest of the list. If the first element is greater than the max of the rest, it is the max of the list, otherwise the max of the list is the max of the rest of the list. The procedure will work with every non empty list of integers:


% lmax {1 50 34 25 61 7 8 9}
61
% lmax {1 2 3}
3

Another example of recursive procedure is the classical Fibonacci function, that's defined as:


FIB(1) = FIB(2) = 1
FIB(N) = FIB(N-2)+FIB(N-1) (for N > 2)

The Tcl implementation of the Fibonacci function is the following:


proc fib n {
    if {$n < 3} {
        return 1
    } else {
        expr {[fib [expr {$n-2}]]+[fib [expr {$n-1}]]}
    }
}

Before to look at the procedure details, note that we put expr's expression inside braces. You may wonder how it's possible that command substitution will take effect if grouping will prevent it: the trick is that expr command performs its own turn of variables and command substitution to the argument we pass, and that it is much faster if we provide an expression grouped with braces. So instead to write expr $a+$b, you can also write expr {$a+$b}, letting the compiler to optimize the code to run faster.

About the fib procedure, all should be clear, it is a plain Tcl translation of the mathematical definition of the Fibonacci function. It is interesting to note that this function will end computing multiple times itself. for example in order to compute FIB(5) the value of FIB(2) is computed 3 times, this is not optimal because FIB(2) will always have the same value, so it's useful to compute it more then one time. We will see in the next chapters how it is possible to write a Tcl procedure that makes possible to automatically cache already computed values of recursive procedures. The technique of caching the already computed values of a procedure is called memoization, thanks to the introspection capabilities of Tcl we will be able to write a memoize procedure, that used as first command of a procedure will turn that procedure in a memoized one.

7.8 Recursion limit

In order to trap an infinite recursion error before it is too late the Tcl interpreter will generate an error if a given recursion depth (the number of nested calls) is reached. For example the following procedure will exit with an error when called.


proc infinite {} {
    infinite
}
infinite

infinite will call infinite forever.

This may create problem sometimes, you may need to write code that performs a recursion with a depth not allowed by the default limit. In order to change this use the following Tcl command:


interp recursionlimit {} $newlimit

where *$newlimit$ is the value of the new recursion limit. If you want to check what's the current recursion limit, call this command without the last argument:


% interp recursionlimit {}
1000

Note that to enlarge this limit too much may not always result to the ability to write recursive procedures with the desidered recursion depth: Tcl is implemented in C, and the interpreter calls itself, so what may overflow is the C stack itself. If this happens you will see an operating system error similar to stack overflow, or segmentation fault.

Sometimes it's possible to write procedures where the recursion appears only just before the procedure will return, this is called tail recursion. As we will see in an advanced chapter of this book it is possible to write a version of proc that does what is called tail recursion optimization, this makes it possible to write recursive procedures running in constant space.