When is Scripting really Programming?

I recently updated some analysis scripts in Simics to use the new OS awareness framework in Simics 4.4. While doing so, I completely updated the structure of the code, ending up with something looking suspiciously like a regular Python program. It had declarations, classes, variables, did not rely on global variables, and was fairly robust to changes in the target. This got me thinking about the difference between “scripting” and “programming”. When I was a computer science undergraduate in the 1990s, I recall the difference was very simple. Programming was done in C or other real languages (with the corresponding heated debate on whether anything except C or assembly counted), scripting was done in bash. For some reason, it does not feel that simple today.

Let’s use Simics as the example of the many shades of grey that can be found between “pure scripting” and “real programming”. Simics has a command-line interface (CLI) where users can interact directly with the simulator. The simplest form of Simics scripting is to putting some commands that would otherwise be typed one by one into a file, and have Simics interpret the file rather than typing all the commands manually. Here is simple example from the early days of Simics, using a script to do an automated login into a SunFire machine.

run-command-file "sunfire-1p.simics"
con0.break "login:"
continue
con0.unbreak "login:"
con0.input "root\n"
con0.break "password:"
continue
con0.input "nemo\n"

This script uses fixed object names (con0) to locate objects in the simulation, and can only be run on its own since it starts and stops the simulation. The idea is to setup a breakpoint, run until it hits, do something, and then setup another breakpoint. To me, this is an example of “pure scripting”.

Over time, the Simics CLI has evolved and become much more capable. It now supports variables, lists of items, conditional statements, and loops. Here is an example script which iterates over all processors of type ppc440gp in a system, and reduces the clock frequency of everything except the first two such processors. It also does all of this conditionally, controlled by another CLI variable.

if( $reduce_compute_node_speed ) {
foreach $c in (get-object-list ppc440gp) {
if ( ($c!= cpu0) & ($c!= cpu1)) {
$c->freq_mhz = $reduced_freq_mhz
}
}
}

This is starting to look a lot like programming. It is certainly not a list of commands that a user could just as well type at the CLI. The next step in the evolution of Simics scripting was to add parallel execution of scripts. Parallelism is crucial to manage the scripting of multiple independent machines inside a single simulation. Each machine can have its own little script snippet running in parallel, without care for what the other scripts and parts of the simulation setup is doing. The login task from the first example, would look something like the below script.

# The script sets the variable $system to point
# to the top-level of the created target machine
run-command-file "sunfire-1p.simics" script-branch { # Use a local variable for the console object local $con = $system.console1.con $con.wait-then-write "login:" "root\n" $con.wait-then-write "password:" "nemo\n" }

A user could even execute this script several times over, inside a single Simics session, each time creating a new SunFire machine attached to the virtual network, and with its own separate script branch running. Since all names are variables, the naming of each machine and its objects does not affect the script at all.

Simics script branches can also communicate and synchronize using two parallel programming primitives, barriers and fifos. Using these primitives, scripts can perform tricks like sending off a boot of multiple target machines, waiting until all are booted, start a program on one, and once that program has started, start a program on another target machine. Essentially, you end up with a real threaded program running inside of Simics, despite it being “just scripting”…

This brings us to the limit of what is reasonable to do with the Simics CLI. To achieve more complicated tasks, the recommendation is to use Python. Python can be used both inline on the command-line (prefixed by an @ sign), and as stand-alone Python files. Inline Python is used to extend the expressiveness of the command line, including access to the Simics simulator API as well as computations which are not possible (or clumsy) in plain commmand-line language. Another example is shown below, using Python to use the Python “time” package to clock how long a Simics run takes in terms of wall-clock time.

# $program, $program_args are set from other
# Simics scripts before going into this code
script-branch {
local $con = $system.console0.con
local $prompt = "# "
$cmd = ("./%s %s\n" % [$program,$program_args] )
$con.wait-then-write $prompt $cmd
@exp_start = time.time()
$con.wait-for-string $prompt
@exp_end = time.time()
@exp_time = exp_end - exp_start
@print "\nRun took %d seconds\n" % (exp_time)
}

Finally, let’s get back to the “script” that triggered my thinking on just what a script is and when scripts become proper programs. The whole script used is really a system consisting of several different scripts in CLI and Python, with most of it used to setup the target machine and get the software under investigation running. Once the software is up and running, we get to the core functionality in Python. A piece of the code is shown below, which listens for the start of the target program on the target by registering a callback with the Simics OS awareness system. It also uses callbacks to listen for magic instructions within the target program, as well a callback to note when the program has terminated. Using these mechanisms, it computes the numbers of network packets processed per second in the target program.

Python Code Example

This code certainly looks much more like a program to me, even though some people would call it a script just because it uses Python.

I recently listened to an interview with Matt Mackal from the Mercurial project, and jumped when he first called Python a scripting language, and then said that all of mercurial is written in it. If an entire application is written in Python, I think calling it a scripting language is slightly incorrect. To me, Python is a modern high-level programming language with some nice properties like built-in strings and dynamic typing which makes it very easy to use for scripting applications. But it is also a real complete programming language that allows for neatly structured code to be written.

The conclusion that I think can be drawn from this evolution of scripting from simple repetition of fixed tasks to multithreaded parallel scripts and full Python object-oriented programs is that there is no clear separation between “scripting” and “programming”. As the target systems get more complex and the tasks we are trying to solve get more complex too, what used to be simple scripts will tend to evolve into much more flexible and program-like constructions.

1 Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>