2024 π Daylatest newsbuy art
Trance opera—Spente le Stellebe dramaticmore quotes
very clickable
data + munging

The Perl Journal

Volumes 1–6 (1996–2002)

Code tarballs available for issues 1–21.

I reformatted the CD-ROM contents. Some things may still be a little wonky — oh, why hello there <FONT> tag. Syntax highlighting is iffy. Please report any glaring issues.

(2001) Constants in Perl. The Perl Journal, vol 5(5), issue #21, Fall 2001.

Constants in Perl


Modules Used
constant in Perl 5.004 and later
B::Deparse in Perl 5.00502 and later

Wherein our protagonist explains how constants will clarify your source code, and also make your programs run ever so quickly.

Ex Librîs Pascalatôrium

In a used bookstore not too long ago, I happened on a book from 1979, about coding practices. It had the verbosely Victorian-sounding title of PASCAL with Style: Programming Proverbs: Principles of Good Programming with Numerous Examples to Improve Programming Style and Proficiency. I bought it and read it, and saw that it had common sense advice like this (translated into Perl terms):

  • Use clear spelled-out symbol names -- "$hours" or "count_lines()" instead of "$h" or "ctls()".

  • Use named constants -- i.e., instead of "$cycles * 29.53", say "$cycles * LUNATION_DAYS".

  • Don't make your code convoluted in order to get microscopic (and possibly non-existent) gains in efficiency. The microsecond that you think you'll save isn't worth the hour that I know it'll take me to make sense of your contorted code.

Each bit of advice was explained it detail, and there was lots of example code in Pascal. But after finishing the book, I said to myself "well, this is all just obvious!", and I put the book away and expected to never think of it again.

But weeks later, I was having to debug a program that was misbehaving strangely (with no guess as to how or why), and as I read thought the source code, I came upon this line:

$diff = abs($stamp - $stamp_mirror) / 86400;

Now, I can hazard a guess about what this does: $stamp and $stamp_mirror are timestamps from two different files, and abs(x-y) is how we get the always positive difference between their timestamps. And, in the way of such things, those timestamps are in days, not seconds, so we use the conversion factor of 86,400, which is the normal number of seconds in a day. Or is it? If the program were behaving correctly, I wouldn't give that line a second thought, but since something is wrong and I don't know what, my well-practiced code-auditing paranoia kicks in, and I blaze through this line of thought:

I can neither mentally calculate nor remember offhand what the right figure is for seconds in a day, so I'm free to obsess: if 86,400 weren't actually the right number, I'd never know it just by looking at it! Maybe the programmer typed this from memory, but got a digit wrong. Maybe the real figure is 86,500, and the above code is bad. Or maybe 86,400 is the number of seconds in a twenty-three hour day (such as occurs once a year, if/when we go onto Daylight Savings Time). But somehow that's the right thing to do in this case, because if we assumed all days were 24 hours long, then something we do later with $diff would cause some dazzling catastrophe on the DST change day.

Or maybe 86,400 isn't even supposed to be the number of seconds in a day, but instead is some other conversion factor of obscure significance:

  • Four pi times the number of sidereal days in a sunspot cycle?

  • The number of inches in a nautical mile?

  • The kilocalories in a gross of Abba-Zabas?

By this time I'm just dizzy with anxiety, Abba-Zabas, and astronomy, so I just have to go and figure out what that number is:

 % perl -e "print 60*60*24"
  86400
So yes, 86,400 is the number of seconds in a normal day.

And that means that my little fugue state was apparently all for nothing: the line in question does exactly what I thought it did (but worried that it didn't!), and so it isn't interesting. The whole episode has distracted me from the task of finding why the larger program is misbehaving -- and it so exhausted me that I can't bear to read the next line of code (something about ($< % 3 and exec 'cat')||dump, whatever that does!). And as I blearily give up bug-swatting for the day, I suddenly remember the 22-year-old Pascal book that I'd read, and I wish that the person who wrote "86400" had instead followed the Pascallers' advice to use a named constant!

Constant Symbols in Perl

The first attempt at improving that code could be in replacing the conversion factor 86,400 with some sort of named entity:

$diff = abs($stamp - $stamp_mirror) / $seconds_per_day;
By having given the conversion factor a name, specifically the symbol "$seconds_per_day", then at least I know what it's supposed to actually denote. Of course, somewhere higher up in the program code, there will have to be a line defining that variable:

my $seconds_per_day = 86400;

This, however, only feeds The Fear, because the next time I read the source, I have to repeat the whole episode with making sure that 60 * 60 * 24 is 86,400 and not 84,600 or 85,400 or other likely suspects. Moreover, maybe somewhere else in the code we accidentally alter the value of $seconds_in_day! Maybe somewhere the programmer means to evaluate $seconds_per_day * 7 but accidentally types "$seconds_per_day *= 7", which does in fact give you seven times $seconds_per_day, but also then stores that larger number back into $seconds_per_day! That's the problem with wanting a constant symbol but using a variable -- it could vary, which is the last thing we'd want a constant to do. But if we want a named constant, we can get it, using the relatively underappreciated constant pragma:

 use constant SECONDS_PER_DAY => 86400;

This tells the "constant" module to make you a constant in the current package, calling the constant "SECONDS_PER_DAY". (It's conventional for constant names to be in all uppercase.) Once you've defined that constant, you can use the symbol "SECONDS_PER_DAY" (with no leading "$") to mean the value 86400, as in:

  $diff = abs($stamp - $stamp_mirror) / SECONDS_PER_DAY;

The best part about this, is that if we try altering the value of the constant, as in:

  $foo = (SECONDS_PER_DAY *= 7);

then Perl refuses to even compile this. (The error message you get depends on the version of Perl you're using; under the version I have on hand, I get the somewhat cryptic message "Can't modify non-lvalue subroutine call in multiplication (*)".) Also, if you've got "use strict" at the top of your program (which you really really should), then if you try using that constant but get the name wrong, Perl catches that:

  use strict;
  use constant SECONDS_PER_DAY => 86400;
  ...
  $foo = SECONDS_IN_A_DAY * 7;
That dies in compilation, with the error "Bareword 'SECONDS_IN_A_DAY' not allowed while 'strict subs' in use...".

However, I'm back again to worrying whether the number of seconds in a day really is 86,400. So I'd prefer that the constant be declared this way:

 use constant SECONDS_PER_DAY => 60 * 60 * 24;

The work of figuring out 60 * 60 * 24 is done just once, when we call constant.pm to get it to put that value into a new constant symbol called "SECONDS_PER_DAY".

Deparsing

It makes you wonder -- what is Perl actually doing when I use a constant symbol like "SECONDS_PER_DAY" here, as in this line:

$diff = abs($stamp - $stamp_mirror) / SECONDS_PER_DAY;

Is it doing the same thing as if I had a $seconds_per_day -- i.e., does it have to look up a symbol's value every time it evaluates that line?

It used to be that to find out anything about Perl internals, one would have to read the Perl source, which is no small task. But one can now glean a lot of information about how Perl parses programs by using the B::Deparse module, which takes the in-memory compiled form of your program, and re-expresses it as Perl source. The compiled form is what actually gets run by the Perl interpreter, not your source code.

Suppose we have a simple program, sec1.pl, that prints how many days have passed since the epoch:

% cat sec1.pl
use constant SECONDS_IN_DAY => 60 * 60 * 24;
print "There have been ", int(time() / SECONDS_IN_DAY),
 " days since the epoch.\n";
 
% perl sec1.pl
There have been 11458 days since the epoch.

% perl -MO=Deparse sec1.pl
sub SECONDS_IN_DAY () {
    package constant;
    $scalar;
}
print 'There have been ', int time / 86400,
 " days since the epoch.\n";
sec1.pl syntax OK
The "sub SECONDS_IN_DAY" thing is a bit of a distraction here, but the interesting and important part is that we had "int(time() / SECONDS_IN_DAY)" in source, but B::Deparse tells us that Perl took this to mean something which it reiterates as "int time / 86400". In other words, where we had a constant symbol, Perl substituted the value itself , before it even ran the program.

Constants, Literal Constants, and Constant Symbols

So far I've been using the term "constant" in its most common sense, meaning a "constant symbol" (AKA a "named constant"), i.e., a symbol that refers to a slot in memory that can't change. But to keep things straight from here on out, we need to distinguish constant symbols from literal constants.

You already know what a literal is, although you might not know that word for it. Literals are the 24 in $x = 24;, or the "stuff" in print "stuff". When Perl parses those statements, it needs to put the value 24 or "stuff" someplace in memory so that if/when it has to run those commands, it will have someplace from which to copy a value into $x, or someplace from which to send a value to print. So Perl simply allocates a piece of memory for each of those, and it marks each value as a constant, just to keep programmers from changing them -- even if they try using roundabout syntax:

my $x = \"foo";  # reference to a constant
$$x .= "bar";
That dies with "Modification of a read-only value attempted", because all literal constants are read-only values. (Again, they wouldn't be very good constants if you could go changing them!) A constant symbol is just a symbol (currently implemented as a parameterless subroutine) declared by "use constant symbolname => value ". And as we saw with B::Deparse, whenever Perl sees a constant symbol actually used, it compiles it as if you had used a literal constant instead, so that these turn into exactly the same thing in memory:

use constant SECONDS_IN_DAY => 60 * 60 * 24;
print "File $f is ", SECONDS_IN_DAY * -M $f, "s old!\n";
 # -M returns the age in days.
print "File $f is ", 86400 * -M $f, "s old!\n";
Constant Folding

Now, some would say that we should all avoid having any numeric constant literals in programs, since when one sees:

  $size = $h * $w * 4;

one always has to wonder: why four? Whereas if one says:

  use constant BYTES_PER_PIXEL => 4;
  $byte_size = $pixels_high * $pixels_wide
            * BYTES_PER_PIXEL;

then things are rather clearer.

However, that policy does bloat the code a bit, and if that's the only place you use a 4 with that meaning, then I figure a comment will do just fine, to explain the literal constant's meaning:

 $byte_size = $pixels_high * $pixels_wide
           * 4; # it takes four bytes to store a pixel
And similarly, instead of:

  use constant SECONDS_IN_DAY => 60 * 60 * 24;
  print "File $f is ", SECONDS_IN_DAY * -M $f, "s old!\n";
I am usually happy writing just this:

  print "File $f is ",  60 * 60 * 24  # seconds in a day
   * -M $f, "s old!\n";

especially if I'm unlikely to be using the seconds-per-day conversion factor anywhere else in the program. As to the "60 * 60 * 24" bit, I used to presume that Perl had to figure out 60 * 60 * 24 every time it evaluated that line. In the scheme of things, that would be no big deal, since having to evaluate -M $f (age of the file $f, in days) is going to take much longer than a bit of simple math. But I once happened to have a line like that in a program which I run through B::Deparse for some other reason. And much to my surprise, I saw this:

  % cat bim_skalabim.pl
  ...stuff...
  print "File $f is ",  60 * 60 * 24  # seconds in a day
   * -M $f, "s old!\n";
  ...stuff...
  % perl -MO=Deparse bim_skalabim.pl
  ...stuff...
  print "File $f is ", 86400 * -M($f), "s old!\n";
  ...stuff...
I had "60 * 60 * 24 * -M $f", and through some magic, Perl compiled that as 86400.

I tried experimenting a bit to see how this all worked:

  % perl -MO=Deparse -e "print 2 * 3 * 5 * 7 * $x"
  print 210 * $x;
  % perl -MO=Deparse -e "print 2 * 3 + 5 * 7 * $x"
  print 6 + 35 * $x;
  % perl -MO=Deparse -e "print $x / 2 ** 9"
  print $x / 512;
So whatever magic is happening, we see that it applies to all the simple arithmetic operators...

  % perl -MO=Deparse -e "print $x / 2 ** (9 + sin(3))"
  print $x / 564.613575929232;
  % perl -MO=Deparse -e "print 4 * atan2(1,1)"
  print(3.14159265358979);
...and apparently also applies to the more complicated cases of "**" (exponentiation) and the trig operators. And it applies even to some string operations:

  % perl -MO=Deparse -e "print 'foo'.'bar'"
  print 'foobar';
  % perl -MO=Deparse -e "print 'foo'.'bar' . $x"
  print 'foobar' . $x;
  % perl -MO=Deparse -e "print sprintf('%d is %s',12,lc('TWELVE')). $x"
  print '12 is twelve' . $x;
But of course, not all operations can be dealt with this way:

  % perl -MO=Deparse -e "print rand(123)"
  print rand 123;
That is, even though the input to the rand operator is constant, that doesn't mean you can figure out its value once and for all -- while addition and sin and string concatenation are the kind of operations that give constant output for constant input, rand isn't like that.

It turns out that this "magic" optimization that I'd stumbled on, is not some occult phenomenon at work, but something well-known to people who go making compilers:

"If the value of all operands of an operation are known at compile-time, and cannot change at execution-time, then the operation can be performed by the compiler and the result insterted in place of the code to perform the operation in the object code. Such operations are known as compile-time arithmetic, a very common optimization strategy. This particular strategy is called constant folding."

-- Pyster's Compiler Design and Construction, p 163

In other words, when you have an operation that you know gives constant output when given constant input, and you observe that all its inputs are constants, then you can figure out its value right there and pretend that you had simply been given that value directly. So when Perl sees source that says (2 * 3 + 5 * 7) * $x, it will parse it into an operations tree that looks like this:

      *
     / \
    +   $x
   / \
  *   *
 /\   /\
2  3 5  7
It's simple, then, for Perl to look at the tree, see that there's a "constantable" operation "*" that takes two constant values as input, 2 and 3 -- and that means we can figure out 2 * 3 right there and fold that part of the tree up into just one node:

    *
   / \
  +   $x
 / \
6   *
    /\
   5  7
And then we can do the same thing for the "*" node with the constantable operation "*" with its constant operands 5 and 7, folding that treelet up into a singe node of constant value 35. And then, moving up the tree, we see that that leaves us with a "+" operation with two constant operands 6 and 35. That can be folded into 41:

  *
 / \
41 $x
And that leaves us with a constant-foldable operation "*". But we can't fold any more, because not all the operands are constants; $x is a variable. So we're left with 41 * $x in the code tree -- and that's exactly what Deparse shows us:

  % perl -MO=Deparse -e "print( (2 * 3 + 5 * 7) * $x )"
  print 41 * $x;
When We Can Fold, But Don't A bit more experimentation shows that constant folding doesn't currently apply in every case where it could apply:
  % perl -MO=Deparse -e "print 2 * 3 * $x * 5 * 7"
  print 6 * $x * 5 * 7;
Now, I know that 2 * 3 * $x * 5 * 7 is the same as $x * (2 * 3 * 5 * 7), so why doesn't Perl fold that into $x * 210? The -p switch to B::Deparse is handy here: it just makes it so that when Perl dumps the op tree as Perl source, it'll provide parens almost everywhere it can (whereas without the -p, we'll get parens only where necessary). So consider:

  % perl -MO=Deparse,-p -e "print $w * $x * $y * $z"
  print(((($w * $x) * $y) * $z));
Or, drawn as a tree...

      *
     / \
    *   $z
   / \
  *   $y
 / \
$w  $x
So 2 * 3 * $x * 5 * 7 must originally parse as this:

        *
       / \
      *   7
     / \
    *   5
   / \
  *   $x
 / \
2   3
Now, remember how we applied our simple constant folding: just take a constant-foldable operation (like "*"), all of whose operands are constant values, and fold that, and then keep doing that until you can't do it anymore. Starting out, we see the lowest "*", with operands 2 and 3, is foldable, so we do that:

      *
     / \
    *   7
   / \
  *   5
 / \
 6  $x
However, once that's done, there are no other foldable operations in the tree -- even though all the operations in the tree are constant-foldable operations (all instances of "*"), none of them have constants as both operands, so our constant-folding stops. And the tree as drawn above, is exactly what Deparse tells us it's left with:

  % perl -MO=Deparse,-p -e "print 2 * 3 * $x * 5 * 7"
  print((((6 * $x) * 5) * 7));
Now, we could try teaching Perl to use a knowledge of algebra to rearrange the expression to try to group constants together. This, however, is a relatively expensive opertation to have the compiler try, and in real workaday code, it rarely pays off. In short, if you want to be as sure as possible that your constants fold, group them together as much as possible, and maybe even put parens around them, to try to force them to be evaluated as a single group (and hence a separate, all-constant-foldable sub-branch in the tree). And when in doubt, deparse it!

  % perl -MO=Deparse -e "print $x * (2 * 3 * 5 * 7)"
  print $x * 210;
Incidentally, not all operations are like "*" in that $w op $x op $y op $z parses as ((($w op $x) op $y) op $z), i.e., producing a left -leaning tree. Yes, "*" is that way, and so it's called "left-associative", but there are right-associative operators, i.e., where $w op $x op $y op $z parses as ($w op ($x op ($y op $z))), forming a right-leaning tree. Exponentiation, "**" is an example:

  % perl -MO=Deparse,-p -e "print $w ** $x ** $y ** $z"
  print(($w ** ($x ** ($y ** $z))));
And the "=" (assignment) operator is an example, too:

  % perl -MO=Deparse,-p -e "$w = $x = $y = $z"
  ($w = ($x = ($y = $z)));
That is, copy $z's value into $y, $x, and $w.

My favorite right-associative operator (and one that constant-folds, in an interesting new way) is the much under-appreciated "condition ? x : y" operator, which means: evaluate condition, and if it's true, then evaluate and return x, otherwise evaluate and return y. It's useful in situations like:

 print "Hey ", @name_bits ? diminutive($name_bits[0]) : 'you';
 # If we have name bits on hand, use the diminutive
 #  of first one, otherwise just use "you".
The trick is that you can nest x?y:z operators, and they'll do what you expect:

  % cat cond_test.pl
  print
     $condition1 ? $value1
   : $condition2 ? $value2
   : $condition3 ? $value3
   : $otherwise_value
  ;
  % perl -MO=Deparse,-p cond_test.pl
  print(($condition1 ? $value1 : ($condition2 ? $value2 :
  ($condition3 ? $value3 : $otherwise_value))));
In other words, that above block means that if $condition1 is true (which you can replace with any expression), then use the value of $value1 (likewise, any scalar expression); otherwise fall through to seeing whether $condition2 will let you use $value2; or whether $condition3 will let you use $value3; or otherwise you give up and use $otherwise_value.1

Now, since we could constant-fold x * y * z, what do we do with x ? y : z, you might ask? After all, they're both just operators. Deparse shows us:

 % perl -MO=Deparse -e "print 0 ? $x : $y"
 print $y;
 % perl -MO=Deparse -e "print 1 ? $x : $y"
 print $x;
 % perl -MO=Deparse -e "print 'raspberries' ? $x : $y"
 print $x;
In other words, Perl sees that since the condition is a constant, we know which of the consequents we'll end up evaluating every time -- if the constant-condition is true (like 1 or -57 or 'raspberries'), we want the first consequent, otherwise we want the second. So we can just replace it all with the expression for the condition we want, and throw out the code for the condition that we know will never evaluate.

Now, remember that since constant symbols are a kind of constant, everything I've said about constant folding with literal constants applies to constant symbols too:

 % cat named_constants.pl
 use constant BUNCH => 3;
 print "That's a bunch of arguments!\n" if @ARGV >= BUNCH;
 print "One more than a bunch is ", BUNCH + 1, "\n";
 print "A bunch is ",
   (BUNCH <2) ? 'less' : 'not less', " than a pair\n";
 % perl -MO=Deparse named_constants.pl
 sub BUNCH () {
    package constant;
    $scalar;
 }
 print "That's a bunch of arguments!\n" if @ARGV >= 3;
 print 'One more than a bunch is ', 4, "\n";
 print 'A bunch is ', 'not less', " than a pair\n";
Note that the BUNCH + 1 was constant-folded to 4, and the (BUNCH <2) ? 'less' : 'not less' was constant-folded to 'not less'.

Folding All Conditionals

We saw that the "x?y:z" conditional operator was constant-foldable -- but Perl has two other kinds of basic conditional structures:

  • The short-circuit operators: "x && y" and "x || y" (or their variants "x and y" and "x or y", which are different only in that "and" and "or" have lower precedence). These form expressions, so you can use them in the middle of any statement.

  • The structures: "if(cond) { block... }", "statement if cond", and the derivative forms with "unless" and "else" and "elsif". These form statements, so can't be in the middle of an expression. (I.e., you can't say "$x = if($cond) { block } else { block }", although you can always wrap any statements in a do { ... } to get an expression.)

Since constant-folding applies to the "x?y:z" structure, does it also apply to the other two kinds of conditionals? Let's see:

 % cat fold_if.pl
 if(0) {
   print "Mmmm Abba-Zaba\n";
  } else {
  sleep 5;
  print STDERR "I started at $^T\n";
 }
1 && print time-$^T, "s elapsed\n";
 % perl -MO=Deparse fold_if.pl
 sleep 5;
 print STDERR "I started at $^T\n";;
 print time - $^T, "s elapsed\n";
So that means that you can sort of nullify any code by putting "if(0) { ... }" around it -- it is still parsed (so it had better be good code), but it gets discarded once Perl sees that it's something it'd never execute.

But this is really most effective when you combine it with named constants:

 % cat fold_if2.pl
 use constant DEBUG => 2;   # our "debug level"
 DEBUG and print STDERR "Start: ", scalar(localtime), "\n";
 print STDERR "Arguments: @ARGV\n" if DEBUG > 1;
 print "...Pretend real work gets done here...\n";
 sleep 3;
 DEBUG and print time-$^T, "s elapsed\n";

% perl -MO=Deparse fold_if2.pl sub DEBUG () { package constant; $scalar; } print STDERR 'Start: ', scalar localtime, "\n"; print STDERR "Arguments: @ARGV\n"; print "...Pretend real work gets done here...\n"; sleep 3; print time - $^T, "s elapsed\n";

If we change the first line of the program to "use constant DEBUG => 1;", and then deparse it:

 % perl -MO=Deparse fold_if2.pl
 sub DEBUG () { package constant; $scalar; }
 print STDERR 'Start: ', scalar localtime, "\n";

'???'; print "...Pretend real work gets done here...\n"; sleep 3; print time - $^T, "s elapsed\n"; fold_if2.pl syntax OK The '???' thing looks strange here, but perldoc B::Deparse explains it as just "a sign that perl optimized away a constant value". It shows up a whole lot more when we change the first line of the program to "use constant DEBUG => 0;", and then deparse it all:

 % perl -MO=Deparse fold_if2.pl
 sub DEBUG () { package constant; $scalar; }
 '???';
 '???';
 print "...Pretend real work gets done here...\n";
 sleep 3;
 '???';
Effectively, it's as if your input program had consisted of just:

 print "...Pretend real work gets done here...\n";
 sleep 3;
because those are the only statements that are actually executed at run-time.

Optimizing with use constant DEBUG => 0

Although it's straightforward how (and when) it's clearer to define a SECONDS_PER_DAY and use that instead of a bare 86400, it's less obvious why we should discard this:

 my $Debug = 0;
 print STDERR "Arguments: @ARGV\n" if $Debug > 1;
...in favor of this:

 use constant DEBUG => 0;
 
 print STDERR "Arguments: @ARGV\n" if DEBUG > 1;
An anecdote will explain the difference: Some time ago, I was debugging an early version of HTML::TreeBuilder (a module that's discussed in TPJ #19, and in my forthcoming book from O'Reilly). For HTML::TreeBuilder to correctly parse a piece of HTML source, a lot of different methods in several different classes have to cooperate. When it works, it works great; but when it doesn't, it's no easy task figuring out which method in which class went wrong.

In desperation, I went through just about every line in HTML::TreeBuilder, and added a line just before it, explaining what it was about to do (or after it, explaining what it'd just done), like this:

 $Debug > 1 and print $indent, $node->tag, "'s parent is a ",
   $node->parent->tag, " -- about to check whether that's legal.\n";
or like this rather more convoluted case:

  if($Debug > 2) {  # Say who called the current routine:
   require Data::Dumper;
   local $Data::Dumper::Indent = 0;
   local $Data::Dumper::Useqq  = 1;
   printf "Called from %s line %s, with args %s",
    (caller)[1,2], Data::Dumper::Dumper(\@_);
  }
Now, evaluating a single "$Debug > 1", is nothing next to the overhead of actually performing the print statement -- when $Debug actually is greater than 1. But for the vast majority of times that people are running HTML::TreeBuilder on an HTML document, $Debug is 0, and stays 0, so all the evaluation and reevaluation of the "$Debug > 1" lines and "$Debug > 2" lines is wasted overhead. On the off chance that it might speed up the module (and HTML::TreeBuilder really needed speeding up), I tried changing the initial "my $Debug = 0" to "use constant DEBUG => 0", and then search-and-replaced "$Debug" to "DEBUG". That meant that for most users most of the time, all the dozens of "$Debug and print..." lines and "if($Debug){...}" blocks might as well not be there.

I then benchmarked the before and after versions of the module, and found that the simple change from $Debug to DEBUG made HTML::TreeBuilder work about ten percent faster -- a much appreciated speed payoff for just the few minutes it took me the change and run some benchmarks.

That's an extreme case, since the average program or module isn't typically so thick with "$Debug > 1 and print..." lines as HTML::TreeBuilder was. But since learning that the "CONSTANT > level and print..." statements and the "if(CONSTANT > level) {... }" blocks get optimized way before the program is even run, I've felt freer to use more such statements, and that's made my programs both clearer, and easier to debug. They're clearer because the content of the print lines has some of the same explanatory value as comments. And the programs are easier to debug, because if one of them misbehaves, I can simply change its "use constant DEBUG => 0" line to have a 2 or a 3, and I then follow along through the resulting output, to see where unexpected things start happening.

Constants Based on Expressions

Now, most constants in typical programs do come from literals and all-literal math expressions, like so:

 use constant DEBUG => 2;
 use constant INDENT => "\t\t";
 use constant SECONDS_IN_DAY => 60 * 60 * 24;
However, you can easily take your values from some other sources, including Perl magic variables:

 use constant DEBUG => $ENV{'DODAD_DEBUG'};
 use constant IS_WIN => ($^O =~ m/Win32/);
 use constant HAS_CONFIG => (-e ".dodadrc");
This, however, won't work:

 my $x = 'stuff';
 use constant THING => $x;
... because Perl while Perl will have compiled the "my $x = 'stuff';" line before it parses and compiles the "use constant THING => $x;" line, it won't have executed the "my $x = 'stuff';" line. (And if or when it does execute it, it'll be too late.) The short story here is that defining a constant based on a variable is itself a fishy concept, and is best avoided -- except for the variables, like %ENV or $^O, which Perl pre-defines.

That even applies to @ARGV -- and you can make constants out of command-line switches using my Getopt::constant module, available in CPAN. With that module, the sort of optimizations that we saw applying to the "if(DEBUG)..." lines, can be made to apply based on any number and combination of constants from the command line switches, as with:

 use Getopt::constant (
  # Declare the @ARGV switches we accept, and their defaults:  
   'octal'   =>  0,  # This is set to 1 if user says -octal   
   'margin'  => 78,  # Overridable with -margin=65   
   ':prefix' => 'C_',   
    # So the value of the "margin" and "octal" switch-options    
    #  shows up as constants called "C_margin" and "C_octal".    
  ); 
  ... 
  $item = sprintf(C_octal ? '%03o' : '%02x', $value); 
  ...  
  $output =~ s/\s+// if C_octal and C_margin < 70; 
  ...
You can even generate constants of your own without the constant.pm library at all, based on any value -- say, a value read from a configuration file, or even pulled off a Web page. This, however, usually requires tinkering with BEGIN { ... } blocks, and/or symbol-table manipulation -- and each of those topics, to say the least, merits an article of its own.

References

Ledgard, Henry F., John F. Hueras, and Paul A. Nagin. 1979. PASCAL with Style: Programming Proverbs: Principles of Good Programming with Numerous Examples to Improve Programming Style and Proficiency. Hayden Book Company, Rochelle Park, New Jersey.

Pyster, Arthur B. 1980. Compiler Design and Construction. Van Nostrand Reinhold, NY NY.

Abba-Zabas are made by the Annabelle Candy Company. https://www.annabelle-candy.com/abbazaba.htm

The modern-day equivalent of the good advice in PASCAL with Style can be found in such sources as these:

per lstyle.pod (Documentation in the standard Perl distribution.)

Scott, Peter and Ed Wright. 2001. Perl Debugged. Addison-Wesley, Boston.

Hunt, Andrew and David Thomas. 1999. Pragmatic Programmer. Addison-Wesley, Boston.

1 Yes, it's just like Lisp's (cond (condition1 value1)(condition2 value2) ... (t otherwise_value)) construct.

Sean M. Burke is an obsessive origamist. That explains his constant folding.

Martin Krzywinski | contact | Canada's Michael Smith Genome Sciences CentreBC Cancer Research CenterBC CancerPHSA
Google whack “vicissitudinal corporealization”
{ 10.9.234.152 }