A More Direct Explanation of the Problem

Subject:      Re: Critique/comments for web board?
From:         mjd@op.net (Mark-Jason Dominus)
Date:         Thu, 16 Sep 1999 21:46:20 GMT
Message-ID:   <7rroeb$s1n$1@monet.op.net>
Newsgroups:   comp.lang.perl.misc
In article <37e05478_1@news2.one.net>, David Wall <darkon@one.net> wrote:
>
>In article <7rnome$ho2@dfw-ixnews19.ix.netcom.com>, ebohlman@netcom.com (Eric Bohlman) wrote:
>>2) You're using symbolic references just to reduce typing.  This is a 
>>false economy.  Use a hash instead, and use strict.
>
>I'm not sure I understand offhand what you mean here.  I guess I shouldn't 
>have done something like the following?
>
># grab the message fields and put them in their own variables
># so I don't have to type so much when maintaining this code
>foreach $field ( keys %{$Messages{$id}} ) {
>   $$field = $Messages{$id}{$field};
>}

Right.

>What's so bad about symbolic references?  [Time passes while I look it up in 
>the Camel]  Hmm, I suppose I could accidentally mix up a hard and a symbolic 
>reference, and then have one hell of a time figuring out the bug, eh?

No. The real problem is that if your string contains something unexpected, it will sabotage a totally unrelated part of the program, and then you will have one hell of a time figuring out the bug.

Suppose something goes wrong and one of the `field names' in one of the messages get mangled. Maybe there was some error return you didn't detect, and so one of the keys turns out to be 0, or maybe some function got called in the wrong context and you ended up with a field name of 4 for some reason, or something gets quoted that shouldn't be and the field names end up being *.

Then your loop goes blithely ahead and tries to modify $0 or $4 or $*, which are reserved variables with special meanings, and then something bizarre and disgusting happens---the details depend on the variable name. For example, if you accidentally modify $*, suddenly some of your regexes might start matching when they shouldn't. Try debugging that.

If the field name comes out to be /, then you end up modifying $/, and suddenly every filehandle read in the rest of the program comes out bizarre and wrong. The read operations might appear in a part of your program very far away from the place that the bug occurred, and it could take you days to figure it out---or you might never figure it out.

If there's field name which is accidentally i, and you go and tamper with $i, then you might not even notice the problem, until one day someone adds some code that does

	for $i (...) { 
	   ...
	}

and then your $$fields assignment smashes the loop variable, and the program goes into an infinite loop, and the whole thing is mystifying.

The real problem is that $$fields = ... is not confined at all, and if something goes wrong, it can smash any variable in the entire program, which will have a bizarre effect, possibly on something far away.

This is the sort of problem that variable namespaces are supposed to prevent. Code A is not supposed to be able to tamper with the variables of code B that is far away. A hash is a portable namespace, and if you use a hash here, you are safe from all these weird problems:

	# grab the message fields and put them in their own variables
	# so I don't have to type so much when maintaining this code
	my %f;
	foreach $fieldname ( keys %{$Messages{$id}} ) {
	   $f{$fieldname} = $Messages{$id}{$fieldname};
	}

Now you have to write $f{SUBJECT} or whatever. That is only three characters extra, and you no longer run the risk of smashing variables in distant parts of the program.

Also, when you write it like this, it becomes immediately apparent that you can make it shorter and simpler at the same time:

	# grab the message fields and put them in their own variables
	# so I don't have to type so much when maintaining this code
	my %f = %{$Messages{$id}};

This is shorter, simpler, safer, and more efficient. That is a couple of big wins and a medium-sized win, and the cost in return is that you have to write

	$f{SUBJECT}

instead of

	$SUBJECT

So as usual there are tradeoffs, but in this case the tradeoffs are rather lopsided. The soft-reference technique has big problems, and trivial benefits.

>So instead I should just go ahead and use $Messages{$id}{$field}?  Hey, 
>whatever happened to Laziness? 

I'm all for laziness. I'm too lazy to spend half my life tracking down monster bugs caused by accidentally modified global variables. That's why we have private variables in the first place: To prevent exactly this sort of error.

I hope this makes the potential problem clearer.


Other articles on this topic: Part 1 Part 2 Part 3


Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia

mjd-perl-misc@plover.com