October 19, 2017

What's In a Namespace?

Python programmers talk about namespaces a lot. The Zen of Python* ends with
Namespaces are one honking great idea—let’s do more of those!
and if Tim Peters thinks namespaces are such a good idea, who am I to disagree?

Resolution of Unqualified Names

Python programmers learned at their mothers' knees that Python looks up unqualified names in three namespaces—first, the local namespace of the currently-executing function or method; second, the global namespace of the module containing the executing code; third and last, the built-in namespace that holds the built-in functions and exceptions. So, it makes sense to understand the various namespaces that the interpreter can use. Note that when we talk about name resolution we are talking about how a value is associated with an unadorned name in the code.

In the main module of a running program there is no local namespace. A name must be present in either the module's global namespace or, if not there, in the built-in namespace that holds functions like len, the standard exceptions, and so on. In other words, when __name__ == '__main__' the local and global namespaces are the same.

When the interpreter compiles a function it keeps track of names which are bound inside the function body (this includes the parameters, which are established in the local namespace before execution begins) and aren't declared as either global or (in Python 3) nonlocal.  Because it knows the local names the interpreter can assign them a pre-defined place in the stack frame (where local data is kept for each function call), and does not generally need to perform a lookup. This is the main reason local access is faster than global access.

Although the interpreter identifies local names by the presence of bindings within a function body, there is nothing to stop you writing code that references the names before they are bound. Under those circumstances you will see an UnboundLocalError exception raised with a message like "local variable 'b' referenced before assignment".

For non-local names, something very like a dictionary lookup takes place first in the module's global namespace and then in the built-ins. If neither search yields a result then the interpreter raises a NameError exception with a message like "name 'nosuch' is not defined."

Resolution of Qualified Names

In qualified names (those consisting of a sequence of names or expressions delimited by dots  such as os.path.join) starts by locating the first object's namespace (in this case os) in the standard way described above. Thereafter the mechanism can get complex because like many Python features you can control how it works for your own objects by defining __getattr__ and/or __getattribute__ methods, and because descriptors (primarily used in accessing properties) can cloud the picture.

In essence, though, the mechanism is that the interpreter, having located the object bound to the unqualified name, then makes a gettatr call for the second name (in this case, path) in that namespace, yielding another object, against which a further getattr call is made with the third component of the name, and so on. If at any point a getattr fails then the interpreter raises an AttributeError exception with a message such as "'module' object has no attribute 'name'."

Understanding Expression Values

Once you understand the mechanisms for looking up the values of names it becomes a little easier to understand how Python computes expression values. Once a name is resolved there may be other methods to apply such as __getitem__ for subscripting or __call__ for function calls. These operations also yield values, whose namespaces can again be used to lookup further names. So, for example, when you see an expression like

    e.orig.args[0].startswith('UNIQUE constraint failed')

you understand that the name e.orig.args is looked up by going through a sequence of namespaces and evaluates to a list object, to which a subscripting operation is applied to get the first element, in whose namespace the name startswith is resolved (hopefully to something callable) to a value that is finally called with a string argument.

Ultimately, by decomposing the expressions in this way you end up only dealing with one object at a time. Knowing how these mechanisms work in principle can help you to decipher complex Python code.

* Just type import this into a Python interpreter, or enter python -m this at the shell prompt, and hit return.

August 12, 2015

Pro tip: Use CDPATH in Your Shell

This is a tip I first picked up about thirty years ago (my God!) when I worked at Sun Microsystems and used the C shell fairly extensively. Fortunately there's no need to do so any more, as many of its more desirable features have been incorporated into bash.

There's nothing worse, even with tab-completion, than having to stab through a sequence of directories to find the location you want. If you're anything like me there are typically three or four main directories that I use for about 80% of the work that I do.

The CDPATH environment variable works pretty much like your PATH setting, except that instead of being used to locate executables it's used to locate directories. It comes into play when you issue a cd or pushd command. So, for example, on my personal machine my .bash_profile contains the following line:

export CDPATH=.:~/Projects:~/Projects/Python/

So when I try to change to a new directory the shell first looks in the current directory (it can cause real confusion if you don't look there first), then in my Projects directory, then in its Python sub-directory. So the command

pushd PytDj

takes me to my ~/Projects/Python/PytDj project directory with no need to specify the path. I estimate this saves me at least a minute a day, so over thirty years it's saved me a substantial amount of time. Try it, and see what you think.

June 14, 2015

A Short Musical(?) Introduction to Infinity

Wondering how best to convey the bigness of infinity, I came upon the idea of using a musical exposition. Please forgive the inadequate nature of the performance - like present-giving it's the thought that counts. It's a monologue on the nature of the smallest infinite cardinal. Enjoy. Or not ...


May 20, 2015

What Kind of Geek am I? (1)

I've been pondering about why sometimes progress seems hard to make when I am learning a new piece of technology, which other people apparently just pick up and start using with no trouble at all. I now wonder whether it's a combination of personality and knowledge.

In common with many developers I like to understand things thoroughly. For me this means being able to construct satisfactorily self-consistent mental models that I can, effectively, "unit test" by applying what-if techniques. Unfortunately my history as a developer means that I have a detailed knowledge of system architecture. Without claiming to be fully up-to-date (for example the last time I built logic was in 1987, from Schottky TTL clocked at 12 MHz), when I use the term "full-stack" I include a reasonably detailed knowledge of all aspects of system architecture, including the hardware, operating system, language support libraries and application.

In other words, I am (often subconsciously, thanks to many more than 10,000 hours of practice at this computer game) aware of lots of things that can go wrong at many different levels.

I suspect this causes me, without specific motivation, to choose test cases that go “near the edge of the envelope”. It's frustrating when I'm learning because it often reveals deficiencies (most often in the documentation) of the “system under test” that can block progress; but I also believe it helps to integrate my newly-learned knowledge, and leaves me with valuable knowledge that you just can't pick up in the middle of the road.