Python Scopes and Namespaces

20 Mar 2016

Understanding the namespace and scoping rules of a programming language is essential to using the language effectively. For Python, you can approach this subject in two ways: one is through reading its wonderful documentation; the other is to play with the language and enlightened. I will do both and summarize what I find.

Read The Docs

Reading documentation is the first important step in hacking and understanding how a software system works. That is why good documentation in software engineering is equally important as writing good codes. So here we go:

According to Python’s documentation about execution model, some definitions first:

What is a code block?

A block is a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition. Each command typed interactively is a block.

A code block is executed in an execution frame. A frame contains some administrative information (used for debugging) and determines where and how execution continues after the code block’s execution has completed.

A code block (and only a code block can) creates a scope:

A scope defines the visibility of a name within a block. If a local variable is defined in a block, its scope includes that block. If the definition occurs in a function block, the scope extends to any blocks contained within the defining one, unless a contained block introduces a different binding for the name (shadowing).

We now know what a scope is, how is the scope of a name determined? It is determined based on the principle of the nearest enclosing scope:

When a name is used in a code block, it is resolved using the nearest enclosing scope. The set of all such scopes visible to a code block is called the block’s environment. If a name is bound in a block, it is a local variable of that block. If a name is bound at the module level, it is a global variable. If a variable is used in a code block but not defined there, it is a free variable (for example, a global variable used in a function body, but not defined in the function).

To be more specific about the scoping rules, some people call it LEGB which stands for Local, Enclosed, Global and Builtin. That means names are searched in that particular order (LEGB) to determine the scopes they belong to; such concept can be found in the Python tutorial written by the Python creator himself.

Hands-On Experiments

Now let me get my hands dirty to sink the information in deeper in my brain. First start with something simple:

# a global scope and a function scope
ng = 'a global'

def f():
    nl = 2
    print ng, nl

f()

The output is:

a global 2

In the example above, variable ng was a free variable in function f. Python interpreter first tried to find ng within the local scope created by function f, but it could not, then it searches the next enclosing scope, which is already the global scope, and it found ng which was bound to value a global. No surprise whatsoever. Now something more interesting:

# a global scope and a function scope
ng = 'a global'

def f():
    nl = 2
    print ng, nl
    ng = 3
    print ng, nl

f()

Produced output:

Traceback (most recent call last):
  File "./scopes.py", line 14, in <module>
    f()
  File "./scopes.py", line 8, in f
    print ng, nl
UnboundLocalError: local variable 'ng' referenced before assignment

Wow! This is surprising if you come from other programming languages that properly implement lexical scopes. The behavior above says that variable ng is ‘declared’ locally by name-binding (assigment statement) operation and is visible within the entire local function scope and therefore ng no longer referes to the global ng variable anymore (that is why the first reference to ng became accessing an unbound local variable)!

If you want to rebind the global ng inside a function, you need to use global keyword:

ng = 'a global'

def f():
    global ng
    nl = 2
    print ng, nl
    ng = 3
    print ng, nl

f()
print ng

Now the output is as expected:

a global 2
3 2
3

If you do not use global keyword, name-binding in a local scope always creates a local variable only visible within the local scope. In fact, Python 2 can only rebind names in two scopes: local scope (by assigment statement) or the module-global scope (by using a global declaration). There is no way to rebind (assign to) names in the nearest enclosing scope other than local and global scopes. This is one of the annoying quirks Python has and has been fixed in Python 3. This is also one of the reasons why you should use Python 3 (or learn another programming language!).

Now let us investigate the scope(s) in class block:

class Foo(object):
    nclass = 'pig'

    def f(self):
        print nclass

foo = Foo()
foo.f()

Can you guess what will happen?

Traceback (most recent call last):
  File "./scopes.py", line 25, in <module>
    foo.f()
  File "./scopes.py", line 22, in f
    print nclass
NameError: global name 'nclass' is not defined

Wow! a surprise again?! It seems like a class scope does not extend into the methods of a class! Here is the relevant quote from the doc

The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods – this includes generator expressions since they are implemented using a function scope.

That means the following will also fail:

class A:
    a = 42
    b = list(a + i for i in range(10))

You might ask then how do we access the class variables from within methods? I personally prefer this way:

class Foo(object):
    nclass = 'pig'

    def f(self):
        print self.__class__.nclass

foo = Foo()
foo.f()

This style makes it clear that it is trying to access a class variable (shared by all instances of the same class) and this works even with inheritance.

Finally, I want to mention a few more quirks about Python scopes as if it is not messy enough.

Python separates blocks of code by giving different indentations to lines, which makes them look like different code blocks visually. This coding style together with scoping knowledge from other programming languages might give you false ideas that some blocks of codes also creates scopes. For example, you might expect a for loop, with or try ... except .. else ... finally statement creates its own scope, do you?

The aforementioned constructs do not create lexical code blocks recognized by Python (as defined at the beginning of this post), they are only syntactical blocks appear to your eyes; that means, for loops leak loop variables:

for i in xrange(10):
    print i,
print '\ni after loop:', i

Gives you:

0 1 2 3 4 5 6 7 8 9 
i after loop: 9

The loop variable (i) is accessible even AFTER ‘exiting’ the loop! This behavior may or may not be what you want. Same logic applies to with and try ... except .. finally statements; the names they bind within the ‘block’ are still usable after they are bound (but still within the same lexical scope).

Summary

Python scopes and namespaces really show you it is a different language. And this is one of its known weaknesses. It seems the implementation is inconsistent for different cases. Having the quirks of Python in mind, the take-away is the LEGB rule and remember what the real lexical code blocks that really creates scopes are (whisper: module, function body, class definition, etc; and not loops or with or try except statements).