Migrate to Python 3 with RHSCL

Although most of Python enterprise applications still use Python 2 (e.g. Python 2.4 on RHEL 5 or Python 2.6 on RHEL 6), Python 3 has already become a mature variant and is worth considering. Why, you ask?

  • Python 3 series is being actively developed by upstream, while Python 2 now only gets security fixes and bug fixes. Python 2.7 is the latest minor release of the 2.X series and there will be no Python 2.8. This is very important since Python 3 will be getting new modules (check the new asyncio module coming in 3.4, for example) and optimizations, while Python 2 will just stay where it is and will be abandoned sooner or later.
  • Although the initial Python 3.0 release had worse performance than Python 2, upstream has kept improving it and Python 3.3 is comparable to Python 2.7 performance-wise.
  • Python 3 is already adopted by major libraries and frameworks: Django since version 1.5, SciPy since 0.9.0, mod_wsgi since 3.0, …

Migrating projects to Python 3 takes some time, but with RHSCL it’s as easy as it can get. Read on to get information about changes in the language itself and about the suggested approach to using RHSCL as a migration helper.

What Changed in Python 3.3 Compared to Python 2.7

This section tries to summarize the most significant changes in Python, comparing Python 2.7 and Python 3.3. Changes both in language syntax and in standard libraries are listed. Note that it is not in scope of this article to discuss all the differences. For complete listing, see release notes of Python 3.X series:

The most important changes are:

  • print statement has been removed, print() is now only available as a function.
  • In the except part of try/except blocks, it is no longer possible to use the form with comma – except exc, var – you must always use the the form with asexcept exc as var.
  • The long type has been “renamed” to intint now behaves the same as long did and there is no long type any more.
  • Dictionary type methods iterkeys(), itervalues() and iteritems() have been removed in favour of keys(), values() and items(). These three new methods don’t return lists, but “views“. If you need to get a result of any of these as a list, you need to explicitly wrap it with list().
  • Similarly, builtin methods map(), filter() and zip() return iterators instead of lists, so assuming you need lists, you need to wrap them in list(), too.

Text handling and text types have undergone some major changes, too:

  • In Python 2, there was a basestring type, which was a superclass to str and unicode types.
  • In Python 3, there is a str class (a decoded unicode string; equivalent of unicode in Python 2) and bytes (an encoded string or binary data; equivalent of str in Python 2).
  • In Python 3, str and bytes don’t share enough functionality to have a common superclass. When converting between these two, you must always explicitly encode/decode.
  • The u"..." unicode literal was removed in Python 3.0 and reintroduced in Python 3.3. However since string literals are unicode by default, using “u” has no effect; it was re-added only to ease writing code compatible with both Python 2 and Python 3. On the other hand, you must always use b"..." to create bytes literals.

Finally, there were some changes and renames in the standard library, notably:

  • ConfigParser module was renamed to configparser.
  • Queue module was renamed to queue.
  • SocketServer module was renamed to socketserver.
  • urllib module was split to urllib.request, urllib.parse, and urllib.error

How to Port Your Code

There are literally dozens of great articles about porting code to Python 3 or making it run on both major versions of Python (which is perfectly possible). Personally, I find the upstream “pyporting” document to be a great starting point. It not only thoroughly explains various differences and suggests the best way to deal with them, it also has a nice list of other community resources.
My personal recommendations for porting to Python 3 are:

  • If you want your code to work both on Python 2 and Python 3, use the six library. As its documentation states, six provides you with a thin compatibility layer for writing code compatible with both Python 2 and Python 3 – it can help you handle string types, import renamed modules easily and workaround changed syntax in a simple and uniform way.
  • If you just want to port your application to Python 3 and don’t want to maintain backwards compatibility for Python 2, I’d suggest using the 2to3 script. It is shipped as part of both Python 2 and Python 3 distributions and can do most of the automated conversions for you.
  • Have plenty of unittests and every time you make a change to the source code, run the tests with all Python versions you want to support.

Example

Let’s look at a simple code example and how you would go about porting it. First, let’s see the old version that only runs on Python 2:

# This script fetches webpage given as a first argument
# on command line and prints it on stdout.
import urllib
import sys

try:
    h = urllib.urlopen(sys.argv[1])
except IOError, e:
    print "Error! %s" % e
    sys.exit(1)
print h.read()

This short code example has quite a few deficiencies from Python 3 perspective: urllib has been split into multiple modules, so this won’t work; print can’t be used as a statement any more; the used form of except clause doesn’t exist in Python 3. So how do you do the porting?

Make Porting Easier with Red Hat Software Collections

From perspective of porting to Python 3, Red Hat Software Collections bring a great value, since they include both Python 2.7 and Python 3.3 with exactly the same set of extension packages. This means that you can test your script/application by running it the same way in two different environments like this:

scl enable python27 "python script.py http://www.python.org"
scl enable python33 "python script.py http://www.python.org"

… and just see what happens (the Python 3 run fails, of course). As I’ve mentioned, it is ideal to have plenty of unittests and run them instead of running the script/application itself – but we don’t have that luxury in our example, so we’ll just try to port the application, re-run it and see what happens:

import sys

import six
from six.moves.urllib import request # use six to get the right import

try:
    h = request.urlopen(sys.argv[1])
except IOError as e: # use correct except syntax
    print("Error! %s" % e) # use print function
    sys.exit(1)
# decode the string to get str on Python 3
# (we should have decoded even on Python 2 to get unicode!)
print(h.read().decode('utf-8'))

Before running the script again, you’ll have to install the “six” module into both collections (I prefer “user” installation to not mix user-installed modules with the system ones):

scl enable python27 "easy_install --user six"
scl enable python33 "easy_install --user six"

Now you can run the script under both interpreter versions and see that it works exactly the same.

So here you go: migrating your Python application to Python 3 as easy as it can get.

Leave a Reply