April 28, 2006

Tutorials from PyCon

When I finally made time to visit comp.lang.python for the first time in two or three weeks I found a post that said

I would like to know if anybody can point me to the site, where it is possible to find the tutorial "Using Databases in Python" which is mentioned by Steve Holden here: http://tinyurl.com/ectj8
So of course I had to make sure that the material was available on the web. As a result you con now download my tutorials in either PDF or Open Office Impress format (yes, for once I eschewed using the obvious Microsoft products, and found that the Open Office component was a more-than-acceptable clone).

Using Databases in Python: Impress PDF

An Introduction to wxPython: Impress PDF

These materials are available under a Creative Commons license. Thank Guido van Rossum for inspiring me with the liberal terms of the Python license.

April 24, 2006

Squidoo: Just Another Web Phenomenon?

Seth Godin, a marketing whizz if ever there was one, recently started promoting a new web service called Squidoo that's aimed at making web authorship easier. The idea is to buld "lenses", which are promoted as offering ways to look at the world.I just noticed there are graphics you can use to link to your lens.
So take a look at Check out my lens Python is Amazing. And if you find something amazing I haven't included please feel free to let me know!

April 20, 2006

Shameless Self-Promotion

An Amazon review of Python Web Pogramming reminds me that the book wasn't written just to explain Python ... thanks, Sheila!

20 of 21 people found the following review helpful:

Excellent example snippets; Clear explanations, February 24, 2002
Reviewer:Sheila King "desk-worker mom who needs to exercise" (L.A. County, California) - See all my reviews
(REAL NAME)
If you are going to be using Internet protocols, doing network programming, or web programming with Python, and these are new topics for you, I would highly recommend this book.

The book starts with a brief overview of the Python language. The author's intention is that someone with a fairly extensive programming background in other languages would be able to pick up enough Python from this overview to be able to do the rest of the programming in the book. Perhaps so. I already know Python, but did find the summary in the front informative.

I really like the fact that nearly every page has a code snippet on it. Examples are brief and to the point. The author explains each line of code and has a very direct and clear way of explaining things. I found the explanations easy to read and understand.

After the brief Python Language overview, comes an overview of sockets and socket programming. I've been trying to learn a bit about the whole topic of sockets by searching the web and nothing I found on the web explained it as clearly as this book. I now appreciate the difference between TCP and UDP protocols and have an idea of the situations in which I would want to use each. If you want to learn low-level sockets, or how to write your own socket protocols, this is not the book you are looking for. This book basically assumes you will go with either TCP or UDP (and ignores the other types of sockets available in the Python socket library). However, these will probably suit most people's needs.

The author then walks you through each of the Internet data-handling libraries in Python, such as the telnetlib, ftplib, poplib, smtplib and so on. He gives examples of working code for each library, showing first how to implement clients, and later on how to implement servers. If you want to work with these libraries, these explanations should be very helpful.

Later in the book, Holden addresses using databases in Internet programming, using XML and writing your own web-application framework. I haven't yet had a chance to go through these chapters in detail (I've skimmed them only). But there is a LOT of stuff there. One thing the author does at the beginning of each new section, is give an overview of the topic (such as an overview of why you might want to use a database, how databases work, or why you might want to work with a web framework). For me, I really appreciate this type of overview. It helps give me a context for the new information, and helps me to make better sense of it. I read through some of the database chapters where he explains how the SQL query language works, and again, I have to say it is one of the best explanations I've read. (Most explanations I've read about SQL have just convinced me I wanted to steer clear of it.)

Another nice thing, is how he sort of "works you up to" SQL. He starts out with regular Python code, and shows how parts of it are similar to working with an SQL database, and then eventually transitions into the full SQL language. He also addresses database design and efficiency.

Overall, I'd say if you want a good overview of the topics mentioned here, want to understand the reasoning behind their use, and want to be able to understand good design and efficiency, then this book should really help you out.

April 16, 2006

End of Internet Prematurely Flagged

In Tech Blogosphere has peaked [the blogosphere deserves a capital?] Phil Sim suggests (I paraphrase) that
"anybody who's going to have a blog has one by now, after two years as a journalist you get stale, lots of bloggers are going back to real life"
and graphs the "daily reach" of memeorandum.com, thereby confirming that for him it's all about the eyeballs. When I look back at my own blogging history I see that there are frequently months when I have written nothing at all in my blog, and hey, here I am still blogging.

If blogging is just "look at me, Ma!" then the sooner it stops the better. Blogging works best not when it's a publicity channel but when it's a reasonably consistent, selective window onto the world of the blogger. Like journalism, much blog content is ephemeral, and the blogging world needs to remember that. If it's not on page one of my del.icio.us then it's history. If you want the history it's sometimes there, but remember it could be a revisionist history as blog posts can be changed at any time.

April 15, 2006

Test-Driven Development

Some code, for a change. I recently taught an introductory Python class to some fairly experienced programmers, and we had an hour or so left at the end of the class to try a problem. We'd been discussing test-driven development, so we arrived at the idea of creating a problem that was fairly simple in scope and then writing tests and a solution to the problem.

The idea was to write the tests first, though it will be no surprise to those who've done this before that the nature of the tests changed as errors came to light during development. The problem was as follows:
Given a directory structure of arbitrary shape, locate all JPEG images and copy them into a named destination directory. [Not specified but implied: the files should continue to exist in their original positions]
It turned out that the level was fairly well chosen. None of the students managed to complete the task, but they all had a fairly clear sense of where they were going by the end of the exercise, giving them something to work on independently after I'd gone. Of course I had to provide them with a "model solution", which I'm happy to say I just managed to create in the time allotted.

Here is the test harness for the jpegcopy requirement:
import jpegcopy
import unittest
import os

BASEDIR = '/c/Steve/Projects/BrightonHove'
BASEDIR = 'c:/Steve/Projects/BrightonHove'
INDIR = os.path.join(BASEDIR, "input")
OUTDIR1 = os.path.join(BASEDIR, "output1")
OUTDIR2 = os.path.join(BASEDIR, "output2")
EXPECTED = ['%s.jpg' % s for s in "f1 f2 f3 f4 f5 f6".split()]

class TestJpegCopy(unittest.TestCase):

def setUp(self):
"""Ensure both output directories are empty."""
for d in OUTDIR1, OUTDIR2:
fl = os.listdir(d)
if fl:
try:
for f in fl:
os.unlink(os.path.join(d, f))
except:
raise ValueError, "Cannot empty directory %s" % d

def testEmpty(self):
n0 = jpegcopy.main(OUTDIR1, OUTDIR1)
self.assertEquals(n0, 0)

def testDir1(self):
n1 = jpegcopy.main(INDIR, OUTDIR1)
self.assertEquals(n1, 6)
self.assertEquals(os.listdir(OUTDIR1), EXPECTED)

def testDir2(self):
n2 = jpegcopy.main(INDIR, OUTDIR2)
self.assertEquals(n2, 6)
self.assertEquals(os.listdir(OUTDIR2), EXPECTED)


def tearDown(self):
for d in OUTDIR1, OUTDIR2:
for f in os.listdir(d):
os.unlink(os.path.join(d, f))


if __name__ == "__main__":
unittest.main()
Nothing too fancy here. The tests are parameterised. We set them up by clearing both the output directories. Then we test that they are indeed empty. Then we test to make sure that we can put the JPEGs into two different directories and verifying that each time we see six files copied. Finally we check that both output directories contain the same thing. We tear down the test by deleting the contents of both directories.

This will probably show my ignorance, highlighting the fact that test-driven methods don't yet come naturally to me. I'll be happy to integrate suggestions for improving test coverage. My solution follows.
"""Copy jpegs from a recursive to a flat directory structure."""

import os
import shutil

def main(indir, outdir, debug=0):
count = 0
for f in os.listdir(indir):
if os.path.isdir(os.path.join(indir, f)):
count += main(os.path.join(indir, f), outdir, debug=debug)
else:
if f.endswith(".jpg"):
count += 1
shutil.copyfile(os.path.join(indir, f),
os.path.join(outdir, f))
if debug:
print "Returning %d for %s" % (count, indir)
return count

if __name__ == "__main__":
main("input", "output1", debug=1)
As you can see I have put a simple test inline; this script is not intended to be run as a main program, but the debug output was useful sometimes when tests failed for obscure reasons.

Again, if readers can suggest improvements I'll incorporate them as I have time. You should be able to copy the code from your browser window and paste it into an editor. Thanks to the MoinMoin developers for colorize.py.

April 6, 2006

Do We Need Speed?

Well, finally the news is out. The reason things have been so quiet on this blog (and why I've been missing from comp.lang.python pretty much since PyCon) is that all available spare time has been going into trying to get the Need for Speed sprint off the ground. To my knowledge this event is unique, though an assertion like that invites correction from the better-informed.

Commercial organisations are free to use the output from open source projects without putting anything back (vide Industrial Light and Miserliness), and by and large this is expected -- open source licensing terms are fairly explicit, and few are even as "draconian" as the GPL (which currently says in essence that if you distribute GPL-derived products you have to make the source available). By the same token, of course, there's nothing to stop people from supporting open source projects if they want to.

Hopefully this little gap in the curtain will help to trigger a realisation among commercial software developers that they can make a huge difference to open source projects from which they might benefit (or from which they already have benefited) by the application of what are, in strictly commercial terms, relatively modest funds. The various Foundations controlling some of the better-known open source projects do make their own efforts, but without significant external funding (such as that achieved by the PyPy project) and management expertise (extremely variable) it can be difficult to make progress. Heck, even management training might be a useful contribution from the world of pointy hair (I'm sorry, I'll wash my mouth out with soap later).

Of course we'll have to wait and see. It could be that I am completely wrong, that hardly any of the invited developers will bother to come, and that the sprint will really show that open source people don't want to play in the commercial playground at all. In which case I guess I'll have egg on my face.

If I'm right, though, it will show that there are benefits to be had on both sides of the equation and that, although open source developers don't do it primarily for the money, they don't necessarily object to working with commercial developers when there is a sufficient alignment of goals. I feel incredibly fortunate to have been given the chance to put this belief to the test, and I can't wait to see how this effort goes.