headerphoto

Custom Error Pages with TurboGears 2

Posted by Tim Freund Tue, 21 Oct 2008 03:35:00 GMT

Your TurboGears 2 application is not perfect. Just for a second, let's pretend like it is perfect. Even with all of its perfection, your application will need to deal with bad incoming links and malformed data. Is it ready? It is only a matter of time before your users receive a 404 or 500 response from your most wonderful application. Whether the source of trouble is a bug in the code or a bad incoming link, why leave users lost in the dark? Custom error pages can put them back on track when something goes wrong, and they are very easy to implement. We will create one in the following few paragraphs.

We will use Turtle Goals to demonstrate the techniques in this tutorial. It is open source and fairly simple. Please feel free to follow along in the Turtle Goals source, or work along in your own TurboGears 2 project.

Look in your controllers directory. See that file named error.py? That's the key to a custom error page. The Routes package does a bit of work behind the scenes to send any request that generates an error through the ErrorController, so by customizing the ErrorController, we can customize the resulting 404 and 500 pages. The default document method produces a standard Pylons error page. It looks nice, but it probably doesn't look right compared to the rest of your project.

TurboGears projects default to the Genshi template engine, and that is the engine used by Turtle Goals. Let's create a new template, error.html, and save it in the templates directory.

So now all we need to do is add the expose decorator to the document method, and return a dictionary of appropriate values. Done, with enough time to check Reddit before your next meeting, right? Well, almost. Look closely at the ErrorController definition, and you will see that it is not a standard TurboGears controller. It extends a Pylons controller class, WSGIController, and that causes it to behave differently from our other controllers. At least it should extend WSGIController according to a post on the mailing list. Apparently there is a bug in the quickstart template, and you you may need to chage the ErrorController definition to extend WSGIController. I was easy to convince: as soon as I made the suggested change, my error page started working. Back to the point: the expose decorator will do you no good inside of the WSGIController. It is up to us to render the template to a string and return that string. Fortunately TurboGears provides a method to do just that:

from tg import render
rendered_template = render.render_genshi("error.html", {})

Of course there are also methods like render.render_mako and render.render_jinja if you prefer those other template engines. Here's the full listing of our modified error.py:

There's one other small matter to deal with: configuration. There are two relevant configuration values in your application: debug and full_stack. To use your custom 404 error but still get the interactive debug page for 500 errors, try the following:

debug = true
full_stack = true

The interactive debug page is inappropriate for production environments. When your application is deployed into production, use these settings instead:

debug = false
full_stack = true

If you have both set to false, you will get generic error pages that end users will run from, screaming. All done, for real this time. And you still have time to check Reddit, but you may want to check out these links instead:

What is your strategy for designing and implementing custom 404 and 500 error pages?

Bookmark and Share

Posted in , ,

Cover Your Nose When You Test

Posted by Tim Freund Wed, 24 Sep 2008 06:06:00 GMT

My mom always reminded me to cover my nose when I sneezed. Now I take it a step further, and I cover my nose when I test.

Nose and the nosetest command are used to run unit tests for Pylons and TurboGears 2 applications, as well as a multitude of other Python applications and libraries. Although nose is great at running tests and reporting back issues, it doesn't natively show developers what isn't being tested. For that, we need a code coverage tool.

Just because nose doesn't handle code coverage reports natively doesn't mean this will be a difficult task. Ned Batchelder's coverage package provides exactly those reports, and nose ships with a plugin to enable it. To install coverage and invoke the reports, you could do something like this:

$ easy_install coverage
$ nosetests --with-coverage  

Name                                         Stmts   Exec  Cover   Missing
--------------------------------------------------------------------------
_strptime                                      228    149    65%   23, 80, 84-89, 155, 169-170, 175, 189, 237, 280-294, 303-304, 306, 314-323, 329, 332, 353-360, 366, 368, 374-388, 393-427, 431-432, 443-446
encodings.ascii                                 19      0     0%   9-42
ez_setup                                       103     11    10%   53-62, 80-104, 117-151, 156-190, 197-222, 226-229
fixture                                         10      9    90%   38
fixture.base                                   124     33    26%   10-19, 25-28, 48-49, 56, 60, 64, 98, 103-104, 122-217
fixture.dataset                                301    225    74%   41, 51-52, 55-59, 65, 76-77, 80, 98-110, 141, 144, 225-229, 232-239, 243-248, 277, 285-289, 323-325, 447-450, 456-457, 461, 468, 479, 483, 529-530, 560-564, 571-574, 580, 631, 720-722, 725-737, 740-744, 752-753
... lots of lines cut to keep you from going blind ...
zope.interface.ro                               22     22   100%
zope.interface.verify                           45     29    64%   46, 51, 56-61, 66, 70, 75-79, 88, 93, 104, 107, 109, 111
--------------------------------------------------------------------------
TOTAL                                        17257   7646    44%
----------------------------------------------------------------------

Good Gravy, man! That's a big coverage report!?

As cool as it is to know that we can get coverage reports for the entire Pylons/TG2 stack, how about we focus for just a minute on only our project. That's the problem with kids today, no focus.

$ nosetests --with-coverage --cover-package=YOUR_PACKAGE_NAME

OK, that's better. Now the report only shows the coverage for code inside of our package. But, seriously, do we really need to type those arguments every time?

Of course not. Sitting there in the root of your Pylons or TurboGears 2 application is a file named setup.cfg. If I were a betting man, I'd say you've never opened it. Ever. Let's knock the dust off and take a look at it in any decent text editor. We're looking for a section named [nosetests], and since you've never changed the file, it is probably at line 8 and looks just like this:

[nosetests]
with-pylons=test.ini

We can add any additional options for nose in this section. Now is your chance to spring into action. Add the following lines to

with-coverage=true
cover-package=YOUR_PACKAGE_NAME

Save setup.cfg and run your tests. You will see a code coverage report at the end of the test run, and the last two columns will be the most interesting. They show the percentage of the code that was covered and the lines of code that were not covered, respectively. Depending on the outcome of your coverage report, you may be feeling rather smug right now. Stop it, we're not Rails developers, and there is still work to do.

On the other hand, if your code coverage report leaves you feeling a little ashamed of just how much code is uncovered, don't despair. Just knowing what code coverage is and caring about the results of a coverage report already puts you into a minority of all programmers. Pick a block of untested code and write a test. All of the sudden, your numbers are better and you'll start to leave those fly-by-the-seat-of-their-pants programmers in the dust.

Remember that this setup.cfg trick will work with any python application or library that is testable with nose. We happen to be focusing on Pylons and TurboGears because that's what I've been busy using lately. How do you use nose to help you write better code?

Bookmark and Share

Posted in , ,

Python and Jabber: Head to Head Library Review

Posted by Tim Freund Tue, 12 Aug 2008 03:42:00 GMT

Python is known as a batteries included language, but sometimes it can feel like a home improvement project gone bad. You know, the kind of project where there seems to be too many parts for the work that remains? I had that feeling as I tried to figure out which Jabber/XMPP library would suit my application best. Like any programmer worth their salt, my first tool for research is a visit to Google. Two searches, python jabber and python xmpp, seemed to turn up three viable projects.

There’s also a reference to Twisted’s support for XMPP in the first page of search results, but we’ll save Twisted for another day. I haven’t grasped Twisted enough to feel comfortable writing about it, and the application that I am working on doesn’t use Twisted in any way. Keep an eye on these guys for more information regarding Twisted’s XMPP support.

All three of the options seem like reasonable choices, so we will review each of the packages for the following criteria:

  • Installation
  • Ease of Use
  • Support Options

Installation

The trial begins with an empty virtualenv environment. We have a blank slate into which we can install the projects. We will first attempt easy_install $PROJECT_NAME, and then easy_install $DOWNLOAD_URL. Any library that won’t easily install with easy_install is going to be a source of grief form any team or project that depends on it.

jabber.py

(xmpp)tim@prime:~/src/xmpp$ easy_install jabber.py
...
error: Could not find suitable distribution for Requirement.parse('jabber.py')
(xmpp)tim@prime:~/src/xmpp$ easy_install "http://downloads.sourceforge.net/jabberpy/jabberpy-0.5-0.tar.gz?modtime=1075826815&big_mirror=0" 
...
Finished processing dependencies for jabber.py==0.3-1

It appears that jabber.py isn’t in the PyPI index at all, and when we attempt to download and install jabber.py version 0.5-0, we are told that we have just installed version 0.3-1. I downloaded jabberpy-0.5-0.tar.gz and confirmed that setup.py reports the distribution as version 0.3-1. One other slightly confusing issue: the README file reports jabber.py as a GPL project, but the home page and setup.py files report it as LGPL. The xmpppy project also refers to jabber.py as LGPL. Best 3 out of 4 licensing?

The good news: jabber.py installation is painless on Windows, OS X, and Linux, as long as a download URL is provided.

PyXMPP

(xmpp)tim@prime:~/src/xmpp$ easy_install PyXMPP
...
Reading http://pypi.python.org/simple/
Reading http://pypi.python.org/simple/pyxmpp/
Reading http://pyxmpp.jabberstudio.org/
No local packages or download links found for PyXMPP
error: Could not find suitable distribution for Requirement.parse('PyXMPP')
(xmpp)tim@prime:~/src/xmpp$ easy_install http://pyxmpp.jajcus.net/downloads/pyxmpp-1.0.0.tar.gz
...
Finished processing dependencies for pyxmpp==1.0.0

PyXMPP is in the PyPI index, but the download URL is out of date and the download URL is not automatically discovered from the page at http://pyxmpp.jajcus.net. The package installs without issues once the download URL is passed to the easy_install program, assuming you are not using Windows. Windows users will likely see something like “Python was build with Visual Studio version 7.1, and extensions need to be build with the same version of the compiler, but it isn’t installed.” And with that, we’ve exceeded the scope of this tutorial.

Before we go further, are you running OS X? I know you OS X users were laughing at the Windows guys in the last paragraph because their inferior system left them hanging. Sure, PyXMPP installed just fine on your Mac Book Pro, but open an iTerm session right now and try to run a PyXMPP application. You will probably see ImportError: No module named libxml2, and if you don’t, you’ve run into this problem before and fixed it yourself. Let this be a lesson, karma always finds a way. Google may hold the answer to this problem, however, I would guess that Cuil does not.

So now we have, what, maybe a handful of Linux users still chomping at the bit to try PyXMPP? Let’s get going. One more package to install, and we’re off to the races.

(xmpp)tim@prime:~/src/xmpp$ easy_install dnspython
...
Finished processing dependencies for dnspython

xmpppy

Hey, you Windows and OS X users, you can come back now. This next one is a piece of cake for everybody:

(xmpp)tim@prime:~/src/xmpp$ easy_install xmpppy
...
Installed /home/tim/pyenvs/xmpp/lib/python2.5/site-packages/xmpppy-0.4.1-py2.5.egg
Processing dependencies for xmpppy
Finished processing dependencies for xmpppy

The latest version of the xmpppy project installs automatically with easy_install.

Ease of Use

We are about to write some code, and this is where any head to head competition can turn subjective. For the sake of this comparison, I have written a small base class that does the following:

  1. Create a chat bot
  2. Send an arbitrary message to an arbitrary user on a supplied list every 10 seconds
  3. Accept messages from users: print the message to the console, and thank the user.

Although the messages sent in this demonstration are fairly silly, the concept can easily be applied to real world applications. For instance, bug trackers could send notifications to developers, and developers could request status changes with the help of a chat bot.

We will create an implementation for each of the libraries under review. Before we begin, you may want to review the base class below. You can also check out the whole package from subversion.

jabber.py

Let’s dive in with jabber.py because it comes first alphabetically. That’s the kind of rigorous science we’re doing here, we alphabetize.

The JabberPyBot subclass proceeds in a pretty straightforward way that matches my experiences with writing Jabber client software in Java. We create a Client, connect, authenticate, and prepare to send and receive messages in the __init__ method.

I was able to send messages in very little time, but receiving messages was a little tricky. We registered a message handler when we created our Client object, and I thought that registering the handler would cause a glitter-laden jabber fairy to fly in on gossamer wings and sprinkle a little “we’ll handle these incoming messages automagically” dust over the code.

I ran the code a second time. Surely those dropped incoming messages were a fluke, but running unchanged code a second time produced exactly the same results. I was shocked. Shocked! Truthfully, I was just delaying the inevitable RTM moment. After reading through the example a second time, I saw the process method. That’s the ticket, when called it processes queued messages and fires off the appropriate handlers. We wire the process method into a receiver method that will run in its own thread, and we’re in business.

PyXMPP

The PyXMPP client was the last of the three clients written. So much for rigorous, alphabetized, science, eh? This code was a nice change of pace since the jabber.py and xmpppy clients were so similar, as you will soon find out.

The code for the PyXMPP implementation is just as short as our other two implementations, but something feels a little funny about it. Notice how we need to keep digging into our connection (JabberClient) object and grabbing its stream object to get stuff done? I don’t know about you, but that feels a little dirty. Matter of fact, there’s even a name for the “rule” that we are breaking. Take note, and during your next code review at the office when you see something similar you can say “I don’t know about that, Barry, the AbstractDatabaseManagerManagerFactory doesn’t adhere to the Law of Demeter very well, and I think it will be a maintenance nightmare down the road. Are you going to support this stuff when the API changes in the future?” But who are we kidding? Barry is such an old crank that it’s easier to fix his stuff after it’s been deployed than argue with him about his crazy design right now at 10:55. It is so close to lunch, and we can’t be late for that. Man, I’m hungry. A gyro sounds really good.

What were we talking about? Oh, right, our PyxmppBot and the code that talks to more than just its immediate friends. I feel like we’re giving the library a quick brush off just because it doesn’t adhere to the world view that we’ve adopted in the other two client examples. Let’s take another look at the PyXMPP example. We have formed the opinion that our JabberBot “has-a” connection/client object and interacts with it, but the PyXMPP programmers seem to think that a bot “is-a” client/connection object. The have been doing this stuff longer than we have, so let’s give it one more try.

That feels a little bit cleaner. If Demeter is happy, I’m happy.

xmpppy

The xmpppy client was shockingly similar to the jabber.py client, but the similarities were explained away upon reading the xmpppy home page where it mentions that some of the code and API decisions came directly from jabber.py. That just goes to show, when in doubt, read the docs.

One thing I didn’t quite understand about the xmpppy API was the choice of method names. Why are some methods given capitalized names, while others are lowercase? This is probably a silly complaint on my part, but I just know I will spend a lot of time second guessing the capitalization of method names if I use this library in my applications.

Reading the xtalk.py source provided most of the pointers that I needed to get our client up and running, with only one minor issue that stood in the way. The example calls a method named SendInitPresence which was since changed to sendInitPresence. Again, this made me wonder about the naming conventions in use.

Support and Growth Options

The example we worked through today was pretty basic. Where do these packages leave us as our skill and requirements grow?

jabber.py

Jabber.py is dead, dead, dead, and they are kind enough to tell us that right on the front page. If that’s the case, why did we look into it? Well, it works, and it isn’t licensed under the GPL. The working bit matters to everyone, and the GPL bit matters to some. If you have fairly simple needs and a list of requirements that includes “NO GPL!”, then this may just be the project for you.

PyXMPP

PyXMPP under active development, and it is a capable performer that seems to have a lot of power available for developers. For instance, as your bots grow, you can transfer your efforts into server side Jabber components, and the PyXMPP API will support your efforts. Of course, the fact that a large majority of developers will need to jump through a few hoops to install the library will be a source of frustration.

xmpppy

The xmpppy project’s last release was in December of 2007, and the mailing list remains active. It installs easily on Windows, OS X, and Linux, and it is a fairly easy to grasp library that could be quickly integrated into existing applications. The fact that it is GPL licensed could cause heartache for some projects or corporations, so proceed with caution if you are working in such a situation.

Conclusion

Sometimes the best way to pick a library is to sit down and write a little bit with each of the contenders. A little bit of code, and a little bit of reading goes a very long way toward a sound design decision. Sometimes these experiments lead us to question our initial assumptions. In this case, I am reconsidering Twisted to handle my XMPP needs since it is mature, under active development, and installs quickly on all of my potential target platforms.

What did I do right? What did I get wrong? What did I miss? Your comments are appreciated!

Submit to programming.reddit.com

Bookmark and Share

Posted in , ,

Schema Migrations in TurboGears with Migrate

Posted by Tim Freund Mon, 06 Aug 2007 00:42:00 GMT

I learned about the importance of schema migrations the hard way. At my previous job, I helped a team upgrade a Java web application. The upgrade involved schema changes, and I had the forethought to script the upgrade and thoroughly test it on the development database. Even with that preparation, the night of the upgrade would teach me two important questions that new developers should always ask of their team. Does the development database schema match the production database schema? And do you know how to restore the database from backups should anything go wrong? The answers to both questions on that night were no and oh no. Sometime after midnight things started working.

A better way existed. I first learned of schema migrations a few months earlier when exposed to ActiveRecord::Migrations. After using them on several projects, I was itching to have the same capability in my Python and Java projects. The Pythonic answer came in the form of Migrate, a schema migration library for SQLAlchemy, and direct support for TurboGears was added with TGMigrate. Having migrate integrated with my projects greatly reduces my blood pressure on deployment days. Of course, I still make sure that my database backups are working.

Those interested in integrating Migrate with a TurboGears project might enjoy the screencast I just completed on the topic. If you use SQLAlchemy but avoid Migrate, I would be interested to hear what is holding you back.

Schema refactoring and migration was one of three topics at the last DotNext Kansas City Tech Coffee meeting. Notes on the schema migration talk were posted to Squidoo.

My apologies to those with small screens. I will record my next screencast in a smaller window. Your comments on today's screencast are appreciated.

Bookmark and Share

Posted in , , ,

Change Your Identity in TurboGears with Entry Points

Posted by Tim Freund Thu, 14 Jun 2007 03:15:00 GMT

Paper
  PressIdentity defines who we are. Identity is made up of all the little distinguishing traits that differentiate one person from another. We've all changed our identity throughout our lives. We change from student to graduate, single to married, dogless to dogged, and more, but that's not what we're talking about today. Identity is the authentication and authorization framework for TurboGears, and it is easy to extend.

At the core of the Identity framework is an IdentityProvider. The Identity Provider interfaces with an authentication and authorization repository to determine two things: are you who you say you are, and do you belong where you are trying to go. The framework comes with two providers, one each for SQLObject and SqlAlchemy.

We will customize an IdentityProvider to authenticate against an IMAP server in the few steps that follow. This would be helpful for writing a web mail application, and the concept can be applied to other authentication mechanisms as well, including LDAP, Radius, and others.

Action Plan

  1. Quickstart a project
  2. Create an identity provider
  3. Define an entry point for the identity provider
  4. Configure the application for the new provider
  5. Finish the identity provider
  6. Test
  7. Relax

The code for this tutorial is available from subversion or as a tar.gz file. It is released under the MIT license. No TurboGears installation? Install it like so.

Step 1: Quickstart a Project

If you don't have an existing TurboGears project to experiment with, now would be a great time to start one.

tim@iris ~/src $ tg-admin quickstart -s -i iddemo
...
tim@iris ~/src/ $ cd iddemo
tim@iris ~/src/iddemo $ tg-admin sql create

Note the -s and -i flags. This is a project with support for SqlAlchemy and Identity.

Step 2: Create an Identity Provider

Any object can be an identity provider as long as it supplies the following methods: validate_identity, validate_password, load_identity, anonymous_identity, authenticated_identity, but it isn't always necessary to write one from scratch. Extending an existing provider often gets an application authenticating as required without much trouble. We will extend the SqlAlchemyIdentityProvider in this example to authenticate against an IMAP server.

iddemo/identity.py
from turbogears.identity.saprovider import SqlAlchemyIdentityProvider

class ImapSqlAlchemyIdentityProvider(SqlAlchemyIdentityProvider):
    pass

Step 3: Define an Entry Point

The Identity Framework uses an entry point named turbogears.identity.provider to decide what class to use when authenticating users. We are about to define a new option for this entry point, but further reading on the subject of entry points is recommended. Scroll to the bottom of this entry for a couple of relevant links. It's OK, we have the time.

This entry point step won't be necessary in the future, thanks to this patch, but entry points are a powerful tool and worth learning, regardless.

setup.py
setup(
    name="iddemo",
... (more setup stuff) ...
    entry_points="""
    [turbogears.identity.provider]
    imapsa = iddemo.identity:ImapSqlAlchemyIdentityProvider
    """,
... (more setup stuff) ...
)
  

To let the setuptools system know about this new identity provider, run the following:

tim@iris ~/src/iddemo $ python setup.py develop

Step 4: Configure the Application

With our imapsa option defined for the turbogears.identity.provider entry point, we can now configure the application to call the new provider. There is a value named identity.provider in app.cfg. We will replace the existing value with imapsa. While app.cfg open is, add the other three lines in the following example. They will be explained in the next step.

iddemo/config/app.cfg
...

identity.provider='imapsa'

identity.imapprovider.imap_authoritative=True
identity.imapprovider.server="localhost"
identity.imapprovider.port=143

...

  

The application is now ready to run with the new identity provider. Restart the application if it is currently running so that the configuration change will take effect.

Step 5: Finish the Identity Provider

Now let's dig in and implement the new authentication behavior.

iddemo/identity.py
...

class ImapSqlAlchemyIdentityProvider(SqlAlchemyIdentityProvider):
    def __init__(self):
        SqlAlchemyIdentityProvider.__init__(self)
        
        # These three lines get the configuration parameters we set in app.cfg
        self.imap_authoritative = get("identity.imapprovider.imap_authoritative", False)
        self.server = get("identity.imapprovider.server", "localhost")
        self.port = get("identity.imapprovider.port", 143)

        # These four lines make the user and visit classes available for
        # later use
        user_class_path = get("identity.saprovider.model.user", None)
        self.user_class = load_class(user_class_path)
        visit_class_path = get("identity.saprovider.model.visit", None)
        self.visit_class = load_class(visit_class_path)


    def validate_identity(self, user_name, password, visit_key):
        if self.validate_password(None, user_name, password):
            user = session.query(self.user_class).get_by(user_name=user_name)
            if not user:
                if self.imap_authoritative:
                    user = self.user_class()
                    user.user_name = user_name
                    user.save()
                    session.flush()
                else:
                    return None
            link = session.query(self.visit_class).get_by(visit_key=visit_key)
            if not link:
                link = self.visit_class(visit_key=visit_key, user_id=user.user_id)
                session.save(link)
            else:
                link.user_id = user.user_id
            session.flush()
            return SqlAlchemyIdentity(visit_key, user)
        return None

    def validate_password(self, user, user_name, password):
        rc = False
        try:
            imapcon = imaplib.IMAP4(self.server, self.port)
        except:
            log.error("Could not establish connection to server at %s:%d" % (self.server, self.port))
            return rc

        try:
            if imapcon.login(user_name, password)[0] == 'OK':
                rc = True
        except:
            # Probably threw an error for invalid username/password
            log.info("Passwords don't match for user: %s", user_name)
        imapcon.shutdown()
        return rc

Take a look at each method to figure out what they accomplish.

__init__ invokes the constructor of the SqlAlchemyIdentityProvider and then collects the configuration parameters required for authentication.

validate_identity first invokes validate_password. If the password validate succeeds, the user is selected from the database. If the user does not exist in the database, the provider will create the user when the imap_authoritative option is set. Should you not require this capability, you can remove this method entirely and rely upon the implementation in SqlAlchemyIdentityProvider. Finally, this method links the user with the current visit_key.

validate_password handles all IMAP access. It is the perfect method to override if all you need to do is change the identity authentication mechanism.

Step 6: Test

Open controllers.py and change the identity decorator to require an authenticated user when accessing the index page.

iddemo/controllers.py
class Root(controllers.RootController):
    @expose(template="iddemo.templates.welcome")
    @identity.require(identity.not_anonymous())
    def index(self):
        import time
        flash("Your application is now running")
        return dict(now=time.ctime())

The moment of truth has arrived. Point a browser at the application. You should see a login page, and a valid IMAP username/password should provide access to the application.

Step 7: Relax

We're done here. Go tell your friends and family how cool you are.

Any feedback is appreciated. Leave a comment here, post a response on your own blog, or send an email to tim -at- achievewith -dot- us. You can also usually find me lurking in #turbogears on irc.freenode.net as timphnode.

For more information, follow these links:

Bookmark and Share

Posted in , , ,

Produce PDF Pages with TurboGears, Cheetah, and ReportLab

Posted by Tim Freund Thu, 22 Feb 2007 02:10:00 GMT

Paper PressHTML is king of Web 2.0, and JavaScript is HTML's chief advisor and errand boy. But the printed page still matters, especially when working with Web 0.2 requirements and processes, and PDF documents provide crisp and consistent print output across platforms.

Follow along with the rest of this document to see how to produce PDF documents with the help of TurboGears, Cheetah, and ReportLab. We will generate form letters in response to a job opening, and in Part II we will print mailing labels for our letters.

The code for this tutorial is available from subversion or as a tar.gz file. It is released under the MIT license.

The Tools

  • ReportLab is a Python library that creates professional PDF documents with minimal developer effort. It can be used in any Python application. The <br/> tag doesn't seem to work with the stable release, so try the snapshot instead.
  • Cheetah is a flexible template engine that can generate arbitrary documents. It is supported by most Python web frameworks. TurboGears support is provided by the TurboCheetah package.
  • TurboGears is a web framework that aims to make creating web apps faster, easier, and more fun. It brings together a wide range of powerful Python libraries into a coherent whole.

The Exercise

We will generate acceptance and rejection letters on behalf of Non Compos Mentis Research, a mysterious scientific research organization located somewhere underneath the the Midwestern United States. They were seeking applicants for an "Evil Genius, Sr." position after their long time employee, Dr. Phinius Fraggenblam, had an accident that lead to early "termination." All of the interviews are complete, the Board has made its decisions, and we have all of the applicant information we need in a model class named pdfdemo.model.Applicant.

This demonstration breaks into three logical steps. We need to

  • Create a static PDF just to prove that we can
  • Insert data into the PDF
  • Serve the PDF to a web browser

Step 1: Generate a PDF from a Text File Using ReportLab

ReportLab allows programmers to specify every detail as they write code to generate PDF documents. For the [pragmatic | impatient | lazy] developer, ReportLab also provides an abstraction layer named PLATYPUS, short for "Page Layout And Typography Using Scripts." PLATYPUS provides a collection of "flowable" elements including paragraphs and page breaks that can be inserted into templated documents.

The Paragraph is a powerful Flowable element. In addition to accepting a style definition from reportlab.lib.styles, the Paragraph element also understands a small set of intra-paragraph XML markup.

Two brief examples follow:

pdfdemo.letters.reject_plain
<para>Dear Sir/Madam:</para>

<para>After careful consideration, we regret to inform you that your application
was <font face="times" color="red">rejected</font>.</para>

<para>Please try again next year.  <b>Or else</b>.</para>
<para>Best regards,</para>
<para>The Board<br/>
   Non Compos Mentis Research</para>
   
pdfdemo.letters.accept_plain
<para>Dear Sir/Madam:</para>

<para>Your application rose to the top of the millions and millions that we 
received.  We are pleased to extend to you a position on our dynamic team.</para>

<para>We will come for you in the night, unannounced.  Be prepared.</para>
<para><i>Congratulations!</i></para>

<para>The Board<br/>
   Non Compos Mentis Research</para>
   

Given the above, we can generate a static PDF with two pages: one acceptance letter, one rejection letter.

pdfdemo.printjobs.generate_plain_document()
def generate_plain_document():
    """Generates a PDF with two pages, an accept and reject letter.
    
    Output is hardcoded to output.pdf in the current working directory
    """
    output_file = open("output.pdf", "w")
    document = SimpleDocTemplate(output_file)
    components = []
    
    sheets = styles.getSampleStyleSheet()
    # increase font size to 14 points
    sheets['Normal'].fontSize = 14
    # increase line height to 16 points
    sheets['Normal'].leading = 16
    # increase space after a paragraph to 16 points.
    sheets['Normal'].spaceAfter = 16
    paragraphStyle = sheets['Normal']
    
    for letter in (letters.accept_plain, letters.reject_plain):
        # There are multiple paragraph elements (para) defined in our 
        # documents, and each Paragraph object can only accept one.
        # Fortunately ElementTree is a requirement component of TurboGears,
        # and it makes short work of splitting up the document by para element.
        doc = ElementTree.XML(letter)
        for para in doc.findall('para'):
            components.append(Paragraph(ElementTree.tostring(para), paragraphStyle, None))
        components.append(PageBreak())
    
    document.build(components)
    output_file.close()

Step 2: Use a Cheetah Template to Insert Custom Data into the PDF

Now there is the small matter of inserting custom bits of data into the text before ReportLab does the voodoo that it does so well. Note that ReportLab is generating the PDF from a text string.

Also note that Cheetah can generate arbitrary text strings from templates and data with grace and aplomb. Since TurboGears is our target platform, TurboCheetah is installed, and that makes things even easier. Data available to a Cheetah template is passed as a dictionary into the TurboCheetah render method. Variables are rendered in Cheetah templates like so: ${my_variable.some_property}.

Below are examples of a Cheetah template, and a function invoking the TurboCheetah.render method.

pdfdemo.letters.accept.tmpl
<doc>
<para>${applicant.prefix} ${applicant.full_name()}<br/>
${applicant.address1}<br/>
#if $applicant.address2 != None
${applicant.address2}<br/>
#end if
${applicant.city}, ${applicant.state} ${applicant.postal}</para>

<para>Dear ${applicant.prefix} ${applicant.full_name()}:</para>

<para>Your application rose to the top of the millions and millions that we 
received.  We are pleased to extend to you a position on our dynamic team.</para>

<para>We will come for you in the night, unannounced.  Be prepared.</para>

<para><i>Congratulations!</i></para>

<para>The Board<br/>
Non Compos Mentis Research</para>
</doc>
pdfdemo.printjobs.generate_lettersv1()
def generate_lettersv1(applicants):
    tcheetah = TurboCheetah()
    pages = []
    # render the templates
    for person in applicants:
        letter_data = {"applicant": person}
        # convert status to a string because render doesn't
        # appreciate unicode template names
        template_name = "pdfdemo.letters.%s" % person.status.__str__().lower()
        rendered = tcheetah.render(letter_data, template=template_name)
        pages.append(rendered)
    
    paragraphStyle = get_stylesheets()['Normal']
    # assemble the PDF from the rendered templates
    components = []
    for page in pages:
        page = ElementTree.XML(page)
        for para in page.findall('para'):
            components.append(Paragraph(ElementTree.tostring(para), 
                                        paragraphStyle, 
                                        None))
        components.append(PageBreak())
    
    # generate the PDF
    output_file = open("output.pdf", "w")
    document = SimpleDocTemplate(output_file)
    document.build(components)
    output_file.close()

Step 3: Render the PDF to a Browswer with TurboGears

The hard work is done, but the code still generates a file on the file system, and we need to view the file in a browser. ReportLab can work with any object that acts like a file, including a StringIO Buffer. A new function, generate_lettersv2, accepts a file like object rather than opening a file itself. The controller then calls generate_lettersv2 with a StringIO object as the "file". This is a breeze.

pdfdemo.controllers.Root.letters()
    @expose(content_type="application/pdf")
    def letters(self, **kw):
        letters_file = StringIO()
        generate_lettersv2(Applicant.select(), letters_file)
        pdf = letters_file.getvalue()
        letters_file.close()
        return(pdf)

Some really old browsers require the URL of a PDF document to end with .pdf. This is accomplished by implementing a default method for the controller.

pdfdemo.controllers.Root.default()
    @expose()
    def default(self, *path, **kw):
        method_name = path[0]
        if method_name.find('.') != -1:
            method_name = method_name.split('.')[0]
            method = getattr(self, method_name, None)
            if method != None:
                return(method(**kw))
        raise cherrypy.NotFound

Get Up and Stretch. Grab a Soda.

This article has detailed the steps required to produce templated PDF documents using TurboGears, Cheetah, and ReportLab. In the next installment we will generate mailing labels so that the letters we generated today are not lost due to poor handwriting skills. The Non Compos Mentis organization does not tolerate mistakes.

Any feedback is appreciated, both on the technical content and the silly back story. Leave a comment here, post a response on your own blog, or send an email to tim -at- achievewith -dot- us. You can also usually find me lurking in #turbogears on irc.freenode.net as timphnode.

For more information, follow these links:

Bookmark and Share

Posted in , , ,

System.out deemed Unreliable

Posted by Tim Freund Wed, 15 Nov 2006 04:47:00 GMT

Actually, System.out fulfills its purpose perfectly. Data goes in through print methods, and goes out to the console every single time. Before you complain about the misleading title, let me finish. Although great at printing, System.out is a completely unreliable test framework.

Who in their right mind uses System.out as a test framework? At no place in the javadocs does it mention code testing as an alternative use of System.out. How about a quick show of hands. How many people have written code like this:

System.out.println("widget.name == " + session.getWidget().getName());
I have, I admit. Quite often, actually. Printing a few statements is often times the easiest way to become familiar with a new API. When faced with particularly confusing libraries, especially closed source libraries, it is often helpful to do a little bit of detective work. Often times this detective work morphs into a tiny self standing program:
public class CogAdapter{
  /*
   * other code goes up here...
   */ 

  public static void main(String[] args){
    // the intuitive naming scheme in the catalog is awesome... ugh.
    Cog c = CogswellCogFactory.getCog(CogswellCatalog.X43RNT);
    Sprocket s = SpacelySprocketFactory.getSprocket();

    try{
      c.attach(s);
      System.out.println("cog.attach(sprocket) worked");
      System.out.println("cog.attachedObjects.length == " + 
                          c.getAttachedObjects().size());
    } catch(RatioException re){
      System.out.println("cog & sprocket ratios incompatible");
      System.out.println("cog.ratio == " + c.getRatio());
      System.out.println("sprocket.ratio == " + s.getRatio());
    } catch(JetsonException je){
      System.out.println("manufacturing defect in sprocket: " + je.getMessage());
    }
  }
}

Right, right, cogs don’t really attach directly to sprockets. That’s why this example is in the CogAdapter class. Duh. I didn’t want to pass up a fun Jetsons reference just because sprockets and cogs are not directly interchangeable. The object model of Fred Flintstone’s rock quarry just wasn’t a good fit for use in this example.

At this point System.out has evolved into the cornerstone of a miniature testing framework. Not a good testing framework, but one none the less. This output driven test framework requires labor intensive manual intervention for the tests to have any value. A person must manually run the tests and then manually verify the printed results with the intent of the code.

When I would write tests like this, I would get the test to work, satisfy my original curiosity, and then the test would never run again. At least not until the vendor upgraded their library and changed behavior. Even then, I had to actually remember that I wrote such a test when first working on the project before that test would again be called into service.

I saw the answer to these problems for years before I acted to remedy the situation. These little System.out driven test programs were perfect candidates to become JUnit tests, but I didn’t make that connection. I had only ever heard the “Test First” mantra, and I was obviously already knee deep in source code, so I thought that unit tests weren’t for me. Watching two of Mike Clark’s presentations at a NFJS conference in 2004 got me excited about integrating unit tests into my existing code.

How does JUnit improve upon our System.out test framework?

  • JUnit tests can run one at a time, or all at once
  • Every major Java IDE understands how to run JUnit
  • Every major build system supports running JUnit tests, either through built in plugins or trivial customization.
  • The mind numbing task of verifying results is left up to the computer.

Let’s transform the Sprocket/Cog test above into a real JUnit test.

public class CogAdapterTest extends TestCase {
  public void testAttach(){
    Cog c = CogswellCogFactory.getCog(CogswellCatalog.X43RNT);
    Sprocket s = SpacelySprocketFactory.getSprocket();

    try {
      c.attach(s);
      assertNotNull(c.getAttachedObjects());
      assertEquals(1, c.getAttachedObjects().length());
    } catch(RatioException re){
      assertEquals(c.getRatio(), s.getRatio());
    } catch(JetsonException je){
      fail("Incompetent Manufacturer");
    }
  }
}

No System.out references in sight. Now the computer will only produce output if something goes wrong, and once JUnit is scheduled to run as a part of the build process, your project will never go untested again.

System.out is great at printing data, but it makes a lousy test framework.

Ready to learn more? Check out these resources:

JUnit

Pragmatic Unit Testing in Java with JUnit or in C# with NUnit.

A Dozen Ways to Get the Testing Bug in the New Year by Mike Clark.

Bookmark and Share

Posted in ,  | no trackbacks

Prototype for the Proto-programmer

Posted by Tim Freund Tue, 06 Jun 2006 04:05:00 GMT

So you’re a web designer afraid of a little code?  Or maybe you really are a developer, but there are some static pages that need a little dynamic boost.  The Prototype JavaScript library enables dynamic behavior in pages created by those of you who don’t think you can code as well as those of you who aren’t allowed to code.  Perhaps a case study is in order:

Zack runs a website for a shelter that finds homes for older dogs dogs.  Dogs whose owners are "moving", or who ran away from home, and even the seldom seen blind seeing eye dog.  Don’t worry, the blind seeing eye dogs are obviously retired.  It is tough to place these animals in a caring home since so many people would prefer to spend hundreds of dollars on a pedigreed animal, complete with a lifetime of bad hips and an increased likelihood of tumors, but Zack and the crew at Red Rover Rescue carry on.  At least once a week they take an assortment of animals out to local events and pet stores, hoping that some kind soul will take Miss Growlypants home. It is very helpful to have a calendar of events on their home page so that volunteers and potential adopters alike know when and where the dogs will be out.

Zack knows HTML, CSS, and a little bit of JavaScript.  He has access to pre-installed calendar programs, but he doesn’t know how to marry his dynamic calendar with his oh so static shelter web page.

Can we help Zack?  Of course we can, otherwise this would be the lamest tutorial in the world.  There is a popular JavaScript library called Prototype.  Prototype wraps up a lot of inconsistencies among different web browsers and makes accomplishing some rather dynamic stuff with JavaScript quite simple.  Among it’s many tricks, Prototype can load files and call programs on the server into the current page.  That’s just what Zack needs.  Zack does the following to make things work:

  1. Downloads the Prototype library and puts the prototype.js file in his javascripts directory
  2. Inserts the following line into the HEAD section of his HTML file to load the library:
<script src="./javascripts/prototype.js" type="text/javascript"></script>

The Prototype is nice, but it isn’t quite magic.  Zach still needs to write a little bit of JavaScript to glue everything together.  He wants the calendar loaded when the page loads, so he writes the following function:

// the first argument is the name of the div where the results of the remote call will end up
// the second argument is the url to retrieve
// we'll talk about the third argument in a later lesson
function loadFirstTarget(){
  new Ajax.Updater('firsttarget', './lesson01-calendar.html', {asyhnchronous:true, method:"GET"});
}

Noting that the Ajax.Updater will look for a div with the id ‘firsttarget’, Zack adds one to his page where he wants the calendar to appear:

<div class="target" id="firsttarget"></div>

Zack then changes the body element of his page to have an onload attribute:

<body onload="loadFirstTarget();">

There’s one other thing—Zack wants people to be able to click a button to see the dog of the month.  He adds another function to the page:

function loadSecondTarget(){
  new Ajax.Updater('secondtarget', '/lesson01-dotm.html', {asyncrhronous:true, method:"GET"});
}

And then he creates a form with a single button that fires loadSecondTarget() as well as the ‘secondtarget’ div:

<form>
  <input type="button" value="See the Dog of the Month!" onclick="loadSecondTarget();">
</form>

<div class="target" id="secondtarget"></div>

At this point, you probably want to see the whole thing in action

To recap, let’s put you in Zack’s shoes.  If the following fits your situation then you just learned a new trick that you can apply to your site today:

  1. You have a pretty static website.
  2. You have some dynamic web applications like a calendar or some forums.
  3. You’d like to take some of the content from the dynamic bits and put them in the static bits, but you don’t want to turn the static bits dynamic with a bunch of programming.
  4. JavaScript doesn’t scare you.

If that’s the case, you should roll up your sleeves and implement the technique outlined in this tutorial, if you haven’t done so already. 

So what do you think?  Leave a comment to let me know how it works out for you.  You can also contact me directly if you’d like to talk about work that needs to be done on your own site that you’d rather not do yourself.

Bookmark and Share

Posted in ,