Planet Pypefitters

July 03, 2009

Chris McDonough

Random Things I've Learned While Writing Tech Docs

Here are some random things I've learned while writing technical documentation:

  • When talking about concepts, using singulars makes for clearer reading than using multiples. For example, rather than composing a sentence like this:

    "Applications built with foobar may be frobnobbed using fleebars."

    It's usually clearer to say something like:

    "An application built using a foobar can be frobnbbed using a fleebar."

    Personally, I find keeping to the singular keeps things concrete, and prevents the text from careening into the abstract too quickly.

  • Don't be afraid to repeat yourself. Use pronouns sparingly. For example, it's often clearer to say: "The cat sleeps on the sofa. The cat is a tommy" rather than "The cat sleeps on the sofa. It is a tommy." Is the cat or the the sofa a tommy? The first form is repetitive, but it's crystal clear that the cat is a tommy.
  • Examples beat any amount of purely narrative explanation.

by chrism at July 03, 2009 03:33 AM

June 18, 2009

Tres Seaver

They Call it Stormy Monday, but Tuesday's Just as Bad

I just uploaded a tutorial on using Storm with BFG to the BFG tutorial bin. The application is traversal-based, and uses the storm.zope-based transaction integration, in combination with repoze.tm2, to make for a very pleasing little CRUD app.

by tseaver at June 18, 2009 07:04 PM

June 05, 2009

Repoze Project

repoze.who / repoze.what Featured in Python Magazine

Gustavo Narea reports:

A few months ago I wrote an article on repoze.who and repoze.what, which has just been published on Python Magazine: http://pymag.phparch.com/c/issue/view/98.

I believe it's a good resource for those who are new to both frameworks (even if they aren't familiar with WSGI or auth in general yet), as well as for current users to better understand how repoze.who/what work and so make the most out of them.

June 05, 2009 09:53 PM

June 04, 2009

Repoze Project

repoze.urchin 0.1 release

I just whipped up a tiny bit of middleware for OSI's version of the KARL project: they needed to put markup for Google Analytics into each page served. We wanted to avoid adding anything to the core KARL software which imposed such a policy on other users of KARL; we might have added it via the "customization package" for OSI, but it would have still required changing the core software in ways that seemed too invasive.

This seemed like a natural use for WSGI middleware, which could intercept the outbound respond and add the additional markup, without requiring any changes to the application. It turns out that the implementation is simple, cheap and generic: it would be useful for any WSGI-based deployment which wants to use Google Analytics.

So, I released it to the Cheeseshop, with the minimal docs also served from there.

Enjoy!

- Tres Seaver

June 04, 2009 04:05 PM

June 03, 2009

Paul Everitt

Fun and profit with middleware


Yesterday we were working on the KARL project, doing some post-deployment housekeeping.  Specifically, we had a checkout of the templates that had a local customization (injecting Google Analytics) that we didn’t want to check in.  At least not to the software repository itself.  We wanted building a demo KARL to have no analytics, certainly not OSI’s account.

The most logical thing would have been to throw ZPT at it and get a little snippet of HTML to jam in just before closing the body tag.  But that would mean going to a number of places and injecting calls, plus we’d have to grab the configuration data from somewhere for the right snippet.

Tres and Chris Rossi argued for middleware: something that would watch the outgoing HTML and inject Google Analytics in the appropriate circumstances.  An hour later, Tres had written repoze.urchin that parameterized in the Paste configuration file the data, then hacked the HTML on the way out.

When to use and not use middleware is an art that I’m still learning about.  The biggest two rules appear to be, don’t solve a problem with middleware if the application won’t run without it, or if the middleware requires access to information inside the application.

by Paul Everitt at June 03, 2009 07:32 PM

June 01, 2009

Paul Everitt

The marketing value of developer docs


Last week at the Plone Symposium East I gave a talk on the KARL project that I’ve been working on.  The basic meme: Plone and its large ecosystem provide a ton of value when your needs match up with its bulls-eye.  What should one do when your needs don’t fit so well into Plone-the-product’s box?

My talking point was, we need to discourage expanding Plone’s bulls-eye to cover generic platform development of any possible application.  Instead, encourage the meme that the technologies (and effort expended learning them) can be used to make a targeted product.

The KARL project adopted that thinking in its switch to BFG (to good effect, as we then focused on building the best KARL we could.)  In describing BFG’s goals, I lifted one directly from Chris:

Documentation: The lack of formal documentation of a feature or API is a bug.

I then went on to explain that Chris released the documentation for BFG before releasing the software, and has made an enormous, constant effort at keeping the wide-ranging docs (API, narrative, example applications) up-to-date as he has refactored.

In making the point, I posited that “Friendly, ample docs make a positive first impression” is part of the reason for swift uptake of Django and other Python web frameworks.  Chris pointed me to a survey that makes that point in spades.

Further confirmation came during the BFG tutorial Tres and I did last week.  Eric Rose clicked the link to the BFG docs in the comprehensive BFG Wiki tutorial and had a very visible positive reaction on his face.

by Paul Everitt at June 01, 2009 06:31 PM

May 31, 2009

Ben Bangert

Making a better TextArea

I’ve been working on some Javascript to enhance the TextArea elements on the PylonsHQ Snippets section, and have noticed that… well, TextArea’s suck.

The hack I’ve seen is to use one of the newer features of browsers, the editable or ‘design’ attribute’s for div elements I believe. This lets one build a very snazzy amount of features, such as syntax highlighting, code completion, etc., but I don’t think I needed to go that far.

I only have one main design goal, this TextArea is for the user to enter RestructuredText so it’d be awesome if the TextArea acted in a way that made rst a bit nicer. The obvious two things that came to mind:

  1. Tab key indents 4-spaces
  2. Hitting return on an indented line, will retain the indentation on the next line

I’ve actually gotten some Javascript, hobbled together from various parts of the net, along with an ‘enter’ key handler I wrote myself on bitbucket.

Course, it’d also be nice to have a button or key combo, that will indent/unindent a selection in the TextArea as well.

Anyone else have any Javascript they’ve hobbled together in the past to make TextArea a little nicer for restructured text?

by ben at May 31, 2009 12:42 AM

May 23, 2009

Ian Bicking

WebOb decorator

Lately I’ve been writing a few applications (e.g., PickyWiki and a revisiting a request-tracking application VaingloriousEye), and I usually use no framework at all. Pylons would be a natural choice, but given that I am comfortable with all the components, I find myself inclined to assemble the pieces myself.

In the process I keep writing bits of code to make WSGI applications from simple WebOb -based request/response cycles. The simplest form looks like this:

from webob import Request, Response, exc

def wsgiwrap(func):
    def wsgi_app(environ, start_response):
        req = Request(environ)
        try:
            resp = func(req)
        except exc.HTTPException, e:
            resp = e
        return resp(environ, start_response)
    return wsgi_app

@wsgiwrap
def hello_world(req):
    return Response('Hi %s!' % (req.POST.get('name', 'You')))

But each time I’d write it, I change things slightly, implementing more or less features. For instance, handling methods, or coercing other responses, or handling middleware.

Having implemented several of these (and reading other people’s implementations) I decided I wanted WebOb to include a kind of reference implementation. But I don’t like to include anything in WebOb unless I’m sure I can get it right, so I’d really like feedback. (There’s been some less than positive feedback, but I trudge on.)

My implementation is in a WebOb branch, primarily in webob.dec (along with some doctests).

The most prominent way this is different from the example I gave is that it doesn’t change the function signature, instead it adds an attribute .wsgi_app which is WSGI application associated with the function. My goal with this is that the decorator isn’t intrusive. Here’s the case where I’ve been bothered:

class MyClass(object):
    @wsgiwrap
    def form(self, req):
        return Response(form_html...)

    @wsgiwrap
    def form_post(self, req):
        handle submission

OK, that’s fine, then I add validation:

@wsgiwrap
def form_post(self, req):
    if req not valid:
        return self.form
    handle submission

This still works, because the decorator allows you to return any WSGI application, not just a WebOb Response object. But that’s not helpful, because I need errors…

@wsgiwrap
def form_post(self, req):
    if req not valid:
        return self.form(req, errors)
    handle submission

That is, I want to have an option argument to the form method that passes in errors. But I can’t do this with the traditional wsgiwrap decorator, instead I have to refactor the code to have a third method that both form and form_post use. Of course, there’s more than one way to address this issue, but this is the technique I like.

The one other notable feature is that you can also make middleware:

@wsgify.middleware
def cap_middleware(req, app):
    resp = app(req)
    resp.body = resp.body.upper()
    return resp

capped_app = cap_middleware(some_wsgi_app)

Otherwise, for some reason I’ve found myself putting an inordinate amount of time into __repr__. Why I’ve done this I cannot say.

by Ian Bicking at May 23, 2009 02:35 AM

May 21, 2009

Paul Everitt

Some info about KARL, the project I’ve been working on


For the last few years I’ve been working with some great folks at the Open Society Institute on a project called KARL.  It’s now open source and has a website with some preliminary information, which means I can chat about it in advance of my presentation next week at the Plone Symposium.

In a nutshell, KARL is a collaboration system for projects and organizations.  We are just wrapping up KARL3 (a rewrite to convert from Zope/Plone to Zope-like BFG application) and we’re doing the migration work.    There’s quite a bit to chat about, so look for some more blog posts as we finish up the process.

by Paul Everitt at May 21, 2009 12:18 AM

May 17, 2009

Chris McDonough

Personal Priority Inversion

According to my Ohloh stats:

by chrism at May 17, 2009 08:58 PM

May 14, 2009

Noah Gift

My Kiwi Foo Experience



In November, I moved to New Zealand, and I was very happy to have been graciously invited to Kiwi Foo by Nat Torkington. As a Python programmer, it was quite a bit of fun to mingle with a wide range of people on the cutting edge of technology

It was held up near Auckland, New Zealand area, and it was an incredibly fun weekend. The Webstock conference was held the same weekend, so a lot of the people in town for that conference also came to Kiwi Foo. This was great, as I got to meet several interesting people who might not have been there otherwise.


View Larger Map

One thing I regret a tad, is that I didn't do a talk on anything in Python, as the board filled up so quickly I couldn't squeeze anything in by the time I decided to talk about something. I guess this is one of the lessons you learn as a newbie to a Foo style camp, you have to be quick!

One of the interesting things about being one of the few, if not only, people who made there living programming in Python was to listen to some of the conversations about Python that sprung up here and there. Google App Engine in particular seems like it is bringing a huge assortment of people to the Python language, and these aren't your everyday Joe Blows, a lot of them help make the technical decisions for their company.


While I wasn't able to give a talk on Python, I do think I convinced quite a few people to take a second look. My "sales technique", that I have been polishing for a while, revolves around getting someone to easy_install the IPython shell, and then walking them through how to write a program in Python. If you haven't tried this technique, yet, do, it is has a very big wow factor.

To wrap up, I really enjoyed myself, and was happy to hear so much talk about Python, as a fly on the wall in a country around the world. If am so lucky to be invited back to Kiwi Foo next year, I would be a very happy man, and, more importantly, I would shove in a talk on Python Generators, right after I dropped off my bags.

A couple of funny things to note on the way out was that the whole camp had a group debate in auditorium on the topic, "Is New Zealand Fucked?". It was a riot, and worth the trip alone. Those kiwi have foul mouths, I think one person managed to fit "Fuck" into a sentence 6 times, and I love them for it. Second, when one of many discussions on App Engine popped up, I heard someone say, I think it is going to fail. I replied, I hope not, because I have been writing a book about it for the last year :)

by Noah Gift (noah.gift@gmail.com) at May 14, 2009 11:43 AM

May 09, 2009

Repoze Project

Register now for the BFG Tutorial at Plone Symposium East 2009

We will be presenting a half-day tutorial on repoze.bfg at the 2009 Plone Symposium East, hosted by the awesome WebLion group at Penn State. The tutorial is a hands-on (bring a laptop, and be ready to code) introduction to developing applications using BFG.

The tutorial runs on Thursday, 26 May 2009 (exact times TBD). The cost for the tutorial is $75.00. Register now!

Paul will also be talking about an ongoing BFG customer project, and how it relates to Plone. This talk starts at 11:00 AM on Friday, 28 May 2009.

Hope to see you there!

May 09, 2009 06:38 PM

May 03, 2009

Noah Gift

April 30, 2009

Repoze Project

Six Feet Up on Integrating Zine and Plone Using repoze.zope2 and Deliverance

Calvin Hendryx Parker of Six Feet Up writes a blog entry about integrating the Zine blog engine together with Plone and Deliverance, using repoze.zope2. The blog entry itself is hosted on that setup.

April 30, 2009 04:49 PM

April 29, 2009

Noah Gift

Python Functional Programming Antipatterns: When Closures Can Be A Solution In Search of A Problem (PART 1)

One of the things I don't like about closures [1](via nested functions) is how they obscure intent in code. For example, if you just want to retain state why use a closure if you could just use a class? Sure a closure sounds cooler, but a class or a regular group of functions is often more flexible and readable than a closure.


Example 1: Simple Persistent State
Closure That Stores State: (1A)



In [39]: def outer():
....: x = 1
....: def inner():
....: return x
....: return inner
....:

In [40]: func = outer()

In [41]: func
Out[41]:

In [42]: func()
Out[42]: 1



Class That Stores State (1B)



In [46]: class State(object):
....: def __init__(self):
....: self.x = 1
....: def func(self):
....: return self.x
....:
....:

In [47]: func = State().func

In [48]: func()
Out[48]: 1





Score: +1 Class
Summary: If you simply want to persistent state, why use a closure that
has an odd signature, when you can simple use a class?


Example 2: Flat is better than nested, closures taken to an extreme with multiple nesting

Nested function w/ function that operates on state (2A)




In [18]: def way_outer():
....: wo = 1
....: def inner():
....: i = 2
....: def way_inner():
....: return i + wo
....: return way_inner
....: return inner
....:

In [19]: out = way
way_inner way_outer

In [19]: out = way_outer()

In [20]: out
Out[20]:

In [21]: out()
Out[21]:

In [22]: way_way_out = out()

In [23]: way_way_out()
Out[23]: 3



Class functions that operate on state (2B)



In [24]: class Addition(object):
....: def __init__(self):
....: self.x = 1
....: self.y = 2
....: def add(self):
....: return self.x + self.y
....:
....:

In [25]: a = Addition().add

In [26]: a
Out[26]: [bound method Addition.add of __main__.Addition object at 0x78db70]

In [27]: a()
Out[27]: 3




Score: +2 Class
Summary: You might be thinking "duh" with this example, but yes, that is exactly the point! Why nest things if you don't have to? Especially because of the fact that if someone looks at your highly nested closure they will go WTF as they should. Why make things more complex then they need to be? Just because you can use closures doesn't mean you should or it makes your code intuitive and readable.

Example 3: Delaying Execution of a Function
Delayed execution of a function using closures (3A)



In [29]: def outer():
....: def inner(x):
....: return x + 1
....: return inner
....:

In [30]: func = outer()

In [31]: func(3)
Out[31]: 4



Delayed execution of a function using class: Example (3B)


In [1]: class DelayCall(object):
...: def __init__(self, x):
...: self.x = x
...: def logic(self):
...: return self.x + 1
...: def delay(self):
...: return self.logic
...:
...:

In [3]: d = DelayCall(3)

In [4]: func = d.delay()

In [5]: func()
Out[5]: 4




Delayed execution of a function that takes args using a lambda in a class: Example (3C)


In [19]: class LambdaDelayCall(object):
....: def logic(self, x):
....: return x + 1
....: def delay(self):
....: return (lambda x: self.logic(x))
....:
....:

In [20]: l = LambdaDelayCall()

In [21]: func = l.delay()

In [22]: func(5)
Out[22]: 6



Delayed execution of a function that takes args using a lambda in a class w/ __call__: Example (3D)


n [19]: class LambdaDelayCall(object):
....: def logic(self, x):
....: return x + 1
....: def __call__(self):
....: return (lambda x: self.logic(x))
....:
....:

In [37]: l = LambdaDelayCall()

In [40]: func = l()

In [41]: func(3)
Out[41]: 4





Score: +3 Class
Summary: While 3A is shorter, I think it is much less clear then 3B, 3C or 3D, the class examples. I think of a closure in terms of an outer function that needs to operate on an inner function, such as in the case of a decorator. Using a closure just to delay execution of a function doesn't seem right to me, and I feel like it obscures the code.

Conclusion:

Closures are useful when you want to modify the state of inner function and return, as in the case of decorators. Lambdas, from Learning Python 3rd edition, " are often used as a way to inline a function definition, or to defer execution of a piece of code.". While closures are useful and powerful, with power comes responsibility. [1] Nested functions aren't that clear, try a class or maybe you don't even need a closure?. Make sure you Python for good, not evil. Stay tuned for the next installment.

[1] Added note about "nested functions" 04/29/2009


References:
Bruce Eckel on Decorators
Zen of Python
Functools Partial
Dive Into Python: Lambda

by Noah Gift (noah.gift@gmail.com) at April 29, 2009 12:08 AM

April 26, 2009

Noah Gift

Python Artificial Intelligence SIG Weekly Update: 04/26/2009

I am going to attempt to do weekly updates based on the newly formed Python AI SIG, http://groups.google.com/group/aipy . Sunday, New Zealand time, is going to be my day to summarize what is going on.

So far the hot topics are as follows:

1. Getting a code repository setup somewhere to share ideas.
2. Getting some domain setup so we can share and categorize what we learn. Jeff Rush mentioned possibly ai.python.org and using Sphinx. I kind of like that idea.
3. Filtered RSS reader: It seems like getting a bot to pre-filter RSS is low hanging fruit.
4. Continuous monitoring of the body: We are discussing the feasibility of continuous monitoring of the body, one gotcha so far is what device do we use?

by Noah Gift (noah.gift@gmail.com) at April 26, 2009 07:30 AM

April 24, 2009

Paul Everitt

BFG Tutorial at Plone Symposium next month


The Plone Symposium East 2009 is at Penn State University again next month. Fine conference last year, really enjoyed the people, the atmosphere, the convenience (walkability), and the conversation.

We (Chris/Tres/Paul or some combination thereof, with perhaps other bfgers) are giving a Developing Using BFG tutorial at the conference. It’s gonna rock. Gold, baby, gold.

I’m also giving a conference talk on the large BFG project I’ve been working on, which I suppose I’ll chatter more about in the coming days.

by Paul Everitt at April 24, 2009 06:53 PM

Shane on BFG


Nice article from Shane Hathaway about his experience on a project using BFG.  More on this subject later.

by Paul Everitt at April 24, 2009 05:57 PM

Website for BFG: It’s Alive!


I’ve been working non-stop on a large BFG (*) app for a number of months.  Well, almost non-stop.  I spent a week with Chris on a different BFG project.

Thus, I didn’t do a dang thing to help it, but here it is: a shiny new website for the BFG microframework.  Congratulations to Carlos and Chris for getting this up and running.  (Carlos writes about the site launch.)

BFG has been pretty unique in a number of ways, one of which is apropos for this site launch: Chris actually published the copious, well-written BFG docs just before releasing BFG’s code.  He’s been insistent on keeping the docs up-to-date.

Having a site is a big help.  Thanks guys!

(*) BFG is a Python web application “micro” framework that takes patterns, ideas, and some code from WSGI, WebOb, and Zope.  Keywords: fast, light, documented, tested, and non-judgemental.

by Paul Everitt at April 24, 2009 05:52 PM

Getting back into blogging


Delete faux post.  (Check).  Change title to something faux witty.  (Check).  Upload headshot from picture when boating with my son.  (Check).

Find a way to migrate old links to my blog to this, then get this into new planets and stuff.  Uhhh.  Check?

by Paul Everitt at April 24, 2009 05:35 PM

April 23, 2009

Ben Bangert

Beaker 1.3 is juicy caching goodness

Beaker 1.3 is out, actually, its been out for awhile and I’m just not getting around to blogging the fact. It’s a shame I’ve been a bit too busy lately to blog this earlier because in addition to some bug fixes it has some nice new features that make it even easier to use in any Python script/application/framework.

First, to air my dirty laundry, the important bug fixes in Beaker 1.3:

  • Fixed bug with (non-cookie-only) sessions not being timed out
  • Fixed bug with cookie-only sessions sending cookies when they weren’t supposed to
  • Fixed bug with non-auto sessions not properly storing their last accessed time

The worst thing with the first two of these is that they were regressions that snuck in despite unit tests that exercised the code fairly decently. They’re fixed now along with more comprehensive tests to help prevent such regressions occurring again.

Beaker has always had session’s, and caching, but except for Pylons I’ve yet to see anyone actually use Beaker’s caching utility. I’ve seen the SessionMiddleware used in other WSGI based frameworks, but not the caching, which is kind of a shame since it:

  • Supports various backends: database, file-based, dbm file-based, memcached
  • Has locking code to ensure a single-writer, multiple reader model (This avoids the dreaded dog-pile effect that caching systems such as the one in Django experience!)

For clients that hit the cached function while its already being regenerated, Beaker serves the old copy until the new content is ready. This avoids the dog-pile effect, and keeps the site snappy for as many users as possible. Since the lock used is disk-based though, this does mean you only avoid the effect per machine (unless you’re locking against NFS or a SAN), so if you have 20 machines in a cluster, the worst the dog-pile effect can get is that you’ll have 20 new copies generated and stored.

Now, in Beaker 1.3, to try and encourage its use a bit more, I’ve added a few decorators to make it easier to cache function results. Also with Mike Bayer’s suggestion, there is now cache regions to make it easier to define various caching policy short-cuts.

Cache Regions

Cache regions are just pre-defined sets of cache instructions to make it easier to use with your code. For example many people have a few common types of cache parameters they want to use:

  • Long-term, likely to a database back-end (if used in a cluster)
  • Short-term, not cached as long, perhaps to memcached

To set these up, just tell Beaker that about the regions you’re going to define, and give them the normal Beaker cache parameters for each region. For example, in this Pylons app, I define 2 cache regions in the INI:


beaker.cache.regions = short_term, long_term
beaker.cache.short_term.type = ext:memcached
beaker.cache.short_term.url = 127.0.0.1:11211
beaker.cache.short_term.expire = 3600

beaker.cache.long_term.type = file
beaker.cache.long_term.expire = 86400

Note: For those wondering about multiple memcached servers, just put them in as the url with a semi-colon separating them.

If you want to use the caching outside of Pylons without middleware (ie, as a plain library), that’s a bit easier now as well:


from beaker.cache import CacheManager
from beaker.util import parse_cache_config_options

cache_opts = {'cache.data_dir': './cache',
              'cache.type': 'file',
              'cache.regions': 'short_term', 'long_term',
              'cache.short_term.type': 'ext:memcached',
              'cache.short_term.url': '127.0.0.1:11211',
              'cache.short_term.expire': '3600',
              'cache.long_term.type': 'file',
              'cache.long_term.expire': '86400',
}

cache = CachManager(**parse_cache_config_options(cache_opts))

And your cache instance is now ready to use. Note that using this cache object is thread-safe already, so you just need to keep one around in your framework/app (Can someone using Django explain where you’d keep a reference to this object around so that you could get to it in a Django view?).

New Cache Decorators

To make it easier to use caching in your app, Beaker now includes decorators for use with the cache object. Given the above caching setup, lets assume you want to cache the output of an expensive operation:


# Get that cache object from wherever you put it, maybe its in environ or request?
# In Pylons, this will be: from pylons import cache
from wherever import cache

def regular_function():
    # Do some boring stuff

    # Cache something
    @cache.region('short_term', 'mysearch')
    def expensive_search(phrase):
        # Do lookup with the phrase variable
        return something
    return expensive_search('frogs')

The second argument to the region decorator, ‘mysearch’. That isn’t required unless you have two function’s of the same name in the same module, since Beaker records the namespace of the cache using the function name + module + extra args. For those wondering what a Beaker namespace is, its a single cache ‘block’. That is, lets say you wanted to cache 4 versions of the same thing, but change them differently depending on the input parameter. Beaker considers the thing to be a namespace, and the things that change the thing being cached are the cache keys.

Only un-named arguments are allowed on the function being cached. These act as the cache keys so that if the arguments change, a new copy is cached to those arguments. This way you can have multiple versions of the function output cached depending on the argument it was called with.

If you want to use arbitrary cache parameters, use the other decorator:

# Get that cache object from wherever you put it, maybe its in environ or request?
# In Pylons, this will be: from pylons import cache
from wherever import cache

def regular_function():
    # Do some boring stuff

    # Cache something
    @cache.cache('mysearch', type='file', expire=3600)
    def expensive_search(phrase):
        # Do lookup with the phrase variable
        return something
    return expensive_search('frogs')

This allows you to toggle the cache options per use as desired.

If there’s anything else I can do to make it easier to use Beaker in your application, be sure to let me know (Yes, I know more docs would help, this blog post was a first attempt to help out on that front, more docs on the way!).

by ben at April 23, 2009 05:33 AM

April 21, 2009

Noah Gift

An "Adaptable" Commandline Tool and Web Service Generator

I have been hacking on a Commandline tool and Web Service Generation framework called Adapt. Last night, Adam Shand, and I hacked on getting Phase 2 together, which is making a web service automatically serve out the same URLs as a commandline tool options, which are defined in a config file.

The not even Alpha quality code is here: http://bitbucket.org/noahgift/adapt/

The basic problems I am trying to solve is this:

1. Automatically convert existing Python command line tools into Web Services.
2. Create a plug and play WSGI application so I can talk to other WSGI goodies like LDAP auth, debugging, etc.
3. Create Yet Another Web Framework in Python (although it is pretty unconventional)
4. Automatically adapt existing shell scripts, aliases and Python scripts into a dynamically created command line tool.
5. Declarative mapping of actions via a simple config file.
6. Let people that are completely ignorant of Python, create command line tools and web services by running a tool.
7. Let people do BOTH, in one step.

Is this crazy...yes. But, it is also fun. Once I complete code coverage to 100% and move common code into a library, and make some "frameworkish" code, I will release it. Feel free to help if you think this sounds fun too.


I named it after my blog post "Adapt or Die". Because it was a direct outshoot of my philosophy on tools.

by Noah Gift (noah.gift@gmail.com) at April 21, 2009 10:40 AM

April 20, 2009

Ian Bicking

Treating configuration values as templates

A while back I described fassembler and one of the things I liked in it is how the configuration works. It uses a conventional declarative INI-style but also allows arbitrary code, so that defaults can be based on each other.

Here’s a basic example of a default configuration:

[some_app]
port_offset = 10
port = {{int(section.DEFAULT['base_port'])+int(port_offset)}}

Then if another configuration file defines base_port then this will all resolve. You can do this in Python, but you don’t get sections, and you have to define everything in just the right order. So while base_port will probably be defined in a deployment-specific configuration, it has to be defined before these other derivative settings are defined. On the other hand, you want deployment-specific configuration to take precedence… so there’s really no good ordering.

Anyway, the implementation really isn’t that hard. I use Tempita as the templating language because, well, I wrote it, and because it’s simple and appropriate for small strings. For the configuration parsing, ConfigParser will do.

Here’s what the basic code looks like in ConfigParser:

from ConfigParser import ConfigParser
from tempita import Template

class TempitaConfigParser(ConfigParser):

    def _interpolate(self, section, option, rawval, vars):
        ns = _Namespace(self, section, vars)
        tmpl = Template(rawval, name='%s.%s' % (section, option))
        value = tmpl.substitute(ns)
        return value

Actually instead of using tempita.Template, we could just do eval(rawval, {}, ns), it would just require a lot more quoting (every value would have to be a valid Python expression). Either with that or Tempita the implementation of _Namespace will look the same.

Here’s a simple implementation:

from UserDict import DictMixin

class _Namespace(DictMixin):
    def __init__(self, config, section, vars):
        self.config = config
        self.section = section
        self.vars = vars

    def __getitem__(self, key):
        if key == 'section':
            return _Section(self)
        if self.config.has_option(self.section, key):
            return self.config.get(self.section, key)
        if vars and key in self.vars:
            return self.vars[key]
        raise KeyError(key)

   def __setitem__(self, key, value):
       if self.vars is None:
           self.vars = {key: value}
       else:
           self.vars[key] = value

We’ve introduced a magic variable section, which is used to refer to other sections. It looks like this:

class _Section(object):
    def __init__(self, namespace):
        self._namespace = namespace

    def __getattr__(self, attr):
        if attr.startswith('_'):
            raise AttributeError(attr)
        return _Namespace(self._namespace.config, attr,     self._namespace.vars)

With these I think you get many of the benefits of using Python code as your configuration format, while still having the benefits of a more declarative approach to configuration, one that allows for forward and backward references.

A full implementation has several more things than I show here, but you can see the full example in my recipes. It also has an example of using INITools instead of ConfigParser to give more accurate filenames and line numbers when there is an exception, while otherwise using the same interface.

by Ian Bicking at April 20, 2009 06:11 AM

April 09, 2009

Noah Gift

Python Artificial Intelligence Special Interest Group Mailing List

From the title of my blog,"Artificial Code", you can tell I am interested in Artificial Intelligence. I have noticed quite a few people in the Python community are interested in AI, so wondered if there was a mailing list, evidenced by viewing Raymond's AI talk at PyCon, and a recent blog post by Tennessee, and even Guido himself :). My google searches revealed there wasn't, so I started one:

http://groups.google.com/group/aipy

I think it would be fun to share ideas about artificial intelligence and then make real world implementations. We can also share how everything thinks that people interested in AI are crazy and that it is an unsolvable problem :) If you are interested in both AI and writing Python implementations, then join the list, and let's get to work.

Some of the things I am interested in are solving "low hanging fruit" problems that can be done this year, or sooner.

by Noah Gift (noah.gift@gmail.com) at April 09, 2009 10:52 AM

April 08, 2009

Noah Gift

Short-Circuiting Python Module Lookup Gets 2066% Performance Improvement

Recently I ran nosetests and it took 26 seconds to run. On looking at an strace and it generated about 53K lines of output in strace on ubuntu.

After some help from a couple of co-workers, I noticed that I could get the following performance boost, note this is one a massive NFS installation, not my laptop...

Total Elapsed Time: 2066 % speed improvement
Lines of strace output: 3050| 1695 % reduction in calls to file system

This was accomplished through a series of tests:

#1 Hijack setuptools by explicitly crafting sys.path to include only the eggs it needs. If you don't use --multi-version it will traverse EVERY egg and put them in your sys.path.
#2 unset my Python path eliminating all of the places that were there by default
#3 write a bootstrap script that calls python -S "myscript.py" so site-packages don't get looked at.

I was pretty shocked at how big of an improvement this made. The strace output shrunk to about 3K lines, which I might even be smaller if I only looked at open files.

There are a few downsides to this, obviously flexibility, but I need the speed. Has anyone done anything similar to hijack Python import to speed up time to fire up an interpreter. My hunch is I can go much, much farther.

by Noah Gift (noah.gift@gmail.com) at April 08, 2009 10:01 AM

April 01, 2009

Tres Seaver

Announcing: Zope 4.0 project

On behalf of the Zope community, I am pleased to announce the creation of the "Zope 4.0" project. After extensive discussion with the Zope wizards in conclave at PyCon 2009, the new project's website has been launched!

by tseaver at April 01, 2009 11:48 AM

March 29, 2009

Noah Gift

ReSTless a Restructured Text Preview Cocoa Application

(Note, sorry Planet Python for the previous, accidental cross post from my workout blog...)


One thing I had almost forgot to blog about, ReSTless
. One of my Atlanta area friends, Aaron Hilligass wrote a really sweet, and useful app to demo how to call Python from Cocoa in our book, Python For Unix and Linux System Administration. It turns to be a really, really useful application that I use all the time to preview Restructured text while I write it.

I have a link to the source code at the bottom of this post, as well as a link to the docutils source. One thing to watch out for, is to make sure you adjust the MyDocument.m source file to look at where you actually have rst2html.py. In my situation I used this location when I compiled:



[task setLaunchPath:@"/usr/local/bin/rst2html.py"];



One of the other cool thing that it does is show you how to connect to a piped Python script. Nice work Aaron!









References:
Python For Unix and Linux System Administration Book
Source Code For Python For Unix and Linux System Administration
Latest Docutils Source Code

by Noah Gift (noah.gift@gmail.com) at March 29, 2009 12:52 PM

Week Ending March 29

Monday/5k, Tuesday/6k, Wed/Off, Thursday/6k, Friday/6k, Sat/Off, Sun/25K+
Feeling: 5 days is pretty decent although I could probably do 6. I am very glad I took Saturday off as I had a fairly easy go at running 25k or 16 miles. Here is the map of my hella sweet 25k run around the tip of the North Island of New Zealand:
Notes: I got a tad thirsty, but stopped for water a public faucet. I also had some pretty killer death, water defying jumps over rocks. This old man still has the juice :)
http://mapmyrun.co.nz/view_my_run_127983

by Noah Gift (noah.gift@gmail.com) at March 29, 2009 07:30 AM

2009 The Year of WSGI: Adapt or Die

I was just talking to some people recently about running and how the human body adapts to ever increasing loads. Most people can run marathons if they just expose their body to the load of running on a regular basis, it isn't that tricky. Most of those people could probably run an ultramarathon, if they trained for it, I am training for one next year here. There are even some people like Dean Karnazes, that run 50 marathons in 50 states, in 50 consecutive days.

One of the reasons I like running is it continuously demonstrates, on a daily basis, no less, to me the simple axiom, "Adapt or Die". If I stop running I will get fat and out of shape, but if keep running I will keep adapting. One of the things that shocks first time runners is how they become aware that they can all of sudden run for an hour and a half and they forget they were even running. This is because our bodies are amazing things that adapt to what we expose it to. If you run enough, it becomes easy to run for hours on end with fairly little effort.

I want that same adaption in the tools I use. I have been using Unix and Linux for over a decade now, and I like its adaptability. You can reuse existing tools and creates pipes, or write your own tools and pipe them into existing tools. It is a surprisingly flexible and powerful system. I like Python because it continuously changes itself, incorporating new modules like the multiprocessing module in Python 2.6, or even completely reinventing itself like it is doing in Python 3k, converting to Unicode, and adding many other backwards incompatible changes.

To me WSGI is also an adaptable tool. To me it truly symbolizes "adapt or die", in both implementation and philosophy. Because WSGI is both a philosophy and a technology it draws a different crowd of developers than a pure web framework crowd. Developers that are drawn to WSGI want to use a constantly evolving best of breed component stack. For example, SQLAlchemy is developed by a very passionate developer, Michael Bayer, who obviously takes pride in knocking the shit out of one particular problem. Jonathan Ellis solved another problem when he wrote SQL Soup, which dynamically creates mapped objects from on the fly database queries with ZERO mapper code, this IS rocket science! Those are the kind of tools I want to use. They are flexible like unix pipes, and they work in lots of different contexts.

One of the downsides to WSGI, and WSGI inspired frameworks, like Pylons, Turbogears 2, BFG and Werkzeug is that they are a bit more difficult to deal with, partially, because they don't hide as much of the complexity as something like Django.

On the other hand, I feel that using something like Django or Ruby on Rails is a form of Technical Debt. Technical debt, can be an extremely useful concept, just like business debt, as long as you can get a bigger total payoff then the eventual interest you will need to pay on your "loan". One of the side effects of hiding complexity is some form of "lock-in" to the "Django" or "Rails" way, which is a bit too Alan Greenspan on an Ayn Rand Objectivism high for me. The Fountainhead was a cool book, but it was fiction after all. To me web frameworks often offer a false choice to a problem I don't have.

WSGI inspired frameworks, or just plain tools, like WebOb, urlrelay, repoze.who, are a different kind of philosophy. They remind me a lot of tools like ls, cat, rm, etc. They do a specific job, but they can also work in a vast array of combinations the authors never dreamed of. In the hands of experienced developers, WSGI tools and frameworks, just like unix tools, are quite handy. Many of these developers are using WSGI, but you just don't hear about it. Take SQLAlchemy book author Rick Copeland, who uses a tweaked out version of Turbogears 2.0 to write massively complex web apps for Predictix, or Jonathan, Metaclass, LaCour, who solves mind strangling problems in SQLAlchemy, Elixir, Pylons and Turbogears for ShootQ and major Airlines...yup, those kind of airlines. If your doing something crazy in WSGI, hopefully you comment on this post.

I am predicting 2009 will be the year of WSGI, where the WSGI frameworks and tools ultimately establish their philosophical and technical superiority over the walled garden development philosophy, or Ian Bicking's Solution 1 in his 2005 WSGI presentation. If your a Python developer, adapt....or die...

References

Pylons A Hacker's Framework: http://magazine.redhat.com/2008/11/05/introducing-pylons-a-hackers-web-framework/
Multiprocessing module: http://www.ibm.com/developerworks/aix/library/au-multiprocessing/
Intro to SQLAlchemy: http://www.ibm.com/developerworks/aix/library/au-sqlalchemy/
SQLAlchemy Book: http://oreilly.com/catalog/9780596516147/
Pylons Book Hard Copy: http://www.apress.com/book/view/9781590599341
Pylons Book Online: http://pylonsbook.com/
Ian Bicking WSGI Presentation 2005: http://ianbicking.org/docs/pycon2005/wsgi-presentation/slides.html

by Noah Gift (noah.gift@gmail.com) at March 29, 2009 12:57 AM

March 25, 2009

Repoze Project

Sprints at PyCon 2009

We are lookking forward to working with a number of folks at the sprints following Pycon this year.

There are seventeen folks signed up to date to work on Pylons, Repoze, TurboGears, and the other WSGI-centric Python web freamworks. Once exciting opportunity is to have a lot of cross-pollination between the frameworks, looking for ways to work together and share more code.

March 25, 2009 03:20 AM

March 16, 2009

Tres Seaver

Categorizing Packages: Clarifying Terms

Some of the friction which comes up on the zope-dev list, apparently due to different goals, might in fact be due to some confusion in the terms we use to talk about our goals, and about the shared software we manage toward those goals. In the interests of reducing the friction, I would like to sketch out how I am using those terms.

by tseaver at March 16, 2009 09:02 PM

March 13, 2009

Repoze Project

Mikko Ohtamaa: Setting up Plone 3.2 and Repoze, hackyish

Mikko Ohtamaa provides a blog entry about how to set up Plone 3.2 under repoze.zope2.

March 13, 2009 06:58 PM

Noah Gift

Dynamic Package Management Configuration For Library End Users Considered Harmful

I was just reading, by pure chance, a post by Michele Simionato, in which he mentioned that twill, at least at the time of the article, packaged inside of it's package the two dependencies it used. I think this is a wise choice.

I think it is about time the Python community embraces the fact that dynamically configuring dependencies during network installation is a really, really bad idea for end users of that library. For the actual developers of the library, using something like easy_install is absolutely brilliant, as it allows you to automatically pull in packages that your library depends on. It should stop here though.

From an engineering standpoint, if you want to ensure something like 99.99999% reliability for your product, you don't perform network transactions with clients in an unknown state. You do that BEFORE you release the package to the general public, and then you release that bundle to the general public.

There is a downside to this, as it makes for more work for the developers of the library. They have to actually download each of the dependencies, test their library, and package it up as a big tar file. In addition, a user cannot easily swap out one of their dependencies, they would need to download the bundle again. I think this hassle far exceeds the "flexibility" that is currently offered. What makes it worse, is that many developers of their libraries have no idea that they turn away hundreds or thousands of users away from their framework or library, because they do a complex network transaction with clients in an unknown state and things blow up.

If your a developer of a web framework or library that depends on a huge pile of dependencies, consider offering a plain tar file prepopulated with all of your dependencies, or a plain tar file with a bootstrap.py of virtualenv that bootstraps the files inside of your tar file. It looks like Pip is a step in the right direction, but I still think the burden should be placed on the developers, NOT the end users of the library. If a package install blows up during a download or while downloading, the author of the package is the best person to troubleshoot this, not the thousands of people who try out the package for the first time.

Please, stop the insanity!

by Noah Gift (noah.gift@gmail.com) at March 13, 2009 11:19 AM

March 12, 2009

Noah Gift

Wellington Python User Group Notes March 2009: Multiprocessing, Optparse, Timeline Visualization, and Enterprise Pylons

At the third ever Wellington, NZ Python User Group meeting, we talked about the multiprocessing module, command line tool tricks and visualization. It was a hoot, here are the notes.

First, me and a couple co-workers, Teijo Holzer and John van Leeuwen, gave a three part presentation on the multiprocessing module.

I gave an example of tool I wrote to asynchronously fork Net-SNMP called "multicore snmp", Teijo gave a presentation on shared data structures, which are available via a simple import of Process, Value, Array.

Finally, John gave a presentation about how generator expressions fit quite nicely with Process Pools. Dust up on the nested generator expressions, because functional programming is back with a bang. All in all the crowd was quite blown away with how "sick" the multiprocessing module was, in a good way.

Next, Stephen Judd, gave a talk on how he creates a powerful boilerplate template for command line tools using optparse, profile, doctests, and friends. He promised to post the code to the NZPUG wiki.

Finally, Richard Clark, gave a quick presentation on Event Timeline charting. He showed some pretty cool graphs that generated based on timeline's using I believe Chart Directory...but I could be wrong. Finally, he also mentioned Panda 3D, which graphs some killer 3D and it is has a simple Python API, check out the Hello World example!

Start slight diatribe:

We wrapped up the meeting with some impromptu talk on Web Frameworks, and I mentioned Pylons/Turbogears is the one to watch at the moment. For me WSGI and WSGI middleware like Repoze.who or Urlrelay + SQLAlchemy + a really experienced, mature, and smart developer community, makes Pylons enterprise ready from day one, for almost any task you could throw at it. Combine that with SQLSoup, which can do crazy stuff like doing a join on two different tables on two different databases living on the same server in MySQL, and it makes it a no brainer for almost anything I want a web framework for.

Here now, is some proof about Pylons being able to do incredible stuff. One slightly secret, skunkworks, Turbogears 2.0/Pylons project is ShootQ,which was created by Python mastermind, Jonathan LaCour. I totally agree with Adrian Holovaty, when he says, it is not about the tools you use, it is about the sites you create.

ShootQ rocks, and Jonathan took the enterprise quality toolset of Turbogears 2.0/Pylons, and created an awesome showcase site for anyone that wants to question if Turbogears 2.0/Pylons is the real deal. Banks, film studios, and airlines, as well as Web 2.0 companies are using it, and not only does it work well, but it solves complex problems that are not easily solvable in other web frameworks! I like to call it the hacker's framework, but you could also get a way with calling it Python's enterprise web framework too. I would even go so far as to say, if you work at a major corporation and you decided to use a Python webframework, you need to look at Pylons, or you haven't done due diligence. It is that good.

End slight diatribe:

Back on topic, if your on our side of the world in Wellington on the second Thursday of each month...drop by!

by Noah Gift (noah.gift@gmail.com) at March 12, 2009 10:51 AM

March 11, 2009

Noah Gift

Review of Expert Python Programming by Tarek Ziadé

I must at first admit some bias with this review, as Tarek and I both had Shannon -jj Behrens as technical editors on our books: Expert Python Programming and Python For Unix and Linux System Administration. Packt Publishing also gave me a free copy, which was nice. This kind of cancelled itself out though, as I liked it so much I bought a PDF version so I could quickly refer to it over and over again.

I think the book rocks! I learned a lot of real world stuff that I had not been exposed to before reading his book. It is a much different book than say, the Python Cookbook, or Learning Python, but that is a good thing! It is good to read widely different types of books on Python. I read a lot of books, probably too many in fact, and I can't think of a Python book that I read that I didn't like in some way.

Having written a book, actually, I am on my second, I can attest it is tough. Python is a language that most, if not all, people taught themselves outside of work time. Python authors are perhaps some of the most abnormally motivated people you will ever meet as result, and really haven't read a bad Python book yet. This book in particular showcase Tarek's unique set of skills as a programmer. I gather this book is modeled after exactly how he writes code, and that is a good thing!

I was especially happy with how Tarek covered things tangentially associated with Python like buildbot, and mercurial, as it means I could drop this book off to someone in a company, and say, "take a look at a book that covers the latest developer techniques in Python", and be satisfied they could stick with just reading that book for a while, and learn a whole lot about real world Python programming.

If you are a serious Python programmer, or want to be a serious Python programmer, this is a book you need to have on your bookshelf. Thanks for writing it Tarek!

by Noah Gift (noah.gift@gmail.com) at March 11, 2009 10:35 AM

March 10, 2009

Noah Gift

Python Packaging Issues Survey

I did my civic duty and filled out the Python Packaging Survery setup by Tarek and Massimo. Here were my biggest issues:

1. End users of a library should not download a library, say a web framework, which then downloads a horde of components. This is just so, so wrong from an engineering standpoint. Imagine configuring the Space Shuttle parts as it began to launch. This should happen at some point, sure, but not EVERY SINGLE TIME you install a library. The failure rate for some applications installing from PyPi for me is over 70%, when it should be close to zero.

2. It is damn tricky to register a package on PyPI. Why?

3. CPAN has a like a billion mirrors. We have PyPI.

4. There is no real quality control, or feedback mechanism for a package. It would be AWESOME, if the install mechanism sent a traceback up on the same page as the package entry on PyPI, as a mark of shame, when a package failed to install. I bet there are some packages that would get buried in tracebacks.

5. Operations are not atomic. Often I will get partially through an install with easy_install, something blows up, and then I have a big pile of Python to sift through.

6. There should be some parity with operating system package management tools. Why not have the package manager also make RPM's automatically, and let them be aware of each other? I like RPM's and deb packages....they just work.

7. Put up a customer satisfaction survey up on the site. Ban anonymous comments, but allow registered users, under the real name, say what they thought. Did the package suck beyond all recognition, or was it manna from heaven?

Btw, I really enjoy easy_install most of the time, but when I have to deal with something that pulls down 50 other things, I want to scream...my God, what have you done, why, why.....

by Noah Gift (noah.gift@gmail.com) at March 10, 2009 11:31 AM

Review of Essential SQLAlchemy

I have been meaning to write a review on Essential SQLAlchemy by Rick Copeland for a while now, so I am just going to do it.

Recently, I have been writing code that needs to talk to legacy databases. The kind of work I have to do, usually requires the creation of a bunch of command line tools, and so using a standalone ORM is really the only option. I pulled out my copy of Essential SQLAlchemy and got to work.

First, I read through the beginning of the book, and then skimmed through examples of some of the queries. Finally, I got into the SQLSoup section in Chapter 10, and really got down and dirty. If you haven't yet, used SQLSoup, and you have a legacy database, then you are in for a treat. Rick describes how powerful and easy SQLSoup is to use in Chapter 10, and gives it a great treatment.

Basically, you do something like this:



from sqlalchemy.ext.sqlsoup import SqlSoup
db = SqlSoup('mysql:///SarahPalinIsANutter.db')



You then simply query Tables as attributes, and you get back objects "magically"


for object in db.ShootingWolves.all():
print object.FiftyCaliberAmmo



One topic I wish Rick would have covered more though were how to do joins on different databases on the same database server...especially if this is possible with SQLSoup. Jonathan Ellis if your out there how do I do this equivalent SQL with two MySQL Databases with SQLSoup:


SELECT p.ShotID, s.shot_ID FROM Production.Production p JOIN
Shots.Shots s on p.ShotID = s.shot_ID;



Update: You can do this. I just didn't know how:


db.schema = 'production'
p = db.production
db.schema = 'shots'
s = db.shots



Jonathan Ellis, has this new piece of SQLSoup goodness in the trunk now:



p = db.entity('production', 'production')
s = db.entity('shots', 'shots')



where entity is

def entity(self, tablename, schema=None):

Excellent! I love SQLSoup even more, if that was possible.



I also wrote an entry level article on SQLAlchemy here, for IBM Developerworks. I do cover the declarative syntax, so you can refer to this article, the book, and the excellent online docs for SQLAlchemy and be pretty much set.

Update: Oh, and you should buy this book. It is a must have for any SQLAlchemy developer!

by Noah Gift (noah.gift@gmail.com) at March 10, 2009 11:05 AM

March 04, 2009

Noah Gift

Evil Python with the Property Decorator

I was showing someone python today who has a background in Perl, and got to the topic of how the property decorator was nice syntactic sugar to generate a read attribute that gets calculated when the attribute was accessed. He accidently thought of this example next:


In [1]: class EvilPython(object):
...: def __init__(self, x = 3):
...: self._x = x
...: @property
...: def x(self):
...: return self._x * random.random()

In [2]: import random

In [29]: val = EvilPython()

In [30]: val.x
Out[30]: 2

In [31]: val.x
Out[31]: 2

In [32]: val.x
Out[32]: 1

In [33]: val.x
Out[33]: 0

In [34]: val.x
Out[34]: 1



I had never thought to try to make "Evil" out of using properties, but creating random values for an attribute of an object could really frustrate someone for quite some time. Don't try this at home!

by Noah Gift (noah.gift@gmail.com) at March 04, 2009 10:34 AM

March 02, 2009

Chris McDonough

Setup.py Blues

Hey, imagine, you want to run the setup.py of a Distutils-packaged package and your current directory is not the package's directory:

  [chrism@vitaminf env26]$ bin/python ../supervisor2/setup.py develop
  running develop
  error: error in 'egg_base' option: 'src' does not exist or is not a directory

Oh well of course. That makes sense. This is a really advanced use case. I mean, it's really hard to... you know..... not do the most idiotic thing on the planet. So, let's just cd to the package directory when we want to run its setup.py, over and over, forever, til the end of time.

Sometimes I think we might have been better off without any Python packaging solution.

by chrism at March 02, 2009 02:18 PM

February 23, 2009

Ben Bangert

Pylons 0.9.7 Released

I’m pleased to announce after a rather lengthy release candidate period, that Pylons 0.9.7 is finally out. Pylons 0.9.7 brings a good amount of changes to Pylons from 0.9.6 while still retaining a fairly hefty amount of backwards compatibility to ensure a mostly painless upgrade.

Some helpful documentation on the new release:

Major changes in 0.9.7:

  • Switched to using WebOb for the request/response object
  • Various performance improvements to object initialization
  • Beaker and Routes updates
  • Middleware improvements, and optimizations

This is a huge step forward for Pylons, and I’d like to thank all of the contributers who have helped make Pylons what it is today. We’ve knocked off more bugs for this release than any before, which shows just how far the Pylons community has come:

  • 0.9.5 tickets: 45
  • 0.9.6 tickets: 64
  • 0.9.7 tickets: 160

And we have finally made a huge dent in the historical “lack of docs” problem that Pylons previously suffered from with the new Sphinx generated docs and a comprehensive Pylons book.

The full changelog which describes the major changes (Look for the bits marked with WARNING that might affect backwards compatibility).

0.9.7 (February 23, 2009)

  • WARNING: A new option is available to determine whether or not an actions arguments should be automatically attached to ‘c’. To turn off this implicit behavior in environment.py: config[‘pylons.c_attach_args’] = False This is set to True by default.
  • WARNING: Fixed a minor security hole in the default Pylons error page that could result in an XSS security hole.
  • WARNING: Fixed a security hole in the default project template to use the StaticURLParser to ensure arbitrary files can’t be sent.
  • WARNING: Refactored PylonsApp to remove legacy PylonsApp, moved session/cache and routes middleware into the project template. This will require projects to be updated to include those 3 middleware in the projects middleware.py.
  • Changed to using WebTest instead of paste.fixture for app testing.
  • Added render_mako_def to render def blocks within a mako template.
  • Changes to cache_decorator and cached_template to support Beaker API changes in version 1.1. 1.0.3 is still supported.
  • Fix HEAD requests causing an Exception as if no content was returned by the controller. Fixes #507. Thanks mvtellingen, Petr Kobalicek.
  • Fix a crash when returning the result of ``etag_cache`` in a controller. Fixes #508.
  • “response” flag has been removed from pylons.decorators.cache.beaker_cache, as it sends all headers along unconditionally including cookies; additionally, the flag was taking effect in all cases previously so prior versions of beaker_cache are not secure.

    In its place, a new option “cache_headers” is provided, which is a tuple of specific header names to be cached. It defaults to (‘content-type’,’content-length’).

  • “invalidate_on_startup” flag added to beaker_cache, which provides a “starttime” to the cache such that when the application is started or restarted, the cache entry is invalidated.
  • Updating host to use 127.0.0.1 for development binding.
  • Added option to specify the controller name with a controller variable in the controller’s module. This name will be used for the controller class rather than the default naming scheme.
  • setup.py egg_info now restores projects’ paster_plugins.txt, allowing paster shell to work again after the egg-info directory was lost. fixes #282. Thanks sevkin.
  • The paste_deploy_config.ini_tmpl template is now located at package/config/deployment.ini_tmpl for new projects.
  • Project’s default test fixtures no longer hardcode test.ini; the ini file used can now be specified via the nosetests—with-pylons argument (defaults to test.ini in setup.cfg). fixes #400.
  • @validate now defaults to translating FormEncode error messages via Pylons’ gettext catalog, then falls back to FormEncode’s. fixes #296. Thanks Max Ischenko.
  • Fixed SQLAlchemy logging not working in paster shell. Fixes #363. Thanks Christoph Haas.
  • Added optionally engine initialization, to prevent Buffet from loading if there’s no ‘buffet.template_engines’ in the config.
  • Updated minimal template to work with Tempita and other new templating changes.
  • Fixed websetup to parse location config file properly when the section isn’t ‘main’. Fixes #399.
  • Added default Mako filter of escape for all template rendering.
  • Fixed template for Session.remove inclusion when using SA. Fixed render_genshi to properly use fragment/format options. Thanks Antonin Enfrun.
  • Remove template engine from load_environment call.
  • Removing template controller from projects. Fixes #383.
  • Added signed_cookie method to WebOb Request/Response sub-classes.
  • Updated project template to setup appropriate template loader and controller template to doc how to import render.
  • Added documentation for render functions in pylons.templating.
  • Adding specific render functions that don’t require Buffet.
  • Added forward controller.util function for forwarding the request to WSGI apps. Fixes #355.
  • Added default input encoding for Mako to utf-8. Suggested in #348.
  • Fixed paster controller to raise an error if the controller for it already exists. Fixes #279.
  • Added init.py to template dir in project template if the template engine is genshi or kid. Fixes #353.
  • Fixed jsonify to use application/json as its the proper mime-type and now used all over the net.
  • Fixed minimal template not replacing variables properly. Fixes #377.
  • Fixed @validate decorator to no longer catch exceptions should they be raised in the action that is supposed to display a form. Fixes #374.
  • Fixed paster shell command to no longer search for egg_info dir. Allows usage of paster shell with installed packages. Suggested by Gavin Carothers.
  • Added mimetype function and MIMETypes class for registering mimetypes.
  • WARNING: Usage of pylons.Response is now deprecated. Please use pylons.response instead.
  • Removed use of WSGIRequest/WSGIResponse and replaced with WebOb subclasses that implement methods to make it backwards compatible with the Paste wsgiwrappers.
  • Fixed missing import in template controller.
  • Deprecated function uses string substitution to avoid Nonetype error when Python optimization is on. Fixes #334.
  • E-tag cache no longer returns Content-Type in the headers. Fixes #323.
  • XMLRPCController now properly includes the Content-Length of the response. Fixes #310, thanks Nicholas.
  • Added SQLAlchemy option to template, which adds SQLAlchemy setup to the project template.
  • Switched project templating to use Tempita.
  • Updated abort/redirect_to to use appropriate Response object when WebOb is used.
  • Updated so that 404’s properly return as Response objects when WebOb is in use instead of WSGIResponse.
  • Added beaker_cache option to avoid caching/restoring global Response values that were present during the first cache operation.
  • Adding StatusCodeRedirect to handle internal redirects based on the status code returned by the app. This replaces the use of ErrorDocuments in projects.
  • Refactored error exceptions to use WebError.
  • WSGIController now uses the environ references to response, request, and the c object for higher performance.
  • Added optional use of WebOb instead of paste.wsgiwrapper objects.
  • Fixed bug with beaker_cache defaulting to dbm rather than the beaker cache app-wide default.
  • The—with-pylons nose plugin no longer requires a project to have been registered with setuptools to work.
  • The config object is now included in the template namespace.
  • StaticJavascripts now accepts keyword arguments for StaticURLParser. Suggested by Marcin Kasperski.
  • Fix pylons.database.AutoConnectHub’s doInTransaction not automatically connecting when necessary. Fixes #327.

by ben at February 23, 2009 08:29 PM

February 21, 2009

Noah Gift

Process Based Asynchronous Python Net-SNMP

I have been meaning to wrap up Net-SNMP for quite some time with the fancy new multiprocessing library in Python 2.6, and I finally got around to making a prototype:

http://code.google.com/p/multicore-snmp/

This may be exciting to .0001 of Python programmers, but I think SNMP is pretty sweet. This current API is a bit rough, but I plan on polishing things up shortly, and making it into a PyPi release. If you want to give it a go just look at the main function, and pass in a list of hosts that are either hostnames, or a full instances of SnmpSession

One of the main motivations to wrapping it up in the multiprocessing library, is that the current Net-SNMP bindings are synchronous. By using the multiprocessing library it effectively makes Net-SNMP actually useable from Python, and opens it up to working on a box with say 8 cores. I based my initial API on one of the processing pool examples in the official docs.

If you have want to help or have ideas about how the API should work let me know.

by Noah Gift (noah.gift@gmail.com) at February 21, 2009 04:51 AM

February 14, 2009

Chris McDonough

zope.pipeline

Shane Hathaway has been working on a system that can be used to compose Zope a publisher configuration out of WSGI middleware implementations.

This is an implementation of an idea that I wanted to pursue last year but Shane beat me to it, and it looks pretty nifty!

I'm not certain that all the current Zope publisher behavior belongs in endware but it's interesting to see it take shape. One fallout from such an effort may be that more stuff that's currently "locked up" in the Zope publisher may fall out as reusable WSGI middleware, which could only mean good things.

by chrism at February 14, 2009 12:03 PM

February 11, 2009

Noah Gift

My Robust Publishing System: HG, Sphinx, reStructuredText, and Latex

I have been extremely impressed with Sphinx lately. I currently use it for most of my documentation and writing needs. I have a very simple workflow that publishes documentation in the form a sphinx web bundle and a PDF, all automatically. Here is my workflow:

1: Get Sphinx: http://sphinx.pocoo.org/
2. Download Latex: http://www.latex-project.org/
3. Start a new project, using the sphinx-quickstart command
4. Create a document in reStructuredText, and link to it from the index.rst file you created.
5. I then just type the ZSH alias:


buildcommit


This is what my .zshenv looks like:


#web dev documentation
alias buildweb="/usr/home/ngift/public_html/notes;sphinx-build source build"
alias webcheckin="/usr/home/ngift/public_html;hg ci -m 'checkin web'"
alias makepdf="/usr/home/ngift/public_html/notes;sphinx-build -b latex source build;build;make"
alias buildcommit="buildweb;webcheckin;makepdf"


6. I then get a nice PDF manual that matches my beautiful web site. The Sphinx developers have done an exceptional job, and the tool makes self-publishing trivial. In addition, my lazy alias is awesome because it checks in every changed file automatically to my hg repository, builds the webpages, and pdf in about 5-10 seconds.

by Noah Gift (noah.gift@gmail.com) at February 11, 2009 09:34 AM

February 09, 2009

Chris McDonough

The Zope Book Needs A Maintainer

For years I have hosted an updated copy of the Zope Book (2.7 edition) and from what I gather from my webserver logs it's still in pretty heavy usage. That said, as may be evidenced by the version number (Zope 2 is at version 2.12 now), I have ceased maintaining it. It needs a new home. Are you looking to adopt it?

By the way External Editor also still requires a maintainer too.

by chrism at February 09, 2009 08:03 PM

February 08, 2009

Chris McDonough

Amendments for "What's Your Web Framework Doing"

Sorry folks, it turns out that the discard_first_request flag in repoze.profile wasn't working correctly in my last round of tests, which skewed the absolute number of profile lines emitted by each profile run for all frameworks. The previous conclusion counted profile lines which were only meaningful for the first request. This was due to a bug in repoze.profile. Since they only happen on the first request, these lines are not indicative of reality, so I've redone the tests.

I've released a fixed version of repoze.profile (0.7), and I've updated the numbers using this version. This time, however, the first request is actually discarded, so its function calls don't show up in the profiler output. The resulting numbers, linked to detailed output, are below:

Links to those numbers lead to repaired profile output as well as enough information to reproduce the test on your own system. I've amended the previous blog post with a pointer to this one. Note that instead of using ab -n1000 -c4, I used ab -n1001 -c4 in order to show even numbers for most of the lines in each profile result.

The relative numbers between frameworks are not completely wildly different (previously they had been bfg 50, django 105, pylons 163, and grok 834; the relative "ranking" remained the same), Grok definitely got the short end of the stick on the last round of tests, they now fare better by a little less than 3X (apologies Grok folks). It should also be noted that the Pylons trunk fares better on this test, if you also use the Routes trunk and the WebOb trunk. Django fared better by half in this round as well.

by chrism at February 08, 2009 03:51 PM

February 07, 2009

Chris McDonough

What is *Your* Web Framework Doing Under the Hood?

NOTE : The numbers present in this blog post have been removed. They just weren't indicative of reality. New numbers are available from http://plope.com/whatsitdoing2 . However, I've retained the narrative here for context.

It can be a bit useless to benchmark web application frameworks. When you're commmitted to a particular framework, either it works or it doesn't for your particular application; often raw speed is not really a concern. You're probably not going to switch web frameworks in the middle of a project in order to get a 15% or even a 50% or 100% speed increase: you've got too much investment in the code that works under the framework to consider it. In my experience, very few people truly understand more than one web framework, and they tend to use that framework for everything even it it's slightly less optimal for any specific task; this is because the "switching cost" to go to another one is so high. So benchmarks aren't really all that interesting in the "real" web world; it all depends on context.

But if you haven't chosen a web framework yet (is there anyone?), or if you're falling out of love with your current web framework and you're considering using a different one, you might be able to learn something from profiling an application running under various frameworks nonetheless, even if you ignore the raw speed of the framework itself.

One measure of what you're going to be faced with with when your web application framework doesn't work as advertised is the complexity of what a it does to render a very simple page. If it does a lot of work to render a very simple page, you might need to understand a lot if it breaks or to extend it. If it does very little work, it's likely it will be easier to fix and/or extend than one that does a lot of work. Additionally, usually it's a corollary that the less work an application server does to render a response, the faster it will render that response. But that's not the point here, we're only concerned about work done.

I am the primary author of one of the frameworks shown here (repoze.bfg). As a result, "lies and benchmarks" applies, of course, more than it otherwise would. But I've done my level best to make each framework do as little work as possible to render the page. If I've not, I'm sure the various framework authors will tell me how to improve things.

I've done the following:

  • I've created four "hello world" applications (one for each framework tested) These applications are available at http://svn.repoze.org/whatsitdoing in the directories named after their respective framework. The four frameworks that I wrote applications for were repoze.bfg, Grok, Pylons, and Django.
  • I ensured I could run each using a WSGI server in order to be able to use http://pypi.python.org/pypi/repoze.profile.
  • I placed the http://pypi.python.org/pypi/repoze.profile WSGI middleware into the pipeline in various ways within each application I tested.
  • I ran "ab" against the "hello world" page of a running instance of each application (via "ab n1000 -c4") with the profiling middleware turned on within the WSGI pipeline.
  • I scraped the output of the "__profile__" page provided by repoze.profile after the "ab" run for each framework was completed. This output gives a rough indicattion how much work was done during the run of "ab". repoze.profile uses the Python "profile" module to peek in to see what an application is doing under the hood.
  • I counted the number of lines outputted by the profiler (not including header information). This is a rough estimate of how "broad" the software is. Each line represents a specific function called. So the more lines, the more functions called, and (to some extent) the more you'll need to understand when it doesn't work right or when you need to change what it does or just plain-old understand what it does. By this measure, fewer lines is probably better. Of course, some frameworks might defer doing "one-time" work until the first request that others don't, so this isn't a perfect metric. That said, why would they? Why not get it over with at startup time?

Here are the results:

EDITED (see amended numbers at http://plope.com/whatsitdoing2)

Each link above shows how the application was configured, what software versions were involved, how the application was invoked, as well as the SVN link to the source code for the application, and any config tweaks attempted to make it faster. You can run these yourself if you want to; the results files and code taken together should contain all the information needed to replicate the results.

My blog signup is broken, and people without accounts can't comment, so if you don't have an account, please mail me if you want an account here in order to respond here.

EDIT: shameless self-promotion: I'll be giving a repoze bfg.tutorial this year at PyCon. Please sign up if you're interested.

by chrism at February 07, 2009 03:22 PM

February 05, 2009

Repoze Project

Carlos de la Guardia: Screencast, Plone + repoze.bfg

Once again, kudos to Carlos for his screencast on using Plone in conjunction with repoze.bfg.

Carlos writes that the BFG version, which publishes content mirrored to an RDBMS from the Plone CMS backend, renders pages at around 100x faster than Plone itself.

He is hoping to release the software, developed in conjunction with a big project for the Chilean Library of Congress, in time for PyCon, where he will be presenting about the topic. Way to go, Carlos!

February 05, 2009 07:10 PM

January 22, 2009

Noah Gift

Article on Developing an iPhone Application with Google App Engine

Jonathan Saggue and I just completed an article for IBM Developerworks on writing an iPhone application that uses Google App Engine as a back end. The complete source for the application is available here, as well a framework to help deal with caching.

One additionally cool thing about Google App Engine, is that it is ridiculously easy to prototype applications, or extend development environments. Objective C is a reasonable language, but it is much slower to write code then Python. Using Google App Engine can become a way to prototype out parts of the application to speed up development time, and/or augment what something was intended to do.

by Noah Gift (noah.gift@gmail.com) at January 22, 2009 07:13 AM

January 21, 2009

Ben Bangert

New PylonsHQ Site Launches

The new PylonsHQ site has now launched!

The new site is running on the latest Pylons 0.9.7 code-base backed by the CouchDB database. New features that have been added:

Unfortunately, we were unable to integrate the Wiki’s auth, so that will still require a separate login for now.

Comments are through-out the site, to ensure that feedback isn’t missed and of course there’s many more features planned that are coming soon. The site isn’t quite 100% complete, as a few links here and there are likely broken yet (like the tutorial links on the front page). I’ll be putting out frequent updates to remedy this and any other little bits that need more polish.

Enjoy!

by ben at January 21, 2009 11:08 PM

January 20, 2009

Repoze Project

Repozecast 3: Wherein We Cut It Short

Repozecast 3 (mp3), a podcast about Repoze, is very short; less than 10 minutes. It talks about repoze.bfg. The reason it's so short is that we talked for another forty-five minutes after the Handy Zoom recorder had run out of space. We'll try to recover the ground we lost in subsequent epsisodes. Have fun!

- Chris McDonough

January 20, 2009 04:49 AM

January 19, 2009

Repoze Project

Pycon US 2009: The Big F'n Tutorial

Pycon US 2009 is in Chicago this year again. I love going there. This year at the conference, I'll be presenting a tutorial on the repoze.bfg web framework along with Chris Perkins. The tutorial is on Thursday, March 26. If you're interested in repoze.bfg, this is a great way to learn how it works.

Chris Perkins will also be presenting a tutorial on ToscaWidgets: Test Driven Modular Ajax on the same day.

- Chris McDonough

January 19, 2009 08:50 PM

Chris McDonough

Big F'n Tutorial at Pycon US 2009

Pycon US 2009 is in Chicago this year again. I love going there. This year at the conference, I'll be presenting a tutorial on repoze.bfg web framework along with Chris Perkins. The tutorial is on Thursday, March 26. If you're interested in repoze.bfg, this is a great way to learn how it works.

Chris Perkins will also be presenting a tutorial on ToscaWidgets: Test Driven Modular Ajax the same day.

by chrism at January 19, 2009 08:16 PM

Repoze Project

repoze.what 1.0 Final Released!

The repoze.what authorization framework has its first stable release!

repoze.what, which is the default authorization framework in TurboGears 2, was initially a TurboGears-specific repoze.who plugin (tg.ext.repoze.who) to support authorization based on the groups the authenticated user belongs to and the permissions granted to such groups, written by Chris McDonough, Florent Aide and Christopher Perkins.

The plugin evolved as an framework for arbitrary WSGI applications which allows developers to store the groups and permissions of the application in other source types (not only databases), just to name a few of the features implemented as a TurboGears-independent project.

The code sample below illustrates how this fully documented and tested framework (yes, its code coverage is at 100%) can be used:

# Sample use in TurboGears 2; pay attention to the line with the "@require"
class RootController(BaseController):
    # ...

    @expose('algo.templates.index')
    @require(predicates.has_permission('manage', msg=_('Only for managers')))
    def manage_permission_only(self):
        return dict(page='managers stuff')

In the example above, only people with the "manage" permission will be granted access to the "manage_permission_only" action. Also, if access is denied (i.e., user doesn't have the "manage" permission), she will be redirected to the login form and the message "Only for managers" will be displayed; a behavior that is fully customizable.

This groups/permissions-based authorization pattern is just the default pattern supported in repoze.what, and you can extend it to support your own pattern by creating so-called "predicates".

Planning for the upcoming major release of the package has already started, so please don't hesitate to report the features you want to see in this release!

Special thanks go to Chris McDonough for his support throughout the development of repoze.what.

-- Gustavo Narea.

January 19, 2009 07:08 PM

January 16, 2009

Ian Bicking

Woonerf and Python

At TOPP there’s a lot of traffic discussion, since a substantial portion of the organization is dedicated to Livable Streets initiatives. One of the traffic ideas people have gotten excited about is Woonerf. This is a Dutch traffic planning idea. In areas where there’s the intersection of lots of kinds of traffic (car, pedestrian, bike, destinations and through traffic) you have to deal with the contention for the streets. Traditionally this is approached as a complicated system of rules and right-of-ways. There’s spaces for each mode of transportation, lights to say which is allowed to go when (with lots of red and green arrows), crosswalk islands, concrete barriers, and so on.

A problem with this is that a person can only pay attention to so many things at a time. As the number of traffic controls increases, the controls themselves dominate your attention. It’s based on the ideal that so long as everyone pays attention tothe controls, they don’t have to pay attention to each other. Of course, if there’s a circumstance the controls don’t take into account then people will deviate (for instance, crossing somewhere other than the crosswalk, or getting in the wrong lane for a turn, or the simple existance of a bike is usually unaccounted for). If all attention is on the controls, and everyone trusts that the controls are being obeyed, these deviations can lead to accidents. This can create a negative feedback cycle where the controls become increasingly complex to try to take into account every possibility, with the addition of things like Jersey barriers to exclude deviant traffic. At least in the U.S., and especially in the suburbs or in complex intersections, this feeling of an overcontrolled and restricted traffic plan is common.

Copenhagen retail street

So: Woonerf. This is an extreme reaction to traffic controls. An intersection designed with the principles of Woonerf eschews all controls. This includes even things like curbs and signage. It removes most cues about behavior, and specifically of the concept of "right of way". Every person entering the intersection must view it as a negotiation. The use of eye contact, body language, and hand signals determines who takes the right of way. In this way all kinds of traffic are peers, regardless of destination or mode of transport. Also each person must focus on where they are right now, and not where they will be a minute from now; they must stay engaged.


Code as Jersey Barrier

So, I was reading a critique of Python where someone was saying how they missed public/private/protected distinctions on attributes and methods. And it occurred to me: Python’s object model is like Woonerf.

Python does not enforce rules about what you must and must not do. There are cues, like leading underscores, the __magic_method__ naming pattern, or at the module level there’s __all__. But there are no curbs, you won’t even feel the slightest bump when you access a "private" attribute on an instance.

This can lead to conflicts. For example, during discussions on installation, some people will argue for creating requirements like "SomeLibrary>=1.0,<2.0", with the expectation that while version 2.0 doesn’t exist, so long as you install something in the 1.x line it will maintain compatibility with your application. This is an unrealistic expectation. Do you and the library maintainer have the same idea about what compatibility means? What if you depend on something the maintainer considers a bug?

Practically, you can’t be sure that future versions of a library will work. You also can’t be sure they won’t work; there’s nothing that requires the maintainer of the library to break your application with version 2.0. This is where it becomes a negotiation. If you decide to cross without a crosswalk (use a non-public API) then okay. You just have to keep an eye out. And library authors, whether they like it or not, need to consider the API-as-it-is-used as much as the API-they-have-defined. In open source in particular, there are a lot of ways to achieve this communication. We don’t use some third party (e.g., a QA team or language features) to enforce rules on both sides (there are no traffic controls), instead the communication is more flat, and speaks as much to intentions as mechanisms. When someone asks "how do I do X?" a common response is: "what are you trying to accomplish?" Often an answer to the second question makes the first question irrelevant.

Woonerf is great for small towns, for creating a humane space. Is it right for big cities and streets, for busy people who want to get places fast, for trucking and industry? I’m not sure, but probably not. This is where a multi-paradigm approach is necessary. Over time libraries have to harden, become more static, innovation should happen on top of them and not in the library. Some times we create third party controls through interfaces (of one kind or another). I suppose in this case there is a kind of negotiation about how we negotiate — there’s no one process for how to build negotiation-free foundations in Python. But it’s best not to harden things you aren’t sure are right, and I’m pretty sure there’s no "right" at this very-human level of abstraction.

by Ian Bicking at January 16, 2009 07:09 PM

January 15, 2009

Repoze Project

Carlos de la Guardia Takes up the "Songlist" Meme

Kudos to Carlos for his writeup on implementing the "songlist" app, now circulating in the Python blogosphere, using repoze.bfg.

Carlos writes that the BFG version, implemented in ~50 lines of Python and ZCML, benchmarks at 1060 requests per second, compared to the 1780 r/s of the "raw" WSGI version on the same machine.

January 15, 2009 09:15 PM

Noah Gift

What is The Best Version Control System To Extend In Python?

I have a project that is coming up that involves extending version control and moving data around. I am a Python developer who loves writing Python code, so I was curious to consider not just Subversion and pysvn, but also Mercurial, Bazaar, and Git.

It is one thing to buy into the benefits of distributed version control, but when you want to actually extend a system, it does make me look long and hard at the tools made in Python. Bazaar, in particular, seems to have a nice plugin architecture. Does anyone have an experience they would like to share as a Python developer?

by Noah Gift (noah.gift@gmail.com) at January 15, 2009 06:52 AM

Ian Bicking

Cultural Imperialism, Technology, and OLPC

A couple posts have got me thinking about cultural imperialism lately: a post by Guido van Rossum about "missionaries" and OLPC not about OLPC at all, a post by Chris Hardie and a speech by Wade Davis.

Some of the questions raised: are we destroying cultures? If so, what can we do about it? Must we be hands off? I will add these questions: is it patronizing to make these choices for other people, no matter how enlightened we try to be? How much change is inevitable? Can we help make the change positive instead of resisting change?

More specifically: what is the effect of OLPC on cultures where it is introduced? Especially small cultures, cultures that have been relatively isolated, cultures that are vulnerable. The internet Quechua community is pretty slim, for example. Introducing the internet into a community will lead the children to favor Spanish more strongly, and identify with that more dominant culture over their family and community culture.

Criticisms like Guido’s are common:

I’m not surprised that the pope is pleased by the OLPC program. The mentality from which it springs is the same mentality which in past centuries created the missionary programs. The idea is that we, the west, know what’s good for the rest of the world, and that we therefore must push our ideas onto the "third world" by means of the most advanced technology available. In past centuries, that was arguably the printing press, so we sent missionaries armed with stacks of bibles. These days, we have computers, so we send modern missionaries (of our western lifestyle, including consumerism, global warming, and credit default swaps) armed with computers.

This kind of criticism is easy, because it doesn’t have any counterproposal. It’s not saying much more than "you all suck" to the people involved.

Cultural imperialism is a genuine phenomena. In an attempt to subjugate or assimilate, the dominant culture may explicitly and cynically enforce its cultural norms, through its religion, requiring all schools to operate in the dominant language, even going as far as suggesting how we arrange ourselves during sex.

But it’s not clear to me that what’s happening now is cultural imperialism. It’s more market-oriented homogenization. Food manufacturers don’t use high-fructose corn syrup because they want to make us fat — they just give us what we want, and they are enabling our latent tendency to become obese. Similarly I think the way culture is spread currently encourages homogeneity, without explicit attempting to destroy culture.

This is where I think a protectionist stance — the idea we should just be hands-off — is patronizing. People aren’t abandoning their cultures because they are stupid and they are being manipulated. People make decisions, what they think is the best decision for themself and their families. These decisions lead them to leave rural areas, learn the dominant language, try to conform through education, and even just lead them to enjoy a dominant culture which is often far more entertaining than a smaller and more traditional culture.

The irony is that once they’ve done this they’ve traded their position for a place in the bottom rung of the dominant society. And it’s true that in many cases they’ve made these decisions because they’ve been forced out of their traditional life by political and legal systems they don’t understand. But to blame it all on oppression is to be blind to the many concrete benefits of our modern world. Corrugated metal roofs are simply superior to thatched roofs, and we can get all romantic about traditional building processes and material independence, but we do so from homes with roofs that don’t leak. Leaking roofs are just objectively unpleasant. And frankly people like TV, you don’t have to tell people to like TV, it just happens.

So I believe that assimilation pressure is natural and inevitable in our times.

What then of technology, of the internet and laptops?

I believe OLPC takes an important stance when it selects open source and open licensing for its content. It is valuing freedom, but more importantly encouraging self-determination, trying to build up a user base that can act as peers in this project, not as simply receivers of first-world largess. But it will be culturally disruptive. And I’m okay with that. In a patriarchal culture, giving girls access to this technology will be destructive to that power structure. Yay! I believe in the moral rightness of that one girl making her own choices, finding her own truths, more than I believe in the validity of the culture she was born into. If you believe people should be able to make their own choices (so long as they are aware of the real consequence of their choices), then you must allow for them to choose to abandon their own cultures for something they find more appealing. They might know better than you if that’s a good choice. I think we all hope that instead they transform their own cultures, but that’s not our choice to make.

What I find unpleasant is if they leave a true identity to find themselves in a place of cultural subservience. If they feel they can’t preserve the part of their culture they most value. Perhaps because of discrimination they feel they must hide their past, or they build up a sense of self-loathing. Perhaps they become isolated, unable to find peers that understand where they come from. And perhaps there is no higher culture at all that they can use to exalt their understanding of the world — do they have a literature? Do they have non-traditional music forms of their own? Do they have a forum where people who share their perspective can have serious discussions? Cultures aren’t destroyed so much as they are starved out of existence.


I think assimilation is inevitable, and can be positive. If we were all able to speak to each other, with some shared second or third language, I think the world would be a better place. I’m not a Christian, but I’m not afraid of anyone knowing The Bible. There’s no piece of culture that I would want to deny from anyone. Each new song, each new book, each new idea… I believe they will all make you a better person, if only in a small way.

And on the internet our culture is cumulative. There’s only so many hours of programming on TV or the radio, only so many pages in a newspaper. On the internet the presence of one kind of culture does not exclude any other. There’s room for a Quechua community as much of any other. But the online Quechua community won’t have exclusive rights to its members like a traditional culture claims — children will live between cultures.

Cumulative culture is not a promise that anyone will care. Languages can still die, cultures can still die, identities become forgotten. If these smaller cultures are going to be preserved, they must adapt to the partially-assimilated status of their members. There must be new art and new ideas and new identities. This is why I believe in the laptop project, because it can enable the creation and sharing of these new ideas. I think it will give smaller cultures a chance to survive — there’s no promises, literature doesn’t write itself, but maybe there is at least a chance.

This is also why I am more skeptical of mobile phones, audio devices, and any device that doesn’t actively enable content creation. Mobile phones are not how culture is made. It let’s people chat, consume information, communicate in a 12-key pidgin. But the mobile phone user is not a peer in a world wide web of information. The mobile phone user lives on a proprietary network, with a proprietary device, and while it perhaps it breaks down some hierarchies through disintermediation, it does so in a transient way. The uptake is certainly faster, but the potential seems so much lower.

I don’t know if OLPC will be successful. That’s as unclear now as ever. But it’s trying to do the right thing, and I think it’s a better chance than most for maintaining or improving the richness of the worlds’ culture.

by Ian Bicking at January 15, 2009 12:58 AM

January 14, 2009

Ian Bicking

Modern Web Design, I Renounce Thee!

I’m not a designer, but I spend as much time looking at web pages as the next guy. So I took interest when I came upon this post on font size by Wilson Miner, which in turn is inspired by the 100e2r (100% easy to read) standard by Oliver Reichenstein.

The basic idea is simple: we should have fonts at the "default" size, about 16px, no smaller. This is about the size of text in print, read at a reasonable distance (typically closer up than a screen):

http://blog.ianbicking.org/wp-content/uploads/images/typesize_comparison2.jpg

Also it calls out low-contrast color schemes, which I think are mostly passe, and I will not insult you, my reader, by suggesting you don’t entirely agree. Because if you don’t agree, well, I’m afraid I’d have to use some strong words.

I think small fonts, low contrast, huge amounts of whitespace, are a side effect of the audience designers create for.

This makes me think of Modern Architecture:

http://blog.ianbicking.org/wp-content/uploads/images/300px-seagram.jpg

This is a form of architecture popular for skyscapers and other dramatic structures, with their soaring heights and other such dramatic adjectives. These are buildings designed for someone looking at the building from five hundred feet away. They are not designed for occupants. But that’s okay, because the design isn’t sold to occupants, it is sold to people who look at the sketches and want to feel very dramatic.

Similarly, I think the design pattern of small fonts is something meant to appeal to shallow observation. By deemphasizing the text itself, the design is accentuated. Low-contrast text is even more obviously the domination of design over content. And it may very well look more professional and visually pleasing. But web design isn’t for making sites visually pleasing, it is for making the experience of the content more pleasing. Sites exist for their content, not their design.

In 100e2r he also says let your text breathe. You need whitespace. If you view my site directly, you’ll notice I don’t have big white margins around my text. When you come to my site, it’s to see my words, and that’s what I’m going to give you! When I want to let my text breathe with lots of whitespace this is what I do:

http://blog.ianbicking.org/wp-content/uploads/images/500px-my-white-desktop.jpg

Is a huge block of text hard to read? It is. And yeah, I’ve written articles like that. But the solution?

WRITE BETTER

Similarly, it’s hard to read text if you don’t use paragraphs, but the solution isn’t to increase your line height until every line is like a paragraph of its own.

The solution to the drudgery of large swathes of text is:

  1. Make your blocks of text smaller.
  2. Use something other than paragraphs of text.

Throw in a list. Do some indentation. Toss in even a stupid picture. Personally I try to throw in code examples, because that’s how we roll on this blog.

That’s good writing, that’s content that is easy to read. It’s not easy to write, and I’m sure I miss the mark more often than not. But you can’t design your way to good content. If you want to write like this, if you want to let the flow of your text reflect the flow of your ideas, you need room. Huge margins don’t give you room. They are a crutch for poor writing, and not even a good crutch.

So in conclusion: modern design be damned!

by Ian Bicking at January 14, 2009 07:16 AM

January 12, 2009

Noah Gift

Can Armin Ronacher and Pocoo Save The World?

I started playing around with Zine recently and have to say I am quite impressed with Armin, and the whole Zine. Every time I look at pocoo.org I think, WTF, you guys are INSANE. Keep up the incredible work with all of the stuff you guys are working on.

by Noah Gift (noah.gift@gmail.com) at January 12, 2009 09:31 AM

January 11, 2009

Ian Bicking

Atompub as an alternative to WebDAV

I’ve been thinking about an import/export API for PickyWiki; I want something that’s sensible, and works well enough that it can be the basic for things like creating restorable snapshots, integration with version control systems, and being good at self-hosting documentation.

So far I’ve made a simple import/export system based on Atom. You can export the entire site as an Atom feed, and you can import Atom feeds. But whole-site import/export isn’t enough for the tools I’d like to write on top of the API.

WebDAV would seem like a logical choice, as it lets you get and put resources. But it’s not a great choice for a few reasons:

  • It’s really hard to implement on the server.
  • Even clients are hard to implement.
  • It uses GET to get resources. This is probably its most fatal flaw. There is no CMS that I know of (except maybe one) where the thing you view the browser is the thing that you’d actually edit. To work around this CMSes use User-Agent sniffing or an alternate URL space.
  • WebDAV is worried about "collections" (i.e., directories). The web basically doesn’t know what "collections" are, it only knows paths, and paths are strings.
  • (In summary) WebDAV uses HTTP, but it is not of the web.

I don’t want to invent something new though. So I started thinking of Atom some more, and Atompub.

The first thought is how to fix the GET problem in WebDAV. A web page isn’t an editable representation, but it’s pretty reasonable to put an editable representation into an Atom entry. Clients won’t necessarily understand extensions and properties you might add to those entries, but I don’t see any way around that. An entry might look like:

<entry>
  <content type="html">QUOTED HTML</content>
  ... other normal metadata (title etc) ...
  <privateprop:myproperty xmlns:privateprop="URL" name="foo" value="bar" />
</entry>

While there is special support for HTML, XHTML, and plain text in Atom, you can put any type of content in <content>, encoded in base64.

To find the editable representation, the browser page can point to it. I imagine something like this:

<link rel="alternate" type="application/atom+xml; type=entry"
 href="this-url?format=atom">

The actual URL (in this example this-url?format=atom) can be pretty much anything. My one worry is that this could be confused with feed detection, which looks like:

<link rel="alternate" type="application/atom+xml"
 href="/atom.xml">

The only difference is "; type=entry", which I’m betting a lot of clients don’t pay attention to.

The Atom entries then can have an element:

<link rel="edit" href="this-url" />

This is a location where you can PUT a new entry to update the resource. You could allow the client to PUT directly over the old page, or use this-url?format=atom or whatever is convenient on the server-side. Additionally, DELETE to the same URL would delete.

This handles updates and deletes, and single-page reads. The next issue is creating pages.

Atompub makes creation fairly simple. First you have to get the Atompub service document. This is a document with the type application/atomsvc+xml and it gives the collection URL. It’s suggested you make this document discoverable like:

<link rel="service" type="application/atomsvc+xml"
 href="/atomsvc.xml">

This document then points to the "collection" URL, which for our purposes is where you create documents. The service document would look like:

<service xmlns="http://www.w3.org/2007/app"
         xmlns:atom="http://www.w3.org/2005/Atom">
  <workspace>
    <atom:title>SITE TITLE</atom:title>
    <collection href="/atomapi">
      <atom:title>SITE TITLE</atom:title>
      <accept>*/*</accept>
      <accept>application/atom+xml;type=entry</accept>
    </collection>
  </workspace>
</service>

Basically this indicates that you can POST any media to /atomapi (both Atom entries, and things like images).

To create a page, a client then does a POST like:

POST /atomapi
Content-Type: application/atom+xml; type=entry
Slug: /page/path

<entry xmlns="...">...</entry>

There’s an awkwardness here, that you can suggest (via the Slug header) what the URL for the new page is. The client can find the actual URL of the new page from the Location header in the response. But the client can’t demand that the slug be respected (getting an error back if it is not), and there’s lots of use cases where the client doesn’t just want to suggest a path (for instance, other documents that are being created might rely on that path for links).

Also, "slug" implies… well, a slug. That is, some path segment probably derived from the title. There’s nothing stopping the client from putting a complete path in there, but it’s very likely to be misinterpreted (e.g. translating /page/path to /2009/01/pagepath).

Bug I digress. Anyway, you can post every resource as an entry, base64-encoding the resource body, but Atompub also allows POSTing media directly. When you do that, the server puts the media somewhere and creates a simple Atom entry for the media. If you wanted to add properties to that entry, you’d edit the entry after creating it.

The last missing piece is how to get a list of all the pages on a site. Atompub does have an answer for this: just GET /atomapi will give you an Atom feed, and for our purposes we can demand that the feed is complete (using paging so that any one page of the feed doesn’t get too big). But this doesn’t seem like a good solution to me. GData specifies a useful set of queries to for feeds, but I’m not sure that this is very useful here; the kind of queries a client needs to do for this use case aren’t things GData was designed for.

The queries that seem most important to me are queries by page path (which allows some sense of "collections" without being formal) and by content type. Also to allow incremental updates on the client side, filtering these queries by last-modified time (i.e., all pages created since I last looked). Reporting queries (date of creation, update, author, last editor, and custom properties) of course could be useful, but don’t seem as directly applicable.

Also, often the client won’t want the complete Atom entry for the pages, but only a list of pages (maybe with minimal metadata). I’m unsure about the validity of abbreviated Atom entries, but it seems like one solution. Any Atom entry can have something like:

<link rel="self" type="application/atom+xml; type=entry"
 href="url?format=atom" />

This indicates where the entry exists, though it doesn’t suggest very forcefully that the actual entry is abbreviated. Anyway, I could then imagine a feed like:

<feed>
  <entry>

    <content type="some/content-type" />
    <link rel="self" href="..." />
    <updated>YYYYMMDDTHH:MM:SSZ</updated>
  <entry>
  ...
</feed>

This isn’t entirely valid, however — you can’t just have an empty <content> tag. You can use a src attribute to use indirection for the content, and then add Yet Another URL for each page that points to its raw content. But that’s just jumping through hoops. This also seems like an opportunity to suggest that the entry is incomplete.

To actually construct these feeds, you need some way of getting the feed. I suggest that another entry be added to the Atompub service document, something like:

<cmsapi:feed href="URI-TEMPLATE" />

That would be a URI Template that accepted several known variables (though frustratingly, URI Templates aren’t properly standardized yet). Things like:

  • content-type: the content type of the resource (allowing wildcards like image/*)
  • container: a path to a container, i.e., /2007 would match all pages in /2007/…
  • path-regex: some regular expression to match the paths
  • last-modified: return all pages modified at the given date or later

All parameters would be ANDed together.

So, open issues:

  • How to strongly suggest a path when creating a resource (better than Slug)
  • How to rename (move) or copy a page (it’s easy enough to punt on copy, but I’d rather move by a little more formal than just recreating a resource in a new location and deleting the original)
  • How to represent abbreviated Atom entries

With these resolved I think it’d be possible to create a much simpler API than WebDAV, and one that can be applied to existing applications much more easily. (If you think there’s more missing, please comment.)

by Ian Bicking at January 11, 2009 11:27 PM