Posts Tagged ‘Python’

Debugging python (multi)processing

Thursday, January 7th, 2010
CPython
Image via Wikipedia

My goal is to get the pdb shell from the worker processes i spawned with Process() from python-processing. The “classic” approach to spawning the pdb shell miserably fails:

(Pdb) > /home/redduck666/dev/abj/bin/feeds.py(639)__init__()
-> self.timeout = timeout
Process Process-3:2:
Traceback (most recent call last):
  File "/var/lib/python-support/python2.5/processing/process.py", line 227, in _bootstrap
    self.run()
  File "/var/lib/python-support/python2.5/processing/process.py", line 85, in run
    self._target(*self._args, **self._kwargs)
  File "./feeds.py", line 639, in __init__
    self.timeout = timeout
  File "./feeds.py", line 639, in __init__
    self.timeout = timeout
  File "/usr/lib/python2.5/bdb.py", line 48, in trace_dispatch
    return self.dispatch_line(frame)
  File "/usr/lib/python2.5/bdb.py", line 66, in dispatch_line
    self.user_line(frame)
  File "/usr/lib/python2.5/pdb.py", line 144, in user_line
    self.interaction(frame, None)
  File "/usr/lib/python2.5/pdb.py", line 187, in interaction
    self.cmdloop()
  File "/usr/lib/python2.5/cmd.py", line 130, in cmdloop
    line = raw_input(self.prompt)
ValueError: I/O operation on closed file

The problem here is that processing closes the file descriptors for the processes it spawns, so a straight forward approach like that will not work. Due to the same reason using sys.__std(out|in|err)__ will not work.

The solution for me was to tell explicitly python to use my current stdin/stdout. The ‘r+’ flag is needed as pdb needs to read from stdin.

pdb.Pdb(stdin=open('/dev/stdin', 'r+'), stdout=open('/dev/stdout', 'r+')).set_trace()

I use this on Linux, AFAIK it should work across Unix world (and is probably horribly broken on Windows).

GAE — too much magic!

Monday, August 17th, 2009

Image representing Google App Engine as depict...
Image via CrunchBase
This is a rant post about GAEGoogle AppEngine, it is too magical.

My first problem with it is that pdb simply doesn’t work. Why not? Because GAE hijacks your stdout and prints in on the web page. This by itself should ring bells, seriously cgi wrapper? *checks calendar* What do you mean it’s 2009? Are you sure you didn’t mean 1999? But apparently i’m not the only one to have that problem, people have come up with the solution, explicitly pass the file descriptors to pdb. It usually boils down to something like:

def set_trace():
    import pdb, sys
    debugger = pdb.Pdb(stdin=sys.__stdin__,
        stdout=sys.__stdout__)
    debugger.set_trace(sys._getframe().f_back)

Great, now i have a debugger that works. Until it doesn’t. For example trying to debug any POST statements has turned fruitless for me:

> /home/redduck666/dev/abj/st/trunk/views/auth.py(91)post()
-> login = self.request.get('login')
(Pdb) print self.request
(Pdb) print 'wtf??'

This is using the above defined set_trace, as you can see it doesn’t print anything to the stdout and instead prints it to web page when ‘c’ in pdb is hit. Why? Who knows?

Than there is the datastore initialization problem i’ve already been rambling about. You can’t out of the box write to GAE datastore from python script. You’d think an engineer driver web company would get this right, such as do the initialization in the model classes not some magical part of dev_appserver.py. But fear not it get’s better! For example my frontend developer is complaining that sometimes she can login with the username, sometime she can not! The best part is that i have a script executed before the dev_appserver.py to create some dummy data, it wasn’t working. Than i cleaned my datastore dir and surprise surprise, the exact same script executed in the exact same way works now. If that ain’t magic i don’t know what is.

What happens if something goes wrong with the appcfg.py update? Who knows?

Checking if new version is ready to serve.
Closing update: new version is ready to start serving.
2009-08-16 05:23:29,819 ERROR appcfg.py:1272 An unexpected error occurred. Aborting. 
Traceback (most recent call last):
  File "/root/abj/GAE/1.2.3/google/appengine/tools/appcfg.py", line 1265, in DoUpload
    self.Commit()
  File "/root/abj/GAE/1.2.3/google/appengine/tools/appcfg.py", line 1141, in Commit
    self.StartServing()
  File "/root/abj/GAE/1.2.3/google/appengine/tools/appcfg.py", line 1194, in StartServing
    app_id=self.app_id, version=self.version)
  File "/root/abj/GAE/1.2.3/google/appengine/tools/appengine_rpc.py", line 344, in Send
    f = self.opener.open(req)
  File "/usr/lib/python2.5/urllib2.py", line 387, in open
    response = meth(req, response)
  File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.5/urllib2.py", line 425, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.5/urllib2.py", line 506, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error
Rolling back the update.
Error 500: --- begin server output ---
 
Server Error (500)
A server error has occurred.
--- end server output ---

Things i hate in django

Sunday, July 26th, 2009
django-logo-negative
Image by John Griffiths via Flickr

First a disclaimer, i use and like django, occasionally i even advocate it. IMHO it’s advantages greatly over weight it’s disadvantages. To paraphrase brian d foy never trust someone who can’t find things to hate in a thing he loves. Here is my attempt at explaining things i hate about django.

My first problem is called reverse. Imagine a situation where you import something that has a problem in say “pie.views”, reverse will give you “pie.views” as the source of the problem. On an upside it will report the error :) . The real world example of this is trying to run the “social” project from newsmixer source code, after you get through initial problems it throws:

ViewDoesNotExist at /
 
Tried index in module pie.views. Error was: 'module' object has no attribute 'register'

Problem being that “pie.views” doesn’t mention “register” in it’s source code :) .

Reverse ain’t really that powerful, for example, if you have urls ala http://www.kiberpipa.org/sl/event/2009-jun-11/727/luka-princic-crosshibrid/ and the only thing in there you care about is the id, 727 in this case. Well, you’re out of luck with reverse :) .

My next major class of things i hate can be joined under name “bundle”. Say you like the forms library and want to use it in your non-web project?

ImportError: No module named django.utils.html

Ops! Doesn’t work without the rest of django. This is a really small example, for a much bigger one let’s have a look at one of django’s killer features, the admin interface. It doesn’t work without the authentication, which in turn requires django ORM. Those things are by themselves quite big part of django. For example I’d love to see django broken down in reusable packages

When i say that authentication requires ORM i mean that User row in the SQL table has to exist, no matter what your custom auth backend does. Suppose you want to write the LDAP auth module which grants permissions based on the department people are in. The corner case here is what happens when a person changes department? To handle that case reliably at every login you have to delete persons permissions and grant them again the permissions which belong to current department :/.

DJANGO_SETTINGS_MODULE. WTF? :) The manage.py does handle it, but when you deploy scenarios don’t have that luxury. Is it really that hard to do some checks and try to automatically determine it? :)

Having a full text search on django docs page would make me happier, for example search for examples of .extra() is more difficult than it has to be :) . While we are at the django web site, having membership management (password reset/change) for trac would be helpful.

To conclude i can say that given django’s size i actually expected to have more things to rant about.

Avahi thoughts

Tuesday, June 30th, 2009
Diagram of Streaming Multicast
Image via Wikipedia

First a short intro to avahi, basically it is a ZeroConf implementation for linux, to make a long story short through use of multicast it is able to discover services as they appear (as well as scan network for those services).

This is a tale of a hacker deciding to take a pydra ticket. After verifying that avahi has python bindings i headed to their home page looking for docs. There is a “ProgrammingDocs” which looked like a good sign, imagine my horror when i read on that page

Though no real documentation about the DBUS API is available, you may browse the DBUS introspection data online

OK, so they pretty much don’t have docs, it can’t be that bad to work with it, right? Their API is to f**** complicated :-) , this is supposed to be the simplest of the “client” examples, this is the simplest of the “publisher” examples, fortunately the publisher wrapper is very nice to work with, but the fact that it exists hints that there is something wrong with the API.

As far as the protocol goes, first thing that struck me is that you can’t really distinguish between server and client, both take an active role. “publisher” advertisizes it’s service, while the “client” (or however you wanna call it) initiates the glib event loop and waits for asynchronous callback to happen. If we ignore the fact that you are forced to use certain event loop having continuous discovery is not a bad thing, it provies you a way to do discovery even if certain involved parties are temporary down, as well as the ability to see them as they join the network. What i’m saying is that forcing people to use event loop is a bad thing (as it is big overkill in simple cases).

If we get to my code, i choose to make the master the one discovering nodes, the reason for this is that this way i don’t have to make the node tell the master “hey i’m alive use me” (and of course implement the appropriate extension to the protocol). So basically a master is looking for nodes, when a node is put on network it advertisizes itself and master finds it and add’s it to it’s Node list.

Reblog this post [with Zemanta]

metaclasses (hopefully) explained

Friday, June 26th, 2009

Metaclasses are deeper magic than 99% of users should ever worry about. If you
wonder whether you need them, you don’t.”

—Tim Peters

This quote illustrates the perceived complexity of metaclasses in python, they really aren’t tho, this post attempts to explain them in simple terms understandable to anyone familiar with OOP (familiarity with django won’t hurt).

Basically they are just classes whose instances are classes and their main purpose (same as every other classes) is to customize their instances.

class MyMeta(type):
    def __new__(cls, name, bases, attrs):
        print 'in MyMeta.__new__'
        return type.__new__(cls, name, bases, attrs)
 
class Boo(object):
    __metaclass__ = MyMeta
 
print 'end'

This is the most basic and completely useless example of metaclasses, the interesting things here:

  • metaclasses are supposed to inherit from the default (for new style classes) metaclass, “type”
  • class attribute __metaclass__ tells python what metaclass current class should use (doh)
  • metaclass is executed as soon as the class is defined, that means that it can permanently change your class (thus any object using it uses the changed one)

But since all so far was just theory let’s have a look at the real world examples, i explored metaclasses through django, so the examples are from there

class ModelFormMetaclass(type):
    def __new__(cls, name, bases, attrs):
    ...
class ModelForm(BaseModelForm):
    __metaclass__ = ModelFormMetaclass

First a short explanation of ModelForm, basically you pass it a django Model (object abstracting SQL table) and it generates the html form for it. You use it by subclassing ModelForm passing it your Model class as one of the inputs. (yes this is inaccurate explanation, let’s keep it simple for the sake of explanation)

Let’s start with a reminder, since __metaclass__ is just a class attribute, so when you use ModelForm (subclassing it) your class inherits this metaclass, what they do there is parse the Model you provided and generate the form fields in your class, this way (as opposed to manually writing fields) you:

  • have cleaner/less code
  • are more future proof (when/if the Model changes you don’t have to change your form)
def modelform_factory(....):
    ....
    return ModelFormMetaclass(class_name, (form,), form_class_attrs)

This is from the same file as above code snippet, and ilustrates another way to use the metaclasses to generate classes, the use case for this is that based on the model you can get arbitrary numbers of forms for it, to be able to do so you have to generate the ModelForm on the fly (since you don’t know what Model you are gonna get)

Another possible use case for the ability to generate the classes on the fly is to create a template tag to which you pass model you want and it spits out the form for it, as a side note, this requieres light python abuse (globals() and
inner classes).

Ok, now we know what metaclasses are, how to use them and are familiar with it’s real world examples, but what are the alternatives? Well, two actually __new__ and class decorators.

You can use __new__ to kinda customize the class before it’s created, the options are limited, but first of two real world examples is still possible, that probably says: i refuse to acknowledge metalcasses can do anything and i’d rather abuse something i’m familiar with. A real downside of this is that the code in __new__ is executed once per object creation, let’s say you create 10 objects (implementing some feature in metaclass as opposed to your classes __new__, skipping the use of metaclass), code for that feature is executed 10 times in __new__ as opposed to 1 time when done in metaclass. Here is the code on what am i talking about:

class M(type):
    def __new__(*args, **kw):
        print 'M.__new__'
        return type.__new__(*args, **kw)
 
class A(object):
    __metaclass__ = M
 
class B(object):
    def __new__(*args, **kw):
        print 'B.__new__'
        return object.__new__(*args, **kw)
 
for i in xrange(10):
    A()
    B()

The other alternative are class decorators, the major problem with this is that they are only available from 2.6 up, the other problem with them (same as __new__), they don’t allow you to blatantly abuse python like metaclasses do (say, mess up MRO), on the up side i can’t really think of a real world use case where you would need more power than class decorators offer :-)

So, to conclude use metaclasses when you want to do one time transformation of classes and listen to Tim, if you don’t know why use metaclasses, than don’t.

Reblog this post [with Zemanta]
Blog Widget by LinkWithin