Archive for the ‘python’ Category

There is an on going discussion at the office with a team member who refuses to use dynamic languages. Claiming that most of his errors are typographical errors and they are caught by the compiler. So for him, since these errors are not caught until runtime, he throws and entire group of languages out the window. He also claims that to ensure that same level of checking with a dynamic language you would have to create more unit tests than normal to prevent introducing unhandled runtime exceptions.

So I decided to do a little test over the weekend. I created a very simple Number class in Python and C++. Using the exact same TDD development process, I implemented some very basic operations including division, addition, subtraction, etc… I ended up with 12 tests. The exact same tests for both the C++ and Python implementation resulting in 100% of the executation path being covered. I decided that the compliation (in case of C++) and passing of the tests determined a success.

Then went back and inserted common typographical errors. Mistypes, extra = signs, not enough = signs, miseplled_varaibles, etc… The end result was I was unable to get my unit tests passing while introducing syntax that would induce an unhandled runtime exception in Python. Granted, in C++ the compiler did catch a lot of things for me, but the point here is I didn’t have to create any extra tests to ensure that same level of confidence in my Python code.

Morning Tutorial

In the AM I attended the “py.Test: rapid testing with minimal effort“. I was planning to attend Python 401, but that filled up before I registered. I learn some new things about py.test that I didn’t know about, having never read the doc for it, it wasn’t hard.

I learned about the -k switch (loop on fail). Basically this continually runs the tests as source units change. It only reruns failing tests or new tests.

The generative tests using yield was something I had know about but never used and now I know exactly where I will be applying this. I have a program that takes a dozen or so different command line arguments and switches. I will generate a text file with all possible combinations. Then I can use that file to run through the command line tests. 

Afternoon Tutorial

In the afternoon I attended the “Advanced SQLAlchemy tutorial. I am not sure if I was the target audience for this tutorial or not. I have been using SQLAlchemy for a while now. The topics of coverage showed promise. Maybe I built it up too much in my head. A tutorial by Bayer himself, he wrote it. I should be blow away here. I wanted to leave this tutorial thinking, “Wow look how stupid I was, look at how much easier this is when you use FOO or BAR.” Or WOW, I never knew you could do that. Sadly, I have to say I had neither of those moments.

First let me say this, this isn’t a critique of either of the tutorial instructors or the content of it at an academia level. Both were quality.

I have to say the best thing I did see a nice shorthand way of doing somethings with the declarative_base. The coverage of inheritance mapping wasn’t really mind blow single inheritance was a sparse matrix approach using exclude and the join inheritance was just a Strategy pattern.

Transactions were covered for what seems like people who had never worked with transactions. The deadlock example that was given using SessionExtension was nice and practical and really the only thing that made me go “ahhhh” as I knew I could refactor the current way I was dealing with concurrency and databases with SQLAlchemy.

Summary

Overall, it was a good day. The coverage in the tutorials was very good. The dialog with people who were attending the tutorial was the best part really. Helping people work through the examples and answering some questions that people were either too shy or too embarrassed to ask to the whole class.

Just like with OOPSLA 2008, I like to keep a personal log of what I did at the conference each day. Today was my first day at Pycon  2009. We arrived early this morning (around 9 AM). I managed to get registered and pickup my fun bag which included a Pycon shirt, a Launchpad shirt, and an CD for Opensolaris amoung other things.

We ate at the convention center restaurant across the street from the Hyatt Regency O’Hare .. yeah don’t go there. After that, we took a nap, we had been up since the day before (our flight in to Chicago left Florida at 5 AM).

We stopped at Red Bar, the bar inside the hotel and had a few drinks and spoke with John Moulder (spelling?) who was also attending Pycon. We laughed a little since he works for the government and his last name was Moulder.

I have 2 tutorials tomorrow. An AM tutorial about py test and an afternoon tutorial on SQL Alchemy, Looking forward to both. Even though I am disappointed the Python 401 tutorial was full, I am sure the py test tutorial will be a fine substitute and equally as informative.

UPDATE / 13 March 2009: snakefight 0.3 now has a –include-jar option, prefer that to using my hack.

After reading P. Jenvey’s blog post about Deploying Pylons Apps to Java Servlet Containers I immediately downloaded the Jython 2.5 beta and installed snakefight to give it a try. One of our services where I work is a Pylons based application. It is deployed using paster and Apache ProxyPass. Our main application is written in Java and is deployed as a war under Jetty. So if I can get my Pylons application built as a war and deployed that way, it would greatly simplify our deployment process.

$ sudo /opt/jython25/bin/easy_install snakefight
$ /opt/jython25/bin/jython setup.py develop
$ /opt/jython25/bin/jython setup.py bdist_war --paster-config dev_r2.ini
... output of success and stuff ...
$ cp dist/project-0.6.8dev.war /opt/jetty/webapps

Now I visit my local server and hit the project context. I get some database errors, kind of expected them. So for the time being, I’ll be running this directly using Jython to speed up the debugging process. A quick googling of my DB issues turns up zxoracle for SQLalchemy which uses Jython zxJDBC. I install that in to sqlalchemy/databases as zxoracle.py and give it another go. Changing the oracle:// lines in my .ini file to now read zxoracle:// Now it can’t find the 3rd party Oracle libraries (ojdbc.jar).

$ cd ./dist
$ jar xf project-0.6.8dev.war
$ cd WEB-INF/lib
$ ls
# no ojdbc.jar as expected ...
$ cd ~/project
$ export CLASSPATH=/opt/jython25/jython.jar:/usr/lib/jvm/java/jre/lib/ext/ojdbc.jar
$ /opt/jython25/bin/jython /opt/jython25/bin/paster serve --reload dev_r2.ini

Now it is looking a little better and it able to find the jar, but still a DB issue, now with SQLalchemy library. Not having a ton of time to investigate, I decide to try rolling back my SQAlachemy version for Jython. Turns out rolling back to 0.5.0 fixed the issue. I’ll be investigating why it was breaking with 0.5.2 soon ™. So now I rerun it, and get a new error.

AttributeError: 'ZXOracleDialect' object has no attribute 'optimize_limits'

I decide I am just going to go in to the zxoracle.py and add optimize_limits = False to the ZXOracleDialect. No idea what this breaks or harms, but I do it anyway and rerun the application. Success! Every thing is working now. No liking the idea of having to manually insert the Oracle jar in to the WEB-INF/lib and not really wanting to much around with environment variables, I also implemented a quick and dirty include-java-libs for snakefight, the diff for command.py is below. This allows me to pass in a : separated list of jars to include in the WEB-INF/lib. EDIT: The diff I posted isn’t needed since I put it on my hg repo. You can grab it from here.

So now I am back to building my war. Just as before.

$ /opt/jython25/bin/jython setup.py bdist_war --paste-config dev_r2.ini --include-java-libs /opt/jython25/extlibs/ojdbc.jar
running bdist_war
creating build/bdist.java1.6.0_12
creating build/bdist.java1.6.0_12/war
creating build/bdist.java1.6.0_12/war/WEB-INF
creating build/bdist.java1.6.0_12/war/WEB-INF/lib-python
running easy_install project
adding eggs (to WEB-INF/lib-python)
adding jars (to WEB-INF/lib)
adding WEB-INF/lib/jython.jar
adding Paste ini file (to dev_r2.ini)
adding Paste app loader (to WEB-INF/lib-python/____loadapp.py)
generating deployment descriptor
adding deployment descriptor (WEB-INF/web.xml)
created dist/project-0.6.8dev-py2.5.war
$ cp dist/project-0.6.8dev-py2.5.war /opt/jetty/webapps
$ sudo /sbin/service jetty restart

And presto! I am in business. My pylons application is deployed under Jetty and all the selenium functional tests are passing. I am sure there is probably a easier, neater, or cleaner way to do all this, but this was my first iteration through and also my first time ever deploying a WAR to a java servlet container so all in all I am happy with the results. Performance seems about the same as when running the application with paster serve, but Jetty does use a little more memory than before (expected I guess).

Heading to PyCon this year. Looking forward to the tutorials and the great line up of keynotes. I highly recommend attending this year, it looks like one of the best PyCon’s in a while. I’ll be attending the Advanced SQLAlchemy tutorial and the py Test tutorial. I was hoping to get in to the Python 401 tut , but registered late and it was already full.

They key notes I am looking forward

  • Building tests for large, untested codebases by C. Titus Brown
  • Metaprogramming with Decorators and Metaclasses by Bruce Eckel
  • Topics of Interest by Ian Bicking

So if you are a Python hacker get over to http://us.pycon.org sign up and get yourself there! It is gonna be a great conference this year.

I need to concatenate a set of PDFs, I will take you through my standard issue Python development approach when doing something I’ve never done before in Python.

My first instinct was to google for pyPDF. Success! So, fore go reading any doc and just give the old easy_install a try.

$ sudo easy_install pypdf

Another success! Ok, a couple help() calls later and I am ready to go. The end result is surprisingly small and seems to run fast enough even for PDFs with 50+ pages.

from pyPdf import PdfFileWriter, PdfFileReader

def append_pdf(input,output):
    [output.addPage(input.getPage(page_num)) for page_num in range(input.numPages)]

output = PdfFileWriter()
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)

output.write(file("combined.pdf","wb"))

So I am playing around in Firefox with XMLHttpRequest. Looking in to a way to facilate a server update to a client without have to refresh the page or use Javascript timers. So the long-live HTTP request seems the way to go.

This little app will at most have 20-30 connections at once, so I am not worried about the open connection per client. The data it calculates is rather large and intensive to gather, so I paired it with the cache decorator snippet found on ActiveState and used in Expert Python Programming. This example feeds a cached datetime string. The caching lets different client receive the same data during the cache process. There is some lag between the updates since they all set their sleep at different points, there may be away around this though.

So here is my basic index.html.

<body>
<em>This will push data from the server to you every 5 seconds .. enjoy!</em>
<ul id="container"></ul>

<script>
var div = document.getElementById('container');
function handleContent(event)
{
  var xml_packet = event.target.responseXML.documentElement;
  div.innerHTML += '<li>' + xml_packet.childNodes[0].data + '</li>';
}
(function () {
    var xrequest = new XMLHttpRequest();
    xrequest.multipart = true;
    xrequest.open("GET","/server/index",false);
    xrequest.onload = handleContent;
    xrequest.send(null);
})();

</script>
</body>

Now the controller code itself.

class ServerController(BaseController):
    def index(self):
        response.headers['Content-type'] = 'multipart/x-mixed-replace;boundary=test'
        return data_stream()

def data_stream(stream=True):
    yield datetime_string()

    while stream:
        time.sleep(5)
        yield datetime_string()

@memorize(duration=15)
def datetime_string():
    content = '--test\nContent-type: application/xml\n\n'
    content += '<?xml version=\'1.0\' encoding=\'ISO-8859-1\'?>\n'
    content += '<message>' + str(datetime.datetime.now()) + '</message>\n'
    content += '--test\n'

    return content

Also the decorator code for good measure.

cache = {}

def is_old(entry, duration):
    return time.time() - entry['time'] > duration

def compute_key(function, args, kw):
    key = pickle.dumps((function.func_name, args, kw))
    return hashlib.sha1(key).hexdigest()

def memorize(duration=10):
    def _memorize(function):
        def __memorize(*args, **kw):
            key = compute_key(function, args, kw)

            if (key in cache and not is_old(cache[key], duration)):
                return cache[key]['value']
            result = function(*args, **kw)
            cache[key] = {'value': result, 'time':time.time()}
            return result
        return __memorize
    return _memorize

Full working demo will be available in the HG repos shortly.

I get made fun of on a daily basis for this but I am addicted to GUI Green-Bar testing. When I say that I literally mean a simple little Green/Red progress bar that shows me my pass/fail tests. I am addicted to it. I need it. Eclipse C++ and CUTE had spoiled me and now I desire the same thing for Python. Don’t get me wrong, I don’t practice “metrics driven development“, but for me personally, it is a motivator, an easy and clearly defined goal in my test driven approach, make that bar go full green.

I’ve spent the last few hours on Google and misc blogs looking for GUI Green bar testing for modern Python and have been unsuccessful in finding anything. So I ask anyone who happens to read this blog if you know of any plugins for any IDEs or text editors that support this for Python.

In the mean time I started my first Eclipse plugin project ever in hopes I can hack my through enough Java and pull enough from PyDev extension that I can make a simple green bar for Eclipse that parses nosetests output or something.

As I am sure most of you have heard Python 3.0 (final) has been released. For me, this means some nights getting some continuing development projects updated for the language changes and freezing some projects in maintence mode with their own copy of Python 2.6 (or in some cases 2.4).

Some highlights

  • print is now a function: print(”5×5″, “is”, 5*5, sep=” “)
  • annotations for methods (I create a lot of libraries, so this is great!)
  • extended unpacking: x, y , *z = [1,2,3,4,5] now x is 1, y is 2, and z is 3-5
  • <> removed, use != (personal favorite cause I hate <>)
  • no longer can you from import * inside functions

See the whole list here: http://docs.python.org/3.0/whatsnew/3.0.html

So I’ve been exercising lately. The last visit to the Dr. was a wake up call. 300 lb .. WHAT?! So, taking an iterative approach to exercise, I’ve manged to work up to 6-day a week cardio routine and a 3-day a week strength training routine and have gone from just over 300 lb. to 270 lb. in a couple months. Couple friends asked how I started and what I do, so I figure I’d break it down here.

I started by walking. 3 times a week for a couple weeks. Then after reading a Lifehacker post about a morning routine and matabolism, I added my very fast morning routine and actually eating breakfast to my day. Made it a whole week Monday-Friday. Then I bumped up the walking to Monday through Saturday. Did the morning routine and the walking for another week, then I bumped up the walking to targeted cardio 3 out of the 6 days and did that for another week. Added strength training 1 out of the 6 days and did that for another week. Switched from walking to cardio on all 6 days and + 1 more week. Then strength training 3 days out of the week. And slowly over about 3 months, I built up to this routine.

Every morning
I do this right after eating breakfast
2 sets, push-ups, 30 sec OR failure
2 sets, scissor kicks, 1 minute OR failure

Monday through Saturday
2 minute warm up
20 minutes of cardio (140-160 heart rate, check with your Dr.)
2 minute cool down

Strength training
This adds another 20 minutes to my workout, so schedule for that
Monday - Push (chest, triceps)
Wednesday - Pull (bicep, shoulders)
Friday - Legs (legs and stomach)

Now, it is just habit. Also, I am sure there is some study somewhere to prove or disprove this, but I find it makes my mind much sharper. I am more alert through out the day and I sleep much better at night. Not to mention clothes fitting better and just feeling better. My personal goal is 250.

Python tip
Prefer xrange to range when you need an arbitrary number of iterations as range actually generates a list for you, this will consume memory and be more costly than xrange.