unrelated-picture

Sometimes putting into words the practices you use can expose obvious flaws where you need improvements and provide a starting point for introducing new practices in to your development. Analogies are often used to express sometimes foreign, sometimes complex ideas by bringing them in to a realm that is familiar.

Idea
My development is fueled with an idea. The idea sometimes bad, sometimes good, and rarely great, but nonetheless this is where I start. The idea usually stems from a problem. I want to be able to do Foo or I’d really like if Bar was able to do this.

What ever it maybe, I think it is important to take the time to outline your idea. Give yourself some initial requirements. Give yourself some structure to build on. You need a solid foundation for a project to be seen to completion and even with that, most fail to make it.

Now don’t get me wrong, I am not saying you need a full set of business requirements with risk analysis, input from legal, hiring a marketing firm to ensure there is sufficient market share to capture to justify the risk of development. I am just saying you need to have some kind of outline, some set of initial requirements, your motivation. They are easy to create and it only takes a minute.

Once it makes it past the idea stage, it means I’ve written down on paper. I’ve created a loose set of requirements for the project and I still think it is a decent enough idea to continue.

Design
The design process is unique beast. There is plenty of reading material out there about design and about architecting software. I am by no means a great architect. I know enough UML to not be confused by it, but I couldn’t punch my way out of a wet Visio box.

My approach to design is brief at best. I hop a plane to 30,000 ft and look around. I make some notes of that obvious things I see. Oh look a Factory Pattern, oh look a white fluffy Strategy pattern, hey that skyscrapper looks a lot like a Reactor pattern and that lake looks like a Facade. I like patterns, even though some people seem to think they hinder innovation and developers coming up with new awesomeness. I make some notes and polish my lead balls to ensure my development compass points North. That is it.

As a rule of thumb, I try to spend no more than a day on this phase. Depending on the scope of the project and the requirements you may require more or less time, but that is really up to your gut. As soon as you feel comfortable laying down code, start laying down code.

plants-tell-me-fings

Now The Analogy - Growing Software.
I am a giant fan boi when it comes to test driven and agile processes, I mix and match the different processes to meet my needs. My favorite anology I use to describe my process of what I like to call organic development, involves growing plants. I am sure the term organic development already exists and means something completely different, but I don’t care, I like it for this.

  • Idea - Decide you want to plant something and read up on what you’re planting.
  • Design - Sun? Shade? Raised bed?
  • Coding - Watering, fertilizing, repotting and transplanting.

So with the idea you read about what you are planting. You look at maybe some other people who have planted the same type of plants and read on their experiences if they are available.

Now the design. You pick out your pots or bed type. Decide if you are gonna use natural light, artificial light, how much water and they type of delivery mechanism? I hear they have these cool water globes now!

Now you can plant the seed. You water and weed and as your plant grows you prune, stake in supports, and continue to care for it until it comes to bare. With the proper care and handling you can produce a healthy, beautiful, low maintenance plant.

Keeping with this software is a plant analogy, somethings to watch out for. Water too much you run the risk of killing your plant (developer burn out). Prune too much and you can kill it (premature optimization). Basically in the early stages of a plants life mistakes can easily kill it I think this is very similar to software.

As your plant takes root it becomes easier and easier to maintain. Things like transplanting, ripping a plant out to the roots and replanting in a new bed are risky, but if done right, can produce a larger, healthier plant. Like software, sometimes completely gutting a prototype might be the right thing to do, but you increase the risk of killing the project.

But also plants, like software, can be very forgiving as long as you catch the the fact early that you are giving your plant too much or too little of water, pruning, fertilizer, soil to grow in, etc… you should be able to salvage it. Though it will take more work and time, with extended care an over or under watered plant can be saved from the brink of death with a little time investment and care. But sometimes there is nothing you can do, like an over pruned Bonsai tree, once the damage is done, even if you continue to grow your plant, it will never feel right for many many years.

So anyway, that is might crazy, abrupt, all over the place abstract take on software. What’s yours?

On a side note, an email commenter Linda requested I have more pictures in my posts so this one is for you Linda.

EDIT: Fixed the strong tags. WP decided to do something weird with the formatting. Also fixed some typo’s.

There is an on going discussion at the office with a team member who refuses to use dynamic languages. Claiming that most of his errors are typographical errors and they are caught by the compiler. So for him, since these errors are not caught until runtime, he throws and entire group of languages out the window. He also claims that to ensure that same level of checking with a dynamic language you would have to create more unit tests than normal to prevent introducing unhandled runtime exceptions.

So I decided to do a little test over the weekend. I created a very simple Number class in Python and C++. Using the exact same TDD development process, I implemented some very basic operations including division, addition, subtraction, etc… I ended up with 12 tests. The exact same tests for both the C++ and Python implementation resulting in 100% of the executation path being covered. I decided that the compliation (in case of C++) and passing of the tests determined a success.

Then went back and inserted common typographical errors. Mistypes, extra = signs, not enough = signs, miseplled_varaibles, etc… The end result was I was unable to get my unit tests passing while introducing syntax that would induce an unhandled runtime exception in Python. Granted, in C++ the compiler did catch a lot of things for me, but the point here is I didn’t have to create any extra tests to ensure that same level of confidence in my Python code.

I want a new theme for my blog. This site runs Wordpress 2.7. So please link your recommendations in the comments. I’d like something simple and clean but a little more pleasing to the eye than this current theme.

I’ve been flipping through the different ones on the Wordpress site, but there is so many and I have no zero skills when it comes to what looks good and what doesn’t; I like Christmas colors if that tells you anything.

Morning Tutorial

In the AM I attended the “py.Test: rapid testing with minimal effort“. I was planning to attend Python 401, but that filled up before I registered. I learn some new things about py.test that I didn’t know about, having never read the doc for it, it wasn’t hard.

I learned about the -k switch (loop on fail). Basically this continually runs the tests as source units change. It only reruns failing tests or new tests.

The generative tests using yield was something I had know about but never used and now I know exactly where I will be applying this. I have a program that takes a dozen or so different command line arguments and switches. I will generate a text file with all possible combinations. Then I can use that file to run through the command line tests. 

Afternoon Tutorial

In the afternoon I attended the “Advanced SQLAlchemy tutorial. I am not sure if I was the target audience for this tutorial or not. I have been using SQLAlchemy for a while now. The topics of coverage showed promise. Maybe I built it up too much in my head. A tutorial by Bayer himself, he wrote it. I should be blow away here. I wanted to leave this tutorial thinking, “Wow look how stupid I was, look at how much easier this is when you use FOO or BAR.” Or WOW, I never knew you could do that. Sadly, I have to say I had neither of those moments.

First let me say this, this isn’t a critique of either of the tutorial instructors or the content of it at an academia level. Both were quality.

I have to say the best thing I did see a nice shorthand way of doing somethings with the declarative_base. The coverage of inheritance mapping wasn’t really mind blow single inheritance was a sparse matrix approach using exclude and the join inheritance was just a Strategy pattern.

Transactions were covered for what seems like people who had never worked with transactions. The deadlock example that was given using SessionExtension was nice and practical and really the only thing that made me go “ahhhh” as I knew I could refactor the current way I was dealing with concurrency and databases with SQLAlchemy.

Summary

Overall, it was a good day. The coverage in the tutorials was very good. The dialog with people who were attending the tutorial was the best part really. Helping people work through the examples and answering some questions that people were either too shy or too embarrassed to ask to the whole class.

Just like with OOPSLA 2008, I like to keep a personal log of what I did at the conference each day. Today was my first day at Pycon  2009. We arrived early this morning (around 9 AM). I managed to get registered and pickup my fun bag which included a Pycon shirt, a Launchpad shirt, and an CD for Opensolaris amoung other things.

We ate at the convention center restaurant across the street from the Hyatt Regency O’Hare .. yeah don’t go there. After that, we took a nap, we had been up since the day before (our flight in to Chicago left Florida at 5 AM).

We stopped at Red Bar, the bar inside the hotel and had a few drinks and spoke with John Moulder (spelling?) who was also attending Pycon. We laughed a little since he works for the government and his last name was Moulder.

I have 2 tutorials tomorrow. An AM tutorial about py test and an afternoon tutorial on SQL Alchemy, Looking forward to both. Even though I am disappointed the Python 401 tutorial was full, I am sure the py test tutorial will be a fine substitute and equally as informative.

UPDATE / 13 March 2009: snakefight 0.3 now has a –include-jar option, prefer that to using my hack.

After reading P. Jenvey’s blog post about Deploying Pylons Apps to Java Servlet Containers I immediately downloaded the Jython 2.5 beta and installed snakefight to give it a try. One of our services where I work is a Pylons based application. It is deployed using paster and Apache ProxyPass. Our main application is written in Java and is deployed as a war under Jetty. So if I can get my Pylons application built as a war and deployed that way, it would greatly simplify our deployment process.

$ sudo /opt/jython25/bin/easy_install snakefight
$ /opt/jython25/bin/jython setup.py develop
$ /opt/jython25/bin/jython setup.py bdist_war --paster-config dev_r2.ini
... output of success and stuff ...
$ cp dist/project-0.6.8dev.war /opt/jetty/webapps

Now I visit my local server and hit the project context. I get some database errors, kind of expected them. So for the time being, I’ll be running this directly using Jython to speed up the debugging process. A quick googling of my DB issues turns up zxoracle for SQLalchemy which uses Jython zxJDBC. I install that in to sqlalchemy/databases as zxoracle.py and give it another go. Changing the oracle:// lines in my .ini file to now read zxoracle:// Now it can’t find the 3rd party Oracle libraries (ojdbc.jar).

$ cd ./dist
$ jar xf project-0.6.8dev.war
$ cd WEB-INF/lib
$ ls
# no ojdbc.jar as expected ...
$ cd ~/project
$ export CLASSPATH=/opt/jython25/jython.jar:/usr/lib/jvm/java/jre/lib/ext/ojdbc.jar
$ /opt/jython25/bin/jython /opt/jython25/bin/paster serve --reload dev_r2.ini

Now it is looking a little better and it able to find the jar, but still a DB issue, now with SQLalchemy library. Not having a ton of time to investigate, I decide to try rolling back my SQAlachemy version for Jython. Turns out rolling back to 0.5.0 fixed the issue. I’ll be investigating why it was breaking with 0.5.2 soon ™. So now I rerun it, and get a new error.

AttributeError: 'ZXOracleDialect' object has no attribute 'optimize_limits'

I decide I am just going to go in to the zxoracle.py and add optimize_limits = False to the ZXOracleDialect. No idea what this breaks or harms, but I do it anyway and rerun the application. Success! Every thing is working now. No liking the idea of having to manually insert the Oracle jar in to the WEB-INF/lib and not really wanting to much around with environment variables, I also implemented a quick and dirty include-java-libs for snakefight, the diff for command.py is below. This allows me to pass in a : separated list of jars to include in the WEB-INF/lib. EDIT: The diff I posted isn’t needed since I put it on my hg repo. You can grab it from here.

So now I am back to building my war. Just as before.

$ /opt/jython25/bin/jython setup.py bdist_war --paste-config dev_r2.ini --include-java-libs /opt/jython25/extlibs/ojdbc.jar
running bdist_war
creating build/bdist.java1.6.0_12
creating build/bdist.java1.6.0_12/war
creating build/bdist.java1.6.0_12/war/WEB-INF
creating build/bdist.java1.6.0_12/war/WEB-INF/lib-python
running easy_install project
adding eggs (to WEB-INF/lib-python)
adding jars (to WEB-INF/lib)
adding WEB-INF/lib/jython.jar
adding Paste ini file (to dev_r2.ini)
adding Paste app loader (to WEB-INF/lib-python/____loadapp.py)
generating deployment descriptor
adding deployment descriptor (WEB-INF/web.xml)
created dist/project-0.6.8dev-py2.5.war
$ cp dist/project-0.6.8dev-py2.5.war /opt/jetty/webapps
$ sudo /sbin/service jetty restart

And presto! I am in business. My pylons application is deployed under Jetty and all the selenium functional tests are passing. I am sure there is probably a easier, neater, or cleaner way to do all this, but this was my first iteration through and also my first time ever deploying a WAR to a java servlet container so all in all I am happy with the results. Performance seems about the same as when running the application with paster serve, but Jetty does use a little more memory than before (expected I guess).

I use hg (Mercurial) for version control. Since switching to hg I have adopted the following process. I also do this for my Git projects at work.

  • I create a local branch to working.
  • I setup my External Tools in Eclipse to run my test suite.
  • The output of my test suite gets committed to my local branch.
  • I squash the local branch messages when I merge in to master.
  • I add some insightful commit message for my master commit. Like, I haz changes.

So yesterday, I roll up my sleeves and prepare to dive in to an older project that smells like rotten potatoes. The plan of attack is to take this project and bring it up-to-date with Python 2.6, Pylons 0.9.7, and SQLalchemy 0.5.2 in the process of doing it, re-factor and extend where needed, of course letting the tests drive. I start my work and wand waving and 2-3 hours in I’ve removed about 200 lines of cruft and copy paste inheritance extended flexibility by further encapsulating some behavior using the Strategy pattern. I’ve got 47 tests (including functional doctests) passing and I’m green bar and happy with my time spent. So now time to merge this baby back in to master.

My test suite external tool performs the hg add . and I keep my .hgignore pretty up-to-date for Python projects, so I feel confident doing that. I open up the terminal to check out the change sets and start the merge and I notice I missed a binary format in my .hgignore. So I now have about 15 unwanted files staged for adding. Being lazy and knowing my last commit was when I just ran my test suite, I blindly run.

$ ^R hg revert <enter> <enter> (Ctrl-R, hg revert - shell previous command search)
$ hg revert -a --no-backup
# ...my work being destroyed because I was lazy and not paying attention
# whimpering

It is at this point my day goes from great to awful. I face palm as I watch the uncommitted changes I’ve been making over the last 3 hours get reverted. As I mentioned, this project was older, in fact, it was started before the migration to hg and I never updated the External Tools runnable for this project in Eclipse to do the new hg add / commits. So every time I thought I was committing when I was running the tests, I was in fact not. Fortunate for me, I did have some buffers open and was able to recover the end result in about 45 minutes of hacking, but I did lose all of my change history which was very very disappointing (not to mention scary).

So if I had any advice after this it would be ensure your older projects are up-to-date with how you do things now and they follow your current development process before you start refactoring. I guess the oneliner could be; When refactoring a project start with the tool set first.

Heading to PyCon this year. Looking forward to the tutorials and the great line up of keynotes. I highly recommend attending this year, it looks like one of the best PyCon’s in a while. I’ll be attending the Advanced SQLAlchemy tutorial and the py Test tutorial. I was hoping to get in to the Python 401 tut , but registered late and it was already full.

They key notes I am looking forward

  • Building tests for large, untested codebases by C. Titus Brown
  • Metaprogramming with Decorators and Metaclasses by Bruce Eckel
  • Topics of Interest by Ian Bicking

So if you are a Python hacker get over to http://us.pycon.org sign up and get yourself there! It is gonna be a great conference this year.

I need to concatenate a set of PDFs, I will take you through my standard issue Python development approach when doing something I’ve never done before in Python.

My first instinct was to google for pyPDF. Success! So, fore go reading any doc and just give the old easy_install a try.

$ sudo easy_install pypdf

Another success! Ok, a couple help() calls later and I am ready to go. The end result is surprisingly small and seems to run fast enough even for PDFs with 50+ pages.

from pyPdf import PdfFileWriter, PdfFileReader

def append_pdf(input,output):
    [output.addPage(input.getPage(page_num)) for page_num in range(input.numPages)]

output = PdfFileWriter()
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)
append_pdf(PdfFileReader(file("sample.pdf","rb")),output)

output.write(file("combined.pdf","wb"))

So I am playing around in Firefox with XMLHttpRequest. Looking in to a way to facilate a server update to a client without have to refresh the page or use Javascript timers. So the long-live HTTP request seems the way to go.

This little app will at most have 20-30 connections at once, so I am not worried about the open connection per client. The data it calculates is rather large and intensive to gather, so I paired it with the cache decorator snippet found on ActiveState and used in Expert Python Programming. This example feeds a cached datetime string. The caching lets different client receive the same data during the cache process. There is some lag between the updates since they all set their sleep at different points, there may be away around this though.

So here is my basic index.html.

<body>
<em>This will push data from the server to you every 5 seconds .. enjoy!</em>
<ul id="container"></ul>

<script>
var div = document.getElementById('container');
function handleContent(event)
{
  var xml_packet = event.target.responseXML.documentElement;
  div.innerHTML += '<li>' + xml_packet.childNodes[0].data + '</li>';
}
(function () {
    var xrequest = new XMLHttpRequest();
    xrequest.multipart = true;
    xrequest.open("GET","/server/index",false);
    xrequest.onload = handleContent;
    xrequest.send(null);
})();

</script>
</body>

Now the controller code itself.

class ServerController(BaseController):
    def index(self):
        response.headers['Content-type'] = 'multipart/x-mixed-replace;boundary=test'
        return data_stream()

def data_stream(stream=True):
    yield datetime_string()

    while stream:
        time.sleep(5)
        yield datetime_string()

@memorize(duration=15)
def datetime_string():
    content = '--test\nContent-type: application/xml\n\n'
    content += '<?xml version=\'1.0\' encoding=\'ISO-8859-1\'?>\n'
    content += '<message>' + str(datetime.datetime.now()) + '</message>\n'
    content += '--test\n'

    return content

Also the decorator code for good measure.

cache = {}

def is_old(entry, duration):
    return time.time() - entry['time'] > duration

def compute_key(function, args, kw):
    key = pickle.dumps((function.func_name, args, kw))
    return hashlib.sha1(key).hexdigest()

def memorize(duration=10):
    def _memorize(function):
        def __memorize(*args, **kw):
            key = compute_key(function, args, kw)

            if (key in cache and not is_old(cache[key], duration)):
                return cache[key]['value']
            result = function(*args, **kw)
            cache[key] = {'value': result, 'time':time.time()}
            return result
        return __memorize
    return _memorize

Full working demo will be available in the HG repos shortly.