Cuberick: November 2008

Saturday, November 29, 2008

Syntax Highlighting In Blogger

I have pulled various tricks in the past to get syntax highlighting in my code samples. In the past I only had highlighting for ruby code. I think I finally have a good solution and it works for any C style language.

Luka Marinko's blog has simple instructions to get this to work. Thanks Luka! Now here is some nonsense code to show what kind of highlighting you get with Java, Python, and Ruby. All the useless imports and requires are just so we can see those kinds of statements getting highlighted.
JAVA


import java.io.*;

class HelloWorld{
public static void main(String[] args) {
 hello();
}

public static void hello(){
 System.out.println("hello world");
}

}

RUBY


require 'rubygems'
class HelloWorld
def initialize
 puts "hello world"
end
end

HelloWorld.new

PYTHON


import sys
class HelloWorld:
def __init__(self):
 print "hello world"

HelloWorld()

Unit Test Your Google App Engine Models

I've been working on a project using Google App Engine (GAE) called "les Freres Jacques" That manipulates images of people's face onto the cover of this old french LP. Look for it to be out in a couple months. Anyway, My favorite way to start any project is by doing TDD. Unfortunately I'm new to python and GAE simultaneously so I had to do plenty of research to figure out how to unit test a GAE app. Most importantly for me was the ability to test my models that are based on google's datastore api. What follows is some information to get you started writing unit tests against a GAE model. First, a list of the tools you need to install.

Nose is a tool for running your python unit tests.
NoseGAE is a plugin for nose that bootstraps the GAE environment.

The easiest way to install these is with python's easy_install, which as far as I can tell is similar to ruby's 'gem' program and perl's 'cpan' program... though I don't know if it resolves dependencies automatically. Anyway, on OSX easy_install is installed by default so you can simply type

sudo easy_install nose

sudo easy_install nosegae

Now let's create a test to exercize a simple GAE model object. Here is a file called test_simple_model.py

import unittest
from google.appengine.api.users import User
from test_example.simple_model import SimpleModel

class TestSimpleModel(unittest.TestCase):
def test_creation(self):
 user = User(email = "test@foo.com")
 model = SimpleModel(goo_user = user)
 model.put()
 fetched_model = SimpleModel.all().filter('goo_user =', user).fetch(1)[0]
 self.assertEquals(fetched_model.goo_user, user)

The nose tool we installed earlier gives us a program called nosetests to run. When you call it it looks through your project and runs all your tests. We should call it now with the google app engine switch.

nosetests --with-gae

Wheee!! It is a lovely failing test.


======================================================================
ERROR: Failure: ImportError (No module named simple_model)
----------------------------------------------------------------------
Traceback (most recent call last):
...
from test_example.simple_model import SimpleModel
ImportError: No module named simple_model

----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (errors=1)

Good. Now we need to write some code to get our test passing. Here is what I wrote in a file called simple_model.py.

from google.appengine.ext import db
class SimpleModel(db.Model):
goo_user = db.UserProperty()

Now when I run

nosetests --with-gae

I get the lovely

.
----------------------------------------------------------------------
Ran 1 test in 0.008s

OK

and I am happy because I see how I can do TDD with GAE! Here is a list of references I used to figure this stuff out. Hope you find them useful. You can find the full source of this example here.

Update: 11/30/08
The datastore persists between tests which isn't usually what I want to happen. I submitted an issue on the nose-gae issue tracker. In the meantime here is a workaround to make sure the datastore is flushed between runs. Add this method to your test class and call it in your setUp method.


from google.appengine.api import apiproxy_stub_map
from google.appengine.api import datastore_file_stub

def clear_datastore(self):
 # Use a fresh stub datastore.
 apiproxy_stub_map.apiproxy = apiproxy_stub_map.APIProxyStubMap()
 stub = datastore_file_stub.DatastoreFileStub('appid', '/dev/null', '/dev/null')
 apiproxy_stub_map.apiproxy.RegisterStub('datastore_v3', stub)

Update: 1/29/09
Reading Dom's well researched and documented post on testing App Engine applications. I thought I better spruce up my own post by adding a citation. The code from the clear_datastore method above comes from this message posted on the google app engine google groups mailing list.
Update: 5/17/09
There is currently an issue with nosegae. See defect 18. There are patches that fix the issue posted there. I downloaded the source code, removed the subversion directories, patched the code and ran

easy_install .

in the root directory of the code. That fixed the issue for me.
Update: 5/17/09
As of GAE SDK 1.2.1 the appid of your datastore stub must match your appID. Make sure your call to DatastoreFileStub uses your actual app id.

Tuesday, November 25, 2008

Deep Clean Your Subversion Working Copy

awk and xargs. Deep cleaning action for those hard to reach places. Sometimes you just want your subversion working copy to be clean. If you have files with changes you can just revert things, but what about files that aren't under version control? Those are harder to clean up, especially if they are spread all over the place. Here is what I do when I want my working copy to be identical to the repo:

svn st | awk '{print $2}' | xargs rm -rf

That will remove all files that are out of sync with the repository. Then simply update to restore things you deleted and get up to date.

svn up

Access Google App Engine Development Server Remotely

By default Google AppEngine's (GAE) development webserver doesn't accept remote requests. If you'd like to make your devel GAE application available remotely you can use a simple proxy to forward requests. I chose to use pen proxy because it is sooooo simple. To install pen on ubuntu do

sudo apt-get install pen

to do it on OSX with macports do

sudo port install pen

Once it is installed startup your GAE application. Then start up pen like this:

pen 8079 localhost:8080

Now pen will forward requests from port 8079 to your app running at 8080.

Monday, November 24, 2008

A Software Journeyman

For the past year or two I have been hearing the words 'software' and 'craftsmanship' used together more and more frequently. Uncle Bob has been talking about it for most of a decade (click the craftsman link). So today when I heard about Corey Haines' pair programming tour I decided these two things must be connected. Corey is traveling through Ohio, Michigan, Illinois and Indiana. Every place he stops he is going to meet different software craftsmen and pair with them. He is a true Journeyman software craftsman. Wikipedia has this to say about Journeymen craftsmen.

In parts of Europe, as in later medieval Germany, spending time as a journeyman (Geselle), moving from one town to another to gain experience of different workshops, was an important part of the training of an aspirant master. Carpenters in Germany have retained the tradition of traveling journeymen until today, although only a small minority still practice it.

Very cool! If you want a chance to hear from aspirant master craftsman, Corey Haines, he will be talking about his travels as a Journeyman programmer and his chain of command gem at the December meeting of the Chicago Ruby User Group. Details are here.

Friday, November 21, 2008

ChiPy Meeting at ThoughtWorks

The December Chicago Python User Group will be hosted at ThoughtWorks! This will be the best meeting of any kind, ever. Be there! Meeting and RSVP details to follow. Oh, and if the flyer doesn't make sense to you, I usually help out with the Chirb meetings at our office and I couldn't help doing a sort of mashup with a reference to _why's foxes. See! Ruby and Python can co-exist at ThoughtWorks. Hope you like the flyer. I made it with inkscape.

Fine Print:
My loving imitation of _why's foxes appear courtesy of the CC license of the original work.

Therefore this work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.

Saturday, November 15, 2008

Update Bash History in Realtime

I have long been annoyed by the behavior of Bash's history file. If you use multiple terminals all history is lost except that of the last terminal closed. The correct behavior should be to save all history from all terminals! An easy way to make that happen is just to save commands to the history file in realtime. Thanks to the Linux Commando I know how to make this happen. Here is the secret:

shopt -s histappend
PROMPT_COMMAND="history -a;$PROMPT_COMMAND"

Put those lines in your bash_profile or bashrc. The first line tells bash to append to the history instead of completely overwriting. The second line calls history -a every time the prompt is shown, which essentially appends the last command to the history file. So simple. I wish I had known this years ago!

Tuesday, November 11, 2008

Lazy Quantifiers and Negative Lookaheads

The time has come,' the walrus said, 'to talk of many things: of shoes and ships - and sealing wax - of cabbages and kings. 'Lazy Quantifiers' and 'Negative Lookaheads'. The names somehow remind me of characters Lewis Carrol would think up or insults Safire would dream up for Agnew. These are the Tweedle Dee and Tweedle Dum of regexes. Today I had a regex challenge and it was a doosie! I wanted to use a regex to verify certain HTML elements all showed up in a specific HTML table. The problem boiled down to matching a < TR > to a < /TR > without getting hung up on any tags inbetween.

Let's make this concrete and offer the text we want to match (extra spaces are added between angle brackets so blogger won't swallow up my html):

< TR > < TD id = "tweedledee"> tweedledum < /TD > < TD > Error < /TD >< /TR >

Given that HTML, I want a regex that could identify a table row that has a column with tweedledee in it and an error message. The first thing I thought of didn't work. I imagined a regex something like this

/< TR[^(<\/TR)*tweedledee[^(<\/TR)*error/mi

Which to me meant match something that starts out like a table row, then has some arbitrary number of characters that are NOT the end of a table row, then match tweedledee then match some other arbitrary number of characters that are NOT the end of a table row, then match error. Ignoring newlines and case. The problem there is with my understanding of character classes. As you know [] denote a character class. One important feature of a regular expression character class is that it can only match one character at a time. Ohhhh Noooo! So how do we make this work? At first I thought the answer was the lazy quantifier (as opposed to standard greedy quantifiers), but according to my trusty book, "Mastering Regular Expressions," by Friedl the appropriate Tweedle brother is the Negative Lookahead. A Negative Lookahead, denoted (?! ...), is a positional matcher that is successful if at a given point in a regex it cannot match what is to the right of it. After confirming that Ruby does indeed support the Negative Lookahead I came up with the following solution.

/< TR>((?!<\/TR).)*?tweedledee((?!<\/TR).)*?error/mi

Which says match something that starts out like a table row, then any character which is not the end of a table row, but only up to the point that tweedledee occurs, then any character which is not the end of a table row, but only up to the point that error occurs. Ignoring newlines and case. So it turns out I needed to use both tweedle brothers to solve my delimma. Now when am I going to get a chance to use the other lookarounds, the positive and negative lookbehinds? And will I ever be able to look at a pair of TD tags again without thinking of Tweedledee and Tweedledum? Let's ask the king. 'Begin at the beginning,' the King said, very gravely, 'and go on till you come to the end: then stop.'

Navigating Larger Ruby Codebases with Vim

Lately I've been doing lots of work with Ruby, and Vim has been my editor. Working with a larger codebase makes it important that your editor helps you find things easily. Here are three of the most important code navigation functions that you simply cannot do without:

Jump to definition. In a large codebase you will find many method calls to methods you've never seen. Being able to quickly drill down through that unfamiliar code and to get back to where you started is massively important. This applies to unfamiliar classes, constants and variables as well.
Find method usages. Often it is nice to look at how a method is used elsewhere in the code. Or if you are refactoring a method you frequently need to modify the code that uses that method.
Plain old search. For sifting through a big codebase you have to be able to do a text search and to be able to quickly jump back and forth between possible matches.

All of this is standard in an IDE, but how do we do it in vim?

For 'Jump to definition' functionality Vim uses Ctags. What exactly is ctags? Ctags is a tool for indexing a source tree of language objects so that they can be found quickly with a text editor. Supported languages include: Assembler, AWK, ASP, BETA, Bourne/Korn/Zsh Shell, C, C++, COBOL, Eiffel, Fortran, Java, Lisp, Lua, Make, Pascal, Perl, PHP, Python, REXX, Ruby, S-Lang, Scheme, Tcl, Vim and Yacc. To install on OSX do "sudo port install ctags" on debian/ubuntu "sudo apt-get install ctags". Usage is simple. To index your code go to the root of your source tree and type "ctags -R ." (that period isn't a period it is the current directory). That will create a file called tags which is an index of your source code. Now when you open vim from that directory you can use ctags. Basic usage is to move your cursor over a method, class, or variable you want to see defined. Then type "ctrl ]" and it takes you to the definition. If you want to go back from where you have come type "ctrl r" and you return. Vim keeps a stack of places you have jumped so you can drill down deeply into the code with "ctrl ]" and then find your way back with "ctrl r". The manual is here. A nice vim plugin to use with this is autotags. Autotags automates the maintenance of your tags files so that you don't have to keep running the ctags command when you change the code. Try it.

Find method usages is a different problem, and unfortunately ctags won't help with it. This website seems to have a complete discussion of possible solutions to this requirement. I played with idutils a bit, but it doesn't support ruby. In the end I gave up on a vim/ruby solution to find method usages and I fall back on general searching to fulfill this requirement. Lame-ish.

A general text search however is well supported in Vim. I like to use the Grep plugin. It has good quality documentation so I won't bother with a lengthy description of how to install or use it. My typical usecase is to do a recursive grep across the codebase for something using the :Rgrep command. Then I can look over the various matches and the Grep plugin lets you jump around among the files. It is a very civilized way to search! I should probably point out that on my OSX setup I had to add the following lines to my vimrc

"fix grep
:let Grep_Find_Use_Xargs = 0
:let Grep_Default_Filelist = '*.rb'

Even on a codebase with a couple hundred ruby files grep seems blazingly fast so not being able to do indexed searches wouldn't be a problem for all but the largest code bases. One thing you will have to figure out once you start using the grep plugin is what vim buffers are and how they work. The Vim docs say a buffer is a file loaded into memory for editing and that all opened files are associated with a buffer. I recommend reading this for more info. Understanding buffers will make you more efficient and change the way you work.

The common thread among all these tools and plugins is that time spent outside of your editor is time not spent coding. If you find yourself opening and closing vim every time you edit a file during a coding session, you are doing it wrong. Would you open and close eclipse every time you wanted to edit a different file? Well maybe you would if it was as fast as vim ;) But seriously. Spend some time learning how to navigate your codebase efficiently and you will become more effective.

I guess to try and put some kind of summary on this I'll just say that while it can require some effort to learn and configure, Vim can make you a very effective programmer, and with the right plugins in your arsenal I think you can outdo an IDE, even on larger codebases.

Wednesday, November 5, 2008

A Difficult Road

McCain's concession speech was the highlight of his campaign. He was kind. He was gracious. It was refreshing. Thanks John, and let me highlight one point he made in that speech.

"The road was a difficult one from the outset..."

How right you are, John. The road was difficult. Check out this map of a poll taken 2 years ago.

In that poll Obama got only 28 electoral votes. Two years spent organizing and campaigning transformed that map. Obama ended up with 364. A difficult road, indeed.

Saturday, November 1, 2008

Shoes Apps With Multiple Windows

From Cuberick

Creating a shoes app with multiple windows is as easy as using the window method. Here is a simple example

#!/usr/bin/env open -a Shoes.app
class Examplish < Shoes
url "/", :index
url "/new", :new_window

def index
 para link "make a new window", :click => "/new"
end

def new_window
 window do
   para "a new window"
 end
end

end
Shoes.app

However there are a few gotchas hidden in this goodness. First notice that if you execute this code the original window is now blank. At first I was surprised, but what is happening is that since you have clicked a link shoes expects you to visit a new url. We aren't supplying one. Simply add

visit "/"

after the window block to tell shoes to forward you on to the same page you were at and now we have the expected behavior.

Now, the big gotcha is that inside the window block the self object is completely different than it is outside the window block. Even more frustrating is that methods defined on the parent aren't available to the child. This can be baffling, a cause of duplication and it turns out a work around is difficult. For example, lets say we have a method called header in our main window that puts a header on the page, since self is redefined inside the new window we no longer have access to that function, therefore we have to re-implement the header function inside the window block. We need a place to put shared code! It won't be easy to find a place. We can't use a module becuase all the methods we need to add content to the page, things like 'para', 'link', 'stack', 'flow', all only exist on classes that extend Shoes, and we need a reference to the specific instance of Shoes.app that is associated with our window.

How can we have shared code when the methods we depend on only exist on a particular instance of Shoes.app? I haven't really found a satisfactory workaround to this problem. The best I have come up with is to put all shared code into a separate class. That clearly solves the problem of giving it a place that is is accesable, but what about the problem of accessing all the magic shoes functions? For that I make our new class a proxy of self (the Shoes.app object). Here is code to illustrate what I mean.

#!/usr/bin/env open -a Shoes.app
class Examplish < Shoes
url "/", :index
url "/new", :new_window

def index
  @proxy ||= ProxyHelper.new(self)
  stack do
    @proxy.header
    para link "make a new window", :click => "/new"
  end
end

def new_window
  window do
    @proxy ||= ProxyHelper.new(self)
    stack do
      @proxy.header
      para "a new window"
    end
  end
  visit "/"
end

end

class ProxyHelper
def initialize(app)
  @app = app
end

def header
  banner "Shared Code!"
end

def method_missing(meth, *args, &block)
  @app.send(meth, *args, &block)
end
end

Shoes.app

Notice how you pass the self or Shoes.app object in when you create the proxy object. Then with a bit of simple method missing magic you can now call all the magic shoes functions seamlessly... well, almost. I have noticed that the link method doesn't work unless you call it directly on the Shoes.app object. And that brings up a bigger issue with this solution. Since all the urls are defined on the other Shoes.app object you can't use any of the navigation routes. Bummer! This is a major bummer! I hope I am missing something and somebody can tell me the right way to go about this, because if not it seriously limits the usefulness of multiple windows in shoes. To my mind the proper behavior inside of a window block is to make sure the new instance of Shoes.app has access to the methods defined on the parent window as well as all the urls. I guess the Shoes::Widget is the only supported way to share code among windows, but as cool as the Widgets are, they don't seem to fit the pattern I am talking about. I'm going to take this up with the shoes mailing list, they can probably sort this out.