Wednesday, June 23, 2010

Testing image generation with Cucumber and Sikuli


Image recognition can be a useful weapon in your testing arsenal. Our current project is a rails app that generates advertisements. We use sikuli's image recognition to detect that the text and images that we expected to be in the advertisement were actually in the ad. To integrate sikuli with our build we simply created a cucumber step that calls sikuli. Let's cut to the chase and see a sample test.

Scenario: Dealer places a Campaign Ad
Given "Bill Murray"'s dealership "Ghostbusters" exists in the system
When I log in as "Bill Murray"
And I book a "Year End" campaign ad
Then the ad proof will look like "ghostbusters_tag.png"
Cool. The magic sikuli step is the one that looks at the ad proof. Here is the step implementation.
Then /^the ad proof will look like "([^\"]*)"$/ do |filename|
feature_dir = File.expand_path(File.dirname(__FILE__) + '/../../features')
sikuli_script_dir = "#{feature_dir}/sikuli"
verification_image = "#{feature_dir}/images/#{filename}"
result = `/Applications/Sikuli-IDE.app/Contents/MacOS/JavaApplicationStub #{sikuli_script_dir}/verify_existance.skl '#{verification_image}'`
result.should match(/hallelujah/)
end
Pretty straightforward. Call sikuli and pass it a sikuli script and the image. The sikuli script returns the word 'hallelujah' if it looks at your app and determines that the verification image exists anywhere on the screen. Ok. Lets drill down into what the sikuli script looks like. Get your Python goggles on...
import sys
image = sys.argv[1]
switchApp("Firefox.app")
type(Key.DOWN, KEY_CMD)
if exists(Pattern(image).similar(0.91)):
print "hallelujah"
else:
print "bugger"
This one is brief, but I'll explain a bit. Sikuli is a cross platform tool for scripting your computer as well as interacting via image recognition. In this script we make sure that firefox is the app in focus. The image we care about is at the bottom of the page so command down tells firefox to go to the bottom of the page. Finally it looks for the image match. The 0.91 number is basically telling sikuli to be damn sure. We print hallelujah if there is a match.

Ok. That is what I am doing, but in practice how does it work? Well, it has actually been pretty trouble free, and it even finds bugs and problems. Amazing! However it isn't perfect. Here is my long list of things that could be improved:
  • Sikuli's documentation on how to run sikuli from the command line is out of date.

  • You have to compile the python down to an skl file if you want to run from the command line. Annoying.

  • Use sikuli IDE's screenshot mechanism. Other screenshot tools cause trouble.

  • Set the similarity number high to prevent false positives. For example at 0.85 similarity, sikuli couldn't tell the difference between the word Ghostbusters and Ohostbusters.

  • Creating the screenshots for tests is tedious

  • My sikuli script is naive, which means if you have multiple firefox windows open it could fail erroneously

  • A popup warning or some always on top window could cause the test to fail erroneously

  • If the browser scales the image your test will fail erroneously
This has been indespensible for our application, because it lets us know if we have broken the integration with InDesign. It also gives us a way to test that our InDesign server is behaving. Image recognition is a bit of an edge case when it comes to testing web applications, but it is an interesting tool for testing and one that is highly adaptable. For images generation applications including charts or custom javascript controls I think image recognition would be the only way to test. It would be super cool to have an image recognition gem because then we would be able to do these kinds of tests without calling out to sikuli.



Just for fun I included this last screenshot. It is sikuli IDE's nifty little image recognition troubleshooting tool. Here I have turned down the similarity number to 0.5 and sikuli is showing 3 matches for my test image of the cucumber on the page. The bright red match is the strongest and the pink and purple are weaker.

Monday, June 21, 2010

Lines of code vs. Lines of test

My last project was a rescue mission. A 20k line rails app with only about 34% line coverage from the tests. No wonder the previous team had failed. 5 months later we made a successful rescue, and were up to about 71% line coverage. Watching those stats go from totally-screwed to mostly-un-screwed felt good. Metric_fu provided all the cool and useful graphs for that project: rcov, flog, reek, etc. One graph that was missing was lines of code vs. lines of test. I was sure that graph would have made a big beautiful X marks the spot, because LOC had steadily gone down and LOT had steadily risen. The moment those lines crossed would have been a great opportunity for toasting the health of the codebase.

Well, good news for those of you looking for excuses to toast your codebase. The brand spanking new Metric_fu 1.4 features a froody new LOC vs LOT graph.
I've been using it on my new project. This codebase is healthy, so no X marks the spot action here, but it is still interesting. From our graph it looks like things started out with 2 lines of test for every line of code, and have been working their way towards 3 lines of test for every line of code. (Our rcov % has stayed about 98% the whole time.) Does that mean that as the codebase increases you have to have an increasing ratio of tests to code? Given a sampling of 2 projects I can't really make any useful generalizations. But what I can do is recommend metric_fu to everyone. Graphs are a great way to start talking about how to improve your codebase and your process, and metric_fu makes it easy to get started. And remember, always use your Vanna White hands when you are exhibiting your graphs to your team mates. If it can make a game-show as lame as Wheel of Fortune successful, then it may work for you.

Thursday, June 17, 2010

Build your views before working on the controller


For complex forms you save time by building your view before you write tests and code for the action in the controller. For example take the following view code:


- form_for :ad_request, @ad_request, :url => ad_request_feature_path do |f|
= f.error_messages :class => 'error'
.form
#publication.form-item.first
- f.fields_for :car_selections do |cs|
= cs.label :headline, "Headline:"
= cs.text_field :headline


Unless you bleed rails you probably can't guess what the params hash posted to the controller will look like. Fortunately for lazybones like me you don't have to guess. Simply create the view, run up your app in dev mode, and post this form. Of course you get an error, but if you look in your app logs you are rewarded with a real life example of your params hash, like this:
{"ad_request"=>{"car_selections_attributes"=>{"0"=>{"headline" => "Mini for $15,000 Driveaway", "id"=>"1"}}}
Great. Guess what you should do with that nugget? That's right! Use it as the starting point of your controller test:


it "should save headline" do
car_selection = Factory.create(:car_selection, :id => 1)
ad = Factory.create(:ad_request, :ad_type => 'retail', :car_selections => [car_selection])
post :feature, {"ad_request"=>{"car_selections_attributes"=>{"0"=>{"headline" => "Mini for $15,000 Driveaway", "id"=>"1"}}}
ad.car_selections[0].headline.should == "Mini for $65,000 Driveeaway"
end


This is one of those things that is almost too pedestrian to even talk about, but it can save you time and pain in the long run, especially for the occasional nasty forms with deeply nested objects.