Panasonic Youth rob sanheim writes about software, business, ruby, music, stuff and things



Tarantula 0.0.5 Released - the “Naked Aardvark” release

Announcing version 0.0.5 of Tarantula.

Tarantula is a big fuzzy spider. It crawls your Rails application, fuzzing data to see what breaks. It can verify HTML validation across all your pages, ensure you don’t have 404s, and pretty much anything else you want via custom handlers.

Don’t let the version number fool you, we’ve been using Tarantula across many projects at Relevance and its very stable. This release fixed a number of annoying bugs, including namespace conflicts with other classes due to Rails dependency loading, improved gem spec with correct dependencies, and clean up on the html reporter.

Install it via the Github

gem install relevance-tarantula --source http://gems.github.com

or via Rails 2.1+ gem handing:

config.gem "relevance-tarantula", :source => "http://gems.github.com"


Scp or rsync failing with no error message? Check your startup scripts…

The other day I was having issues trying to scp/rsync data, with no real error message to try and debug things. Turns out that any output produced by your startup scripts will break rsync/scp hard. I had some simple ‘echo’ statements print when different scripts were being loaded…turns out scp/rysync don’t like that.

My capistrano task was a very simple call out to the ‘get’ helper, which just uses scp under the hood. The task ran and looked as if it completed, only nothing was ever transferred and the scp progress bar never came up. Sometimes it would block and do nothing, which was real fun, too.

The solution was simple - change all the bash scripts we use to not output any echo anything when running. I deployed the new scripts to all servers I needed to scp with, and the issue was resolved.

Since this is a known issue in the faq, it won’t be fixed or improved with a better error message. It’s just something you need to be aware of and work around, either via detecting if the session has an interactive terminal before sending output or removing your output statements altogether from you startup scripts.


Git Clone vs cp -R –> WTF?

I knew git was fast, and I even knew it was faster than a lot of plain linux local file operations. Still, this still blew me away:

rsanheim@ares:~/src/personal/oss $ du -hd 0 insoshi/
 26M    insoshi/

rsanheim@ares:~/src/personal/oss $ time git clone insoshi/ /tmp/insoshi
Initialize /tmp/insoshi/.git
Initialized empty Git repository in /private/tmp/insoshi/.git/
Checking out files: 100% (2193/2193), done.

real    0m3.826s
user    0m0.251s
sys 0m0.658s

rsanheim@ares:~/src/personal/oss $ time cp -R insoshi/ /tmp/insoshi_cp

real    0m9.065s
user    0m0.114s
sys 0m1.442s

Ok, so a 26 meg repo takes almost three times as long to copy via a recursive cp than a local git clone. Thats a fairly small repo, lets try something bigger:

rsanheim@ares:~/src/relevance $ du -hd 0 rails
 75M    rails

rsanheim@ares:~/src/relevance $ time git clone rails /tmp/rails2
Initialize /tmp/rails2/.git
Initialized empty Git repository in /private/tmp/rails2/.git/

real    0m2.321s
user    0m0.151s
sys 0m0.465s

rsanheim@ares:~/src/relevance $ time cp -R rails/ /tmp/rails

real    0m7.133s
user    0m0.067s
sys 0m1.505s

The rails repo at 75 megs is still ~ 3 times faster.

Obviously, this is not scientific at all, but the point is pretty clear. Git is doing some magic that lets it move files around locally 2 to 3 times faster than a plain copy. From looking at the man page, I would guess it has something to do with git using hardlinks for things in .git/objects when cloning locally. My linux fu falls down a bit here -- what are the ramifications of using hard links versus doing a "real" copy?

(This also makes me want to try out gitbak even more...)


Quick: Find the Bug or Gotcha with named_scope

Think fast! Where's the bug?

named_scope :active, :conditions => ["activated_at <= ?", DateTime.now.utc.to_s(:db)]

Looks fine, right? Maybe you've hit this already, and you see it immediately.

The symptoms are that the DateTime.now always seems to be a bit off - maybe you just restarted your server and its a only a few minutes off.

The bug is that DateTime.now gets evaluated at the time the class is loaded, not when the finder is run. What makes this easy to miss is that it will always work fine in tests and development, as everything is constantly getting reloaded there.

The fix, obvious once you've spent a combined time of over an hour trying to figure out what is going on:

named_scope :active, lambda { { :conditions => ["activated_at <= ?", DateTime.now.utc.to_s(:db)] } }


Notes on testing Bj (Background Job)

Some thoughts and random notes on testing Bj within a Rails integration test (or spec).

  • You have to turn transactions off for the scope of the test, or suffer very confusing issues, since Bj itself wraps the job submittal within a transaction. The way I did this was just overriding the use_transactional_fixtures method in the one specific spec.

    describe Foo
      def self.use_transactional_fixtures
        false
      end

  • Remember, bj = background job. This may seem obvious, but whatever you submit to bj will be running in an entirely different process, so in our spec you need to wait for that job to complete before trying to assert things. You can do something as simple as this:
    MAX_TIME = 10.0
        seconds = 0.0
        while(job.pending?) do
          job.reload
          seconds += 0.5
          sleep 0.5
          raise if seconds> MAX_TIME
        end
    # normal assertions here

    This gives your job up to 10 seconds to finish, and will timeout if it takes too long, which usually means something has gone wrong.

  • You now have to watch multiple logs to figure out what is going on. So tail your test.log and tail the bj log as well, and run the script in isolation to make sure you understand where exceptions and syntax errors will go. I wasted some time scanning logs when I really need to check the job.stderr field that bj populates, so be sure to output that for common test failures.

Overall, I've been pleased with bj, besides some open questions I've still been working out by perusing the source. Check it out if you need a easy to use persistent job queue.


CapGun and LogBuddy updated to 0.0.5

Some long overdue releases of cap_gun and log_buddy - both have been updated to version 0.0.5. Both are now available as gems on github.com/relevance as well as from rubyforge.

CapGun gives you super simple deployment notifications from Capistrano. LogBuddy gives you a log helper through all objects, and can also log the name of the thing passed in along with its value -- saving you on typing and making debugging quicker.

CapGun got a fix so it does not attempt to display the rails_env if its not defined - this should clean up any strangeness in notifications if you saw something like "my_app was deployed to ".

LogBuddy got some minor tweaks and improved specs.

Both libraries now use Echoe, since Hoe complains about readme.txt when I want to use readme.rdoc, dammit. Both now only have a dev dependency on echoe to play nice with RubyGems 1.2.

You can install them via github or rubyforge:

sudo gem install log_buddy
sudo gem install cap_gun

or

gem sources -a http://gems.github.com
sudo gem install relevance-log_buddy
sudo gem install relevance-cap_gun

Please log bugs or issues at our Trac.


Git 1.5.6 released

Git 1.5.6 has been released, and there are a lot of usability fixes and tweaks which should make the upgrade worth your while. Looking at the detailed list of changes since 1.5.5, it looks like submodules have been getting quite a bit of love from many contributors, so it might be time to get them another shot. Scroll down or search in the announcement for the the part starting with "Changes since v1.5.5" and look through there for some of the submodule improvements that are coming.

The directions posted here worked fine for me to upgrade my existing source based installation in /usr/local.


Git lessons learned

Lessons learned from day to day use with various ruby and rails projects.

* Submodules completely suck when things get complex - I'm moving away from no submodules, and using direct exports for now until I have time to research braid or piston 2.0. For more details on this, see this or this post on the github group.

* Use capistrano 2.2, not 2.3! 2.3 breaks git support

* Always use :remote_cache for deployments -- super fast with git

* If you have weird errors, it probably means you need to pull - when in doubt pull to make sure you have the latest

* Branch more locally - I've been burned a few times when I've started work in master and then regretted it later when I wished my work wasn't in mainline (yes, its possible to fix this after the fact, but that gets into more advanced git usage)


Refactotum Rails Conf 2008

I'm in Portland for Rails Conf with over 80% of the Relevance crew. We were testing out our "plane number" yesterday, but thank goodness American didn't let us down.

We'll be speaking today at about how to contribute to open source at Refactotum from 1:30 to 5. We will cover some tools to help you find the code with the most technical debt, go over example refactorings, and then spend the rest of the session going from project to project and helping out as folks hit obstacles. Please bring a laptop with any projects checked out that you'd like to hack on during the session (git preferred but not necessary).

Hope to see you there!


Testing Velocity Part 2 - Why do we test?

A couple weeks ago, I began a series on keeping your test suite fast and effective. I now am going to digress a bit, take a step back and view the big picture to establish context.

Before addressing test performance and what makes up a good test, we should ask ourselves why is it that we write tests at all? If we want to be effective, we should always stay conscious of the overall goal of testing, as well as the specific goals behind each test in context.

Some would argue that tests are primarily a design tool. Or that tests are a living, breathing, specification for our code. Others would say it's primarily a means to drive and maintain quality. Some may say that tests are useful to ensure that the really difficult parts of our system work, or to keep lax developers in line.

Testing is *only* valuable and useful insofar as it supports software as a cooperative game. When you think "cooperative game", imagine rock-climbing, or ? To quote Alistair Cockburn:

Software development is a (resource-limited) cooperative game of invention and communication. The primary goal of the game is to deliver useful, working software. The secondary goal, the residue of the game, is to set up for the next game. The next game may be to alter or replace the system or to create a neighboring system.

So is testing for design, or quality, or correctness, or communication? The answer is that it's for *all* of those things (and more) as long as it helps deliver working software and prepare for the next game! So when someone asks, "Why do you write tests? Why do you care about the speed of your tests?" The first answer is, "It depends." It depends on the software you are delivering, on the teammates and domain you are working with, and on what the next game (if any) is. Since every software project will be different, clearly how you write the code and its tests will differ. An embedded system targeting your phone will have a much different test suite than a large enterprise web app.

This digression is to clarify debates I've heard (and been involved in) over issues like what should a "proper" unit test should do, how much setup is okay, mocks or stubs or fake objects, and when it's okay to mock/stub versus when it's not okay. There are not hard and fast answers to any of these questions. You need to consider context: What is being built? What are the current technical issues? Does the client want to run (or create) acceptance tests? How slow is the test suite currently, and is it impacting dev speed? How large is the system, and how much larger do you expect it to be? Who will be maintaining the system after it's released, and what will their skill level be?

Answer those types of questions before making statements like "this spec is doing too much setup and runs too slow," or "we shouldn't be stubbing in a functional test like this." It will ensure that your discussions and debates will stay grounded and useful instead of becoming endless religious debates. Keep context and the cooperative game model of software in the back of your mind, and I will too as I continue this series and try to lay out some practical, overall guidelines.


← Before
Flickr View All » IMG_3333IMG_3312IMG_3311IMG_3310IMG_3309Trying to catch LeifCome and get me!IMG_3304IMG_3303