Friday, June 1, 2012

Okay Folks, let's keep it CLEAN!!

Okay.  You’ve just spent a substantial chunk of your life scouring the skies for birds, taking careful systematic counts of everything you see.  (For some of you, we’re talking hundreds of hours spent in one place!)  The daily forms for each count day are in your possession, the data is submitted to HawkCount, and all the original paperwork is filed away for safekeeping.  Mission accomplished!  But wait!  Just how “clean” is your data?
Most veteran hawkwatchers I know pride themselves on the quality of their fieldwork.  They’ve developed their ID skills after years of practice, and putting a name with reliable ease to most every raptor they see is something they have every right to be proud of.  But at the end of the day, what happens to all those paper forms with the tiny boxes of numbers they’ve scrawled all over them?  Does your hawkwatch have a procedure for handling data?  If it doesn’t, I implore you think seriously whether it might need one.
I raise this issue now as the end of spring hawkwatching season draws near, because I’ll admit I’m easily impressed by the large stack of count sheets presently on my desk.  After all, I worked hard to collect nearly all of that data, and I try very hard to be a careful/conscientious counter.  But an integral part of the job of counting hawks is to see that the data collected can actually be used, and for this to happen, it must be correct!  And for it to be correct, it must be checked through thoroughly, line by line, to see that the paper forms are faithfully transcribed electronically to HawkCount.  So these count sheets on my desk are not yet a finished product, despite appearances to the contrary.  I’ll admit that this is possibly the least glamorous aspect of counting hawks I can imagine, but it’s of paramount importance to the science side of it.  And I think it’s much too easy, especially in this age of nearly realtime HawkCount posting, to come back at the end of the day and quickly bang out the day’s results for all your eager fans waiting to see them and be done with it.  But after spending a full day on the hill, you’re probably tired.  You can barely see straight!  And now you’re going to take aim with your mouse and cursor at more little boxes on your computer screen and expect perfection.  This is unrealistic.  I don’t care who you are, you’re going to make mistakes!  And this, too, is part of the job of counting hawks.
So I’d like to make a special request of you: if you don’t already, make a point to take as much pride in the correctness of your data as you do in your skills at identifying birds.  Whether you audit the data yourself at a later time or designate someone willing (and able) to do it, just ensure that it gets done.  And if you must do it yourself, try to approach it with fresh eyes rather than a mind clouded by fatigue, which is why it’s almost always a bad idea to try to audit the data yourself the very day it was collected.
My personal ritual is to export the submitted data from HawkCount as an MS Excel worksheet (ask your site coordinator to do this for you if you don’t have direct access to your HawkCount profile), and then I’ll tote my laptop with me down to a coffee shop that offers free wireless internet.  In the midst of a caffeinated buzz while wearing headphones to cloud out the surrounding din, I’ll step through the spreadsheet on the computer cell-by-cell while tracing through the stack of daily forms sitting before me with an index finger.  (I imagine it might be entertaining to watch me work!)  My preference for coffee shops with WiFi is twofold: a) it’s perfectly acceptable in many coffee shops (e.g., Starbucks) for one person to spread out his paperwork and things all over a table for several hours at a time and only order a few soy Café au laits, and b) having internet access means I can correct errors on HawkCount as I discover them, and also allows me to take breaks and screw off a little when my eyes begin to glaze over.  Admittedly, coffee houses are not cheap in an absolute sense.  But we’re not talking about making them a daily habit.  We’re talking about spending $10 on your pleasure as an *investment* in the quality of your count data, and this begins to look especially cheap given the amount of time you’ve already invested in counting birds.  (And if coffee is not your thing, do what you can to make the job slightly more pleasurable/rewarding if you find the task as monotonous as I do.)
So getting back on track: accurate counts are at the core of what we do.  Do what you possibly can to make sure they really count!
Good Hawkwatching,

1 comment:

  1. Hello Arthur,

    as you say, checking your database is essential. I'll give an example from "my" own count-programme to support the point you are making. As you know I (together with three collegues) count waders (shorebirds) and waterbirds (wildfowl)in a large part of the SW Netherlands. I will use last April as an example; we counted 99966 waterbirds and waders (82 species) in 185 different areas. The counters write the numbers of birds per species/area in their notebooks and later the same day these are summarized on count sheets. After transferring the notes in their notebooks to the countsheet the counter "reads" the form in order to find what we call level 1 mistakes (wrong additions, numbers in the wrong areacolumn that sort of thing). The data is then entered into a preliminary database by two very experienced professionals. We then make a printout of the entered data. The layout of these prints follows the layout of the count sheets in order to spot the differences more easily. Like you I sit down with a cup of coffee (no latte please) and start working. The first part is pretty easy because it only involves database questions that will help me find mistakes. I usually look for species missing from the count that should have been seen in April, missing areas and numbers that are either too high or too low (compared to results from earlier years). That way it is easy to find the most obvious mistakes. After that I too start reading, in my case with the help of a ruler or a piece of blank piece of paper to help me stay on the correct line. In April I found 113 mistakes varying from complete areas that were overlooked, missing species and wrong numbers. Even after the datacheck there will still be some minor mistakes in the corrected set. Most of these smaller mistakes will be found when we start working with the database, for example when writing a report.

    The problem with this process is that to most people this is very tedious, so you will have to really love your database in order to keep checking, month after month, year after year.

    The best way to start loving your database is USING it. There is nothing as awful as spotting a mistake in a graph, caused by an error in your database, on the day the report returns from the printers. In order to get people to care more about data hygiene they should be encouraged to play around with their data. Teach them (or give them the tools) to make a graph,a table, compare weeks, years, species, daily timing etc. With every question you ask, and with every answer you get from your data, the more you will love your database and the more willing you will be to invest a little time to keep it warm, safe and correct. I am sure that this is probably the norm in Hawk Watch data. However, it seems to me that these days more and more counting programmes give their counters a tool to enter their data via the web. This results in a database that is up to date but at the price of having a database that is very hard to validate.