A day in the life of Dan, the Harvester Man

Posted on 24 November 2011 by Dan

A harvester in a corn field.

Harvest Time by Gerry Balding, on Flickr

This is my first blogpost.  Hi, I'm Dan.  I'm a content analyst in the DigitalNZ team, but people around here call me "The Harvester Guy".

Most people (including me before I started here) associate the word "harvester" with those big agricultural combine harvester things. If you do a google image search for "harvester" you get a pretty awesome assortment of these machines, with all sorts of different attachments, shapes and sizes for harvesting a variety of different crops. I particularly like the look of the one with the scorpion digger claw that walks on spider legs.

Anyway, back to explaining what I do… 

When a new set of NZ relevant content (a new crop) is identified and an agreement is reached with the content provider, it's my job to harvest that content into the DigitalNZ system. This content can come in a wide range of forms, from basic excel spreadsheets to giant metadata rich rdf/oai repositories to simple sitemaps of regular web pages. 

Some content sources have so much metadata (information) about each record that the challenge is how to deal with it all, and where to put all the potentially useful bits of information. Other content sources may offer great images or resources but contain very little associated information.  Sometimes it's a struggle to find even a title for each record. With over 25 million items (!!!) harvested into DigitalNZ so far we have built up quite a variety of tools and techniques for dealing with this vast variety of content. 

So in a way I do get to drive around in those awesome harvesting machines (in my head anyway .. the spider scorpion one) harvesting up great NZ content into DigitalNZ, making it easier to find, share and use.


We've turned off comments here, but we'd still love to know your thoughts. Visit us on our Facebook Page @digitalnz or on Twitter @DigitalNZ to share any ideas or musings with the DigitalNZ team.