Archive for the ‘News’ Category

How to enable warnings, part one of many

Tuesday, August 5th, 2008 by Galen Charlton

Today I sent an RFC about turning on warnings in all of Koha’s Perl scripts and modules. Of course, turning on warnings by itself does nothing except fill up the Apache log; the trick is to quell them by fixing the underlying problems.

As an example, consider misc/migrations_tools/bulkmarcimport.pl. On the plus side, it already contains a use warnings; statement; on the minus side, it was commented out. So close!

After enabling warnings and running bulkmarcimport -d -file test.mrc, we get

deleting biblios
Use of uninitialized value in pattern match (m//) at \
 misc/migration_tools/bulkmarcimport.pl line 118.

One of the most common errors you after turning on warnings are complaints about uninitialized variables. So what’s the variable in question?


if ($format =~ /XML/i) {

OK, so what is $format and why is it uninitialized?


my ($version, $delete, $test_parameter, $skip_marc8_conversion, $char_encoding, $verbose, $commit, $fk_off,$format);

$|=1;

GetOptions(
    'commit:f'    => \$commit,
    'file:s'    => \$input_marc_file,
    'n:f' => \$number,
    'o|offset:f' => \$offset,
    'h' => \$version,
    'd' => \$delete,
    't' => \$test_parameter,
    's' => \$skip_marc8_conversion,
    'c:s' => \$char_encoding,
    'v:s' => \$verbose,
    'fk' => \$fk_off,
    'm:s' => \$format,
);

$format starts off with an undefined value when it is declared by my, and if you don’t supply the -m switch when running bulkmarcimport.pl, it stays undefined.

Fortunately, the script’s usage tells us what the default value of $formatshould be


  m      format, MARCXML or ISO2709 (defaults to ISO2709)

Thus, we can fix that particular warning with this patch:


--- a/misc/migration_tools/bulkmarcimport.pl
+++ b/misc/migration_tools/bulkmarcimport.pl
@@ -2,7 +2,7 @@
 # Import an iso2709 file into Koha 3

 use strict;
-#use warnings;
+use warnings;
 #use diagnostics;
 BEGIN {
     # find Koha's Perl modules
@@ -30,7 +30,8 @@ use IO::File;
 binmode(STDOUT, ":utf8");

 my ( $input_marc_file, $number, $offset) = ('',0,0);
-my ($version, $delete, $test_parameter, $skip_marc8_conversion, $char_encoding, $verbose, $commit, $fk_off,$format);
+my ($version, $delete, $test_parameter, $skip_marc8_conversion, $char_encoding, $verbose, $commit, $fk_off);
+my $format = "ISO2709";

 $|=1;

We’re not done with bulkmarcimport.pl, but this is a start. More anon.

BarCampOhio aka LibraryCampOhio

Friday, August 1st, 2008 by atz

Code4Lib folks and other techs around Ohio should consider checking out the upcoming BarCampOhio / LibraryCampOhio event on Monday August 11, 2008:

The idea is for a self-organized collaborative exchange, not a keynotable conference. So come prepared to share. OCLC will be hosting, with some very knowledgeable and interesting folks in attendance. And if you’re not into that, then you’ll still have me. :)

Hug a sysadmin at your library

Friday, July 25th, 2008 by Galen Charlton

In honor of SysAdminDay, I’d like to give a shout-out to library sysadmins everywhere. Not only do they keep the networks running, the PCs together, the OPAC up and running during a Slashdotting (hey, you never know, it just might happen), the obscure 20-year-old CD-ROM databases alive, and the various dis-integrated library applications tied together, they give the catalogers sharing the basement offices somebody to talk to!

Encoding and decoding XML data as path sequences

Friday, July 4th, 2008 by Chris Catalfo

Lately I’ve been thinking about how to represent information about XML paths and data as a string.

For example, I’d like to be able to record the origin of this data:


<titleInfo type="alternative">
<title>Special edition using XSLT</title>
</titleInfo>

as something like this (with id and data as properties in a JSON object):


{"id":"titleInfo-2@type=alternative\title-1","data":"Special+edition+using+XSLT"}

I could then take the preceding id string, extract the provenance of the data, and recreate the original XML document.

Here’s how I’ve tried encoding the XML path and data using an XSLT stylesheet:

For each text element, create an id consisting of:

  1. Each ancestor (except the root)
  2. A dash to delimit the ancestor element’s name from its position
  3. The integer position of that node in the XML file (using )
  4. Each of the ancestor’s attributes, in the form @attrname=attrvalue
  5. A backslash to be used a path delimiter
  6. The text element’s name

With this id, I believe I now have everything I need to reconstruct the node that the data referenced by that id came from.

After playing around with this a bit, I realized that what I’d done was basically reinvent XPath! In XPath, the preceding path in the id string would be represented as:

/titleInfo[1]@type=alternative/title[0]

OK…so next idea is to see if there are libraries out in the wild wild web for creating XML documents from XPath expressions (and not just querying XML documents). I see that the Perl module XML::XPath may offer a solution.

I also wonder if this is how XForms libraries keep track of what parts of an XML document have been edited….

Gitting Used to Git

Friday, June 6th, 2008 by Andrew Moore

I have been using git a lot more efficiently recently, and I want to share some of the more advanced things that may help you get used to using git, too.

First, it helps me a lot to have some things in color. I have found these four config changes to make it a lot easier to scan git output quickly. The “diff” one is especially handy.

  • git config –global color.branch auto
  • git config –global color.status auto
  • git config –global color.diff auto
  • git config –global color.interactive auto

Second, I have found “git add –interactive” to be pretty useful. If you have changed several files and only want to commit some of them, this will present a menu-driven interface to let you pick the files to add. Even better, if you have edited a file in two places and only want to include one “chunk” in your commit, this lets you specify that. It’s great if you have added some debug code at the top or bottom that you don’t want to commit.

Next, I’ve been using git rebase –interactive” to be able to re-order and combine my patches to make them more readable. If you have a long sting of small commits that you want to organize better, you can run “git rebase –interactive HEAD~20″. This will open an editor with the last 20 commits in it. You can reorder the lines to reorder the commits. You can also “squash” the lines to merge commits together. This will help you make more readable sets of commits.

Finally, if you have a commit that you want to split up, use “git rebase –interactive” to “edit” it. Then, “git reset HEAD^” to put yourself “back in time” to that spot. Then, you an choose only a subset of the files or patches to commit, commit them, and then optionally commit the rest.

For more help on using git, I have really found the gitcasts to be a tremendous help.

Some of these features require a newish version of git, so if yours doesn’t seem to be working like this, I recommend an upgrade.

git ‘er done!

Deciding on an API for Biblios

Saturday, May 24th, 2008 by Chris Catalfo

As I continue to work on Biblios in anticipation of its release (soon, I hope!), it is about time to decide on an API.

I have already put into place a simple macro system for batch editing of bibliographic records. The language is Javascript and makes use of a MarcRecord javascript object to manipulate MARCXML records.

Here is a simple example (record is a MarcRecord instance):


// Check to see if record has 856.  If so, add subfield $u with url.  If not, add a new 856 with url.
if( record.hasField('856') ) {
    record.field('856').subfield('u', 'http://www.google.com');
}
else {
    record.addField( new Field('856', '', '', [ new Subfield('u', 'http://www.google.com')]) );
}

I would like to provide access to Biblios’ main functions for use by plugins. Here are a few ideas for API functions:

  • Run a search
  • Run the current search but limited to something
  • Save all search results to a folder
  • Save record with id n to a particular folder
  • Edit record with id n
  • Run a macro on all records in a folder

I’d be interested to hear what others think: what they’re used to in other cataloging software and what commands/tools that software might be missing which could be ultimately included in Biblios.

Closing in on Koha 3.0

Sunday, May 4th, 2008 by Andrew Moore

Now that we’ve had a beta version of Koha 3.0 out for a little while now, there is some increased interest in getting a final version of Koha 3.0 put together soon. Paul recently started a discussion on the koha developers list about what we need to do to get a release out the door. This includes deciding on the last minute features we would like to include to make it a cohesive, useful product, what bugs absolutely need to be fixed, and the logistics involved in maintaining that version while we set our sights on the next version of Koha. I think that in the coming days and weeks we will see this discussion continue and a flurry of activity as we try to put some effort into finding the balance between completeness and timeliness

How Good is Google Book Services? Ask your mother.

Tuesday, April 29th, 2008 by atz

Despite not being even remotely Irish, my mother likes to make a traditional corned-beef and cabbage dinner for our family on St. Patrick’s Day, and this year was no exception. (Sorry, no pix.) My mother is a five-foot tall head reference librarian in a local public library system and she commands the type of incredible memory that you would expect from her profession. She makes use of this in another tradition, the singing of a particular St. Patrick’s Day hymn taught to her by nuns in grade school. Suffice to say, my mother has not been in grade school for a while, and for a song sung only once a year, it seems remarkable to me that anybody can remember all the words without any help. I mean, how far can you get into Auld Lang Syne or Good King Winceslas? If you say “all the way, no problem”, please remind yourself if you are currently, have even been, or are about to become a reference librarian.

So this song is essentially about St. Patrick and the persevering quality of Ireland’s faith in him, but pace is pretty quick and the lyrics are twisty and complicated. I’ve never heard this song anywhere else, and I didn’t know the title, but I wanted to find the lyrics. Since Google Book Services had just come out, I decided this would by my test. Searching on “St. Patrick” was never going to do it. The best I could do is remember the small, odd phrase “is bright with us yet.” In fact it was that phrase that made me want to revisit the lyrics in the first place.

GBS’s first hit was The Hymn-book of the Modern Church: Studies of Hymns and Hymn-writers (1905) by Arthur Edwin Gregory, a compendium of various hymns including two of the “Romanist” verses I was looking for, and a discussion of the author and his relative merits. Interestingly, it does not offer the title of the song, but Google’s results took me directly to the correct page so that immediately I was looking at the relevant content. Very impressive.

The second hit was even better: The Parochial Hymn Book (1897). 3 verses and a complete four-part vocal arrangement! The title, as it turns out, is “St. Patrick’s Day”, not the most helpful string to search against.

The specificity of the texts themselves is most remarkable. When the first rounds of “electronic texts” were circulated, many people were unimpressed with the experience of reading screens of flat ascii text, objecting to the sterile quality as not-bookish enough. The difference between that time and today is stark. Google’s text is not an abstract vision of the PHB’s content, rather it is photographic images from a particular book in a particular stack, with all the peculiarities of the physical original (save, perhaps, smell).

It has provenance: a hand-inked calligraphic block claims it for Andover-Harvard Theological Library of the Harvard Divinity School, with both a stamp and a bookplate noting that it comes from the estate of one Rev. Charles Hutchins on May 24th, 1939. You can even see Harvard’s call number penciled in on the verso page. The effect of these details is to greatly reinforce the validity of the text. Contrast this with a posting on any of a thousand interchangeable lyrics sites. Which would you regard as desireable or authoritative?

My experience was very much like those I suspect any fan of libraries has enjoyed, the feeling of discovering a tangible artifact of another time and place that was produced and preserved specifically so that you might encounter it, and have the information sought. Even 111 years later.

I chatted up my colleague Chris in New Zealand, one of the original Koha developers, about my GBS results. He was fairly impressed. For his test, he put in his father’s name, Ian Cormack, and was promptly returned as the first hit a link to his academic article “Creating an Effective Learning Environment for Maori Students” in Mai i Rangiatea: Maori Wellbeing and Development (1997). So there you have it: two texts separated by 100 years of time, and half the Earth in distance, accurately and immediately retrieved from one repository, for free.

Note: Google has built on to Book Services a bunch of other features, including ratings, reviews, tags and a My Library feature. In my My Library, you can see the three texts I mentioned. There is also a “Find this book in a library” link to OCLC’s WorldCat that tells me the closest (known) copy is 160 miles away.

For comparison, the my search at amazon was empty. Which suggests another question….

Paranoia Alert!! Assume your library had a copy of the same song in one hymnbook or another. Without GBS, based on my limited query data, how long would it take me to retrieve it at your library? I should add that my search began at 10:30PM.

Perhaps most striking is that GBS would be preferable even if I was sitting in Harvard’s library where the original still resides!

New Kid on the Block

Tuesday, March 18th, 2008 by Andrew Moore

Hi, I’m Andrew Moore, the newest addition to the LibLime development team. After my first day at LibLime yesterday, I’ve actually made my first tiny addition to Koha today. I’m excited to work with the rest of the team and help improve Koha as much as I can. Although I’ve been writing perl for a few years, I don’t have much experience with library technologies. I fully expect to goof something up spectacularly real soon. So, keep an eye out for that and please go easy on me when I do!

LibLime bibliography at LibraryThing

Friday, February 29th, 2008 by Galen Charlton

During her interview, one of the people who’s starting at LibLime next week asked for a list of books to purchase. I’ve put up a short reading list at LibraryThing and asked other LibLimers to contribute.

Of course, not everything worth reading is found in a book. Links to good web pages and blogs are easy enough to share, but how about LibraryThing for journal articles? What’s out there for sharing citations and articles?