Thursday 15 September 2011

VertNet and the GBIF Integrated Publishing Toolkit

(A guest post from our friends at VertNet, cross-posted from the VertNet blog)

This week we’d like to discuss the current and future roles of the GBIF Integrated Publishing Toolkit (IPT) in VertNet. IPT is a Java-based web application that allows a user to publish and share biodiversity data sets from a server. Here are some of the things IPT can do:

GBIF IPT Logo Image

  1. Create Darwin Core Archives. In our post about data publishing last week, we wrote about Darwin Core being the “language of choice” for VertNet. IPT allows publishers to create Darwin Core data records from either files or databases and to export them in zipped archive files that contain exactly what is needed by VertNet for uploading.
  1. Make data available for efficient indexing by GBIF. VertNet has an agreement with its data publishers that, by participating, they will also publish data through GBIF. GBIF keeps our registry of data providers and uses this registry to find and update data periodically from the original sources to make it available through the GBIF data portal. IPT gives data publishers an easy means of keeping their data up-to-date with GBIF.

IPT can help with the data publishing process in other ways as well:

  • standardizing terms
  • validating records before they get published
  • adding default values for fields that aren’t in the original data

To get a better understanding of the capabilities, take a look at the IPT User Manual.

Why are we using IPT?

VertNet has a long waiting list of organizations (65 to date) that have expressed interest in making their data publicly accessible through VertNet. In the past, these institutions would have needed their own server and specialized software (DiGIR) for publishing to the separate vertebrate networks. We’d rather not require any of these participants to buy servers if they don’t have to. As an interim solution, we’re using the IPT to make data available online while we build VertNet. We have installed, at the University of Kansas Biodiversity Institute, an IPT that can act as a host for as many collections as are interested. The service is shared, yet organizations can maintain their own identity and data securely within this hosted IPT. This is a big win for us at VertNet, because there will be fewer servers to maintain and we can get more collections involved more quickly.

Going forward…

Well before completion, VertNet will support simple and sustainable publishing by uploading records from text files in Simple Darwin Core form. Because of this, the IPT will not be a required component of data publishing for VertNet. Rather, we see IPT as a great tool to facilitate the creation of Darwin Core Archives, which we will be able to use to upload data to VertNet.

Interested in publishing now with IPT?

We currently have two institutions sharing their collections with VertNet and GBIF through the VertNet IPT and we’re in the process of working with several more.

So, if you are or would like to be a vertebrate data publisher and would like to make your data accessible as Darwin Core Archives sooner rather than later, VertNet’s IPT might be the solution for you! Learn more about the process on the VertNet web site or email Laura Russell and Dave Bloom.

Posted by Laura Russell, VertNet Programmer; John Wieczorek, Information Architect; and Aaron Steele, Information Architect

No comments:

Post a Comment