IVOA logo

Issues list: Vocabularies in the Virtual Observatory

IVOA Note

Working Group
Semantics
This version
Issues and errata as of Revision: 712 , Date: 2008-07-29 11:37:20 +0100 (Tue, 29 Jul 2008)
Editors
Norman Gray
Alasdair J G Gray

Table of Contents


1. Introduction

This is the list of errata (2. Errata) and major issues (3. The issues list) for the vocabularies work; the latest version of the document itself is at http://www.ivoa.net/Documents/latest/Vocabularies.html. This does not include minor issues more concerned with the fine details of maintaining and distributing the vocabularies; such minor issues might be better handled using the Volute issues list.

The list is present here as a record of the points at general issue, and (post standardisation) as a reference pointing to some rationale for the design decisions in the standardised document.

2. Errata

There is as yet not a single erratum for the document (does this make it temporarily perfect?).

3. The issues list

The sections below are intended to be a log of the various options and eventual conclusions, rather than a summary of the arguments. For those, see the linked online discussions.

CLOSED: [masterformat-1] Format of the master vocabulary

[Issue summarised here in some detail since it hasn't had much airing on-list]

The distributed (and normative) SKOS files are generated to a greater or lesser extent. This might consist of a conversion from some completely different format, such as the IAUT files which originate in the easily-parsed native format of the Lexicon application which originally managed them (see Shobbrook and Shobbrook discussion), or a relatively lightweight processing involving adding missing but mechanially inferrable relations.

Question: what should be the format of the master files?

Possible resolution 1: nothing mandated in the document -- the format of the master file should be whatever is most convenient, as long as the generated and distributed files are valid SKOS. [This says: there is no need for the IVOA to specify this, as it's purely private to the vocabulary maintainers]

Possible resolution 2: SKOS, in Turtle notation, possibly requiring some post-processing to add omitted-but-inferrable relations. This is easy to read and write, and it is simple enough that it would be feasible to create from scratch a parser for the relevant subset of it, if that were somehow necessary. [This says: what we're distributing -- SKOS -- might as well be the format we edit, so we mandate that, for the sake of simplicity]

Possible resolution 3: some more fundamental no-punctuation format, such as that for the Lexicon program. [This says: we want to be completely technology-agnostic, and even SKOS is too hard to parse, post-apocalypse]

Resolution: option (1) above – nothing mandated. Only the distribution format is to be specified (no objections on the list, agreed at Semantics Session Trieste InterOp, May 2008).

Discussion (such as it was): 2008 Feb 4 (+ thread).

CLOSED: [distformat-2] Format of the distributed vocabularies

Question: in which format should vocabularies be distributed?

Possible resolution 1: the standard simply mandates that they be distributed in at least one well-known RDF format (which means either RDF/XML or Turtle, which is equivalent to N3 for this purpose). This implies that an RDF parser will, realistically, be required in order to process the vocabulary files.

Possible resolution 2: the standard requires them to be distributed in a format which is parseable as RDF, but which is also regular enough that it's usefully interpretable as ‘normal’ XML.

Resolution: variation of option (1) above – distribution must be in RDF/XML serialisation with optional turtle serialisation. More rationale added to the document. Agreed at Semantics Session Trieste InterOp, May 2008).

Discussion: 2008 Jan 21, 28 (+ thread), 2008 Feb 4.

CLOSED: [versioning-3] Identifying vocabulary versions

Question: do vocabulary users refer to a concept URI with an explicit version, or to a constant URI which always refers to the latest version?

Possible resolution 1: users always refer to the same concept URI, as for example in http://myvocab.org/myvocab#mytoken, and this refers, either by redirection or server-internal URI rewriting, to the latest version of the vocabulary. The Dublin Core metadata set at http://purl.org/dc/terms/ does this [std:dublincore].

Possible resolution 2: users refer to a concept URI without a version; this URL returns a vocabularly with a versioned namespace (this can probably be excluded, since it violates the good practice of having a namespace be retrievable at its own URL).

Possible resolution 3: users will refer to concepts which have a version explicit within the namespace, as for example in http://myvocab.org/myvocab-v1.1#mytoken (the precise location of the version number or date in the URI is arguably a distribution/maintainance detail).

Resolution: option (1) above – The normal case will be to refer to an unversioned URI. Agreed at Semantics Session Trieste InterOp, May 2008).

References: see [berrueta08], [sauermann08].

Discussion: 2008 Jan 21, 28 (+ threads), 31, 2008 Feb 4

CLOSED: [maintenance-4] Who maintains vocabularies?

Question: By whom, and by what process, are vocabularies maintained?

This is a different issue from CLOSED: [versioning-3] Identifying vocabulary versions, since that is concerned with how the versions are identified, whereas this is concerned with who it is who manages the changes which are necessary as a vocabulary evolves.

Option 1: the vocabularies in the standardised document are regarded purely as examples, with no normative force and no specified maintenance process.

Option 2: the document's vocabularies are normative, and the document should define a maintenance process, possibly modelled on the UCD process [std:ucdmaint].

Option 3: the document's vocabularies are normative, but not claimed to be more than merely adequate. They will not be developed as part of this standard's evolution, but instead be maintained by other interest groups, either within or outwith the IVOA process.

Are there minimal standards of curation which conforming vocabularies must abide by? For example, need we require vocabulary maintainers to use the <skos:changeNote> mechanisms, or just rely on their good sense?

Resolution: Option 3. The final published standard will include a number of SKOS vocabularies produced as part of this process. These will be usable and citable, and the community will be encouraged to use them, but they will not be maintained after the standard is complete. Instead, the `owners' of the underlying vocabularies (for example the UCD maintenance group) will be encouraged to maintain the SKOS version alongside their other forms. In particular, the IVOA-T vocabulary will be developed and maintained in a parallel standard to this one. Agreed at Semantics Session Trieste InterOp, May 2008).

Discussion: 2008 Jan 31, 2008 Feb 14

CLOSED: [vocabset-5] What vocabularies are included in the standard?

Irrespective of the resolution to issue CLOSED: [maintenance-4] Who maintains vocabularies?, there will be a set of vocabularies included in the document, either as samples, or as an initial specification. Question: What should this set contain?

There are six vocabularies which have been associated with the draft standardisation process, namely

In addition, there are multiple informal keyword lists associated with the VOEvent arena (see Roy's message and Rob's). These haven't been SKOSified at all, and Rick's excellent suggestion is that these be left as homework for the VOEvent group.

Resolution: include five vocabularies. The A&A, AOIM, UCD1, IAU-93 and constellations vocabularies will be finished and immediately useable (see the resolution on maintenance in [maintenance-4]). The IVOAT vocabulary will be developed in a parallel process to this vocabularies standard: it will be referred to, and a snapshot of it may be included in the standard, but it will be clearly marked as a work-in-progress. Agreed at Semantics Session Trieste InterOp, May 2008).

Discussion: 2008 Jan 31, 2008 Feb 4, 2008 Feb 4 (VOEvent list), 7, 14; wiki page.

CLOSED: [mappings-6] Inclusion of mappings in vocabularies

Should mappings between vocabularies be in this standard, and if so, how closely bound should they be to the vocabulary itself?

The early-2008 editors draft for the SKOS standard [std:skosref] included inter-vocabulary mappings, which were hitherto separate from the intra-vocabulary links in the SKOS core. The question of mappings in the SKOS standard is still (early 2008) in flux.

Question: how do we accomodate this uncertainty in the IVOA Vocabularies standard? And how do we advise mappings to be published?

Consideration 1: The mappings spec is still in flux, and likely to remain so for some time after the SKOS core document is standardised

Consideration 2: Norman would hope to see the situation developing where there are multiple third-party mappings between vocabularies, maintained by specific communities, or which describe mappings at different levels of granularity, or which represent significant (publication-worthy?) labour on the part of individuals, adding value to the network of vocabularies.

Resolution: include mappings in this standard using the current skos working draft version. This would require the standard to be updated if the format of mappings changes in the skos standard.

Crucially, vocabularies and the mappings between them are conceptually separate entities, although they will in practice likely be maintained together.

Appendices

Bibliography

[berrueta08] Diego Berrueta and Jon Phipps.
Best practice recipes for publishing RDF vocabularies. W3C Working Draft, January 2008. [Online].
[sauermann08] Leo Sauermann and Richard Cyganiak.
Cool URIs for the semantic web. W3C Interest Group Note, March 2008. [Online].
[std:dublincore] DCMI Usage Board.
DCMI metadata terms. DCMI Recommendation, 2006. [Online].
[std:skosref] Alistair Miles and Sean Bechhofer, editors.
SKOS reference. W3C Working Draft, June 2008. [Online].
[std:ucdmaint] Andrea Preite Martinez and Sébastien Derriere, editors.
Maintenance of the list of UCD words. IVOA Recommendation, May 2006. [Online].