Slide - American Library Association
Download
Report
Transcript Slide - American Library Association
Linked Library Data
Tuning Library Metadata for
the [Semantic] Web
Presented 2011-03-16
ALCTS RDA Webinar Series
Corey A Harper
Topical Overview
Semantic Web & RDF Intro
Linked Open Data
[Linked] Library Data
Resource Description and Access (RDA)
Beyond
MARC
As RDF Vocabularies
Broader Interoperability
Small steps forward…
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
2
Semantic Web
TBL’s original vision
“Weaving
the Web” – 1999
Then: Focus on Machine Reasoning
Scientific American Article
Now: Focus on things & links
Reasoning
2011-03-16
& Inferencing less central
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
3
Semantic Web
Originally:
Metadata
standard built on XML
Metadata about “Web” things (documents)
Eventually:
Metadata
about all sorts of things
And about relationships between things
What are the “things”?
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
4
Semantic Web Terminology
Resource: Any “thing”
Class: Abstraction of a type of thing
Individual: An instance of a class
Property: An attribute of an individual
Statement/Triple:
A Resource (subject)
A Property (predicate / verb)
A Value (object) - Nodes
Graph: Visual Representation of statements
Ontology: A domain specific collection of classes and
properties
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
5
Semantic Web Terminology
Nodes: The Subjects and Objects in a Graph
Arcs: The Predicates in a Graph
Domains and Ranges: Constraints on Nodes
Domain:
What things can be subjects
Range: What things (or strings) can be objects
Literals: Values as strings rather than things
Named Graphs: Graphs with URIs treated as
nodes.
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
6
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
7
Linked Open Data
Use URIs as names for things
Use HTTP URIs so that people can look
up those names.
When someone looks up a URI, provide
useful information.
Include links to other URIs. so that they
can discover more things.
http://www.w3.org/DesignIssues/LinkedData.html
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
8
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
9
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
10
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
11
Data in the Cloud
Hubs in the May 2008 Version:
FOAF
DBPedia
Geonames
MusicBrains
Myriad Sources coming online:
Thompson Reuters
New York Times
British Broadcasting Corporation
Government Data (UK, US and more)
Google and Facebook
More
2011-03-16
and More Library, Archive and Museum Data
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
12
DBpedia
Structured Wikipedia Data
Genres, Influences, External Links
Multi-lingual / Multi-script labels
Rich Semantics
Many linkages to other datasets
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
13
DBpedia Model
Partial basis in data entry conventions
InfoBox’s, and InfoBox Templates
Metadata Entry Format
Partial source of Ontology
Class
Structure
Vocabulary Design
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
14
DBpedia
3.4 Million “things” described
Ontology based on “infoboxes”
1.5
million things classified
http://wiki.dbpedia.org/Ontology
Approx. 50,000 “Properties”
Approx.
2011-03-16
1,200 defined in ontology
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
15
What *things* are in
our data???
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
19
…Library
data is
extremely
complicated
Library Metadata
Rich stores of MARC, MODS, &c.
Robust Controlled Vocabularies
Subject
Heading lists
Code lists
Thesauri
Emerging data model in FR*
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
21
Bibliographic Vocabs
Bibliographic Ontology (Bibo)
Zotero,
Omeka, EPrints and Others
FRBR – unofficial
And
now Official (Thank you IFLA!)
ISBD
Resource Description and Access (RDA)
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
22
Linked Library [Archive,
Museum] Data
LIBRIS (Swedish Union Catalog)
Library of Congress (LCSH, OSI)
German National Library
Hungarian National Library
British Library
Europeana
Archives Hub & LOCAH
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
23
Library Authority Data
“Include links to other URIs. so that they can
discover more things.”
Short of providing and linking to URIs, this
*is* authority data.
This is what our authority files are for.
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
24
Library Controlled
Vocabularies: Benefits
Reputation - Trusted Tradition
Mature - Time tested and carefully
developed
General & Comprehensive - Cover large
knowledge spaces
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
25
SKOS
Simple Knowledge Organization System
Properties and Classes for describing
Controlled Vocabulary
Heavily used in Linked Library Data
id.loc.gov
Virtual
International Authority File (VIAF)
skos:primaryTopic
bibo:book
2011-03-16
skos:subject
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
26
Other Vocabularies
Thesaurus for Economics
French Subject Headings
Swedish Subject Headings
IconClass (not on web yet)
OCLC Terminology Services
Dewey Decimal Classification
Virtual International Authority File
Metadata Authority Description Schema (MADS)
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
27
Resource Description and
Access
Current focus on MARC
Much
criticism
Within MARC, not a tremendous change
Different problems outside of MARC
Possible focus outside of MARC
RDA as
realization of FRBR
RDA as Metadata Vocabularies
RDA as related to Bibo
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
28
RDA as Metadata Vocabularies
roles and vocabularies
have been provisionally registered
IFLA FRBRer and ISBD elements and
vocabularies have been officially registered
Discussions about long term maintenance
of both RDA and the vocabularies
Effort to create multi-language RDA
Vocabularies
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
29
Slide Adapted from Diane Hillmann
RDA elements,
Metadata Registries
Formerly NSDL Registry
Now
“Open Metadata Registry”
Managing Vocabularies
Providing Vocabulary Services
RDA – Now adding translations
IFLA Work
FRBR,
2011-03-16
FRAD, FRSAD, ISBD
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
30
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
31
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
32
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
33
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
34
RDA as realization of FRBR
What will this look like?
Probably *won’t* be stored in MARC
Overly constrained by FRBR?
Properties
have FRBR domains & ranges
Unofficial “Generalized” properties
Non-FRBR metadata
Similar to DCMI’s range constraints…
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
35
Support Free Range Metadata!
Photo Credit: http://www.flickr.com/photos/ciwf/3217378769/
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
36
BIBO and RDAVocab
Open question re: alignment
Simplified view of Bib Data is useful
Interlinking
with more general data
Interlinking with non-library domain data
FRBR as internal model for library domain
Examples
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
37
Why Does This Matter?
Our descriptions no longer stand alone!
Connect our data with the rest of the WEB
Allow others to reuse more easily
FOAF, Geonames
DBPedia
MusicBrains
New York Times, Thomson Reuters
Government Data - data.gov
British Broadcasting Corporation
Other Library, Archive and Museum Data
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
38
Conclusions
Distributed bibliographic control environment
Linking
Data
Focus on identification over description
“In short, by treating values as non-literal
resources and assigning URIs to them we give
ourselves (and others) the hooks on which to
hang further descriptions.” - Andy Powell
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
39
Future Work
“Records” in Linked Library Data
Vocabulary Alignment and Interoperability
DCMI
planning in this space
General Metadata Interoperability
Application
Profiles?
Archival Data for *context* - (EAC-CPF)
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
40
W3C Linked Library Data
Incubator
Collecting, Curating and Clustering over
50 Use Cases
Mining use cases for functional
requirements and design patterns
Recommendations to W3C
Should
lead to Working Groups
http://www.w3.org/2005/Incubator/lld/
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
41
Other Activities
ALCTS/LITA Linked Library Data IG
IFLA Semantic Web IG
https://wiki.d-nb.de/x/vA10Ag
Open Knowledge Foundation
http://okfn.org/
CKAN Linked Library Data Group:
http://ckan.net/group/lld
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
42
Thanks!
Questions?
corey.harper@nyu.edu
212.998.2479
@chrpr
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
43