One of the mainstream annotation formats to add
structured data to
the web site markup is RDFa (the other
formats are HTML5
Microdata and JSON-LD,
along with the older but limited microformats).
RDFa (RDF in attributes) makes it possible to write
RDF
triples in the (X)HTML markup, XML, or SVG as
attribute values. The full RDFa syntax (RDFa Core) provides basic and advanced
features for experts to express complex structured
data in the markup, such as human relationships,
places, and events. Those who want to express
fairly simple structured data in their web
documents can use the less expressive RDFa Lite, a minimal subset of RDFa
that is easier to learn and suitable for most
general scenarios. RDFa Lite supports the following
attributes: vocab,
typeof, property,
resource, and prefix. In
host languages that authorize the use of the
href and src attributes,
they are supported by RDFa Lite too.
A bunch of numbers has a different meaning in a
math lesson than in the telephone book, while a
word often has a different meaning in a poem than
in real life. The meaning of words depends on the
context, so in order to make computers understand
the field or area (knowledge domain), we have to
identify the machine-readable
vocabulary that defines the terminology of the
domain. In RDFa, the vocabulary can be identified
by the vocab attribute, the type of
the entity to describe is annotated by the
typeof attribute, and the properties
with the property attribute (see
Listing 1).
Listing 1. Basic Machine-Readable Annotation of a Person in RDFa
<p vocab="http://schema.org/" typeof="Person">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>.
Once the preceding code is published and
indexed, search engines will find the “web site of
Leslie Sikos” more efficiently. To uniquely
identify this entity on the Web, the resource
attribute is used (see Listing 2). The
resource attribute is one of the
options to set the object of statements, which is
particularly useful when referring to resources
that are not navigable links, such as the ISBN
number of a book.
Listing 2. A Unique Identifier of the Entity in RDFa
<p vocab="http://schema.org/" typeof="Person" resource="#sikos">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>.
The vocabulary declaration makes it possible to
omit the full URI from each property
(name refers to
http://schema.org/name,
url abbreviates
http://schema.org/url). However, if
you add RDFa annotation for more than one
real-world object or person, you can declare the
namespace of the vocabulary on the
html element of your (X)HTML document
(e.g., <html
xmlns:foaf="http://xmlns.com/foaf/0.1/"
…>) and associate it with a prefix that
can be reused throughout the document. Every time
you use a term from the vocabulary declared
on the top of your document, you add the prefix
followed by a colon, such as
foaf:name, schema:url,
etc. Using prefixes is not only handy but sometimes
the only way to annotate your markup. For example,
if you need terms from more than one vocabulary,
additional vocabularies can be
specified by the prefix attribute (see
Listing 3). You can refer to any term from your
most frequently used vocabulary (defined
in the vocab attribute value) without
the prefix, and terms from your second vocabulary with the
prefix you define as the attribute value of the
prefix attribute, or define them on
the html element with the
xmlns attribute followed by the prefix
name and the namespace URI.
Listing 3. Using the Term “Textbook” from the FaBiO Ontology
<p vocab="http://schema.org/" typeof="Person" prefix="fabio: http://purl.org/spar/fabio/"
resource="#sikos">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>. I am the author of <a property="fabio:Textbook" href="http://lesliesikos.com/mastering-structured-data-on-the-semantic-web/">Mastering Structured Data on the Semantic Web</a>.
To make search engines “understand” that the
provided link refers to a textbook of Leslie Sikos,
we used the machine-readable definition of
“textbook” from the FaBiO ontology. If you need
more than one additional vocabulary for your
RDFa annotations, you can add them to the attribute
value of the prefix attribute as a
space-separated list.
The most frequently used vocabulary namespaces are
predefined in RDFa parsers, so you can omit them in
your markup and still be able to use their terms in
RDFa annotations (see Table 1).
| Prefix | URI | Vocabulary |
|---|---|---|
cc |
http://creativecommons.org/ns# | Creative Commons Rights Expression Language |
ctag |
http://commontag.org/ns# | Common Tag |
dcterms |
http://purl.org/dc/terms/ | Dublin Core Metadata Terms |
dc |
http://purl.org/dc/elements/1.1/ | Dublin Core Metadata Element Set, Version 1.1 |
foaf |
http://xmlns.com/foaf/0.1/ | Friend of a Friend (FOAF) |
gr |
http://purl.org/goodrelations/v1# | GoodRelations |
ical |
http://www.w3.org/2002/12/cal/icaltzd# | iCalendar terms in RDF |
og |
http://ogp.me/ns# | Facebook OpenGraph |
rev |
http://purl.org/stuff/rev# | RDF Review |
sioc |
http://rdfs.org/sioc/ns# | SIOC Core |
v |
http://rdf.data-vocabulary.org/# | Google Rich Snippets |
vcard |
http://www.w3.org/2006/vcard/ns# | vCard in RDF |
schema |
http://schema.org/ | schema.org |
More sophisticated annotations require
additional attributes that are supported by RDFa
Core only. Beyond the RDFa Lite attributes, RDFa
Core supports the about,
content, datatype,
inlist, rel, and
rev attributes.
The current subject is the web address of the
document or a value set by the host language, such
as the base element in (X)HTML. As a result, any
metadata written in a document will concern the
document itself by default. The about
attribute can be used to change the current subject
and state what the data is about, making the
properties inside the document body become part of
a new object rather than referring to the entire
document (as they do in the head of the
document).
If some displayed text is different from the
represented value, a more precise value can be
added using the content attribute,
which is a character data (CDATA) string to supply
machine-readable content for a literal. A value can
also optionally be typed using the
datatype attribute (see Listing 4).
Declaring the type ensures that machines can
interpret strings, dates, numbers, etc., rather
than considering them as a character sequence.
Listing 4.
Using the content and
datatype Attributes
<html xmlns="http://www.w3.org/1999/xhtml" prefix="xsd: http://www.w3.org/2001/XMLSchema# dc: http://purl.org/dc/terms/">
<head>
<title>Leslie’s Blog</title>
</head>
<body>
<h1 property="dc:title">Leslie’s Blog</h1>
<p>
Last modified: <span property="dc:modified"
content="2015-05-25T10:54:00-09:30"
datatype="xsd:dateTime">25 May 2015</span>.
</p>
</body>
</html>
In RDFa, the relationship between two resources
(predicates) can be expressed using the
rel attribute (see Listing 5).
Listing 5. Describing the Relationship Between Two Resources in RDFa
This document is licensed under the
<a prefix="cc: http://creativecommons.org/ns#" rel="cc:license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons By-NC-ND License</a>.
When a predicate is expressed using
rel, the href or
src attribute is used on the element
of the RDFa statement, to identify the object (see
Listing 6).
Listing 6.
Using href to Identify the Object
<link about="mailto:leslie@example.com" rel="foaf:knows" href="mailto:christina@example.com">
Reverse relationships between two resources
(predicates) can be expressed with the
rev attribute. The rel
and rev attributes can be used on any
element individually or together. Combining
rel and rev is
particularly useful when there are two different
relationships to express, such as when a photo is
taken by the person it depicts (see Listing 7).
Listing 7.
Combining the rel and rev
Attributes
<img about="http://www.example.com" src="koalahug.jpg" rev="dc:creator" rel="foaf:img" />
If a triple predicate is annotated using
rel or rev only, but no
href, src, or
resource is defined on the same
element, the represented triple will be
incomplete.
The inlist attribute indicates that
the object generated on the element is part of a
list sharing the same predicate and subject (see
Listing 8). Only the presence of the
inlist attribute is relevant; its
attribute value is always ignored.
Listing 8.
Using the inlist Attribute
<p prefix="bibo: http://purl.org/ontology/bibo/ dc: http://purl.org/dc/terms/" typeof="bibo:Website">
The web site <span property="dc:title">Andrew Peno Graphic and Fine Artist</span> by <a inlist="" property="dc:creator" href="http://www.andrewpeno.com">Andrew Peno</a> and <a inlist="" property="dc:creator" href="http://www.lesliesikos.com">Leslie Sikos</a>.
</p>
RDFa DOM API
The RDFa DOM API provides programmatic access to structured data expressed in RDFa on a web page.
You can read more about RDFa in the book Mastering Structured Data on the Semantic Web.
