One of the mainstream annotation formats to add
structured data to
the web site markup is RDFa (the other
formats are HTML5
Microdata and JSON-LD,
along with the older but limited microformats).
RDFa (RDF in attributes) makes it possible to write
RDF
triples in the (X)HTML markup, XML, or SVG as
attribute values. The full RDFa syntax (RDFa Core) provides basic and advanced
features for experts to express complex structured
data in the markup, such as human relationships,
places, and events. Those who want to express
fairly simple structured data in their web
documents can use the less expressive RDFa Lite, a minimal subset of RDFa
that is easier to learn and suitable for most
general scenarios. RDFa Lite supports the following
attributes: vocab
,
typeof
, property
,
resource
, and prefix
. In
host languages that authorize the use of the
href
and src
attributes,
they are supported by RDFa Lite too.
A bunch of numbers has a different meaning in a
math lesson than in the telephone book, while a
word often has a different meaning in a poem than
in real life. The meaning of words depends on the
context, so in order to make computers understand
the field or area (knowledge domain), we have to
identify the machine-readable
vocabulary that defines the terminology of the
domain. In RDFa, the vocabulary can be identified
by the vocab
attribute, the type of
the entity to describe is annotated by the
typeof
attribute, and the properties
with the property
attribute (see
Listing 1).
Listing 1. Basic Machine-Readable Annotation of a Person in RDFa
<p vocab="http://schema.org/" typeof="Person">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>.
Once the preceding code is published and
indexed, search engines will find the “web site of
Leslie Sikos” more efficiently. To uniquely
identify this entity on the Web, the resource
attribute is used (see Listing 2). The
resource
attribute is one of the
options to set the object of statements, which is
particularly useful when referring to resources
that are not navigable links, such as the ISBN
number of a book.
Listing 2. A Unique Identifier of the Entity in RDFa
<p vocab="http://schema.org/" typeof="Person" resource="#sikos">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>.
The vocabulary declaration makes it possible to
omit the full URI from each property
(name
refers to
http://schema.org/name
,
url
abbreviates
http://schema.org/url
). However, if
you add RDFa annotation for more than one
real-world object or person, you can declare the
namespace of the vocabulary on the
html
element of your (X)HTML document
(e.g., <html
xmlns:foaf="http://xmlns.com/foaf/0.1/"
…>
) and associate it with a prefix that
can be reused throughout the document. Every time
you use a term from the vocabulary declared
on the top of your document, you add the prefix
followed by a colon, such as
foaf:name
, schema:url
,
etc. Using prefixes is not only handy but sometimes
the only way to annotate your markup. For example,
if you need terms from more than one vocabulary,
additional vocabularies can be
specified by the prefix
attribute (see
Listing 3). You can refer to any term from your
most frequently used vocabulary (defined
in the vocab
attribute value) without
the prefix, and terms from your second vocabulary with the
prefix you define as the attribute value of the
prefix
attribute, or define them on
the html
element with the
xmlns
attribute followed by the prefix
name and the namespace URI.
Listing 3. Using the Term “Textbook” from the FaBiO Ontology
<p vocab="http://schema.org/" typeof="Person" prefix="fabio: http://purl.org/spar/fabio/"
resource="#sikos">
My name is <span property="name">Leslie Sikos</span> and you can find out more about me by visiting <a property="url" href="http://www.lesliesikos.com">my web site</a>. I am the author of <a property="fabio:Textbook" href="http://lesliesikos.com/mastering-structured-data-on-the-semantic-web/">Mastering Structured Data on the Semantic Web</a>.
To make search engines “understand” that the
provided link refers to a textbook of Leslie Sikos,
we used the machine-readable definition of
“textbook” from the FaBiO ontology. If you need
more than one additional vocabulary for your
RDFa annotations, you can add them to the attribute
value of the prefix
attribute as a
space-separated list.
The most frequently used vocabulary namespaces are
predefined in RDFa parsers, so you can omit them in
your markup and still be able to use their terms in
RDFa annotations (see Table 1).
Prefix | URI | Vocabulary |
---|---|---|
cc |
http://creativecommons.org/ns# | Creative Commons Rights Expression Language |
ctag |
http://commontag.org/ns# | Common Tag |
dcterms |
http://purl.org/dc/terms/ | Dublin Core Metadata Terms |
dc |
http://purl.org/dc/elements/1.1/ | Dublin Core Metadata Element Set, Version 1.1 |
foaf |
http://xmlns.com/foaf/0.1/ | Friend of a Friend (FOAF) |
gr |
http://purl.org/goodrelations/v1# | GoodRelations |
ical |
http://www.w3.org/2002/12/cal/icaltzd# | iCalendar terms in RDF |
og |
http://ogp.me/ns# | Facebook OpenGraph |
rev |
http://purl.org/stuff/rev# | RDF Review |
sioc |
http://rdfs.org/sioc/ns# | SIOC Core |
v |
http://rdf.data-vocabulary.org/# | Google Rich Snippets |
vcard |
http://www.w3.org/2006/vcard/ns# | vCard in RDF |
schema |
http://schema.org/ | schema.org |
More sophisticated annotations require
additional attributes that are supported by RDFa
Core only. Beyond the RDFa Lite attributes, RDFa
Core supports the about
,
content
, datatype
,
inlist
, rel
, and
rev
attributes.
The current subject is the web address of the
document or a value set by the host language, such
as the base element in (X)HTML. As a result, any
metadata written in a document will concern the
document itself by default. The about
attribute can be used to change the current subject
and state what the data is about, making the
properties inside the document body become part of
a new object rather than referring to the entire
document (as they do in the head of the
document).
If some displayed text is different from the
represented value, a more precise value can be
added using the content
attribute,
which is a character data (CDATA) string to supply
machine-readable content for a literal. A value can
also optionally be typed using the
datatype
attribute (see Listing 4).
Declaring the type ensures that machines can
interpret strings, dates, numbers, etc., rather
than considering them as a character sequence.
Listing 4.
Using the content
and
datatype
Attributes
<html xmlns="http://www.w3.org/1999/xhtml" prefix="xsd: http://www.w3.org/2001/XMLSchema# dc: http://purl.org/dc/terms/">
<head>
<title>Leslie’s Blog</title>
</head>
<body>
<h1 property="dc:title">Leslie’s Blog</h1>
<p>
Last modified: <span property="dc:modified"
content="2015-05-25T10:54:00-09:30"
datatype="xsd:dateTime">25 May 2015</span>.
</p>
</body>
</html>
In RDFa, the relationship between two resources
(predicates) can be expressed using the
rel
attribute (see Listing 5).
Listing 5. Describing the Relationship Between Two Resources in RDFa
This document is licensed under the
<a prefix="cc: http://creativecommons.org/ns#" rel="cc:license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons By-NC-ND License</a>.
When a predicate is expressed using
rel
, the href
or
src
attribute is used on the element
of the RDFa statement, to identify the object (see
Listing 6).
Listing 6.
Using href
to Identify the Object
<link about="mailto:leslie@example.com" rel="foaf:knows" href="mailto:christina@example.com">
Reverse relationships between two resources
(predicates) can be expressed with the
rev
attribute. The rel
and rev
attributes can be used on any
element individually or together. Combining
rel
and rev
is
particularly useful when there are two different
relationships to express, such as when a photo is
taken by the person it depicts (see Listing 7).
Listing 7.
Combining the rel
and rev
Attributes
<img about="http://www.example.com" src="koalahug.jpg" rev="dc:creator" rel="foaf:img" />
If a triple predicate is annotated using
rel
or rev
only, but no
href
, src
, or
resource
is defined on the same
element, the represented triple will be
incomplete.
The inlist
attribute indicates that
the object generated on the element is part of a
list sharing the same predicate and subject (see
Listing 8). Only the presence of the
inlist
attribute is relevant; its
attribute value is always ignored.
Listing 8.
Using the inlist
Attribute
<p prefix="bibo: http://purl.org/ontology/bibo/ dc: http://purl.org/dc/terms/" typeof="bibo:Website">
The web site <span property="dc:title">Andrew Peno Graphic and Fine Artist</span> by <a inlist="" property="dc:creator" href="http://www.andrewpeno.com">Andrew Peno</a> and <a inlist="" property="dc:creator" href="http://www.lesliesikos.com">Leslie Sikos</a>.
</p>
RDFa DOM API
The RDFa DOM API provides programmatic access to structured data expressed in RDFa on a web page.
You can read more about RDFa in the book Mastering Structured Data on the Semantic Web.