Tutorial
SportsML is XML
SportsML is an XML-conforming vocabulary. This means that SportsML uses the constructs standardized by XML to describe elements of content within a document, and the descriptive attributes of that content.
This tutorial covers the most widely used sections of SportsML. For details about each element, consult the documentation page.
For more information on how XML works, visit our page listing XML resources.
SportsML is a logical representation of sports data, and is not meant to dictate how that sports data is formatted. If a publisher wants to use SportsML to identify how two teams fared
in a particular game, the <sports-event> and <team> elements would be used:
<sports-event>
<sports-metadata
event-status="post-event"
/>
<team>
<team-metadata>
<name
first="New York"
last="Mets"
/>
</team-metadata>
<team-stats
score="4"
event-outcome="win"
/>
</team>
<team>
<team-metadata>
<name
first="Atlanta"
last="Braves"
/>
</team-metadata>
<team-stats
score="2"
event-outcome="loss"
/>
</team>
</sports-event>
This SportsML fragment could then be rendered in HTML for a clean display:
| New York Mets | 4 | final |
| Atlanta Braves | 2 | |
Note that the SportsML event-status attribute of "post-event" has been used by the SportsML-to-HTML rendering processor to indicate that this score is a final score.
Basic Structure of SportsML
The root element in SportsML is <sports-content>, which contains a required <sports-metadata> section, followed
by zero or more of the following:
- <sports-event>
- <tournament>
- <schedule>
- <standing>
- <statistic>
- <article>
The first five of these items hold XML structures built upon various combinations of <team> and <player> elements. The <article> element is intended to hold a news story recommended to adhere to the News Industry Text Format, or NITF.
Data structures for these items are outlined as follows:
| <sports-event> |
A set of teams or a set of players, followed by
optional information about officials/referrees, play-by-play actions,
highlights, and awards
|
| <tournament>
| Broken into tournament-divisions, which have rounds of sports-events
|
| <schedule>
| A structured set of sports-events.
|
| <standing>
| A set of teams or players.
|
| <statistic>
| Also a set of teams or players.
|
| <article>
| A container for an NITF news article.
|
Each of these structures has an envelope for metadata. For example, <event-metadata> holds such properties as when and where the event takes place, and whether the game has started or not.
Keys and Identifiers
Behind SportsML is a comprehensive strategy for unambiguously identifying which player, team, league, sport, and event is being covered.
These values are generally stored in attributes we call "keys." For example, a team-key might equal "t.7".
Where does one go to look up which team has the key of "t.7"? In what we call a Resource File.
The Resource File is an XML file that lists and defines which keys are allowed where.
The IPTC has come up with its contents for Resource Files. However, publishers are free
to create their own files, either based on the IPTC's, or containing whole new sets of values.
Besides listing items like leagues, conferences, associations, and teams, Resource Files
also contain lists of controlled vocabularies used to describe other properties. For example,
the various states of health a player is in could be described as "injured" or "fine," or
could be described in much more detail.
A quick aside: In an ideal world, we might also have a central repository for all player-keys in major sports, regardless of which team they're on or country they're in. This is obviously a long-term goal, and comments for how various agencies could go about putting such a reference database together are welcome.
Sports Actions
Another notable characteristic is how SportsML files generally include only one "root reference" to a
<player> or <team>. To expand,
a <sports-event> may list two teams, and each team may list several
players. But lower down in the document, there may be a list dozens of "actions" that occurred during the game.
Each action refers to its participants not by repeating the player-keys and team-keys, but by calling
out the idref of the "root reference." This example shows how to portray the fact that Bernie Williams
of the New York Yankees hit a grand-slam home run with two outs in the bottom of the ninth inning.
<sports-event>
<sports-metadata
event-status="mid-event"
/>
<team>
<team-metadata>
<name
first="New York"
last="Yankees"
/>
</team-metadata>
<team-stats
score="4"
event-outcome="win"
/>
<player
id="p1"
>
<player-metadata
height="157"
weight="93"
date-of-birth="19680913"
>
<name
first="Bernie"
last="Williams"/>
</player-metadata>
</player>
</team>
<team>
...
</team>
<event-actions>
<event-actions-baseball>
<action-baseball-score
inning-value="9"
inning-half="bottom"
outs="2"
balls="3"
strikes="2"
batter-idref="p1"
hit-type="home-run"
rbi="4"
runs-scored="4"
</>
</event-actions-baseball>
</event-actions>
</sports-event>
You could also point to which player threw him the pitch he hit over the fences.
The SportsML Document Type Definition
XML vocabularies generally use a Document Type Definition -- or DTD -- to define which elements and attributes are allowed where. XML vocabularies can also now be specified by a Schema, which is viewed by many as the successor specification format to the DTD. For more information on XML in general, visit our XML Resources page.
SportsML 1.0 is currently defined by a DTD, though we plan to supply a Schema definition later this year. The DTD consists of a Core DTD plus several sport-specific modules.
- The Core SportsML DTD
- One requirement of SportsML is that it provide a single, core set of properties that could be used to describe scores, schedules, standings, and statistics for a wide variety of sports. This Core DTD, while using U.S. English to express its contructs, has to support properties of sports in a way that is readily usable by publishers from any nation.
- The Control File
- The Core SportsML DTD refers to a separate, small DTD file known as the SportsML Control File. This file contains no particular sports properties in it, per se. Instead, the Control File activates any SportsML Plug-In DTDs that a publisher wants to use.
SportsML users who want to validate documents only covering particular sports are welcome to modify
the SportsML Control File so that only those modules desired will be loaded.
- Plug-In DTDs
- SportsML allows a publisher to express properties that are highly specific to particular sports. It does so by support individual, sports-specific DTDs that "plug in" to the core SportsML DTD.
As long as the SportsML Control File includes the plug-in for, say, Ice Hockey, a publisher is able to represent such Ice Hockey-specific constructs as shift changes, penalty shots, and power plays. The IPTC has decided to support seven sport-specific plug-ins at the launch of SportsML 1.0, including:
- American Football
- Baseball
- Basketball
- Golf
- Ice Hockey
- Soccer (a.k.a. Football, everywhere but the U.S.)
- Tennis
SportsML users who would like to contribute to the authoring process of other modules are welcome
to contact the SportsML Committee Chair.
More of this tutorial will be posted at SportsML.org in the near future.
Samples of SportsML with other sports are also available for browsing in the Examples section.
Please review our documentation and examples for a more complete picture
of all the properties of SportsML.
|