= The Toils of AIS: A Case Study in Application Protocol Design And Analysis =
Eric S. Raymond <esr@thyrsus.com>, Kurt Schwehr <schwehr@gmail.com>

// DRAFT: While this is not officially published anywhere, feel free to refer
// to the paper in any works.

// FIX: thinking we should have all numbered message (e.g. "type 5")
// call outs in an easy to distinguish form.  Bold, italic, or something. 
// Let's not worry about this till after we finish the text.

// FIX: 2 bits for an AIS version number and only available in some
// messages (e.g. type 5).  What does it mean if an entity broadcasts
// that it is compliant with ITU 1371-3?  And there is no number for  1371-4 version?

== Introduction ==

The authors have written several decoders and codecs for the rather complex AIS
protocol used by marine safety and navigation systems.  Based on this
experience, we put forward some lessons learned and design principles
that we think should be heeded in the extension of AIS and the design
of future application protocols.  We also offer pragmatic advice to
those implementing decoders for AIS and similar application
protocols.

== Overview of AIS ==

The Marine Automatic Identification System (AIS) is a software,
hardware, and radio coding protocol that allows ships to broadcast
navigational and other status information for use in collision
avoidance, navigation, and traffic management.  Early design work on
AIS was done primarily by Sweden; the International Association of
Lighthouse Authorities (IALA) proposed the concept to the
International Maritime Organization (IMO) in 1995 <<IALA2004>>.  The
core AIS standard is now maintained by the International
Telecommunications Union (ITU) in association with IMO <<ITU1374>>.

Both the design of AIS and the process by which it evolved are
representative of a large class of application protocols in the real
world, especially those concerned with geolocation and geographic
information systems. A marked feature of AIS, however, is that due to
the hazards of operation at sea, the information it handles can be
immediately life-critical.  This gives software defects in decoders a
special significance and raises the stakes in our design analysis.

// FIX: should we reference the Safety Of Life At Sea (SOLAS) in the
// above paragraph?

The physical layer of AIS uses a self organizing (SO) variant of Time
Division Multiple Access (TDMA) packet radio scheme to avoid collisions among
AIS transmitters serving either of two operating frequencies
<<LANS1996>>.  For the analysis in this paper, the physical layer
can be largely ignored and AIS viewed as a mechanism for passing
binary datagrams in either an addressed or broadcast mode.  For
AIS purposes an "address" is a Maritime Mobile Service Identity
(MMSI), a 9-digit numeric tag associated with a ship or shore
station.

The binary datagrams have a fixed header containing a dispatch field
(the type) followed by information whose formatting varies by type in
complex ways.  Payload information is encoded mainly as packed numeric
bitfields of lengths varying from 1 to 30 bits; these may be
interpreted as boolean flags, unsigned or signed two's-complement
big-endian integers, indices into controlled vocabularies, or
(implicitly, via scaling formulas) as fixed-point real numbers.  Some
messages have longer regions that are interpreted as character strings
or treated as opaque binary blobs to be passed to from or to helper
applications.

Typical data items to be extracted from AIS bitfields include
latitudes, longitudes, vessel course and speed, capabilities
of the transmitting vessel or station, and weather or 
safety conditions. Some message fields are passed solely for the
AIS system's own housekeeping functions.

In practice, software engineers are unlikely to encounter AIS
datagrams in raw binary form. AIS receivers commonly make them
available over RS422, RS232 and USB ports in an ASCII-armored
byte-stream form called AIVDM/AIVDO, a profile or variant of the NMEA
format commonly used in shipboard navigation and control systems.

The design of AIS is optimized to make efficient use of scarce radio
bandwidth. In this it resembles a great many application protocols
which tight-pack every bit - either out of rational economy or (more
commonly) because the designer's mental habits were formed in a time
when bandwidth was far more more expensive than it is today.

Raymond maintains a programmer's-eye view of the details of
AIS/AIVDM/AIVDO in <<RAYMOND1>>.  Since that document is available
from any browser, we choose not to burden this paper with duplicative
details about datagram formats (but see the layout of the Common
Navigation Block in the next section for a representative example, the
payload of the three most common message types).  We will refer to
<<RAYMOND1>> frequently, and readers are assumed to be familiar with
it before proceeding.

== Complexity kills ==

// FIX: reference for scaling of bugs with ^2 of LOC

It is hardly news that software defect rates rise with the complexity
of the software.  Software bugs frequently arise from unplanned
interactions between different parts of a codebase.  It follows that
defect rates scale roughly as the square of codebase size. It is well
known <<MMM>> that bugs tend to cluster at the interfaces between code
written by different people; thus, defects also rise as the square of
the number of coders on a project.

In life-critical systems such as AIS, software defects can kill people. 
Software complexity should therefore be viewed as an actual danger, 
a cause of avoidable deaths.  Simplicity saves lives.

The benefits of software simplicity in non-life-critical systems,
while less dramatic, are no less real.  Defects incur downstream
problems and maintenance costs; thus, software complexity should be
viewed as a burden that does not end when the project ships, but
which continues to incur costs over the entire service life of the
software.

Complex application protocols require complex software to interpret
them.  Thus, application protocol complexity entails software complexity, 
which entails quadratically rising defect rates, which in life-critical
systems implies avoidable deaths.

Accordingly, a central problem with AIS - and many other application
protocols resembling it - is that it is exceedingly complex.

The problem is well indicated by the bulk of the base specification,
<<ITU1371>>.  In the current Version 4 there are 27 message types with
lengths varying from 72 to 1008 bits, described by 142 pages of dense
standardsese. It is supplemented by two International Marine
Organization circulars (<<IMO236>> and <<IMO289>>) defining 24
additional message subtypes in 78 additional pages of standardsese.
The AIVDM/AIVDO ASCII packet format is described in yet another
standard, <<IEC-PAS>>.  When taking into account the sub messages, the
specifications allow for more than 262K message types.

// FIX: Kurt needs to calculate the total number of possible messages
// in the existing design.

There is a common paraphrase of a quote by Albert Einstein that
advises us to "Make everything as simple as possible, but not
simpler." Some part of AIS's complexity is essential to its function.
For example, any application protocol that conveys numeric data has
to have rules about how it is represented in the bits.  Other parts
of its complexity are accidental, simply a result of unfortunate
design decisions.

The most prevalent cause of bad protocol-design decisions is not
individual incompetence, but rather various sorts of cultural
blindness affecting the design process.  Experienced software
engineers learn to recognize - and dread - the signs of an application
protocol designed by domain specialists who do not understand how to
avoid accidental complexity, or gravely underweight its costs.

We will anatomize in detail the accidental complexity of AIS.  In so
doing, we intend to illustrate by inversion how application-protocol 
designers can avoid costly and dangerous mistakes.

== How complexity rises ==

Analyzing datagrams in a protocol like AIS is formally similar to
parsing a context-free grammar (CFG).  The goal, in both cases, is to
digest a serial stream of incoming bits from an external source into
an internal representation that uses the native data formats of the
interpreting computer. This internal representation may then be used
in any number of ways (e.g. visualization, report generation, and
interpretation as a command sequence).

Parsing techniques for CFGs found in textual formats such as computer
languages and document markups are a much-studied area of computer
science, well-described in classics such as <<ASU>>.  Though it is
less practiced, application of these same techniques to binary
datagrams and streams remains valid for practical purposes and for
estimations of algorithmic complexity.   Tools such as 
<<ASN.1>> describe binary byte-stream protocols in this style.
What specifically concerns us
in this paper is to understand how to hold the complexity of the
parsing stage to a minimum.

A CFG's complexity (and the complexity of associated parsers) is
proportional to the number of productions (parsing rules) required to
describe it.  The analog of a terminal symbol in the grammar is a
bitfield length and type, where 'type' in this context includes not
just numeric kind but associated interpretive data like offsets,
scaling constants, and controlled-vocabulary lists.

Further, complexity in CFGs tends to have have a Pareto's-law-like
distribution: small irregularities and exception cases in some 20% 
of the language produces large ripple effects in the difficulty of
parsing the other 80%.  Accordingly, another class of problem to look
out for is exception cases to general rules.

With this model in mind, let us tour the AIS message types in
sequence, as an implementer generally would, looking at the specific
ways they add complexity to the implied CFG.  For the moment, ignore
the many issues about the inventory of terminal symbols (bitfield
types and interpretations), just as we would normally ignore most
lexical issues in estimating the parse complexity of a conventional
textual CFG. The boundaries among terminal symbol types are
disputable, and the counts we give below should be considered
approximations for scale-estimation purposes.

=== Types 1 through 4: Fixed-length felicity ===


Begin by considering the first three AIS message types describe in
Table 45 of <<ITU1371>>-4, grouped as
"Position Report Class A", but ignore the trailing communication
state. These differ only in type number and its interpretation, which
is outside the parser's scope.  They can therefore be regarded as one
production in the grammar; in <<RAYMOND1>> it is called the Common
Navigation Block (CNB).

.Common Navigation Block
[frame="topbot",options="header"]
|==============================================================================
|Field   |Len |Description             |Type         |Units and Meaning
|0-5     | 6  |Message Type            |unsigned int |Constant: 1-3
|6-7     | 2  |Repeat Indicator        |unsigned int |Message repeat count
|8-37    |30  |MMSI                    |unsigned int |9 decimal digits
|38-41   | 4  |Navigation Status       |enumerated   |Moored, under way, etc.
|42-49   | 8  |Rate of Turn (ROT)      |signed real  |Degrees per minute
|50-59   |10  |Speed Over Ground (SOG) |unsigned real|Meters per second
|60-60   | 1  |Position Accuracy       |boolean      |High-accuracy flag
|61-88   |28  |Longitude               |signed real  |Minutes/10000
|89-115  |27  |Latitude                |signed real  |Minutes/10000
|116-127 |12  |Course Over Ground (COG)|unsigned real|Degrees from true north.
|128-136 | 9  |True Heading (HDG)      |unsigned int |Degrees from true north.
|137-142 | 6  |Time Stamp              |unsigned int |Second of UTC timestamp
|143-144 | 2  |Maneuver Indicator      |enumerated   |
|145-147 | 3  |Spare                   |             |Not used
|148-148 | 1  |RAIM flag               |boolean      |Receiver Autonomous 
                                                      Integrity Monitoring 
                                                      enabled and valid?
|149-167 |19  |Radio status            | variable structure     |Status bits for cell radio
|==============================================================================

// Radio status / commstate contains a variable set of bools and ints

The CNB is fixed-length and has a fixed sequence of sixteen fields,
spanning sixteen types, in 160 bits.  It is thus dead-simple to parse.
In C, one could easily decode the bits with no moving parts (formally, no
control logic) by reading the entire datagram into memory (where it
would take up a grand total of 20 bytes) and casting it into a
structure full of bitfields (which, on a little-endian machine, would
alas have to be flipped end-to-end).  

From a reliability-engineering point of view, this is a best case
scenario and a very promising start.  However, see later discussion of
field-specification issues for some troubling problems with individual
fields in these three messages

Message type 4, "Base Station Report", is also fixed-length and
fixed-sequence and can thus also be parsed with no moving parts.
While it introduces nine new field types (including our first controlled
vocabulary index) the increase in overall complexity is low.

=== Type 5: The six-bit blunder ===

// FIX: what are the 5 new types? -kurt

With message type 5, "Ship static and voyage related data", complexity
begins to rise.  This type is still fixed-length with a fixed sequence
of 21 fields, and introduces only five new types.  Three of the Type 5
fields (Call Sign, Vessel Name, and Destination) are character-string
data in a 64-bit subset of ASCII packed as 6-bit nybbles.
Unfortunately, it is dramatically more complex to decode this oddly
packed strings than the fixed-length numerics and flag bits we've been
dealing with so far.

This is a paradigm for the sort of fixation on tightest-possible bit
packing that makes software engineers blanch.  Before this 6-bit data
can be used, it needs to be unpacked into conventional 8-bit bytes.
Not only have we lost the simplicity of being able to decode the
message just by statically slicing it into fixed-length bitfields, the
moving parts of the unpacking logic turn out to be easy to get wrong
in ways that are tricky to diagnose.  This is not a merely
theoretical criticism; Raymond found a longstanding bug in a
preexisting AIS decoder (<<GNU-AIS>>) exactly here.

// FIX: ref to this bug?  I see this one, but nothing from ESR.  -kurt
// http://sourceforge.net/tracker/?func=detail&aid=2231300&group_id=209878&atid=1011486

The fact that these strings aren't coded in byte-aligned 8-bit ASCII
or UTF-8 is the first serious blob of accidental complexity in AIS.
The 6-bit data occupies 262 bits of a 424-bit message format in 47
nybbles. The cost of 8-bit-ASCII would have been a 22% increase in
message length, but going from 2 to 2.34 slots (requiring 3 slots to
transmit).  At the type 5 transmission interval and data rates
prescribed in <<ITU1371>> this implies an additional transmission cost
of 0.004% of the VHF data link available slots.  For 100 ships in an
AIS cell, the overall cost would be 0.37% of the available slots.

// FIX: Kurt, you had the next 'graph run on the previous one,
// but it appears to me to be a different topic.  Does it belong here?

<<ITU1374>>-4 alludes to an international process underway of allocating
two more AIS channels.  These channels may be at much higher frequencies
than VHF and afford much more than 9600bps bandwidth as they will be
optimized for satellite reception.

// FIX: Kurt needs to calculate out the bit rate as a double check
// ITU 1341-4 Page 3, 4.2.1 Reporting intervals:
// Every 6 min or, when data has been amended, on request.

// Slots avail = 2250 slots/channel/min * 2 channels * 6 minutes = 27000 slots 
// 2 slots per message vrs 3 slots per messages
// 1 ship:       0.007% -> 0.011% (percent of the total number of slots)
// 100 ships: 0.741% -> 1.111%
// 500 ships: 3.703% -> 5.555%

//FIX: Tweak first sentence to reflect the calculation.
Thus, the designers of AIS incurred a substantial increase in the
complexity of decoders in order to save the equivalent of two tenths
of one percent of a duty cycle.  This is our first "Never, *ever* do
this!"  It is a blunder that could only arise from being fixated on
hardware-layer economy while ignoring the defect-rate implications and
testing costs for the software layer.

Furthermore, over time bandwidth tends to become less expensive while defect-
and maintainance-related costs in software tend to remain steady or
rising.  Thus, the choice that was a bad tradeoff in 1998 when <<ITU1371>>
first issued looks worse today, ten years later.  It will look still
worse ten years from now.

// They standards folks are even more worried about the bandwidth
// today compared to back then.
// The assumption is that there may be other data channels to ships,
// but critical AIS hardware on ships will be these same two pathetic
// data channels for the next 20-30 years. -kurt
// See http://vislab-ccom.unh.edu/~schwehr/ais/ais-references.html
// and search on "1371" for all the rev dates of the standard.

=== Type 6 and 8: Dynamic allocation is not your friend ===

With type 6, "Addressed binary message", two more forms of complexity
are introduced. One is minor and arguably essential; the other is
major and completely accidental. We'll consider types 6 and 8
together, as type 8 is structured in essentially the same way as 6
except for omitting destination information and housekeeping
information for point-to-point trasnsmission; it is instead designed
for broadcast messages.

The minor increment is that message types 6 and 8 have a bit length
that is variable rather than fixed.  These messages are a wrapper
around a binary payload of up to 920 (type 6) or 952 (type 8) bits,
which is the last field.

In the specific context of AIS, this sort of variable-length payload
poses no special difficulty.  Because the message length tops out at
1008 bits = 126 bytes, it is feasible to unpack it in a static buffer 
of maximum size and ignore the tiny amount of waste space.

This is not in general true of application protocols with bulkier
payloads, for which handling variable-length datagrams often requires
dynamic memory allocation.  And, in languages like C or C++ without
automatic garbage collection, dynamic-memory allocation is a notorious
source of defects.  Indeed, many audits of large C/C++ codebases have
found it to be the single worst source of data corruption, security 
problems, and crash bugs.  On the other hand, languages which avoid this 
problem through support of variable-extent data structures and garbage
collection are often too bulky to fit in embedded deployments or too
subject to GC-related stalls to be suitable for real-time use.

// FIX: Perhaps cite something like Deep C Secrets (someone stole my
// copy) assuming Peter talks about memory alloc problems, which
// I am pretty sure he does.

Application-protocol designers should therefore think twice before
incurring this complexity as a side effect of their designs.  But at
least this is often essential complexity, a kind that is entailed
in the application requirements.

=== Type 6 and 8 continued: Overextension is your enemy ===

Unfortunately, interpretation of types 6 and 8 introduces another kind of
complexity that is completely accidental and (in the AIS context) far
worse.  Interpretation of the binary payload is both variable and
ambiguous.

Early versions of the base AIS specification reserved 16 bits before
the actual payload as an "application identifier". This was subdivided
into a 10-bit "designated area code" (DAC) and a six-bit "functional
identifier" (FID). The intention for types 6 and 8 at that time
appears to have been as a conduit for private and encrypted data.  The
world's Coast Guards and Navies are known to use AIS this way ("blue
force AIS" or E-AIS).  There was no complexity increment to general-purpose
decoders in this scheme; any extra burden fell on private applications
generating and interpreting those payloads. 

All this changed when the ITU delegated control of message 6 and
8 layouts to regional authorities by DAC. These messages then became
an extension mechanism that controlling authorities for a designated
area could use to define their own payload formats without extending
or revising the base standard.  The St. Lawrence Seaway (DACs 366 and
316) and the International Maritime Authority (DAC 1) promptly did so.

// FIX: The community suffers from no official central registry of
// numbers.  And no requirement to internationally publish message defs
// cite ref of Alexander and Schwehr proposing said registry

While this may have been bureacratically convenient, the result was a
confusing proliferation of public message subtypes, some pushed through
without adequate review (for example, <<IMO236>> contains obvious
signs of drafting errors in the FID=12 "Number of persons on board"
specification). As of early 2011 there are 24 such subtypes in the
internatianal DAC alone, requiring approximately as much additional
complexity as the entirety of <<ITU1371>>.

If the result had been to preserve fixed-field simplicity for each
message type while merely bloating the overall volume of decoders, the
implications for complexity and defect loads would have been worrying
but readily manageable.  But the damage did not end there.

Before, the only field of an AIS message that could act as a dispatch
to different bitfield sequences was the message-type field in the
header.  Under the new rules, the payload part in these so-called
"functional" messages of type 6 and 8 could have any number of
different bitfield sequences depending on the joint value of of two
more dispatch fields, DAC and FID.

In one particularly egregious example of misdesign, the IMO289
"Weather Observation From Ship" (DAC = 1, FID = 21) sprouted two
variants and a *fourth* dispatch field, a flag at bit offset 56 ("Type
of weather report")!

The increment to the minimum complexity of decoders was quite large,
larger in fact than the jump from the packed-sixbit blunder.  But the damage
didn't even end there, as ships under the jurisdiction of some
Designated Areas began sending IMO extension messages with IMO236 or
IMO289 FIDs paired with local DACs other than 1.  It became impossible to
reliably distinguish IMO special messages from poorly-documented
local conventions without a growing exception list. 

The confusion had become complete. And  this is before we even enter
into the additional complexity of the actual special-message payloads,
which we shall not attempt to fully describe here.  Sufficiently masochistic
readers can find the details in <<RAYMOND1>> and <<IMO289>>.

All this complexity appears, quite understandably, to have stalled out
adoption of the IMO special-message types.  In searches of data from
<<AISHUB>> and <<NAIS>>, which pools AIS reports for over 150 AIS
receivers scattered all over the world, we have yet to unambiguously
detect any of the 19 messages described in IMO289 aside from these
three: 8:1:11 (Met/Hydro deprecated), 8:1:26 (Environmental), 8:1:22
(Area Notice - broadcast) in two years of trying (the later two are
from USCG test bed receivers in Tampa, FL and Provincetown, MA).  It
is highly doubtful that there is enough volume to motivate vendors of
AIS receivers to undertake the expensive task of upgrading their
firmware and display applications.

All of this complexity - and the increased defect rate it implies -
appears to have been self-defeating, loaded on to little or no purpose and
reqiring decoder implementations to include essentially useless code
in order to check off standards-conformance boxes. 

Had the ITU/IMO stuck with its single original extension mechanism (the
type field in the header) and added only new message types with a 
fixed bitfield sequence, it seems far more likely that these messages
would have achieved useful deployment.

=== Types 6 and 8: Point defects ===

Attempting to describe all the design defects in the IMO extension
messages would certainly exhaust the reader's patience and possibly
endanger his or her sanity.  But some of the simpler ones deserve
examination because they can be used to illustrate sound design
principles by negative example.

Some of the type 6 and 8 extension messages feature trailing
variable-length arrays of record structures.  One of the simpler
examples is Route Information; the addressed and broadcast versions
have a fixed-length header followed by 1 to 16 latitude/longitude
pairs representing waypoints.  Another is Tidal Window, with 1 to 3
more complex trailing structures representing tidal-current
conditions.

In general, when AIS messages have variable-length trailing sections
the actual length of the trailer is implicit from the message's 
overall bit length.  Tidal Window, in particular is organized this way,
But Route Information is an exception; the last field in the fixed-length
part is a waypoint count.

There are two problems with this.  First, the existence of that waypioint
count field violates the Single Point of Truth principle.  It is redundant 
with the message bit-length, and may even contradict it if the message
generator has a bug.  All the complexity and defects produced by having
to cope with this possibility are completely accidental and unnecessary.

The second problem is that this one exception exactly doubles the code
complexity of coping with trailing arrays. There now have to be two
mechanisms instead of one, and they have to be properly matched to
the message type.  An increase in expected defects will ensue.

Though not as serious as the overextension disaster we previously examined,
this certainly qualifies as a minor design defect.

=== Types 7-19: Keeping it simple again ===

After the spectacular efflorescence of misdesign we saw in types 6 and
8, the simplicity of type 7 "Binary acknowledge" is a relief.  It adds
no new field types.

More importantly: this type has a variable bitfield sequence, but
only because the payload is a list of MMSIs that may vary from one
MMSI to four in length.  This adds almost no complexity in practice as
the message is a maximum of 168 bits = 21 bytes long.  With a little
care in initializing empty MMSI slots with the out-of-band value of
zero, it can usually be treated as fixed-length.
// FIX: Kurt, check on how this is actually handled.  Set to zero or
// different sizes?   Mostly I see ACKs that look wrong without source
// MMSI or that have a source MMSI, but no desk MMSI.  Garbage.
// There shouldn't be any addressed 6 message traffic in the US
// Unless it is the USCG C2CEN/C3CEN in Virginia

This sort of case - a fixed bitfield sequence that may be truncated
to omit unused fields at the end - will come up again.  As a shorthand
we will refer to these layout as "tail-variable".

Type 9 "Standard SAR Aircraft Position Report" is also very simple;
fixed-length, fixed fields, and adding no new field types (with
the arguable exception of different scaling for speed over ground).

Type 10 "UTC/Date Inquiry" is a 72-bit fixed-field message that 
introduces no new field types.  Type 11 has a layout identical to
type 4 and introduces no new complexity.

Types 12 "Addressed safety-related message" and 14 "Broadcast
safety-related message" are basically the common header followed by a
variable-length text field, again of packed-sixbit nybbles.  Maximum
message length is 1008 bits = 126 bytes in both cases, so dealing with
the variable length doesn't require dynamic allocation.  Since any
implementation will have coped with the 6-bit blunder by this point
these messages add no significant additional complexity.

Type 13 "Safety-related acknowledgment" is an acknowledge for a binary
broadcast message, identical in layout to type 7 and adding no
additional complexity.

Types 15 "Interrogation" and 16 "Assigned mode command" are used for 
housekeeping functions of the AIS network and contain no navigational,
ship-status or station-status information. The payload consists of
MMSIs and three new field types, slot/offset/increment numbers describing
timing and radio paramters.  Both are tail-variable.

Type 17 "DGNSS Broadcast Binary Message" contains housekeeping information,
a latitude-longitude pair, and a trailing binary blob containing differenial-GPS
corrections. It introduces no new field types or decoding issues.  We
refer the reader to <<ITU823>> for decoding the binary blob.

Type 18 "Standard Class B CS position report" and type 19: Extended
Class B CS position report are subsets of the Common Navigation Block
for vessels with a less sophisticated navigation and sensor suites.  They
introduce no new field types or decoding issues.

Type 20 "Data link management message" is another housekeeping message
type similar to 15 and 16; like them, it is tail-variable and introduces
no new decoding issues.

=== Type 21: Aide to Navigation ===

Complexity begins to rise again with type 21, the "Aid-to-Navigation report". 
This is intended to be broadcast by navigation  markers such as buoys, 
lighthouses, and beacons.  

It is largely unexceptional, introducing no new field types other than
a controlled-vocabulary description of the broadcast and a boolean
off-position flag, except for one curious detail.  There is a
packed-sixbit field available for the name of the navigation mark -
but it is split in two.  The first 20 packed-sixbit characters begin
after bit 43; a trailing 14 characters, if required, end the message after
bit 272.

This is not of a magnitude with the packed-sixbit blunder or the even
worse bureaucratic snafu around the type 6 and 8 messages, but it is
careless and annoying.  Why not a single variable-length text field at
message end?  The bit of extra code required to glue those segments 
together is yet another avoidable failure point.

Alas, there is much worse to come.

=== Type 22: Channel Management ===

The "much worse" arrives with message type 22.  It is a fixed-length 
housekeeping message introducing five new field types which are neither
difficult to understand nor very interesting once understood.  It is
likely to be of interest only to those auditing or troubleshooting the
AIS network itself.

// FIX: can't find alternate format in 1371-4 that is addressed.
// Kurt: Look at fields 9-12 of the table on p128.

But the type 22 message layout has one rather shocking design bug.
The span from bit 69 to bit 138 has two different possible
interpretations depending on one-bit broacast/addressed flag, a
dispatch field which is located at bit 139 *after* the variant fields.

This means that if you are stream-parsing the datagram, you don't have
the information required to interpret the variant fields when you get
to them.  The message *must* be entirely buffered in memory to be
interpreted.  An entire class of elegant, lightweight analysis
techniques resembling stateless LL(1) parsing in a compiler hit this
wall and drop dead.  The code complexity of the entire decoder, and
its expected defect rate, are pushed upwards substantially.

When we discuss analyzer implementation in a later section, we will
show how this irregularity in the AIS design scuppered the smallest
and most lightweight AIS decoder either of us knows to exist.

=== Type 23: Group Assignment Command ===

Type 23 is similar in structure and function to type 22, a fixed-length
housekeeping message introducing a few new field types concerned with
AIS administrative functions.  Unlike message type 22, it poses no new
decoding issues.

=== Type 24: Static Data Report  ===

AIS has only one more serious botch awaiting us, at least before the
set of ITU1371 message types is extended again.  It's in message 24, which
not only has two internal dispatch fields and variant sections, but
has to be assembled from two datagrams.

The dispatch fields are "part number" and vessel MMSI.  Part number 0
identifies the A datagram of the pair, which will contain common
header information and up to 20 packed-sixbit characters of the ship's name.  
Part number 1 identifies the B datagram of the pair, which may contain
either ship type and dimensions or a parent vessel MMSI, depending
on the format of the MMSI in the common header.  (The second format 
is intended for use by small auxiliary craft.)

As usual, multiple dispatch fields and variant sections are defect
attractors.  But this design attracts whole new classes of bugs
because the message spans two datagrams.  What if the standard's bland
assurances that they are expected to be transmitted as an adjacent
pair go unfulfilled due to equipment failure, or reception as an
adjacent pair is foiled by interference?  What edge cases are
possible, and how can we verify that a decoder will cope with them
sanely?  What are the potential consequences of different decoders
choosing different ways to resolve edge cases?

More generally, the the split-datagram design implies that a decoder 
must retain parsing state between datagrams and be prepared to merge 
Part B information into stored Part A information.  This introduces
a whole level of data-management complexity that the protocol
did not previously require.  Expected defect rates will rise as
a direct result.

There is a similar problem with the separation of RAIM and Position
Accuracy in the Common Navigation Block from the actual
navigation-system type information in type 5 (Static and Voyage
Related Data).  If there are two vessels within line-of-sight with the
same MMSI (yes, the happens all the time), interpretation software may
be unable to avoid mixing the position-system data for the two ships.

=== Types 25, 26, and 27: Anticlimax ===

Type 25, "Binary message, single slot", is a binary message wrapper
that combines the capabilities of types 6 and 8.  It differs from 6
and 8 in that an acknowledgement is not expected.  It is limited to
occupying a single packet-radio slot of at most 168 bits.

Type 25 introduces no new field types, but has two dispatch flags
immediately after the common header that indicate whether the message
is broadcast or has a destination MMSI preceding the payload, and
whether or not the payload is preceded by a DAC/FID pair.

Type 25 violates the rule implied in previous message layouts that if
a particular bitfield (such as DAC, FID, or destination MMSI) is
present, it is always present at the same bit offset.  This might be
considered a design defect, but the additional complexity imposed by
it on on decoders is fairly small and it is thus at worst a minor one.

Message type 26, "Binary message, single slot", is an extension of
type 25 that can wrap longer binary messages. It introduces one minor
new parsing difficulty; the variable-length binary payload is followed
by a radio-status field.  This breaks the assumption of all previous
message layouts that if a message type has a variable-length payload
section, it is the last field and a stream parser reaching it can
consume the rest of the message.  This, too, imposes additional
complexity on the parsing machinery and may be considered a minor
design defect.

Message 27 "Long range AIS broadcast message" is a simple subset of the
Common Navigation Block intended for satellite reception.  It introduces
no new field types nor any new parsing difficulties.

This completes the tour of message type layouts in version 4 of
the AIS specification.  Next, we'll examine issue with the field
inventories within those layouts.

== Field-specification issues ==

Data fields in the AIS protocol have at least two significant
attributes: length, type (signed int, unsigned int, boolean, float,
controlled-vocabulary index, string, binary data).  Most have two
more: a scaling formula and a not-available special value.  Some have
other attributes that must be considered in interpretation, such as a
controlled-vocabulary list or additional special values for various
measurement boundaries.

=== Overcomplexity ===

Field attributes occur in a number of different combinations so large
it is difficult to even estimate.  This is a major source of decoder
complexity and defects in itself.  

But there are worse and more specific problems ranging from
underspecification to poorly-chosen semantics, as we shall see next.

=== Underspecification ===

The AIS standard has serious problems with underspecification of field
types and semantics. For example, the interpretation of the "Special
Manoeuvre" value in Class A position messages is not specified adjacent
to the message definition. No definition is supplied elsewhere in
<<ITU1371>>-4.  In an echo of the problems with IMO message subtypes,
the interpretation of "specials" is delegated to unspecified regional
authorities.  Thus it is impossible to evaluate whether these two bits
are critical to safe operation or not, and they are very likely to
slip through the cracks in software design, specification, and
testing.  This is a major defect.

//FIX: Kurt, what do you mean? What other unspecified fields are there? 
This failure of omission is repeated a number of times throughout <<ITU1371>>.

=== Proliferating special cases ===

Another class of problem comes from poorly documented special cases in
field values. For example, the Search and Rescue Transmitter (SART)
devices introduced in revision 4 of <<ITU1371>> create a special case
that will introduce further errors.  A special range of MMSI values
denote devices that are for use in man overboard and vessel
emergencies.  If these devices accidentally have the wrong (non-SART)
prefix, or if the decoding or display software does not have the proper
range of SART prefixes marked, the SART device (which hopefully is
attached to a person or life raft) might show up as a normal vessel
thereby possibly confusing rescue personnel when seconds matter.

=== Complicating exceptions ===

In general, the formula required to extract a value in human-friendly units
from an AIS message field is a simple linear transformation using a scaling
factor and an offset.  But there is one exception.  Interpreting the 
Rate of Turn field in the Common Navigation Block required taking a
(sign-preserving) square root and then scaling.

Exceptions like this increase the defect rate of decoders out of all
proportion to the apparent ease with which they can be specified.
They make it more difficult to capture the protocol in an existing
specification language, and far more difficult to write a custom
minilanguage (such as those two of our examples) that can do the
job. Thus, decoder developers are reduced to fragile and error-prone
hand-coding.

To add insult to injury, there is no discernible reason for this extra
complexity. The field is supposed to scale to degrees per second,
and could have been encoded with linear scaling like every other 
degree-valued field. 

From a whole-systems-engineering point of view, this is a major defect.

=== Semantic problems ===

Here's a representative semantic problem: the RAIM and Position
Accuracy fields occurring in several messages are essentially
meaningless with today's positioning technologies and with
requirements such as vessel Dynamic Positioning (DP).  Reporting that
positioning is better than 10m accuracy means horizontal uncertainty
could be as bad as 9m, leading to the possibility of vessel and platform
collisions. 

9m of imprecision is certainly not enough to safely handle equipment
in (for example) the oil and gas industry. A disaster comparable to
the Deepwater Horizon blowout could result from a tired or stressed
operator incautiously trusting an "accuracy" bit.  But even ignoring
disaster scenarios, meaningless fields are design failure because
they cost complexity in decoders without repaying that cost in useful
information. This counts as at least a minor defect.

=== Proliferating field variants ===

We have found through practice in writing decoders that the most
severe problem with the AIS field inventory is the proliferation of
related field types with *almost*, but not quite identical semantics.
These make it difficult to avoid bugs, and difficult to detect defects
after the fact.

For example, in the CNB, Couse Over Ground and True Heading use
different scales for a measured bearing within the same message.  An
incorrect heading will likely be detected quickly for software run on
ships, but the possibility of updates introducing a regression at
inopportune times can not be discounted.

There are similar problems across messages.  As an example, there are
three different pairs of bit lengths used to encode longitude/latitude
pairs: one in the Common Navigation Block (and elsewhere) with
precision of 10^-4 minutes of arc, another in the IMO 289 Clearance
Time To Enter Port message (and elsewhere) with precision of 10^-3
minutes, and a third in DGNSS Broadcast Binary Message (and elsewhere)
with 10^-1 precision.  Each of these types has its own, different
scaling constant and not-available value.

Minor variations like these create significant risks of defects that
are difficult to avoid and extremely difficult to detect if they're
committed - such as mismatching a scaling constant or not-available
value with a field encoded using a different minor variation of the
type.  Over the entire life-cycle of the protocol is a high price to 
pay for saving 2 to 4 bits in a few rarely-used message types.

=== Defective timestamps ===

Scattered through the AIS specification and IMO extensions are no
fewer than 15 timestamp fields.  Of these, only two (in
message types 4 and 10) are complete date/time stamps.

This is a major design defect which creates significant problems for
testing of AIS, for logging of AIS events, and for maintaining
databases of AIS events.  In particular, it means that for logs to be
useful for replicable testing reproducing the relative timing of different
logs, the logs have to embed information disambiguating their timestamps.

Different choices of how to embed this out-of-band information will 
lead to pointless friction and difficulties in checking the interoperability
of decoder implementations.

== Drawing lessons from the defects ==

We have so far identified a plethora of major and minor defects in the
design of the AIS protocol, each one of which forces higher code
complexity - and therefore higher expected defect rates - in
conforming decoders.  Each error implies a bit of normative advice
for application-protocol designers.

1. AIS's decision to use packed-sixbit nybbles indexing a 64-bit subset of
ASCII chased tiny gains in physical-layer economy for significantly
higher decoder complexity.  It is worth mentioning that it also
forecloses the use of UTF-8 and proper internationalization of the
protocol.

Lesson: Supposing you must design a packed-binary protocol that
contains embedded string data, don't try to get cute with character
encodings to save space. You can't squeeze out enough bits to be worth
the downstream pain you will cause.

Each of the other design defects in the AIS protocol violates a
regularity that was implicitly present in all datagram layouts
previous to the one in which it first occurred, escalating parsing
complexity for *all* message types.

2. Message types 6 and 8 broke the rule that the protocol has exactly one
dispatch field and each type has a fixed (though possibly
tail-variable) bitfield sequence. In doing this, they added a 
second extension mechanism to the protocol, and handed off control
of various portions of the extension space.  The resulting mess teaches 
several important lessons:

Lesson: Dispatch fields (those which change the logic flow of the
parse) are complexity and defect attractors. The complexity cost of
having multiple dispatch fields rises not additively but
multiplicatively. Well-designed application protocols have just one,
period.

Lesson: One protocol-extension mechanism may be just right, but two is 
certainly too many.

Lesson: Beware especially of extension mechanisms that encourage local
options and do not enforce a common namespace in their dispatch
fields. Such mechanisms are likely to turn your protocol into an
unmanageable hairball in short order.

Lesson: Suffer not thy protocol design to fall into the hands of
bureaucrats, for they will smother it in features.

3. Message type 21 pointlessly split a string field in two. This is a
minor defect compared to the other infelicities.  

Lesson: Don't do this. It adds defect-attracting moving parts to
the code and complicates testing.

4. Message type 22 not only contained a second dispatch field, it
located that field *after* the payload section it controlled.

Lesson: Dispatch fields should always *precede* any payload for which
they modify the bitfield sequence. Otherwise you will foreclose entire
classes of simple and lightweight stream-parsing techniques. (This is
a perfect example of a point defect in design that imposes large
complexity costs not just for the processing of that individual
message type but for all types.)

5. Message type 24 has two internal dispatch fields controlling variant
sections, and is split across two datagrams.  In addition to the
normal complexity costs of handling variant sections, this requires
a conforming parser to maintain fragile state between datagrams and
introduces lots of tricky edge cases.

Lesson: Never do this! The amount of statefulness that the network
layers underneath your application protocol imposes on you will be a
severe enough source of bugs and edge cases; deliberately fragmenting
your application protocol's units of meaning to invite *more* is
asking for trouble.

//FIXME: Kurt should check this.
// Also, the frequency of this packet type is so low (CHECKME) that the number
// of slots saved by splitting isn't worth it.

6. Message type 25 contains optional fields that are not at fixed
offsets, adding complexity that might be considered a minor design
defect.

7. Message type 26 contains a variable-length payload field that is not
the last one in the message, complicating the handling of variable-length
fields in stream parsers. This might be considered a minor design
defect.

8. A huge proliferation of field types and variants creates excessive 
complexity in decoders.

Lesson: Minimize the number of field types. Good practice is for there
to be just one type per natural kind; e.g. in a geolocation protocol
all longitudes should be encoded with the same length, signedness, and
special values.  Ditto all latitudes, bearings, timestamp fields, etc.

9. The "Special Manoeuvre" field in the CNB has unspecified semantics.
The (lack of) specification does not include even a handoff to a
registry or document describing the use of the field, so even if
"regional authorities" were to define it, decoder implementors would
have no way to find the definition(s).

Lesson: Never do this! The effect of this design error is that even if some
regional authority attempts to put a meaning on the field, the effort
is effectively certain to fail.

10. Special-casing of MMSIs creates a bug vulnerability related to
mis-recognition of SART prefixes.

Lesson: Don't make an elaborate, multi-special-case decode of a 
field carry information that should be passed via multiple fields
with simpler decode specifications.

11. Rate of Turn in the CNB uses an unnecessarily exceptional scaling
formula.  And Route Information uses an exceptional mechanism for
specifying the element count of a trailing array.

Lesson: Keep exception cases to a dead minimum. Remember that one tiny
specification exception that's easy for a human being to process can
make the protocol intractable for parser generators. A good way to
think about the downstream impact is that each exception case doubles
the overall cost of coding and testing, and correspondingly doubles
the expected defect rate.

12. The RAIM and Position Accuracy fields occurring in several
messages have become essentially meaningless and possibly dangerously
misleading.  They cost code complexity without conveying useful
information.

Lesson: Beware of magic numbers in field definitions (like the "10
meters" in the definition of Position Accuracy) because technology
can change them and likely will. Make the field say what you mean
in a future-proof way (like, actually giving meters of error estimate
at 95% confidence) even if that costs a few bits of space.  

13. AIS timestamps are incomplete, requiring out-of-band information when 
keeping sentence logs.

Lesson: Never, ever define a timestamp field that isn't a complete
date/time stamp in UTC. No exceptions, because you *will* come to
regret every single one.

Further note: If you choose any timestamp format other than ISO8601,
that is probably a design defect in itself.  The advantages of a
format that is readable, readily usable as a database key, and for which 
lexicographic sort order corresponds with time order are large.

To these defects could be added another large one that in a sense
precedes all of them.  At the tiny data volumes of AIS, the style of
packed-bit binary protocols of which AIS is an example is a colossal
error of misdirected and premature optimization, leading to rigidity
and unnecessary overcomplexity.  See <<TEXTUALITY>> for a more
detailed argument of this position.

== A tale of four decoders ==

Having identified the failings in the original design of the AIS
protocol, and explained how these cause problems and how best to avoid
them, we shall next examine the ways in which AIS's design defects
hindered the development of actual AIS decoders, and how the authors
coped.

The authors have written four AIS decoders in three different languages.
We describe them here in historical order.

// Kurt: We might want to add Neil Arundale's to this list.

To put this list in perspective, it should be noted that AIS decoders
are a rare species of software.  Web searches turn up few hits, and
most of those only interpret a subset of the navigational message
types rather than covering the full ITU1371 standard and extensions.
It is likely that the implementations we list here represent a
substantial fraction of all the decoders in existence that are at or
near full conformance.

=== noaadata ===

//FIXME: Kurt, the early-2007 date is my guess, please correct if necessary

Noaadata was designed primarily as a research tool to explore methods
for working with AIS messages and the data contained within the many
message types. Development began in early 2007. Schwehr chose to drive
his decoder from a protocol-description minilanguage. This
architecture was aimed at easy generation of language-native parsers
and other tools, including SQL schema generators and translation tools
to container formats such as GeoJSON and YAML. A minilanguage-driven
parser would also greatly reduce the effort required to support new
ITU-standard messages and experimental messages.

This original architectural vision for noaadata disintegrated with the
horrible inevitability of a Greek tragedy under the weight of AIS's
specifications.  We will describe the process in some detail here
because it shows exactly what the downstream costs of intrinsic
protocol complexity look like in actual practice.

The first major design decision was what protocol-description language
to use.  Schwehr considered but rejected ASN.1 and bdec, finding them
cumbersome and a poor fit to AIS's extremely bit-oriented
layout. Schewhr chose to design a specialized minilanguage expressed
as a custom XML application. As the AIS message standard is written in
MS Word and Excel as free form English text, the first major task was
to hand-translate the content of the standard into a set of XML
message-layout definitions.

An initial barrier to actually doing anything with the XML message
definitions was that XML parser generators also proved cumbersome and
poorly suited to the task. Writing a complete XML schema in any of the
three major schema languages - <<RelaxNG>>, <<Schematron>>, XML Schema
Definition <<XSD>> - would have consumed more time than available to a
single researcher who had no experience in XML schemas.  Even
supposing the commitment to do so could have been sustained, DTDs offer so
little leverage as to be pointless for checking a document format as
heavily structured as the task required.

Early versions of noaadata therefore used lxml, an element-tree XML
API written in Python, to directly walk the XML tree of message
definitions as noaadata decoded each incoming AIS message. But while
walking the XML tree during the actual parsing is compact in terms of
number of lines of code, it is extremely slow. Later versions of
nooadata addressed the performance problem by splitting the parser
into two stages: a code generator read the XML and emitted static Python
code that could decode and encode AIS messages.

The code-generation process added its own layer of complexity. The
generator uses the FIX templating language, bringing the number of
formalisms being juggled to three (Python, XML, FIX) and causing the
process of modifying the Python output to be more annoying than it
should have been.

//FIX: what can I say about my crappy experience with buying Oxygen? 

At each stage in the process, the intrinsic complexity and
irregularity of AIS rendered inadequate the ools for high-level
description of the protocol and forced Schwehr into using
progressively more ad-hoc and complex implementation methods. Not only
did the XML message definitions fail to capture AIS's structure, it
gradually became apparent that no minilanguage with a description
complexity substantially less than that of a hand-coded parser could
do so.

As the prospect of an XML definition language with the full capability
to handle all the odd conditional cases receded, Schwehr started to
fork copies of the generated Python to add support for conditional
data based on values in fields and inspecting the lengths of messages
when necessary. This, of course, subverted the original architecture.

By the time Raymond wrote a critique of the code in 2009 for Schwehr
it was already clear that the noaadata codebase was out of
control. When time came to implement the IMO289 environment and zone
messages in late 2010, the XML definition system completely collapsed.

Near the beginning of the development effort, Schwehr and Alexander
<<IHO2007>> had proposed an initial XML definition language in an
attempt to engage the AIS standards community. While personal feedback
from individual members of that community was positive, the standards
community never formally engaged it, criticized it, or helped develop
it. That lack of interest put the seal on the collapse of the
architecture, removing the motivation for any rescue effort.

All of problems with noaadata that we have described so far were
entailed by the design of AIS.  It is worth noting one other that is
perhaps more a function of implementation: despite a good deal of
effort put into time optimization, the code is too slow. As of early
2011 it cannot handle real time data from the NAIS network feed.

A large portion of this speed issue comes from using the pure Python
BitVector library <<BitVector>>.  While this library makes working
with bits very easy, it is not designed for speed.  In 2009 Raymond
reimplemented the BitVector interface using Python's array type and 
achieved substantial performance gains, but it still proved a 
bottleneck in his Python decoder, suggesting that we'll need a
few more cycles of Moore's Law before any an interpreted language
is up to meeting the throughput requirements for this task.

This performance problem could be addressed by creating a cBitVector
implementation that presents the same interface, but knows how to
efficiently create sub-slices of bit vectors using copy-on-write
storage management.

// FIX: cBitVector would be a great senior undergrad project

Despite all these systemic problems, noaadata has proven
successful as a research tool: generating test cases for AIS
libraries, creating initial SQL database definitions, demonstrating
the potential for XML definitions to produce more friendly HTML
documentation with an XSLT transformation, demonstration of how a
registry could enable all software vendors to handle an explosion of
AIS message types, exploring the concept of automatic testing of new
message proposals, and so forth.

The noaadata suite was also extremely useful in the process of
validating the next two decoders we shall examine.

//FIXME: Kurt, I think the next 'graph may be too far off into the weeds.

Noaadata also suggested strategies for optimization of AIS data processing
based on the task at hand.  For example, tracking ships in real time
might only require processing a small portion of the class A and B
messages.  If the external time stamp can be trusted (not always the
case in NAIS), then there may be no need to look at the timing data
for quick-look visualizations.  Deep statistical analysis of AIS data
requires looking at most or all of the data on the channel to
understand which data can be used and what problems must be addressed
in the analysis.

=== GPSD ===

GPSD is an open-source monitoring daemon for GPSes and other
kinematic/geodetic sensors. It turns the raw take from sensors into
JSON messages on a well-known TCP/IP port, insulating client
applications from the grubby details of vendor protocols and hardware
quirks. Besides being a stock piece of infrastructure on Linux and
*BSD laptops running location-aware applications, it is extremely
widely deployed in embedded systems including cellphones, scientific
telemetry packages, and autonomous robotic vehicles including a DARPA
Grand Challenge entry and the Woods Hole Institute's deep-diving
Nereus submarine.

One of us (Raymond) developed an interest in AIS through GPSD, for
which he is the project lead.  He began work on supporting sensors
reporting AIVDM/AIVDO in GPSD in 2009, and compiled <<RAYMOND1>> as
part of his effort to pull together enough information to do a proper
implementation. Schwehr contributed advice and test loads, and the
authors became acquainted through cooperating to improve this
software. 

The GPSD AIVDM driver is implemented in approximately a thousand lines
of C. At time of writing it has tested support for AIS message types
1-15, 18-21, and 24. It has untested support for types 16-17, 22-23,
25-27 and most of the IMO extension messages, many of which have not
yet been observed in the wild. Conformance was tested by comparison
with reports from Schwehr's noaadata driver operating on samples from
<<AISHUB>> and elsewhere.

Most of the driver is hand-coded, because at the time it was constructed
the author believed AIS was too irregular for an attempt at driver 
code generation based on a protocol specification to work. This belief
was based partly on experience with Schwehr's noaadata code.

However, in early 2011 Raymond built a small suite of code generators
that take any message-layout table from <<RAYMOND1>> as input and
generate snippets of C and Python code as output. These snippets
include a C structure declaration for the unpacked message data, and
driver code to extract the bits into the structure. While these code
generators cannot produce all of the the high-level logic required for
the AIVDM driver, they do reliably automate the most fiddly and
defect-prone parts of datagram analysis.

The design problems of AIS have not prevented this driver from parsing
the full range of message types, mainly because hand-coded C allows an
approach of using brute force to overpower those problems.  The
resulting code is hardly elegant, but it works for production uses.

The intrinsic complexity of AIS has taken its toll, however.  GPSD
reports AIS sentences as equivalent JSON objects.  GPSD ships a 
library for C client applications that unmarshals the JSON into C structures,
using a custom JSON parser that uses only static-extent storage.  The
JSON generated by certain IMO extension messages (notably Area Notice 
and Environmental) is so complex that the library JSON parser breaks on 
it and probably cannot be extended to work without using dynamic storage.

As of May 2011 the GPSD AIVDM driver is the most capable AIS decoder
available in open source. Despite the driver's problems and
limitations, field experience with other implementations and the known
difficulty of conforming with the murkier corners of the AIS standards
cause the authors to strongly suspect that the GPSD AIVDM driver is
the most capable AIS decoder available *anywhere*. The authors are not
pleased or reassured by this evaluation, considering it mainly a
reflection of the shambolic state of the AIS standards.

=== ais.py ===

ais.py is an undistributed developer tool in the repository of the
GPSD project.  Raymond wrote it as a check on the project's C decoder,
using as different an implementation strategy as he could imagine and
attempting to avoid noaadata's problems. It is not intended as a
production tool.

ais.py is approximately a thousand lines of Python code, which includes
extensive report-generation features as well as the datagram analysis.
Most of the code consists of lists of declarations in a tiny
domain-specific language adapted for describing the layout of AIS
datagrams; these are interpreted by 250 lines of a relatively simple
execution loop.

While this approach is far more elegant than ad-hoc C code and in many
ways more successful than noaadata, it runs headlong into AIS's design
errors.  The execution loop is essentially a recursive stream parser
and cannot handle the post-positioned dispatch field in message type
22.  It cannot merge the A and B parts of message type 24.  It cannot
handle the embedded variable-length string field in message type 26
(though this could be fixed relatively easily by extending the
minilanguage). It handles the split string field in message type 21
only via a rather embarassing kluge.  And it does not handle the IMO
specials of type 6 and 8 at all.

As with noaadata, ais.py shows that in a collision between declarative
specification and the ugly realities of AIS, declarative specification
does not fare very well.

ais.py also has a performance problem unrelated to AIS complexity; on
PC hardware generally available at present (May 2011) it is not fast
enough to handle a real-time stream from AISHub. This is an issue with
the performance of bit-vector operations under Python, and would probably
be replicated in any other scripting language.

=== libais ===

Libais is a C++ hand coded parser for AIS messages.  It was written to
replace noaadata for the ERMA / GeoPlatform response to Deepwater
Horizon <<Schwehr2011>> when noaadata was not able to keep up with the
real time volume of AIS traffic from 5000 vessels in the NAIS network.
The design is a hand-written, very simple object-oriented class tree
using inheritance.  A CPython interface is provided that returns a
dictionary for each AIS message.

// FIX: Kurt: keep writing

== Drawing lessons from the implementations ==

Lesson: Do a reference implementation *before* you publish an application 
protocol as a standard.

No application protocol with as many design problems as AIS has would
have made it out of committee if a reference decoder had been
in development in parallel with the specification; implementation
problems would have thrown them into sharp relief.

The IETF (Internet Engineering Task Force) has a custom of not
allowing a network protocol design to be published as a proposed
standard until it has at least two conforming implementations.  This
would be a good practice for everyone else to emulate.

We would go even further and say that as a best practice, the
reference implementation should be open source. In this way, third
parties can see exactly what degree of complexity is required to
implement the standard, exerting a valuable pressure towards simplicity.

Lesson: Design your protocol with a small and consistent set of idioms

Both of the authors had plans to generate an AIS decoder with at least
conditionally provable correctness from a declarative
specification. Both efforts (the noaadata decoder and ais.py) failed
because AIS is just plain too irregular to be captured this way.

We recommend that application-protocol designers should, as a routine
part of their process, render the design as a specification in <<ASN.1>> or 
<<BDEC>>.  If this process is painful and complex, writing a decoder 
will also be painful and complex, and the design should be considered
broken until that is fixed.

Lesson: Ship your standard with test pairs.

Both authors find that, even as mis-designed as parts of AIS are, the
single most frustrating thing about writing AIS decoders is not any
one of those design problems or even all of them together.  The worst
problem is our inability to verify that the code we write is correct.

This problem could be quite easily be fixed by attaching to the
standard a set of test pairs - that is, an example binary datagram
in each of every possible variation of message shape together
with a textual, human-readable decode of that datagram.

Shipping test pairs with a standard has particular importance in
application protocols for life-critical software, where mistakes can
easily kill. But it's good practice everywhere.  The combination
of official test pairs with an open-source reference implementation
is the most effective means we know of to ensure that a standard
is implemented correctly and completely.

Lesson: Closed, paywalled standards endanger lives.

While writing our decoders we had occasion to evaluate about a dozen others.
For some, such as <<GNU-AIS>>, we could read source code; for others, we could
only see sample results and read documentation.

The completeness and correctness of these implementations was generally poor.
Most could handle only a small subset of the AIS messages - typically 1, 2,
3, and 5.  We found bugs on even the most superficial examination. It quickly
became evident that these limitations were a result of the actual standards
being relatively expensive and inaccessible; implementers had been reduced to
trying to reverse-engineer the standard from information that had leaked
out about it, and not uncommonly made errors in doing so.

While that situation was alleviated by the publication of <<RAYMOND1>>
and later by an ITU policy change that made <<ITU1371>> available as a
free download, those poor implementations are still with us.  Someday,
incautious use of one of them might well kill somebody.

The lesson is clear.  Making a standard expensive to get damages the
quality of attempted implementations. This is flat-out unacceptable
when the standard is life-critical. Even when it is not, we can expect 
the costs and damages imposed by paywalling to greatly exceed the maximum 
revenue anyone can ever expect to collect from it; the economist's
concept of "deadweight loss" applies with particular force here.

== Ways forward for AIS ==

The AIS protocol's design problems will not be easily
solved. Implementations are widely dispersed and often difficult to
access for firmware upgrades, so the deployment problem is
significant. Some major defects, such as the handling of string data,
are too deeply embedded to be removed. But there are steps that could
be taken to improve the quality of decoders and prevent further
proliferation of dangerous complexity.

A vital first step would be publication of a definitive set of test
pairs in conjunction with the standard and any future extensions. The 
ITU should also mandate that all extensions by jurisdiction authorities
also publish complete sets of test pairs. With such a conformance suite
to test against, quality and breadth of conformance in AIS decoders
would improve dramatically and rapidly.

The single point defect with the largest ripple effect on complexity
of implementation is probably the post-positioned dispatch field in
message type 22.  Deprecating this message and replacing it with a
better-structured one would eventually enable a significant reduction
in decoder complexity.

Most of the rest of our guidance for designers of future AIS messages
should be obvious from the design rules we've pointed out. Dispatch
fields are complexity poison, so keep them to a bare minimum; don't
split string fields; avoid fields with variable offsets; don't split
message layouts across multiple datagrams.

//FIX: Kurt, you should explain the necessity of a registry for extensions.

Again, life-critical systems concentrate the mind wonderfully.  We
cannot stress strongly enough that, in an
application domain like AIS's, bad protocol-design decisions are not
merely annoying but, because of the way they feed through into
inacreased code complexity and defect rates, actually dangerous.


== References ==

FIX: get these references into the bibtex file and switch to doing an include.

[bibliography]

//include::toils.ref.txt[]

- [[[RAYMOND1]]] http://catb.org/~esr/gpsd/standards/NMEA.txt[AIVDM/AIVDO 
  Protocol Decoding]

- [[[ITU1371]]]
  http://www.itu.int/rec/R-REC-M.1371-4-201004-I/en[ITU 
  Recommendation on the Technical Characteristics for a Universal
  Shipborne Automatic Identification System (AIS) using Time Division
  Multiple Access in the Maritime Mobile Band]. 

- [[[IMO236]]] http://www.imo.org/includes/blastData.asp/doc_id=4503/236.pdf[IMO
   Circular 236: Guidance on the Application of AIS Binary Messages (May 2004)]

- [[[IMO289]]]
  http://vislab-ccom.unh.edu/~schwehr/papers/2010-IMO-SN.1-Circ.289.pdf[IMO
  SN.1/Circ.289 Guidance on the Use of AIS Application-Specific Messages
  (June 2010)]

- [[[CATB]]]
  http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ar01s05.html[The Cathedral and the Bazaar]

- [[[IEC-PAS]]] IEC-PAS 61162-100, "Maritime navigation and
  radiocommunication equipment and systems"  The ASCII armoring
  is described on page 26 of Annex C, Table C-1.

- [[[AISHUB]]] http://www.aishub.net/[AIS Hub, the AIS data sharing center]

- [[[TEXTUALITY]]]
  http://www.catb.org/esr/writings/taoup/html/ch05s01.html[The
  Importance of Being Textual].

- [[[ASN.1]]] http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One[ASN.1]

- [[[bdec]]] http://www.protocollogic.com/[bdec]

- [[[GNU-AIS]]] http://gnuais.sourceforge.net/[GNU AIS - Automatic 
  Identification System for Linux]

- [[[MMM]]] Brooks, J. Fred (1995) "The Mythical Man-Month: Essays in Software 
  Engineering", Addison-Wesley, ISBN 0-201-83595-9

- [[[ASU]]] Aho, A.V., Sethi, R. and Ullman ,J.D. (1986) "Compilers:
  principles, techniques, and tools." Addison-Wesley Longman, ISBN
  0-201-10088-6.

- [[[IALA2004]]] FIX

- [[[LANS1996]]] FIX