From one point of view, the embedded developer is the major obstacle that stands between the marketing guy’s vision of that cool new product and gushing streams of revenue flowing from grateful customers. From that perspective—and it holds some truth—the most important thing a developer can do is finish the job fast.

This imperative has driven a lot of the innovation in the embedded software world. Not long ago, the serious embedded developer hand-crafted assembler on raw iron by hand using a text editor. How times have changed! Embedded operating systems, cross-compilers, simulators, debuggers, and more – all ease the task of the embedded developer.

These tools and components are aimed at making it easier to build a single monolithic application for a device. But modern device environments are often much more complicated than that, and recently a new class of application middleware has emerged, offering rich functionality to support the development of multiple applications sharing data and other resources on a single device. Key among these is a database management system.

That’s right, a DBMS in an embedded device. But this is not your father’s database. The new breed of database manager designed for embedded devices is slender and streamlined. While an enterprise-class DBMS might only feel comfortable in 640 megabytes of RAM, its embedded cousins can squeeze into as little as a single megabyte, or even less. And some offer sophisticated extensions that uniquely suit them to the embedded environment.

Three Approaches to Data Management

There are three fundamentally different DBMS technologies available for devices: data management libraries, object databases and relational databases.

A data management library provides call-level APIs to create and maintain searchable data structures. They offer the smallest footprint but the lowest level of data management service. Think of them as device drivers for data. Just as a device driver lets you deal with the logical functioning of a device without worrying too much about which register bit to flip, so a data management library allows you to store and retrieve data without too much concern over the actual data structures.

Object database management systems (ODBMS) are designed principally to provide persistent storage for data objects. They allow you to write an object-oriented application without too much concern for how objects are stored, and they tend to integrate very well with object-oriented programming languages such as Java to provide object persistence.

However, almost all the databases in the world are managed by relational database management systems (RDBMS). This model won out largely because it offers a mathematically robust programming language, SQL (Structure Query Language). The set-oriented SQL language allows an application to issue high level queries that retrieve data by content, not by address. This makes it an excellent choice for applications that must search for data.

Where’s the Meat?

So what would motivate a developer to embed a DBMS into an embedded application? It comes down to a short, but important list of advantages that together add up to faster time to market for a device application:

Local search
High-level query language
Transactions and Recovery

Local Search

As disk and Flash prices continue to drop, devices are storing more and more data. User interfaces, however, are not keeping up. Even in “cool” devices such as the iPod, the user interface is a simple hierarchical listing of file folders. But most people see music in a number of ways simultaneously—by artist, genre, album, mood, style, how much my friends like it, etc. No single folder structure can support such a wide and flexible view of the content, but an RDBMS is designed to do just this, allowing multiple meta-data references for a single item.

Figure 1: Tables in an RDBMS

As devices become more general purpose, they are starting to contain data created by several different applications. Often the user wants to combine disparate data to create a new view. Imagine a location aware mobile device that contains local data and also downloaded maps. Integrating the local data with the map and position data enables device users to do spatial queries on the local device. This reduces the cost of operation of the device (fewer calls to map servers) and supports disconnected operations as well.

High-Level Query Language

This is the major feature that drives the acceptance of the relational data model. In an RDBMS, data is stored in tables made up of rows and columns. Each column has a name and a fixed datatype. Each row is an individual record, usually identified by a unique key, such as part number or invoice number. Tables often contain references to the unique keys of other tables; these are called foreign keys. Foreign keys enable tables to be cross-referenced to other tables by content, not by pointer. This insulates applications from knowledge of data storage and allows the RDBMS to optimize and reorganize storage without affecting running applications.

Figure 2: B-Tree Indexing

Because SQL is intrinsically a set-oriented language, a single query may return one or more matching records. This is an excellent solution for applications where the search criteria are not very precise, or where the device user is looking for a number of suitable options from which to choose.

Transactions and Recovery

As devices manage more data and more kinds of data, the changes that are made to the data become more complex. A transaction is a way of gathering a number of data changes into a single action that either completes, or does not happen at all. A field worker may want to close out a field support engagement by writing an invoice record, a completion record and a time slip, and updating a customer record. If any of these is incomplete, data integrity is threatened.

Figure 3: RDBMS Cache Management

The RDBMS offers built-in transaction support that allows an application to gather any set of data operations into an atomic unit. The RDBMS guarantees to treat any transaction as an indivisible unit, even through power failure. This means that even if the user drops the unit in the middle of a transaction, causing the battery to fall out, when the device reboots, the RDBMS will restore the data to a consistent state by backing out the partial transaction.

Figure 4: SQL Processing Cycle

The RDBMS also provides transaction isolation, so that other applications running on the device cannot see intermediate states of the data. From their perspective, the transaction has either not begun, or it has completed. They are unable to get an inconsistent view of the data.

After All It’s Just Code

Sure, a smart embedded application engineer can write a data storage module that is smaller and faster than any general purpose data manager. But is this the best use of scarce engineering resources? Shared, reliable data management is tricky, and single-purpose data management modules are notoriously hard to extend to satisfy unexpected requirements. The new breed of embedded DBMS allows engineers to focus on delivering differentiated customer value, not invisible infrastructure. The result: more robust and flexible applications that get to market faster and provide superior search capabilities to their users.

This article has been authored by Malcolm Colton, vice president of sales and marketing at the Hitachi Embedded Business Group (www.hitachi.us/entier) who can be reached at malcolm.colton@hal.hitachi.com/408.970.1340