feature article
Subscribe Now

Just Call Me an Indexing Fool

I love words. I wish I knew more of them. I like to believe that I have a reasonably extensive vocabulary at my disposal, but I also recognize that I know only a subset of all the words that are out there. The problem is exacerbated by the fact that new words are popping up like mushrooms, while the meaning of existing words can evolve over time.

There are many different aspects to words, including the way in which we write them down. As I mentioned in my recent column — Using USB4? Require ReTiming? Kandou Can-Do! I’m currently working on a book: Wroting Inglish: The Essential Guide to Writing English for Anyone Who Doesn’t Want to be Thought a Dingbat. One of the tidbits of trivia I share in this book is as follows:

When we come to written language, at one end of the spectrum we have pictographs/pictograms (a pictorial symbol representing a word or phrase) and logographs/logograms (a sign or character representing a word or phrase). Chinese characters provide a familiar example of logograms (a handful derive from pictograms). Imagine the problems associated with having tens of thousands of different characters to work with. Being encumbered with such a system has many ramifications, such as the fact that you are somewhat unlikely to go on to invent things like the Morse telegraph or conceive the concept of a typewriter. How would you set about creating a dictionary for such a language? Constructing a crossword puzzle would be problematic at best, and it would take at least six strong men to carry the box of tokens used in the Chinese equivalent of a game of Scrabble.

Toward the other end of the spectrum, we have alphabet-based languages in which a standard set of letters (symbols or graphemes) are used to capture the written word.  There are dozens of alphabets in use today, the most popular being the Latin alphabet, which was itself derived from the Greek. Many languages, including English, use modified forms of the Latin alphabet.

One interesting point to ponder is that we currently have only 26 letters in the English language (I say “currently because this was not always the case — also, the sounds associated with certain letters and letter combinations have changed over time). One issue here is that there are more sounds in the English language than there are letters to represent them. We call these distinctive sounds “phonemes,” and, depending on one’s regional accent, most people employ around forty-four.

Now, your knee-jerk reaction might be to simply add more letters such that we had one letter for each phoneme, but this wouldn’t actually solve the spelling problem because everyone would write things down phonetically (i.e., the way they said them using their own accent and/or dialect), which basically means everyone would have their own way of spelling things. In order to address this problem, we came up with a variety of techniques, like adding a “silent” ‘e’ to the end of a word to indicate that the preceding vowel should be pronounced in an elongated way; for example, “hop” is said with a short ‘o’ sound, while “hope” is pronounced with a long ‘o’ sound (similarly for the ‘a’ in “hat” and “hate”). An alternative approach is to double up on vowels, like ‘ee’ in “seek” and ‘oo’ in “boot.”

But we digress… A little earlier, while talking about Chinese characters in the form of logograms (and pictograms), we posed the question: “How would you set about creating a dictionary for such a language?” This task is vastly simplified when using an alphabet-based language, especially if we all agree to use the same ordering from ‘a’ to ‘z’, which means that we know that we’ll find the ‘g’ words between the ‘f’ words and the ‘h’ words, and so forth.

As an aside, I had no idea as to how complex it was to create the first “Full Monty” dictionaries until I read The Professor and the Madman: A Tale of Murder, Insanity, and the Making of the Oxford English Dictionary by Simon Winchester. This is one of the books that I would wholeheartedly recommend to anyone who is interested in learning more about “stuff” (see also my review: A Professor, a Madman, and a Love of Words).

In addition to enabling the creation of dictionaries, the use of an alphabet-based language in which we all agree to a defined ordering facilitates the creation of indexes at the back of books. As the Wikipedia tells us: “An index (plural: usually indexes, more rarely indices) is a list of words or phrases (‘headings’) and associated pointers (‘locators’) to where useful material relating to that heading can be found in a document or collection of documents.”

Indexes are obviously a jolly useful idea, so it’s unfortunate that they are so poorly executed in Oh so many books. It may be that authors are so exhausted by the time they finish their works that they cannot rouse the energy or enthusiasm to devote the time required to create a good index, or it may be that they assume this is a job for their publishers. In this latter case, I fear that the task is, oftentimes, either performed by a computer program that indexes everything exhaustively but ineffectively, or it’s handed over to a junior employee who lacks sufficient subject matter expertise to do the job justice.

Even in the case of a book with a relatively robust index — like A Crack in Creation: Gene Editing and the Unthinkable Power to Control Evolution by Jennifer A. Doudna and Samuel H. Sternberg, for example (see also my review: Life as We Knew It) — I find things that annoy me. For example, I recently used this index to track down the word “phages,” only to be instructed to, “see bacteriophages.” Arrggghhh (and I mean that most sincerely), so I bounced around as ordered to discover “bacteriophages, 46-50”.

Why couldn’t the index simply say, “phages (bacteriophages), 46-50” and also “bacteriophages (phages), 46-50”? Actually, there is a reason, in that this entry has a number of indented sub-entries, so I can see why the people creating the index and printing the book would like to keep things as concise as possible. From a user’s perspective, however, this is a pain in the nether regions. I don’t want to spend the rest of my life trapped in an index being redirected from one entry to another — all I want is to track down the information I’m looking for in the most efficient and timely manner possible.

As another aside, the worst example I’ve ever experienced of indexorial redirection (yes, I made “indexorial” up — sue me), is in Being and Nothingness by Jean-Paul Sartre. Although many regard this book to be a philosophical classic and major cornerstone of modern existentialism, I’m more tempted to describe it as the ravings of a deranged man, but perhaps this is because I once got trapped in its index and spent weeks trying to get out. The problem is that everything is defined in a circular manner. As you are reading, you are presented with a factoid like, “The ‘For Itself’ is inextricably intertwined with the ‘Is Itself’.” So, you go to the index looking up ‘Is Itself,’ which points you at another part of the book, where you find that ‘Is Itself’ is predicated by the ‘Of Itself.’ You return to the index to track down ‘Of Itself,’ which points to yet another part of the book where you read that ‘Of Itself’ is an integral aspect of the ‘For Itself.’ Sad to relate, your next trip to the index to locate ‘For Itself’ returns you to your original starting point (and people wonder why I drink).

The reason for my rambling musings here is that my chums Adam Taylor, Dan Binnun, and Saket Srivistava recently finished writing a book called A Hands-On Guide to Designing Embedded Systems. They just shared an early PDF copy with your humble narrator and asked me to read it and share my thoughts before it goes to press (I’ll be writing a full review here on EEJournal in the not-so-distant future).

First, I took a superfast skim through the PDF to get a high-level view of everything, which led me to a page titled “Index” with an accompanying note saying, “Index goes here.” Do you remember my saying that they asked me to share my thoughts? Well, my first thought was, “Who is going to create your index?”

I’ve spent an inordinate amount of time creating the indexes for my own books, so I scanned the first page of the index from Bebop Bytes Back: An Unconventional Guide to Computers (now sadly out of print), annotated it as shown below, and sent it to Adam for him to peruse and ponder.

The first page of the index from Bebop Bytes Back (Image source: Max Maxfield)

One thing I forgot to mention to Adam (I’ll point him to this column) is that when I’m reading a book myself, all sorts of weird and wonderful things will stick in my mind. Later, I may not be able to recall the exact name of the topic I’m looking for, but I may remember that it was associated with something tangential like a “sea cucumber,” so I go to look up “sea cucumber” in the index in the hope that it will place me in the ballpark of what I’m searching for.

It may not surprise you to learn that I’m usually disappointed. This is why, in my own books, even though they might be on subjects like electronics or math or computers, you’ll find unusual words in the index like “parrots,” “peas,” “pebbles,” and “pianos” (I just pulled these examples out of one of my books).

Looking at the image above, I see a prime example in the form of “accordion, invention of” (my index also tells me that this will be found in a footnote, which will save me time when I visit that page). Why would this be in a computer book? Well, the first British electric telegraph was created by the physicist and inventor Sir Charles Wheatstone, who also invented a form of accordion called the concertina. Six months after someone has read this book, they may remember “accordion” or “concertina,” but not recall the name “Sir Charles Wheatstone.” In the case of my index, they can come at things from multiple directions to more quickly track down what they are looking for.

I also index the same things in multiple ways, like “binary addition” and “addition, binary,” each pointing at the same page(s). In fact, I’m quite proud of everything illustrated in the image above, like the use of bold font to indicate the primary definition of an entry in a bunch of page numbers (the main definition may not be on the first reference). How about you? Do you have any thoughts you’d care to share on indexes in general and the format I’ve developed for my own indexes in particular?

One thought on “Just Call Me an Indexing Fool”

  1. my theory is that there is a general association of intelligent order with regimentation, fascism, tyranny and hence an aversion to it….the day-to-day world given over to otherly structuring is indeed reason enough to shuck some pointless burden….ah, the days when I could chug it back without much care…

Leave a Reply

featured blogs
Nov 30, 2022
By Joe Davis Sponsored by France's ElectroniqueS magazine, the Electrons d'Or Award program identifies the most innovative products of the… ...
Nov 29, 2022
Smart manufacturing '“ the use of nascent technology within the industrial Internet of things (IIoT) to address traditional manufacturing challenges '“ is leading a supply chain revolution, resulting in smart, connected, and intelligent environments, capable of self-operati...
Nov 22, 2022
Learn how analog and mixed-signal (AMS) verification technology, which we developed as part of DARPA's POSH and ERI programs, emulates analog designs. The post What's Driving the World's First Analog and Mixed-Signal Emulation Technology? appeared first on From Silicon To So...
Nov 18, 2022
This bodacious beauty is better equipped than my car, with 360-degree collision avoidance sensors, party lights, and a backup camera, to name but a few....

featured video

How to Harness the Massive Amounts of Design Data Generated with Every Project

Sponsored by Cadence Design Systems

Long gone are the days where engineers imported text-based reports into spreadsheets and sorted the columns to extract useful information. Introducing the Cadence Joint Enterprise Data and AI (JedAI) platform created from the ground up for EDA data such as waveforms, workflows, RTL netlists, and more. Using Cadence JedAI, engineering teams can visualize the data and trends and implement practical design strategies across the entire SoC design for improved productivity and quality of results.

Learn More

featured paper

How SHP in plastic packaging addresses 3 key space application design challenges

Sponsored by Texas Instruments

TI’s SHP space-qualification level provides higher thermal efficiency, a smaller footprint and increased bandwidth compared to traditional ceramic packaging. The common package and pinout between the industrial- and space-grade versions enable you to get the newest technologies into your space hardware designs as soon as the commercial-grade device is sampling, because all prototyping work on the commercial product translates directly to a drop-in space-qualified SHP product.

Click to read more

featured chalk talk

Enabling Digital Transformation in Electronic Design with Cadence Cloud

Sponsored by Cadence Design Systems

With increasing design sizes, complexity of advanced nodes, and faster time to market requirements - design teams are looking for scalability, simplicity, flexibility and agility. In today’s Chalk Talk, Amelia Dalton chats with Mahesh Turaga about the details of Cadence’s end to end cloud portfolio, how you can extend your on-prem environment with the push of a button with Cadence’s new hybrid cloud and Cadence’s Cloud solutions you can help you from design creation to systems design and more.

Click here for more information about Cadence Cloud Portfolio