Since time immemorial, humans have been drawn to the idea of creating artificial creatures and intelligences. In Jewish folklore, for example, a golem is an animated anthropomorphic being formed from inanimate clay or mud (as an aside, Helene Wecker’s break-out novel, The Golem and the Jinni, will have you on the edge of your seat at its conclusion).
More recently, mechanical automata entranced everyone who saw them. Perhaps the most astounding example of this genre is The Writer, which was completed around 245 years ago as I pen these words. Containing 6,000+ moving parts, The Writer is capable of writing any custom text up to 40 characters long. Furthermore, it can do this in cursive — a skill that sadly eludes most young humans today.
First AI Musings
Many people are surprised to hear that Ada Lovelace (1815-1852) mused about the possibility of using computers to perform tasks like creating music whilst working with Charles Babbage (1791 – 1871) on his Analytical Steam Engine project. Babbage was focused on using his mechanical computer to perform mathematical and logical operations, but Ada realized that the data stored and manipulated inside computers was not obliged to represent only numerical quantities but could instead be used to represent more abstract concepts like musical notes. In her notes, she wrote:
“[The Analytical Engine] might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should be also susceptible of adaptations to the action of the operating notation and mechanism of the engine…Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.”
I don’t know about you, but I find it astounding that someone was thinking about this as far back as the middle of the nineteenth century. I was also amazed to discover that Alan Turing (1912-1954) was thinking about the possibility of an artificial intelligence apocalypse — although he didn’t call it that — in the middle of the twentieth century. As he said during a lecture in 1951:
“It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers… They would be able to converse with each other to sharpen their wits. At some stage therefore, we should have to expect the machines to take control.”
Just five years later, the Dartmouth Workshop — more formally, the Dartmouth Summer Research Project on Artificial Intelligence — took place. This gathering is now considered to be the founding event of the field of artificial intelligence as we know and love it today.
The 1970s and 1980s saw the emergence of expert systems, which are considered to be among the first successful forms of primitive artificial intelligence. These systems were designed to solve complex problems by wading through large (for the time) amounts of data. One common approach was to codify the knowledge of multiple experts as a collection of if-then rules. The resulting facts and rules were stored in some form of knowledge database. The other part of the system was an inference engine, which applied the rules to known facts in order to deduce new facts.
Have you noticed that there’s a current trend to avoid gluten? As a result, while walking around a supermarket, it’s common to see product packaging proudly proclaiming, “Gluten Free!” for consumables that never, ever contained gluten in the first place.
Well, much the same thing happened with the terms “expert systems” and “artificial intelligence” in the late 1980s and early 1990s. Almost everywhere you looked, you saw boasts of, “powered by artificial intelligence.” The result was to set everyone’s teeth on edge to the extent that few people outside of academia could even bear to hear the term “artificial intelligence” for the following two decades.
Artificial intelligence was largely confined to academic research, until a “perfect storm” of developments in algorithms and computational technologies thrust it onto center stage circa 2015.
Created by the American research, advisory, and information technology firm Gartner, the Hype Cycle is a graphical depiction used to represent the maturity, adoption, and social application of specific technologies.
According to the Hype Cycle, the five phases of a technology’s life cycle are the Technology Trigger, Peak of Inflated Expectations, Trough of Disillusionment, Slope of Enlightenment, and Plateau of Productivity. The speed of development in the artificial intelligence field is such that, in the 2014 incarnation of the Hype Cycle, technologies like artificial intelligence, artificial neural networks, machine learning, and deep learning weren’t even a blip on the horizon. By comparison, just one year later, machine learning had already crested the Peak of Inflated Expectations in the 2015 Hype Cycle.
Let’s start by considering the happy face of AI…
In the past couple of years, AI-powered speech recognition has come along in leaps and bounds. Hundreds of millions of voice-enabled assistants like the Amazon Echo are now resident in our homes and offices, and it won’t be long before voice control becomes ubiquitous.
One of the big issues with speech recognition is the “cocktail party” problem, in which multiple people (possibly accompanied by other noise sources like televisions, radios, air conditioning, etc.) are talking at the same time. We humans have an incredible ability to focus our auditory attention on whomever we are speaking to and to filter out other voices and stimuli.
In 2017, XMOS acquired Setem Technologies. The combination of multi-processor xCORE devices from XMOS with Setem’s signal separation technology allows users to disassemble a sound space into individual voices, and to subsequently focus on one or more selected voices within a crowded audio environment.
Initially, this XMOS-Setem technology worked well only with multi-microphone arrays, like the 7-microphone setup used on the majority of Alexa-enabled devices. Having multiple microphones facilitates noise filtering and echo cancellation, and the different arrival times of the same signal allows the system to determine the location of the sound source of interest.
How many ears do you have on your head? I get by with two, but even struggling under this limitation I can tell in which direction a sound originates — including whether the source is in front of or behind me — with my eyes closed (that’s how good I am). Just a couple of months ago, the folks at XMOS announced that developments in their algorithmic technology now allow them to perform the same tasks with a 2-microphone array that would have required a 7-microphone array only a year ago.
In 2016, I attended the Embedded Vision Summit in Silicon Valley. This event was a real eye-opener (no pun intended). I saw embedded vision applications of a level of sophistication I didn’t realize existed in the real world. The exhibit hall was jam-packed with demonstrations and attendees. In fact, the in-conference joke was that you couldn’t swing a cat without a host of embedded vision applications saying, “Hey, there’s a human swinging a cat over there!”
A year later, I was scheduled to give a talk on the evolution of technology at the Embedded Systems Conference (ESC). As part of this, I asked the folks at CEVA if I could borrow one of their demo systems. The week before the conference, I set the system up in the bay outside my office. A small notepad computer displayed random images culled from the internet. A webcam on a tripod was set up to monitor this screen, with its output being fed into an object detection and recognition development board boasting CEVA’s machine vision IP running on a Xilinx FPGA. The output from this board fed a larger display, which showed the original image annotated with a caption saying what it was.
A new image appeared every second, accompanied with captions like “Steel Bridge,” “Human Baby,” Airplane,” “Ship,” “African Elephant,” “Fluffy Toy,” “Bicycle,” and so forth. At that time, I was renting my office in an engineering building. I asked the supervisor downstairs if he wanted to let the lads on the production floor come up to see something cool, and he agreed.
It wasn’t long before we had a group of these young men saying, “Ooh” and “Aah” and generally having a good time. And then one (there’s always one) said, “Hang on, Max, how do we know that you haven’t trained this system to know what these images are?”
Actually, that was a fair enough question (we’ve all seen “demos”… I’ll say no more). So, between gritted teeth, I replied, “why don’t you pick up the webcam and point it at things to see what the system says.” The lad in question — we’ll call him JB (because that’s what everyone calls him) — picked up the webcam and pointed it at a pen, and the system said, “Pen.” Then he pointed it at a screwdriver, and the system said, “Screwdriver.” And so it went, until JB unexpectedly turned and pointed the camera at one of his companions.
Admittedly, the guy in question was a bit scruffy that day, hair awry, unshaven, and wearing a T-shirt that had seen better days, but we were all taken aback when the system identified him as, “Plumber’s Helper.” (Unfortunately, I fear this will be his nickname for many years to come.)
The point is that this was a pretty interesting inference. It’s not that you would say, “plumber’s helper,” to yourself if you saw him strolling around Walmart. On the other hand, if you had a lineup of young lads and you were asked to guess which one was a plumber’s helper…
Sad to relate, we’ve covered only the “low hanging fruit” by talking about speech recognition and machine vision. In reality, artificial intelligence is turning up in all sorts of unexpected places.
Many of these applications are truly exciting, with the potential to make the world a better place for all of us. On the other hand, there’s… but no, I’m afraid you’ll have to wait until Part 2 of this mini-series before I expose you to the dark side of artificial intelligence and really give you a “bad hair” day.