feature article
Subscribe Now

Loading Software on the Fly: AppStreamer

Purdue University Solves First-World Problem

“Those who do not learn from history are doomed to repeat it.”  – George Santayana

I swear, people are trying their hardest to resurrect the 1970s. I thought the era of bell-bottom pants, 8-inch floppy disks, and bicycles with banana seats was nostalgia-proof. Yet here we are. First there was cloud computing, a throwback to the days of dumb terminals and remote time-sharing computers. Now we’ve got college-educated kids – good ones, at that – devising ways to revitalize the idea of broadcast TV and video arcades. 

Maybe that’s a bit harsh, but a research paper coming out of Purdue University looks to me like a solution in search of a problem. Or, at least, a solution to a different problem not addressed in the paper. Or maybe I’m just getting old. 

Here’s the short version: They’ve found a way to stream application code to a smartphone in real-time, as it executes. That’s a pretty slick trick. The goal of said streaming is not to improve performance or enhance security, but to reduce the storage requirements on the phone. In other words, when you’ve Instagrammed too many pictures of your food, you’ll have to start streaming your apps instead of installing them. 

Or… you could just delete some of the photos. Or, I dunno, maybe upload some of them to the cloud and free up local storage? Or – radical idea, I admit – not take so many photos of your Frappuccino? Posterity will forgive you. 

The proposed technique swaps storage for bandwidth. Streaming code instead of installing it. The research behind it is impressive, I admit. It all just seems a bit… misguided. 

The paper starts out with the observation that processors don’t actually execute an entire program all at once. At any given moment, your CPU is working on only a small handful of operations, perhaps three or four, depending on the length of its pipeline. You could, theoretically, make the rest of the program disappear and just feed in instructions as needed. Sort of like laying railroad tracks directly in front of a speeding locomotive and pulling them up from behind.  

That’s effectively what the Purdue team is suggesting. To save space, just deliver the bits of the program (see what I did there?) that the processor needs right now and keep the rest in the cloud and download them just in time. 

That’s a fun idea, and it would be easy to accomplish if computer programs ran in a straight line. For straight-line code, you’d just tee up the next instruction, and the next, and the next, until you got to the end. But real programs don’t work that way. They’re messy, they jump around, they loop unpredictably, and so on. How do you know what parts will be needed, and when? 

By watching and observing, that’s how. AppStreamer is essentially an elaborate cache-prediction algorithm. Just as your x86 processor constantly strives to predict what instructions and/or data it’ll need in the next few nanoseconds so it can preload them into the relevant caches, AppStreamer tries to guess what chunks of code you’ll be executing next. Microprocessors do this in hardware, with virtually no historical data to guide them (branch-prediction caches notwithstanding). AppStreamer doesn’t have that restriction. It’s implemented in software and has access to as much historical data as it needs. 

The first step in getting AppStreamer to work is training it, which the researchers do by running the app in question over and over, preferably with different people to give it some variety. Then, they map out the app’s execution path. Does it loop over this section of code several times before jumping over to this section here? That’s good to know; write that down. Over several runs, AppStreamer builds a model of typical execution for that app. 

None of this training requires access to the source code of the app, or even any understanding of how it works. All profiling data is collected empirically, with no foreknowledge of the program’s structure, programming language, or its size. It works much like a debugger or code-coverage analyzer. 

Based on that execution profile, AppStreamer can make educated guesses about what chunks of code you’ll need when you first start the app, what chunks come next, and what you’re not likely to access at all. It also has a good idea about when you’ll need those sections. Most users might need Part B about 45 seconds after Part A, with Part C coming 12 minutes after that. 

To their credit, the researchers took on the toughest apps of all: games. Mobile games can be sensitive to very small delays, on the order of tens of milliseconds. Something like a spreadsheet or an email app (or even Instragram) would’ve been relatively insensitive to delays and a lot easier to stream. Kudos. 

AppStreamer is implemented in Android’s file-system layer. It doesn’t modify the apps at all. It intercepts file accesses, including requests for code from the smartphone’s flash file system. Since Android’s file system uses 4KB blocks, this gives AppStreamer quite a bit of granularity into the execution path. 

Too much granularity, as it happens. The Purdue team reports that most mobile games are somewhere between 1GB and 10GB in size, and tracing that much code with 4KB resolution results in an unworkable amount of data. Moreover, it isn’t necessary. They experimented with different granularities and found that, for the games they tested, something in the range of tens of megabytes gave the right balance between granularity, space efficiency, search depth, and response time. Different apps might have different chunking requirements, so the number isn’t hard-coded but is instead calculated during the training/profiling process. 

As the financial analysts say, past performance is no guarantee of future returns. Even detailed code traces are merely records of how someone else played the game, not necessarily how it’ll play out the next time. Armed with the captured code-execution profiles, AppStreamer applies a continuous-time Markov chain (CTMC) to weight the probability that any given code chunk will be needed. 

Finally, AppStreamer factors in player speed. Experienced players blaze through the game levels faster than n00bs, and AppStreamer takes that into account by fetching offline content sooner. (Beginners are also unlikely to need the boss level, ever.) 

Now that AppStreamer knows (more or less) what code you’ll want and when you’ll want it, it can start to preemptively download the next chunk. Given that most LTE connections deliver bandwidth in the 10–20 Mbit/sec range, and that code chunks are about 10 MB in size, AppStreamer needs to look far ahead – about 30 seconds ahead, in their experience. Otherwise, the code won’t arrive in time and the poor gamer will endure unwelcome delays. Game over for AppStreamer. 

The results look good. In their testing of two mobile games, most test subjects reported little or no noticeable delay in their games. AppStreamer wasn’t entirely invisible, but it was close. Whether it delivered any real benefit to those testers is a different question. 

On one hand, AppStreamer seems like a natural evolution from locally stored content to streaming. Time was, we used to download MP3s and wait to play them after the download completed. Now, we can stream audio in real time without the wait. A step-function improvement came with the advent of video streaming. Rather than wait hours for a movie to download, we could start streaming it in seconds. Netflix, Hulu, Spotify, and countless other content providers have built their entire business on this underlying technology. 

So why not stream programs, too? One reason is that storage is cheap. And some storage is cheaper, and less time-sensitive, than others. We’ve always been able to fill up the hard drives, SSDs, RAM, or flash that our devices provide. No matter how much storage we get, we’ll find a way to exceed it. But when that overflow happens, there’s a natural triage that goes with it. Blurry and out-of-focus photos get deleted first, then old emails, then obsolete documents, and so on. Cloud storage is cheap. Google provides free photo storage with every Pixel smartphone (in return for seeing all your photos). There’s no reason to let your precious device storage overflow. 

Apps, on the other hand, benefit from local storage and local execution. They’re faster, safer, and ours. It’s bad enough when access to our data is mediated by a third-party ISP; now programs are being ransomed, too? “Gee, nice collection of applications you’ve got there. Be a shame if something happened to ’em.” Pay up or lose the apps you already bought. Or just lose them every time you’re out of wireless range, exceed your data quota, or when the power fails. No, thank you. 

The researchers’ efforts are laudable, and they may have even developed some novel methods for modeling and predicting programmatic behavior. But it doesn’t look that way to me. All I see are some well-known algorithms used for branch prediction, weighted probability, and hooking operating system calls, all applied to a dystopian usage scenario. It’s a nice party trick, but one that may not lead directly to widespread use of the applications described. 

One thought on “Loading Software on the Fly: AppStreamer”

  1. I’ve not yet read the paper, but I suspect it’s not code that’s being streamed here, but textures. The bulk of any game’s memory footprint is textures, and then perhaps geometry. This is why 4KB granularity is too small, and 1-1oMB works better. Texture assets are around that size. Code just comes along for the ride.

Leave a Reply

featured blogs
Mar 18, 2024
Innovation in the AI and supercomputing domains is proceeding at a rapid pace, with each new advancement heralding a future more tightly interwoven with the threads of intelligence and computation. Cadence, with the release of its Millennium Platform, co-optimized with NVIDIA...
Mar 18, 2024
Cloud-based EDA tools are critical to accelerating AI chip design and verification; see how NeuReality leveraged cloud-based chip emulation for their 7NR1 NAPU.The post NeuReality Accelerates 7nm AI Chip Tape-Out with Cloud-Based Emulation appeared first on Chip Design....
Mar 5, 2024
Those clever chaps and chapesses at SiTime recently posted a blog: "Decoding Time: Why Leap Years Are Essential for Precision"...

featured video

We are Altera. We are for the innovators.

Sponsored by Intel

Today we embark on an exciting journey as we transition to Altera, an Intel Company. In a world of endless opportunities and challenges, we are here to provide the flexibility needed by our ecosystem of customers and partners to pioneer and accelerate innovation. As we leap into the future, we are committed to providing easy-to-design and deploy leadership programmable solutions to innovators to unlock extraordinary possibilities for everyone on the planet.

To learn more about Altera visit: http://intel.com/altera

featured paper

Reduce 3D IC design complexity with early package assembly verification

Sponsored by Siemens Digital Industries Software

Uncover the unique challenges, along with the latest Calibre verification solutions, for 3D IC design in this new technical paper. As 2.5D and 3D ICs redefine the possibilities of semiconductor design, discover how Siemens is leading the way in verifying complex multi-dimensional systems, while shifting verification left to do so earlier in the design process.

Click here to read more

featured chalk talk

Addressing the Challenges of Low-Latency, High-Performance Wi-Fi
In this episode of Chalk Talk, Amelia Dalton, Andrew Hart from Infineon, and Andy Ross from Laird Connectivity examine the benefits of Wi-Fi 6 and 6E, why IIoT designs are perfectly suited for Wi-Fi 6 and 6E, and how Wi-Fi 6 and 6E will bring Wi-Fi connectivity to a broad range of new applications.
Nov 17, 2023
16,091 views