In the 1980s, a dramatic transformation came over the world of computing. Mainframes and minicomputers had long held the lion’s share of the world’s computing workload. Dumb terminals gave users access to the enormous-for-the-time processing and storage resources housed in massive data centers where most of the existing processing and storage capacity lived.
But, microcomputers finally evolved to the point that tremendous amounts of work could be done locally on desktop machines, without the need for massive mainframes. Tasks moved quickly out of the data center and onto distributed desktop machines. The total amount of computation and storage in the world was increasing exponentially, and almost all of that growth was distributed.
Billions of PCs later, the vast majority of the world’s computation work was being done on desktop machines, but there were still major challenges in the area of communications and access to shared resources – namely storage. Meanwhile, mainframes became zombie dinosaurs – still roaming the Earth, but looking for purpose in life other than exotic high-performance-computing tasks. Soon, however, microcomputers took over that arena as well, with racks of microprocessor-powered machines replacing the heavy-iron CPUs of old.
As mobile computing came of age, the center of computation moved again. So much work was being done in distributed mobile devices that the relative load on desktop hardware decreased. Smartphone processors and storage quickly surpassed the capabilities of the desktop, and even data center machines from just a few years before. The majority of the world’s compute power was literally walking around in people’s pockets.
As we have moved into the data-centric era, however, the trend away from centralized computation has reversed itself. The value of massive troves of shared data is so great that the cloud began to subsume both storage and computing tasks. Even though pocket-sized edge devices had 64-bit multi-core processors and hundreds of gigabytes of local storage, modern applications largely demanded access to the data and the analysis capabilities that were available only in data centers.
We are now in a boom for data center and cloud computing and storage. High-profile cloud providers such as Microsoft, Amazon, and Google are doing land-office business, and technology providers like Intel are pushing huge engineering resources into the evolution of data-centric technologies to accelerate the transformation. Listening to the frenzy of the cloud and data center crowd, you’d think we were moving into a new era of centralized computing.
But, there are other trends afoot trying to push us all right back in the other direction again. With AI taking center stage, there is an entirely new crop of applications that require massive computing power. Machine vision, language recognition, and other neural-network-centric tasks often can’t take advantage of centralized processing and acceleration because of latency and reliability issues. That means there is a growing demand for accelerated computing at the edge, and numerous technology providers are stepping in to fill that gap.
While training tasks utilizing gargantuan data sets may still be performed in server racks, the inferencing tasks using those trained models will be done largely at the edge with millions of high-performance, low-power devices shouldering the burden. Architectures originally developed with the idea of cloud use are being re-imagined in mobile incarnations, and taking the associated compute cycles with them. Non-traditional processors, low-power FPGAs, and other mobile-class accelerators are poised to get serious traction filling the growing demand for edge-based high-performance-computing.
5G will provide yet another force vector in this equation, however. With latencies possibly diving to the 1ms range, it’s possible to imagine applications that need to be edge-based today being able to leverage cloud resources in the near future. But, at the same time, issues such as security are creating demand for data and computation to be localized again. Apple, for example, is reported to have kept their entire facial recognition engine local on the mobile device to help assure user privacy and to avoid transmitting sensitive data over the network. The problem of performing facial recognition entirely locally on a mobile device is sobering, but the company managed to get respectable performance and reliability on what is generally recognized as a server-class application.
With all these shifting tectonic plates under the surface, it becomes increasingly challenging for system engineers to find anchor points to bound their designs. How much computation should be done on the edge device, and how much pushed up to the cloud? How will user data and privacy be secured? How can the local-only tasks be managed within power, cost, and form-factor constraints? How long will architectural decisions made today continue to be valid in the face of rapidly-evolving infrastructure technology?
The current pace of data center hardware renewal is about five years. The revolution in AI applications has gone exponential within the last two. That puts us on the path we see today for a massive build-out of cloud capacity. At the same time, the amount of data being created by IoT devices at the edge is going through the roof. The amount of computation required to take full advantage of this rich trove of data is almost incomprehensible, and it will take up all of the data center capacity, all of the distributed edge capacity, and more. In short, we have moved to yet another era where we have widely-useful applications that can take advantage of more computation and storage than what the sum of all current technology and production output can deliver.
With all these fast-changing trends pushing in different directions, the only thing that is certain is that the computing landscape will continue to change rapidly, and that rate of change is likely to accelerate. The days of tick-tock Moore’s Law evolution of von Neumann microprocessors is over, and the future will be a wild, unpredictable ride into the unknown. It will be interesting to watch.
The bigger questions are centered around what are the applications driving this and why?
From my arm chair and lab, it seems to be more of a change in applications being sold as a service so that software vendors have a steady locked in revenue stream … rather than being sold once, and maybe getting annual updates for a few years. And with that comes the 5-15 year life cycle nightmare for robust mature applications deployed in the field, since it’s nearly impossible to freeze the development environment.
And with that vendor’s insatiable demand to fully acquire all your business IP and design data, while you no longer have a viable path to stay in business should they fail, or worse yet become acquired by a foreign competitor that ignores IP rights for the global market.