feature article
Subscribe Now

Tracing Rays in Real Time

Caustic Tries to Change How Design is Done

It’s not every day that an embedded design project results in claims of fundamentally changing the way an industry works. To be clear, new products try to claim this all the time; it’s just less often that it’s actually true. Well, the Caustic division of Imagination Technologies is saying that they have made such a change available to the graphics and animation worlds.

To lay some background for this, let’s review some basics. There are a couple of fundamental ways to create (or render) a graphic image. One might be considered “quick and dirty” (although it can still be sophisticated): it’s the rasterized image. When processing such a scene, you start at the top left, scan right, munging each pixel, and, when you get to the end, you hit Carriage Return (OK, that would be Enter for anyone born after, oh, 1985? 1990?)

At each pixel, you do stuff that involves surrounding pixels and a lighting model and who knows what. In fact, you’re probably working in a sliding window of pixels rather than simply one row at a time. But the point is, the fundamental work organization principle of this approach – line by line – has nothing to do with the image itself and everything to do with how it’s stored. And what’s convenient. And, frankly, what’s good enough.

But for some applications, it’s not good enough. The more life-like you want things to be, the less satisfying this process is. And we’re not talking just about generation upon generation of increasingly jaded media consumers that need their fix upped each time to stimulate a “Wow.” Product designers, for example, would really really like to know how their products will really look before they exist. Photo-realistic images, in this case, aren’t simply replacing real-life photographs, since there is no object yet to take a picture of.

If the Caustic images are to be believed, it’s likely that some Nike shoes you’re wearing were viewed virtually before anyone viewed them actually. And, sticking with just this use model, guesses and shortcuts and raster-based lighting models aren’t likely to be good enough. If you’re going for a certain level of luster in a metal or plastic finish, for example, you want to be able to incorporate that design decision into the image and see what it’s really going to look like. If you don’t like it, you’re going to change the design. If the only reason you didn’t like it was because the image rendering was poor, then you’ve changed something you didn’t need to change and you’re not going to be happy with the final result when built.

The way our eyes perceive any such real objects is completely determined by the light entering our pupils, and that light has taken a variety of paths, starting at some source and interacting with a variety of surfaces and materials along the way. At an atomic level, each interaction involves a photon hitting an atom and being absorbed, perhaps to be re-emitted or to create some other band-jumping that emits a different color. Maybe the photons “reflect” cleanly; perhaps they bounce all over the place, creating a diffuse light that’s the opposite of shiny.

When designing products that involve light intrinsically, there are specialized optical CAD tools that can do such modeling at a very low level. But that’s overkill, obviously, for figuring out how a tennis shoe will look. Modeling the light as rays and then tracing their progress as they bounce around the scene can give a good intermediate point between rasterization and quantum hell.

Such “ray tracing” is not new. It’s just very compute-intensive. And, frankly, it’s kind of like designing an FPGA with no incremental or ECO design capabilities. You make a change – or a set of changes, preferably, so that you can amortize the wait that is to come. And then you hit a button and go get coffee. Or lunch. Or enjoy your evening and see what the morning brings. Which makes for a very slow feedback loop.

There are two things that Caustic has done to address this. The first relates to how they process the scene. One of the nice things about raster graphics is that you can divide up the image and give portions to different processors to run in parallel since each deals with a different part of memory. Not so with classic ray tracing. A ray may traverse the entire image, and it may do so several times before either exiting the scene or being completely absorbed by something black.

That means that each ray calculation involves the entire scene, meaning the entire memory allotment, meaning it’s impossible to break it up to share the load. What Caustic did was to turn the problem into a database algorithm. They collect the rays and sort them, grouping the rays impinging on a particular part of the scene at a given time. They can then work that portion of the scene, sorting the outgoing rays afterwards for a repeat.

And “repeat” is key here. In theory, as mentioned, the light remains active in the scene until it either leaves or is absorbed. Given all of the light rays in an image, that can take a long time. In fact, when watching a demo with the Caustic guys at CES, the image visibly sharpened up in a matter of seconds, and then we went back to talking. But the algorithm was still churning away behind us, gradually sharpening the image ever so much more over time. Typically, this stops, not when all rays have been resolved once and for all, but when you say, “Enough already!”

So this algorithmic approach, available in their Visualizer product, allowed them to reduce rendering time, perhaps turning a lunch break into a coffee break. Useful perhaps, but also not necessarily a game-changer.

The second piece was recently announced, and it was the pièce de resistance at their CES demo: hardware accelerator boards that amp up (figuratively and literally) the rendering process. They’ve created a ray-tracing unit (RTU – yes, a new processing unit to go with CPU, GPU, GPGPU, and NPU) accelerator chip; they have then incorporated one RTU onto an R2100 accelerator board and two of them on their R2500 board. They claim the RTUs can process “70 million incoherent rays per second.” And they’re scalable, so two of them can work at (roughly) twice the pace. (And, as far as I can tell, they’re not available to anyone else.) The boards themselves plug into PCI slots in the rendering hosts.

And here’s the take-away on this: it allows rendering to happen fast enough to implement real-time, interactive, tight-loop usage. Tweak the scene; see the results. No time for coffee.

It is the combination of processing and the acceleration chip and boards together that underlie their claim that, in fact, this will change how graphic and product design are done.

 

More info:

Caustic Series2 Acceleration Boards

One thought on “Tracing Rays in Real Time”

Leave a Reply

featured blogs
Aug 18, 2018
Once upon a time, the Santa Clara Valley was called the Valley of Heart'€™s Delight; the main industry was growing prunes; and there were orchards filled with apricot and cherry trees all over the place. Then in 1955, a future Nobel Prize winner named William Shockley moved...
Aug 17, 2018
Samtec’s growing portfolio of high-performance Silicon-to-Silicon'„¢ Applications Solutions answer the design challenges of routing 56 Gbps signals through a system. However, finding the ideal solution in a single-click probably is an obstacle. Samtec last updated the...
Aug 17, 2018
If you read my post Who Put the Silicon in Silicon Valley? then you know my conclusion: Let's go with Shockley. He invented the transistor, came here, hired a bunch of young PhDs, and sent them out (by accident, not design) to create the companies, that created the compa...
Aug 16, 2018
All of the little details were squared up when the check-plots came out for "final" review. Those same preliminary files were shared with the fab and assembly units and, of course, the vendors have c...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...