AMD’s Surge, Intel’s Scramble and Custom Hardware

September 12, 2018

I often get asked what motivated me to get deep into technology and one of the most stand out moments was Googling what RAM was in third grade. Understanding that having more RAM is the equivalent of having either a sticky note full of notes – limited amount (low RAM) – or a full notepad (high RAM) to reference for performing tasks made the concept simple enough to understand and start building more knowledge upon. Being an avid Lego builder, the fact that I could place a piece of something into a slot and have more power to do things was absolutely enthralling.

RAM along with other components of computers are vital for pieces of software and something that traditionally gets overlooked now that computing power is so readily available. However, it’s starting to become more of a focal point with the advent of AI (Artificial Intelligence) and more specialized computing.

For example, Imagine Henry Ford’s assembly line and how it revolutionized the industry. By sticking to an exact plan and optimizing it made cars get delivered in record time. Think of every computer chip as the assembly line. Now imagine an SUV vs a performance car, say a Mustang. Yes, you could build both on the same line, but neither are built as efficiently as they could be. This analogy could equate to the CPU or central processing unit where it’s capable of doing just about anything but may not be the best at anything.

Say the Mustang is AI and it’s found to be more efficient to build the engine separate from the body, transmission separate and then combining them at the end. This is what “specialized” chips are essentially doing. They’re hardware that best suits what the software (assembly line layout) requires.

How shrinking CPU sizes can be perceived.

Now, to add another dynamic to this let’s imagine a hot wheel sized Mustang is just as valuable as a full-sized car – maybe we can shrink, bear with me. You could make the tiny versions on the same large equipment but now you’re wasting factory space and paying for heavy equipment that is costly to power and operate. However, the big equipment still gets the job done. Or, you could shrink the equipment which allows you to build, say, 50 cars at once vs 10. Additionally, your operating overhead is reduced as one person can now operate more in their space than before and the equipment has less area to cover making it more efficient.

The above scenario is exactly what happens with Moore’s Law and fabrication manufacturing. A 14nm wide chip can do everything a 7nm chip can but the smaller size allows for increased efficiency even if the bigger chip is laid out more effectively. The smaller size simply has less “factory walking distance” to travel between steps which allows it to move quicker and more efficiently.

Alright, so why are GPUs so hot right now?

Look at Nvidia and AMD’s stock in addition to Intel’s announcement of jumping into the GPU or graphics processing arena, again. It’s big business due to a few areas.

Cryptocurrency also drove up prices but, again, cryptocurrency “mining” is simply a very set way software needs to work which is the same as AI and ML areas. This is what led to GPUs to being very effective in the process. However, as mining rates have slowed even GPUs are inefficient for mining and why even more optimized hardware has been created that make the “assembly line” more efficient.

Conveniently, I outlined the importance of GPUs back in my AP English class in 9th grade. Its embedded below and it really shows how I was able to predict where the market was heading and the GPU’s rise in importance. Also shows I’ve always been that nerdy.

I alluded to this in my previous post about Google’s rumored foray into creating a cloud gaming platform but Google in particular is utilizing GPUs in their cloud products much more prevalently then AWS or Azure. This is mainly due, again, to their much better abilities at processing data but I have to wonder about down time. If a GPU is sitting at rest for any amount of time you will want to utilize it in some way, shape or form and gaming is a very short putt for these companies with cloud expertise.

How GPUs are Manufactured and Sold
GPUs are made up of certain “blocks” of processing cores. A full sized card like an Nvidia GTX 1080 TI has the max amount of “blocks” activated and in use. Think of this as the Shelby GT500 Mustang with the top performing parts. Then there are lesser GPUs like the GTX 1080 and 1070 that simply have less “blocks” available or they’re intentionally deactivated. They’d be the Mustang GT and then the 4 Cylinder Mustang equivalents. What makes this lucrative, though, is that Nvidia or AMD could strive to make, say, 100k maxed out cards and due to manufacturing issues only 20k came out with all the “blocks” operating effectively. Instead of taking a loss they simply turn the 80k cards into the lower performing siblings.

When I mined Litecoin in 2013 I purchased 3 AMD Radeon 6950 cards because, thanks to software, you could enable the non-functional cores effectively turning them into Radeon 6970 cards which improved my mining hash rate significantly. This case is rare, though, in that AMD was ahead of manufacturing issues at the time and needed to disable “blocks” in order to keep their pricing strategy intact.

Analyzing the CPU Hardware Landscape

I’ve always been an Intel and Nvidia fan for performance machines. My computer repair business always saw AMD chips lagging across benchmarks. However, times are very rapidly changing!

As mentioned in the Tweet above, I foresaw AMD’s giant leap in stock price and this was mainly due to one man, Jim Keller. Jim can be attributed to AMD’s early 00’s glory days of the first 64-bit architectures that left Intel to play catch up. Additionally, he was the brain child behind Apple’s absolute dominance in the mobile/ARM manufacturing area and the reason Apple devices were a good two years ahead of other mobile manufacturers in the move to mobile 64-bit processing. Apple’s watch processors are even making the switch next month.

What Jim did was completely redoing AMD’s architecture. When Intel’s Core line of processors came out AMD had no answer and had to rely on taking the budget sector with poor performing, cheaper processors that yielded very little return. The focus on Zen cores is high thread counts with a very high bandwidth for supplemental chips like GPUs and high-speed solid-state drives. Although Intel may still beat AMD on raw core performance these other features are much better suited for growth for the following reasons:

Intel’s saving grace has historically been their ability to be ahead of the curve on manufacturing node shrinking. However, they’re still stuck at 10nm processing nodes after facing many delays. For the first time ever, other manufacturers are beating Intel to the punch.

Intel can’t meet shipments whereas AMD now can. BUT it gets worse. Intel’s raw core performance benefit is mainly due to a slightly faster architecture or “assembly line layout” as mentioned earlier. This has given them the slight performance advantage for a long time. However, now that AMD is poised to start producing their chips on a 7nm node vs Intel’s 10nm. The shrinking should eek out the performance boost AMD has needed making their chips not only a better value but the overall better chip – all available in well stocked supply to eat away at Intel’s market share.

Let’s Bring Nvidia Back Into the Mix

Nvidia is also utilizing a 12nm manufacturing process for their latest ‘Turing’ GPUs. This is compared to their widely popular ‘Pascal’ based cards which were manufactured on a 14nm node. Even with such a small die shrink the cards are proving to be much stronger performers then their counterparts.

AMD never had an answer to Pascal on the high end of cards and was forced to out do Nvidia in the mid-range consumer market. This is great for the consumer space as it’s a sweet spot, however, enterprises and AI enterprises aren’t interested in mid-range performance, they need the best to future proof themselves.

Enter AMD’s 7nm manufacturing for GPUs. Due to release later this year AMDs chips will be a full 42% smaller than Nvidia’s cards. This coupled with the fact they’re cheaper to produce should narrow or even surpass the gap performance gap Nvidia has been enjoying.

As more evidence of this, I believe Nvidia retreated to adding a marketing gimmick to their latest RTX line of cards in the form of ray tracing. Now, don’t let that comment resonate too much as ray tracing is an amazing step forward for real time computer graphics. However, it’s simply an added co-processor of sorts to their previous architecture and they’ve gotten select partners to deploy software to take advantage of the movement. If you dig deep into benchmarks, when the effect is added to high end hardware it nearly cripples performance to that of a very low to mid-range card. Again, this is new technology and to be expected but it’s not giving enterprises any more performance gains outside of the slight 14nm-12nm improvement.

Hard to believe this is computer generated but thanks to Nvidia’s ray tracing – this is now possible in real time within games. However, it doesn’t add much to enterprise needs.

Now, the big kicker here that has always amazed me is this – AMD cards have been better than Nvidia cards for raw number crunching for many years. Not just that but they’re better per watt than Nvidia and for large datacenters running AI and ML algorithms, the cost of operating will be less with AMD hardware with this only being more prevalent with the upcoming die shrink.

The Impending Market Shift is Enticing to Intel

Intel, given their declining CPU projections (until Kim Keller turns them around in 3-4 years) is eyeing the GPU market to offset their losses. Not only do they need GPUs to power their wide range of processors – they’ve partnered with AMD on integrating their GPUs because of necessity from manufacturers. But they see that even if they can infiltrate some market share, it’s a highly profitable business and benefits multiple areas of the business.

Additionally, Intel is great at making completely custom chips for various companies. An example of this is leveraged in Google’s Pixel 2 range of phones as the Visual Core Chip. Anytime the camera API is called within the system, the chip is turned on and engaged which not only ensures other applications can reap the benefit of improved photos but the chip is specialized for what it does so it requires less power than the CPU. They’re doing this same thing for ‘custom’ ML and AI chips and will prove to be a great supplementary line item.

ARM and Mobile Chips – The Elephant in the Room

I highlighted this before but Apple’s mobile architecture is years ahead of competitors such as Qualcomm and Samsung’s Exynos line up. It’s this reason alone that Google is now trying to bring custom chip development in house for their Pixel phones to compete but also for the entire Android ecosystem. Part of Apple’s software fluidity is due to heavy hardware integration at the most barebone levels, by bringing this closer to core Android development only helps everyone in the space.

More broadly, though, is ARM’s performance gains compared to x86 (AMD & Intel) chips. Looking at benchmarks for the iPhone X compared against Intel’s power sipping M lineup, Apple comes out ahead in many different tests. This gain is only going to become more prevalent as these chips become the first to be produced on smaller manufacturing nodes. This is because they’re mobile first and need the efficiency gains.

They’re also inherently able to shut off when device’s go to sleep (i.e. your phone is still ‘on’ when the screen is off) and wake up instantly. Yes, x86 can do this but they’re not nearly as efficient and when you on average do this 300 times a day, the savings add up. Paired with the advent of power hungry 5G modems, these benefits only become more pronounced.

Cost is also a major boon for ARM. High performance chips are cheaper than even mid-range x86 chips. However, there’s even more benefits when you look at their bandwidth and support for other technologies. I mentioned this “straw” concept earlier but ARM chips are paired with UFS 2.1 storage in contrast to cheap x86 processors being stuck with EMMC storage. More acronyms, great, however what this means to the user is how fast processors can access the data stored on phones. This is the same thing as when computers moved from slow spinning hard drives to faster solid-state drives. When you open an application it loads instantly. For lower cost devices, they will see better performance with a lower performing CPU but higher performing storage for user perceived performance.

I’ll hit on ARM’s advantages more in future articles but the architecture opens many more doors into the future and both Intel and AMD are going to have to approach chip design differently to stay competitive in the space. Just like my AP English article, I’m super excited for this space and love seeing all the improvements.

Lots of further reading material if you’re interested. – amp_tf=From%20%251%24s& – amp_tf=From%20%251%24s& – amp_tf=From%20%251%24s&