VMworld 2019 proved to be a very good event. Although new product announcements were limited, the company made a great show of technology previews that kept people’s attention and helped attendees better understand VMware’s strategy, product roadmaps, and upcoming features. In this scenario, one feature that attracted my attention is the ability of future VMware vSphere versions to share remote GPUs (thanks to a technology coming from a recent acquisition). At the moment, many might consider this a minor feature, but it has great implications on how future data centers will be organized and resources consumed.
Before going in deeper detail on this post I warmly suggest you watch this video from Tech Field Day Extra, recorded at VMworld. In this video Andy Banta, NetApp, presented a great session on the evolution of data centers and computing models, giving a good idea of what is happening around us.
Not long ago I wrote a report on IT composability and hyperconvergence delineating the differences between the two. Explaining that IT composability is more related to quickly deploying and reconfiguring hardware resources quickly in large data centers, and that hyperconvergence is more focused at virtualizing these resources for better granularity. And in my latest report, “Key criteria for Evaluating Enterprise HCI,” I predict that composability will soon become very relevant for HCI and this technology from VMware confirms it.
New technologies like NVMe are already making storage composability a reality but, until now, it was quite difficult to share CPUs, memory or other resources like GPUs without proprietary hardware capable of extending the internal PCIe bus of a server and making it accessible to others.
Some startups worked hard on the concept of PCIe bus extension and now have products that combine resources coming from different pools of hardware to configure machines of every size or type. A3Cube and Liquid are just a couple of successful examples in this field. Benefits of IT composability are usually associated with better resource utilization and faster provisioning.
Software is Eating the World (and IT Composability)
Except for storage, for which the concept of shared resources is already common, the limits of traditional server virtualization are in the size of the single server and the amount of the resources available in it. Soon, VMware vSphere will be able to access other servers in the cluster and borrow GPUs to build virtual machines that combine local CPU and memory with remote GPUs via standard network interfaces.
In this video, recorded at Tech Field Day, you can see how easy it will become to take advantage of GPU resources available in the network.
As far as I know, a similar approach has been taken for NVMe storage resources, allowing vSphere (and VSAN in particular) to use remote devices as if they were local.
Virtualizing remote GPUs on Ethernet networks is not the best for latency and throughput. But it is also true that GPUs are very expensive and often underutilized. At the end of the day this could be a good compromise for many applications that can benefit from GPUs but are not business-critical at the moment. Here’s another video that goes deeper into the technology details.
Use cases for this technology are plenty, AI/ML, VDI, imaging, analytics, etc., but the most important benefits will come from better resource utilization. For example, some servers with GPUs can be dedicated at VDI during the day and share GPUs at night with other servers in the network to speed up different tasks. This could also be an opportunity for smaller service providers that will be able to offer GPU-as-a-Service to their customers without needing specific, and expensive, servers to do that.
IT composability startups still have an advantage over VMware. While their products are available, what we have seen at VMworld will come with a future version of vSphere. In my opinion, datacenter-scale composability solutions will have an edge over VMware for quite some time, and they are not focused on the same type of workloads and organizations.
On the other hand, for startups with rack-scale solutions life will quickly become harder if they don’t find something that makes a substantial difference. Yes, PCIe is faster than Ethernet but VMware showed some very compelling numbers, and I’m sure that most of the enterprises will choose an integrated, flexible, and simple solution based on commodity hardware over specialized, expensive solutions, even if they provide the best in terms of performance.