SLAs, Single Points of Failure, and Azure Active Directory
A lot of people are drawn to Windows Virtual Desktop because Microsoft is “taking care of the infrastructure for us now.” Unlike RDS, where you are responsible for architecting, setting up, and monitoring the infrastructure roles (often combined with load balancing methods if you are running a high availability RDS deployment), in WVD, Microsoft is responsible for the infrastructure roles -brokering, web access, and gateway. You are responsible for the session hosts, how you manage user profiles (e.g. FSLogix with Azure Files, etc) and how you deploy your apps (MSIX App Attach, golden images, etc).
This creates a perception of increased reliability, but is that really the case? I would argue that it’s more a case of “out of sight, out of mind.” You breathe easier, because Microsoft has abstracted that infrastructure responsibility away from you. However, what you’re not thinking about can still bite you in the ass and bring down your environment.
For instance, did you know that Microsoft does not offer a financially backed service level agreement (SLA) for Windows Virtual Desktop? Let that sink in, and I’ll repeat it. Microsoft currently does not offer a financially backed service level agreement for WVD.
Don’t believe me? Read it for yourself right here. They say they “strive for 99.9% availability to the Windows Virtual Desktop URLs” by which they mean the aforementioned PaaS infrastructure services like the Broker, Web Access, and Gateway roles. Of course, if despite all their striving, they don’t meet that 99.9% availability level, you have zero financial recourse and you are $h!+ out of luck.
Even if Microsoft hits that 99.9% availability target, that means that the virtual machines running your desktop or application workloads, even if they themselves have an SLA of 99.95%, may very well be inaccessible for over 8.5 hours in a given calendar year due to outages with the WVD infrastructure. And again, you will have no financial recourse in any scenario if the actual availability is worse.
Now, to Microsoft’s credit, they have architected these service components to be fault tolerant and redundant across Azure regions. There are many different WVD brokers, gateways, and web access servers deployed throughout Azure regions across the world. Unfortunately, there is still a single point of failure though – Azure Active Directory.
WVD is completely dependent on Azure Active Directory to work properly. Currently, you must connect your on-premises Active Directory to the Azure Active Directory handling your Azure tenant subscription, OR you must deploy a dedicated VM to run an instance of Active Directory inside your WVD tenant VNET and link that to Azure Active Directory. There is no way around it and no way to bypass it. When a user attempts to reach WVD resources, Azure AD is ultimately the gatekeeper.
Over the past 18 months, there have been several outages that affected Azure Active Directory, and other Azure services, which either impacted the ability to authenticate against Azure Active Directory to reach WVD hosts, OR to login to the Azure Portal to manage said WVD hosts. By means of example, here’s a summary of problems that happened last September.
In fact, these outages and issues have become so common recently that the hashtag #Office364 often trends on Twitter, implying that Microsoft isn’t coming anywhere near that 99.9% uptime they’re striving for, and that cloud services are down at least a day of the calendar year, if not more.
If Well Architected, Running Classic RDS in Azure Can Yield a Higher SLA Than Windows Virtual Desktop
So, as we’ve established, Microsoft does not guarantee a SLA for Windows Virtual Desktop’s infrastructure components. But, you can deploy Classic Remote Desktop Services in Azure and get a financially backed SLA. How so? Let’s create an example.
If I want to set up an RDS deployment in High Availability Mode, I will need, at a minimum, to provision two Windows Server 2019 virtual machines to handle the infrastructure components. Each of those virtual machines will run the gateway, connection broker, web access, and licensing roles. A better approach would be to create a set of four VMs, two VMs handling the connection broker plus licensing roles, and the other two handling the gateway and web access roles.
If I deploy these RDS infrastructure machines across availability zones in an Azure region, I can get a 99.99% SLA guarantee.
Next, I will need an instance of SQL server that will maintain information about the RDS deployment and connected sessions for the connection brokers configured in HA mode. I can use Azure SQL for this purpose, and even at its standard, non-premium tiers, Microsoft also offers a guaranteed 99.99% SLA for Azure SQL.
I’ll also want to set up load balancing between my Gateway/Web Access VMs and Connection Broker VMs. I can use Azure Load Balancer to do that, and guess what its guaranteed SLA is? You got it – 99.99%.
Finally, I’ll either have this connected to my organization’s own Active Directory using various methods, or I can setup a separate Active Directory to serve this RDS deployment on two of my redundant infrastructure VMs (or separate VMs, just to make everything nice and clean).
And – presto – now I have a virtualization environment in the cloud with a guaranteed 99.99% SLA! Of course, I could do the same thing using a colocation provider or my own datacenter, and probably save even more money as the compute costs will be even cheaper. And if I choose a colocation provider with experience in RDS, they can setup the highly available infrastructure roles for me.
Moreover, I can still use FSLogix for my user profile management, since it is a free entitlement for ANY organization with valid Remote Desktop Services CAL or SAL licenses. I will also have complete control over my update cadence since I will be running the LTSC versions of Windows Server 2019. My RD Gateway servers will be using dual transport delivery with TCP and UDP to serve up desktops all across the world, even over higher latency and lossier networks. That is rock solid reliability folks, with zero dependency on Azure Active Directory, or the WVD infrastructure roles that offer no guaranteed SLA. Game, set, match.
My Conclusion – Wait On Migrating to WVD For Now. Continue to Utilize Remote Desktop Services In Your Own Datacenter, Private Cloud, Or Public Cloud.
Without a doubt, Microsoft will continue to iterate upon and improve its Windows Virtual Desktop offering in Azure. That being said – in my opinion – there are simply too many gotchas at the moment with the WVD service to contemplate a migration right now. Don’t be hoodwinked by the “free licensing” loss leader that Microsoft is pitching, namely that all of your Microsoft 365 users of a certain subscription level have a built in WVD “seat” ready and waiting for them in Azure.
That free “seat” will be more than offset by increased computing costs, higher Microsoft 365 related costs, the need to deploy other ancillary or Premium Azure services to optimize WVD (e.g. Azure Files, Azure Monitor, etc.), the inability to have dual TCP/UDP transport of the RDP protocol without added costs, potential downtime related to update issues, and perhaps most importantly, the lack of a guaranteed SLA for the WVD infrastructure role services.
I am blessed to have a very loyal customer base I speak with on a weekly basis about their RDS deployments and their plans for WVD. Almost everyone whom I talk to indicates that WVD is simply too expensive an alternative to RDS to migrate to it right now. Moreover, I have clients who have migrated already to WVD, are having buyer’s remorse, and are contemplating either a move back to RDS or shifting to another virtualization platform altogether. Through my consulting work, I’ve also been privy to those painful conversations whereby the “expert Azure consultant firms” get a big “who farted?” look on their faces when their clients ask them why their WVD deployments are so over budget. Don’t strap your IT budget to a boat anchor and then throw it into a bottomless lake, please!
If you stay on classic Remote Desktop Services, you’re effectively guaranteed that this will be a valid highly available platform all the way until 2032. This is because Microsoft has included support for Remote Desktop Services in Windows Server 2022, the next Long Term Servicing Channel version of Windows Server. Extended support should be available for 10 years, meaning Classic RDS will be around and supported by Microsoft until at least 2032.
Now, Microsoft may try and play some games with Microsoft 365 Apps running on Windows Server 2019 when end of support happens in 2025, but my prediction is if they try and drop Microsoft 365 support on classic Windows Server deployments, the market will revolt in open mutiny. And even so, that’s still 4 more years away, during which time you can watch the evolution of Windows Virtual Desktop and see if it improves enough and becomes more cost competitive for you to reconsider it.
At RDPSoft, we’re committed to supporting classic RDS deployments well into the future. We continue to innovate and add more management and monitoring features that benefit RDS deployments, even while we continue to add more support for WVD, and we do it at a price point that can’t be beaten.
Similarly, if you need to optimize your existing RDS deployment, rebuild it, stand up a new deployment in your own datacenter, private cloud, or public cloud, you should leverage the excellent consulting services of my friends at RDS Gurus. These experts are Microsoft MVPs like myself, they have published numerous books on Remote Desktop Services, and they will provide you the expert consulting you need and deserve in this area.