Greetings fellow readers, RDPHard channel subscribers, and valued customers. In my previous blog article and video in this series, I discussed how to optimize your connection brokers for larger RDS deployments. In today’s article and video, I’m going to discuss techniques for optimizing Remote Desktop Gateways for scalability as well.
Step 1 – Choose a Premium Load Balancer
Remote Desktop Gateways are different than connection brokers when it comes to scaling. Brokers show diminishing returns when you add more of them into a High Availability (HA) group, which is due to their architecture where only one of the brokers fields the majority of WMI requests from other RDS servers, even if there is proper load balancing among the brokers themselves. In the case of Remote Desktop Gateways, you can literally stand up as many of them as you want to support your RDS deployment, as long as you have a good load balancer in front of them that is rated for the volume of data transfer from all of the client connections.
As I’ve stated in an earlier video, I prefer Kemp load balancers for this sort of work, as they both offer physical hardware and virtual appliances you can run in Hyper-V, Azure, and AWS. Furthermore, you can configure Kemp Load Balancers to use direct server return techniques with Remote Desktop Gateways to stretch the capacity of the load balancer when you’re throwing tons of user connections at it. Also, if you’re a masochist with an unlimited budget to blow on public cloud infrastructure costs in Azure or AWS, Kemp’s virtual appliance offerings in those public clouds are far superior to both AWS and Azure’s native load balancer offerings in my opinion.
So – presuming you have a great load balancer that is powerful enough to handle your thousands upon thousands of users – then it’s simply a matter of standing up enough Remote Desktop Gateway VMs to handle that distributed user load.
Step 2 – Determine the Upper Limit of Concurrent RD Gateway Connections
How many connections can a single Remote Desktop Gateway properly handle, Andy? It depends on many factors. A good starting point is to plan on no single gateway handling more than 1000 concurrent connections. Notice I said connections, not users. In modern Remote Desktop Gateways, a single user RDP session can create multiple connections through the gateway if you have dual TCP/UDP transport set up properly, which you do want to have setup properly, because UDP delivers a better user experience over UDP. FYI, if you want to track gateway load over time and in a nice dashboard, you can utilize our Remote Desktop Commander Suite software which shows all of this to you, plus much more to boot.
Depending on the nuances of your RDS deployment, you may be able to stack more than 1000 connections or less than 1000 connections on any given gateway. While a Microsoft technical paper from long ago showed that gateways scale load pretty linearly when it comes to CPU and memory resources consumed per connection, there are other variables you need to take into account. The most important variable in my opinion is data transfer rates.
How much RDP data is each user pushing across the wire? Meaning, how data intensive are the apps you are serving over RDS? Are you hosting graphically intense apps like CAD/CAM software? If so, your gateways will support fewer connections all other things being equal. Conversely, if you have a line of business app that has lightweight data entry workers, you can probably scale to a higher number of concurrent connections. Again, you can use our Remote Desktop Commander Suite software to see how much data on average your users are pushing through the gateway to benchmark this.
Does your application use virtual channels and device redirection for things like scanning, printing, and access to local drives on the client? If so, that will increase data transfer and lower the number of connections each gateway can support.
Step 3 – If Running RDS in a Public Cloud, Make Sure Your RD Gateway VMs Can Support the Required Network Throughput
Beyond just data transfer itself, if you’re running your RDS infrastructure in a public cloud like AWS or Azure, you need to consider VM size and throughput. Different classes of VMs are rated for different levels of network IOPS, both sustained and burst. I’ve seen gateways start choking in Azure due to the IT department choosing undersized VMs (e.g. like the B-Series). Remember, it’s not just CPU/memory, you’ve got to make sure your Gateways can keep up with the network operations and data throughput.
Step 4 – Always Load Balance Your Gateways with Surplus Gateways Available
Taking into account all of the above, that leads me to a second question. How do you know when any single gateway is serving too many clients? The most common scenario is that the Remote Desktop Gateway service will crash, and the crash logs will indicate that the problem occurred in the AAEdge.dll library. Now, if you have your load balancing setup properly, a single gateway crash will not be catastrophic. The users connected through that gateway will be temporarily disconnected but should be able to reconnect when the load balancer detects the crashing gateway is not responding and redirects them to one of the other load balanced gateways. You should also have a recovery action in place for the Remote Desktop Gateway so it restarts itself after the crash, or even better, restarts the entire gateway VM after the crash. On my gateways running Windows Server 2022, there is an auto-recovery sequence already defined for the gateway service.
However, the key phrase I mentioned above is PROPERLY load balanced. For example, if you’ve learned that the upper threshold of concurrent connections per gateway in your RDS environment is 1000, and you’ve scaled out to 4 gateways each handling almost 1000 connections, that’s no good. The moment one gateway crashes, its user connections will be shifted over to the surviving gateways, and if they’re already running at redline in terms of their own connections, you could get a cascading failure that brings all of the gateways down. Not to mention, in such a cascading failure, these reconnection sequences could end up creating a login storm for the brokers, which will suddenly field of bunch of reconnection attempts all at once, and that can bring down your brokers as well, adding insult to injury. So, make sure you add enough surplus gateways to the high availability set so that even after one gateway crashes during peak load, the surviving gateways can easily handle the rebalanced connections.
In the above example, if you replaced 4 gateways handling 1000 connections with 6 gateways handling about 660 connections a piece, a single gateway failure would re-load balance 800 connections on to each of the surviving 5 gateways, which is still below the redline.
Again, you may be able to stack more than 1000 connections onto each gateway, but keep a close eye on things, and when you identify an upper limit make sure you have surplus gateways to prevent problems when one fails.
Step 5 – Understand Ahead of Time How Your Load Balancer Behaves When You Add or Remove RD Gateways to the Load Balanced Pool
One other point – make sure you understand how your load balancers handle things when you add or remove gateways from the load balancer pool. I had a client who was using an earlier version of Azure Load Balancer, and when they would add in new gateways, it would completely break all existing connections to other gateways, which caused massive problems. While I think this was an early engineering flaw with the Azure Load Balancer, test your load balancer’s behavior at an off-peak time by adding/removing gateways from a load balanced pool to see what it does when you do.
Step 6 – DON’T Accidentally Put Your Remote Desktop Gateways into a Legacy, Deprecated RD Gateway Farm
Finally, let me conclude with one common misconfiguration I see among admins who don’t any know better. Many admins will attempt to bind their gateways together into a “Gateway Server Farm” because they see the tab in the RD Gateway Manager that lets them do this. This is an old, deprecated feature from back in the Server 2008 days that provided a sort of load balancing when all gateway connections were made over legacy RPC-HTTP instead of separate TCP and UDP channels per client (e.g. the modern method). In modern RDS, by which I mean Server 2012 R2 and later operating systems, this is not required, and if you enroll your gateway servers into an RD Gateway Farm, it will significantly increase the overhead on your gateways and prevent them from scaling up as high.
So, if you have an RD Gateway Farm defined, as long as you have a load balancer properly configured with Source IP affinity to distribute incoming TCP and UDP connections across all of your Gateways, you need to remove the servers from the RD Gateway farm altogether, which should improve scalability and maximum connection counts. If you have any questions about how to load balance your Remote Desktop Gateways, watch my Episode 1 video where I explain this in depth.
Remote Desktop Gateway Scaling Conclusions
So, in summary, here are the main takeaways you need to consider when optimizing your gateways for larger RDS deployments:
- Deploy a good load balancer designed to work with Remote Desktop Gateways, like Kemp.
- Don’t force too many user connections onto any single Gateway VM. Benchmark and observe performance and stand up new Gateways in the load balancer pool frequently and often.
- Pay attention to network traffic and throughput. If operating your RDS in a public cloud, don’t skimp on VM size, and make sure you choose a VM SKU that can handle the network throughput required.
- Make sure you keep enough surplus gateways online so that any one gateway failure will not cause the surviving gateways to handle more concurrent connections than they can manage reliably.
- Understand how your load balancer behaves when you add or remove Gateways from the load balanced pool, so you don’t accidently bring the RDS deployment down when making adjustments.
- Stop putting your Gateways into the legacy RD Gateway Farm area of the RD Gateway Manager. As long as you have a properly configured load balancer in place, remove your gateways from the RD Gateway Farm to increase scalability.
Leave a Reply