Troubleshoot Your Connection Broker Like a Boss

Howdy RDS fans! Today I’m going to show you how to troubleshoot your connection brokers like a boss. I’m sure you already know this, but in case you don’t, the connection broker is the brains of your RDS operation, provided you have more than one terminal server and you want to equally distribute users across all of your terminal servers in logical groupings of apps and desktops. We used to call this a “farm,” but Microsoft now calls it a “collection.”

The Importance of Microsoft SQL Server on Both High-Availability and Single Server Connection Broker Configurations

Needless to say, it’s very important that your RDS deployment’s brain stays working and accessible all the time, or your users will get errors when they try to connect or reconnect to a collection to access those apps or desktops. In larger environments, organizations often place their RDS deployments in high availability mode, which means there are two or more connection brokers deployed for fault tolerance. In addition, HA requires deploying a shared Microsoft SQL server instance (or preferably a SQL cluster), which houses the central SQL database that both connection brokers use to handle connection requests, keep track of what users are assigned to what servers, and so on.

But in the vast majority of smaller RDS environments, the connection broker role will only be installed on one server, and it may even be running alongside other role services like RDWeb and RD Gateway. When deployed this way, how the connection broker works is relatively opaque to the server administrator responsible for the RDS environment. For example, in this default, single server connection broker deployment, did you know that the connection broker itself is also using a SQL database? You can look for Microsoft SQL amidst the list of installed programs on your connection broker, but it won’t be listed there.

Instead, when you first install the connection broker role on a single server, Microsoft installs a built-in instance of the Windows Internal Database. WID is in fact an instance of SQL Server, but it is designed to only be used by internal Microsoft applications, like RDS and Windows Server Update Services among others. Disappointingly, there are no management tools included with WID when it is installed as part of the connection broker role, so other than firing up the rather clunky, slow and largely useless RDS Manager that shows up in Server Manager on your Connection Broker, you’re pretty much flying blind when it comes to troubleshooting problems with it.

Installing the Connection Broker role also installs the Windows Internal Database

Troubleshooting RDS Connection Brokers – Starting With the Event Log

So, it’s now high time for me to show you how to debug what’s going on with your broker when it acts up. Typically, the first thing to do is to check both the TerminalServices-SessionBroker and TerminalServices-SessionBroker-Client event log areas in the Event Viewer. The Admin channel typically contains major errors, such as the inability for the broker to connect to its SQL database, and the Operational channel gives you information about how connections are being routed. While you can do this in the Event Viewer, if you use our Complete Monitoring and Management Bundle for RDS, the Remote Desktop Commander Client has a consolidated Event Viewer that will automatically show you recent errors/warnings from all 4 of these logs, alongside other RDS infrastructure roles and session hosts, which is especially handy if you have a larger RDS deployment with multiple brokers and don’t want to visit each broker individually.

TerminalServices-SessionBroker Connection Broker Event Log — OPEN THE TerminalServices-SessionBroker and TerminalServices-SessionBroker-CLient EVENT LOGS and review both the admin and operational channels when you start your broker troubleshooting

Troubleshooting RDS Connection Brokers – Accessing and Troubleshooting the SQL Server Configuration

You may get lucky and find an error message that has an easy fix once you look it up online. But by in large, when things go wrong with connection brokers, they go wrong at the database layer. Examples of this, include but are not limited to:

1.) The database transaction log filling up or becoming inaccessible
2.) The database not coming back online after a reboot or after updates are applied
3.) Too many connections (caused by login storm, temporary gateway failure) needing to be handled at the same time, which causes timeouts when executing stored procedures,
4.) And many more items…

Now, if you are running high availability mode which uses a dedicated SQL server or SQL cluster for your connection brokers, it’s easy to launch SQL Server Management Studio to poke around and check for problems. However, if you’ve got a single connection broker tied to a Windows Internal Database instance, you are flying blind, unless you take the following approach. By the way, this approach is also valid for high availability connection broker setups, and you can just skip ahead to the part in my video above after I’ve installed SSMS and after I’ve connected to the SQL instance in WID.

Installing SQL Server Management Studio (SSMS) On A Single Server Connection Broker Deployment To Access the Windows Internal Database

First, download a copy of SQL Server Management Studio from Microsoft. It’s free, and you should ALWAYS install a copy of it on a single server connection broker for troubleshooting needs that may arise in the future.

Use special syntax to connect to the Connection Broker Windows Internal Database — Once SSMS is installed, connect to the windows internal database using np:\.\pipe\MICROSOFT##WID\tsql\query as the server name, and set the encryption level to optional

Once it’s installed, the next trick is knowing how you can connect to the Windows Internal Database to inspect the database, check for errors, etc. In order to do this, you have to make a direct named pipes connection to the WID instance. The current named pipes syntax to include in the Server Name field when opening the database is np:\\.\pipe\MICROSOFT##WID\tsql\query. As this may change over time in later editions of WID, you will need to get a portion of the pipe path from the internal service name of the WID. Basically, it’s the part of the internal service name that follows MSSQL$. You’ll also need to set the Encryption setting to Optional. And of course, you’ll need to be an admin on the broker server to make this connection in SSMS.

Inspecting the ErrorLog Table in the Connection Broker Database, and How to Access Database Properties for Other Optimizations

Voila! – We’ve now logged into the Windows Internal Database, and we can see the Connection Broker database. We can list tables, views, and stored procedures. Of significant note is the ErrorLog table. This is where the Connection Broker logs internal errors that arise when invoking stored procedures during normal operations. Often times, if you’re seeing things like “semaphore timeouts” in the Broker event log entries, this ErrorLog table will be chock full of errors related to stored procedure timeouts or locks, related to things like login storms in the morning, after lunch, or when a Gateway crashes and the broker fields tons of reconnect attempts. As I’ve talked about at length in my other videos, the Connection Broker stored procedures are not particularly optimized and don’t scale very well, so if you have more than a few thousand users in your RDS deployment, you need to watch my videos on how to optimize your brokers and gateways for supersized RDS rollouts, and/or consider breaking your single deployment into two or more clustered RDS deployments, each with its own set of load balanced brokers and gateways.

Check entries in the ErrorLog table in the Connection Broker database. — Check entries in the errorLog TABLE in the connection broker database (RDCMS) first

As an aside, you can also use the SSMS console to adjust advanced settings on the Windows Internal Database, such as maximum memory used, processor affinity, and even Cost Threshold for Parallelism which I’ve talked about in my previous Connection Broker optimization video. All that being said, before you make any changes here, make a complete backup and snapshot of the virtual machine hosting your connection broker so you can roll back your changes if problems occur.

Inspecting the SQL Server Logs History

If the ErrorLog table doesn’t offer clear insight as to what may be going on with your Connection Broker database, it’s time to dig deeper into the SQL Server Logs history, which is buried under the Management folder of the Connection Broker database in SSMS. These logs are comprehensive, and tell you everything about what’s happening with a database, including things like login failures, the database going offline, the database being placed in suspect state, etc. These logs often tell you exactly what’s going wrong with the broker’s WID database, and by googling the errors found in these logs you may gain quick insight on how to resolve the issue.

Review the SQL Server Logs history to troubleshoot the Connection Broker database — Next review the SQL Server Logs history to further debug problems with the connection broker database

For instance, I was recently helping a client troubleshoot a problem where their connection broker would go offline after some, but not all Windows Updates. After examining the SQL Server Logs history, we realized that the database was going offline due to an Authenticode signing error after particular Windows Updates were applied. We then found a thread online describing the issue in depth and potential mitigations for it.

Connection Broker Troubleshooting is Reactive, Synthetic RDP Login Monitoring is Proactive

All this said, while understanding how to troubleshoot your Connection Broker can be helpful in diagnosing the root causes of failures, it cannot beat around the clock, proactive monitoring of your RDS infrastructure roles so you can more quickly respond to remediate failures. This is where our Remote Desktop Canary tool can be a big help, as it routinely checks the health of those role servers AND your session hosts by performing full synthetic login tests into your RDS environment, all the way to a client desktop or a client app launch. If anything goes wrong, if login times get too long, or if black screens appear, it will alert you with an email message that contains relevant error and event log messages about the problem, as well as screenshots of the login sequence to provide even more context.

If you’d like to learn more about Remote Desktop Canary, you can visit its product page here, and you can start an affordable monthly subscription to it alone, or as a part of our Complete Monitoring and Management Bundle for RDS. Our Complete Bundle also comes with the Remote Desktop Commander Suite, among other tools, that provide dashboards showing the health of your broker, connection success rates, and much more.

Stay tuned for more conversations on Connection Broker scalability and reliability, including tips and tricks on promoting your brokers into High Availability Mode!

Additional menu