Unusual errors with Distributed Replay
I’ve come to really like Distributed Replay in the last couple of years. I’ve used it to do a scale test, I’ve used it to test a workload for performance regressions before upgrading. It has a lot of possibilities.
One problem with it is there’s no GUI, so configuring it requires a fair bit of time spent playing around on the command line or in the depths of Component Services.
Another problem is that there aren’t a lot of people using it, so there isn’t a lot of detail on what to do if something goes wrong.
Make that ‘when something goes wrong’, DReplay is a little finicky.
The more common errors are firewall and COM related and they appear in several blog posts, a search for the error code usually turns up a resolution. However there was an error which I ran into recently which turned up exactly 0 results in google. So, to fix that problem, here’s a description of the error, the circumstances and what turned out to be the cause of the error.
To start, the scenario. The preprocess of the trace files had been done, the firewall configured, the COM setting changes made. The services were running, no errors showing in the logs. I’d used DReplay on the machine previously with the same processed trace file and it had worked fine. This time, however…
“Error DReplay Failed to get client information from controller.”
After turning up nothing in google, I spent half the afternoon checking logs, restarting the services, restarting the computer, checking and rechecking the firewall and the COM settings. I finally went and checked the details of the controller and client services.
Anyone spotted the problem yet? For those who haven’t, let me highlight pieces of those last two screenshots.
Distributed Replay is not instanced. If there are two versions of SQL Server installed on the machine, and this laptop has SQL 2012 and SQL 2014, and the replay controller and client were installed with both, then the service points to the executables from the most recent installation. The older version’s executables are still there however, and they still execute. They throw errors, but the errors do not, in any way, indicate that there’s a version problem.
The above error is what the Replay option of DReplay returns. If the preprocess is run from the incorrect directory, the error returned is “Error DReplay Object reference not set to an instance of an object.”
The fix is as simple as changing to the correct directory and running the correct version of DReplay, the one that matches the version which the services point to.