I’ve come to really like Distributed Replay in the last couple of years. I’ve used it to do a scale test, I’ve used it to test a workload for performance regressions before upgrading. It has a lot of possibilities.
One problem with it is there’s no GUI, so configuring it requires a fair bit of time spent playing around on the command line or in the depths of Component Services.
Another problem is that there aren’t a lot of people using it, so there isn’t a lot of detail on what to do if something goes wrong.
Make that ‘when something goes wrong’, DReplay is a little finicky.
The more common errors are firewall and COM related and they appear in several blog posts, a search for the error code usually turns up a resolution. However there was an error which I ran into recently which turned up exactly 0 results in google. So, to fix that problem, here’s a description of the error, the circumstances and what turned out to be the cause of the error.
To start, the scenario. The preprocess of the trace files had been done, the firewall configured, the COM setting changes made. The services were running, no errors showing in the logs. I’d used DReplay on the machine previously with the same processed trace file and it had worked fine. This time, however…
“Error DReplay Failed to get client information from controller.”
After turning up nothing in google, I spent half the afternoon checking logs, restarting the services, restarting the computer, checking and rechecking the firewall and the COM settings. I finally went and checked the details of the controller and client services.
Anyone spotted the problem yet? For those who haven’t, let me highlight pieces of those last two screenshots.
Distributed Replay is not instanced. If there are two versions of SQL Server installed on the machine, and this laptop has SQL 2012 and SQL 2014, and the replay controller and client were installed with both, then the service points to the executables from the most recent installation. The older version’s executables are still there however, and they still execute. They throw errors, but the errors do not, in any way, indicate that there’s a version problem.
The above error is what the Replay option of DReplay returns. If the preprocess is run from the incorrect directory, the error returned is “Error DReplay Object reference not set to an instance of an object.”
The fix is as simple as changing to the correct directory and running the correct version of DReplay, the one that matches the version which the services point to.
Pingback: (SFTW) SQL Server Links 18/09/15 - John Sansom
Out of curiosity (because so few people use it and I don’t get to ask), how many clients were you using and… how did you get them? If I asked for a couple VMs to just run the client at my workplace I’d hear the laughter all the way from ground floor.
Also what were you doing with it and how were you monitoring the progress? I mean, were you running it while also running a second trace against the databases to capture stats from what was going on, or…?
There’s very little workflow information about any of this.
In this particular case, one client as I was testing stuff on my laptop (hence the two instances of SQL). In a earlier project, 16 clients.
How I got them, I told the client that I needed 16 small VMs (1 core, 2GB memory) for a week to do the project they wanted me to do for them.
There’s an option on distributed replay that is will write out a trace file of the results back to the controller, however in the projects I’ve done, I had an extended events session running on the server to capture the stats.
Gail, thanks for this. I had an ‘object reference’ error while trying to preprocess. turns out, my 2016 install was causing the same problem you reported. I changed the directory, and it worked. I would have been hunting a while for this one!