Latest Posts

Parameter sniffing, pt 3

Last, but not least, here’s one final look at parameter sniffing.

In part 1 of this mini-series I wrote about data skew and the problems it can cause with parameters. In part 2, I looked at what happens when the optimiser can’t sniff values. In this post, I’m going to look at a third thing that can cause parameter sniffing problems. It’s fairly obvious once you know about it, but the first time it was pointed out to me I was a little surprised.

So, to recap. When the query optimiser gets a stored procedure to compile, it knows the values of any parameter passed to that procedure. It will then compile the procedure and optimise the queries based upon the value of those parameters. The optimiser cannot sniff the values of variables, because the values of the variables have not been set at the time that the procedure is optimised.

I’m going to use the same sample code and data as in the first article, to keep things consistent.

From the tests that were done before, I know that the query

select * from largetable where somestring = 'zzz'

executes optimally with an index seek and returns 9 rows. Likewise, I know that the query

select * from largetable where somestring = 'abc'

executes optimally with a clustered index scan and returns 1667 rows.

Now, let’s see if I can get the optimiser to make the wrong choice.


Common T-SQL mistakes

I have the pleasure at the moment of doing a code review on some vendor code. No names will be mentioned. I’ve seen better. I’ve seen a lot better. I’m seeing very common mistakes in the code, so, in the interests of my sanity, I’m going to go over a couple of common T-SQL mistakes in the hopes that the next batch of code I get to review doesn’t have these mistakes in…

1. Error Handling

Proper error handling is hard. SQL 2005 has made it a lot easier with the TRY…CATCH blocks, but it still means that everything that can throw an error be wrapped inside a TRY block, with an appropriate CATCH block to handle any errors.

It was a lot harder on SQL 2000 when all we had to work with was @@Error. What I think was not well understood was what statements set and reset @@Error, and how long a non-zero value persists, leading to code constructs like this

Insert into SomeTable ...
Update SomeTable SET ...
Delete From SomeOtherTable ...

IF @@Error !=0
Print 'An error occured'


SQL Injection

This is a bit of a rant, so please ignore it you’re looking for technical info.

There’s been a fair bit of news on SQL injection in the last week or so. Mainly cause some people figured out a way to automate the exploit.

What scares me if the widespread lack of knowledge of SQL injection. I’m fairly active on a couple of the SQL forums and on monday this week there were 2 posts by people who have had their databases hacked via a SQL injection exploit.

If this was a new exploit, I wouldn’t be so disappointed, but it’s not. SQL injection’s been around for years. I first read about it in 2001 when I started in web development.

So, why 7 years later are people still being hit with it? Why does a quick google search turn up a number of web sites with entire queries in the url? Sites including some government organisations, a couple of police departments, online stores and the coast guard (No, I’m not posting links. Search for yourself if you’re curious)

Completely preventing SQL injection is not hard. If web pages call the database using properly parameterised calls to stored procedures then SQL injection cannot be done. Set minimum permissions in the database for the web user and it’s even more secure.

So, why is it that so many sites, new and old, are still vulnerable?

Edit: For some in-depth info on preventing SQL injection, see this blog post on Technet

Token and perm user store growth

I wrote about TokenandPermUserStore issue a while back. I revisited the issue recently on 2 production servers to see if it was still an issue on a fairly recent SQL build.

The two servers in question run the same application, but with a vast difference in usage.

Server 1 – Itanium, 12 processors, 48 GB memory, build 3042 (SP2) running on Windows Server 2003 DataCenter SP1
Server 2 – x64, 16 processors, 64 GB memory, build 3215 (SP2 CU5) running on Windows Server 2003 DataCenter SP2

On both servers there’s a job that checks the size of the token cache and flushes it if it’s above 750 MB. The size was set because when I first set up the jobs, I noticed that the CMEMTHREAD waits were getting noticeable if the cache got above 1 GB.

On both servers, I set up a job that would log the size of the cache to a table. On server 1, that ran every 15 min. On server 2 it ran every 5 min. The results were a little frightening.


Execution plans – important properties

Time I wrote another piece on execution plans (see the first post for the full list)

So, now a quick look at some of the important properties of exec plan operators.

  • Estimated number of rows – This is the number of rows that the query optimiser estimates this operator will process. The estimate is based on the statistics and on other data available at time of compilation. It’s possible that this number includes a fraction, due to the way to optimiser does its estimates
  • Actual number of rows – This is the actual number of rows that were processed by the operator. This value is calculated by the query processor at execution time. An estimated execution plan will not include this. (more…)

The exceptional DBA

I was listening to a podcast the other day, and it got me thinking. What is it that makes an exceptional DBA?

In no particular order, some of the things that I think make an exceptional DBA are


We all know that IT is a fast changing industry. There’s no doubt that one has to keep learning just to keep up. An exceptional DBA should be constantly learning, always looking for new challenges, new and better ways of doing things. They should also seek to broaden their knowledge. Specialisation is great, but having at least a basic understanding of development practices, architectural principles or system administration can be extremely useful.

Execution plan operations – misc

This is going to be the last post on the operators in an execution plan. I’m not going to cover all the other operators. There are a lot of operators, many quite specialised or only appearing in specific scenarios. Conor Cunningham’s writing a series covering the deep, dark details of some of the operations

So onto the operators. I’m going to cover Sort, Concatenate and Scalar Function.


Sort has three logical operators:

  • Sort
  • Distinct Sort
  • Top-N Sort

Sort is used to satisfy any ORDER BY that is specified in a query, if the index used does not naturally sort the rows in the desired way. It also may appear in the execution plan before a merge join or a stream aggregate. Sometimes SQL may decide that a sort followed by a merge join or stream aggregate is cheaper than a hash join or hash aggregate.

Distinct sort is used for DISTINCT, sometimes for UNION and sometimes for GROUP BY where no aggregates are specified.


Four minor new features of SQL 2008

There are a couple of new features of SQl Server 2008 that I’m pretty excited about that I haven’t seen geet agreat amount of attention. I thought I’d write a short article on them, so they don’t get overlooked.

  1. Variable initialisation
  2. Row constructors
  3. Multi-server queries
  4. Intellisense/syntax checking

Variable Initialisation

I’ve always been more partial to the C# way of declaring and initialising variables all at once. Now it’s possible in T-SQL too.

DECLARE @SomeVariable INT = 0, @SomeOtherVariable CHAR(5) = ‘Hello’

It’s not revolutionary, but I do like it. Anything that reduces the amount of typing and reduces the chances of bugs is worthwhile in my opinion


Dev Chat

I spent yesterday at the JHB SQL Server 2008 Dev Chat. Most definitely a day well spent. The two presenters (Peter Willmot and Eben de Wit) were both knowledgeable and entertaining and the material covered was interesting.

The sessions covered some of the larger changes in SQL Server 2008.

First up was a brief look at some of the changes on the administration side. The first topic was around the transparent database encryption. this is a fantastic feature for people concerned about the security of the database files themselves. The database is encrypted on disk, decrypted in memory, and any attempt to attach the encrypted files to another instance of Sql will fail, unless the certificate that was used to encrypt the database is present on the second instance.

One major warning on that. Loose the certificate, say goodbye to your database.

Second of the admin-related changes discussed was the DMF (Policy-based management). This is a fantastic new feature and there are two nice ways of using it.


On recoverability

Had a lovely situation this morning.

I can’t go into details, but essentially a database had no off-site backups, no tape backups and the only full backups were on disk. Then the SAN glitched and both the data files and backups were corrupted.

I’m probably preaching to the converted, but I don’t think there are many more important things on a production server than ensuring good backups.

However, having the backups is not enough. Can they be restored? If there’s a complete drive failure of all drives connected to a server, can the databases be recovered?

When the pawpaw hits the fan (as the local saying goes), that’s the only thing that matters.