Latest Posts

All about Execution Plans

Coming to the PASS Summit in October this year? Excellent!

I say excellent, because Grant (blog|twitter) and I are presenting a full-day seminar on the Monday, all about execution plans. Excited already? If not, let me give you a taste of what we’ll be discussing.

Grant will be kicking things off with a look at how the optimiser works. Not 500 level deep internals (that’d take all day by itself), but an overview of how the optimiser turns this

SELECT t.Title, u.UserName, u.UserLevel, t.LastModified, t.LastPoster
FROM Threads t
INNER JOIN Posts p on t.ThreadID = p.ThreadID
INNER JOIN Users u on p.Poster = u.UserID
WHERE t.ThreadID = 42

into this

QueryIntoExecPlan

as well as some DMVs that give a view on what the optimiser is doing.

After that, I’ll be digging into what the plan cache is, how it’s structured and how plans are matched to incoming queries. I’ll also cover how you might go about monitoring SQL’s usage of the cache.

After that we’ll discuss the myriad and varied ways to get execution plans out of SQL and what the various different options for that return and what some of the terms involved really mean (estimated execution plans anyone?).

Once all that’s out there (which will probably take the entire morning) it’ll be onto the fun of reading the execution plans, what those various icons represent and what all their arcane properties are telling you. Maybe, just maybe we’ll also have a look at the raw XML form of the plan, just for fun.

And because theory’s not very useful without something to apply it to, we’ll be ending off the day by seeing how you can use the information from the execution plan to tune your queries. After all, everyone wants better performance, right?

Still not completely convinced? Then check out the 24 hours of PASS coming in September. Both Grant and I are doing a pre-con preview. I’m doing extracting execution plans and Grant’s doing a portion of the reading execution plans section.

Right, now that everyone’s dying to attend this, I’m going to toss out a little bit of a challenge. If you’re coming to our precon and you have an execution plan that you don’t understand (maybe a strange operator, maybe a property with a value you’ve never seen), mail it to me (gail@<blog domain>) with a note as to what you want clarifying and we might discuss it during the pre-con. Some rules, just to keep things sane

  • No confidential stuff. I don’t want legal trouble, neither do you.
  • If the exec plan takes 5 minutes to open in SSMS, I’m deleting it.
  • If it’s multiple pages in SSMS, I’m deleting it.
  • I don’t promise to look at anyone’s plans, it depends how we’re doing on time.

Goodbye IsNumeric hell

A well overdue feature introduced in Denali CTP 3 is that of the Try_Parse and Try_Convert functions. These are great for dealing with the something that’s frustrated SQL developers for years – data type conversions.

Let’s imagine a rather nasty case, a file with some values in it that once imported into SQL (as character data) looks something like this:

BadNumerics

Ewww… that’s a mess. Let’s see if we can identify which of the values can be converted into a numeric data type. The function prior to Denali for that was ISNUMERIC.

SELECT ANumber, ISNUMERIC(ANumber)
FROM BadNumerics   

IsNumeric

Great, so other than the obvious one, they’re all numeric data. Time to get converting.

SELECT CAST(ANumber as Numeric(18,6))
  FROM BadNumerics
  WHERE ISNUMERIC(ANumber) = 1;

Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to numeric.

Err, so they’re numeric but can’t be converted to numeric. That’s fun. Maybe another of the numeric data types will work…

(more…)

SQLSkills Immersion training in London

Just short of the winter solstice, I bailed out of a freezing cold Johannesburg for warmer climates; well, actually for London where the weather could almost have been mistaken for a South African winter, except much wetter.

However I wasn’t going to London for the weather (fortunately), nor even for some sightseeing; I was going to London for some SQL training. Not just any SQL training, but some seriously deep training from among the best trainers in the world, one of Paul Randal and Kimberly Tripp’s five day immersion courses, and immersion it most certainly was.

It gives a good feel for the intensity and depth of the course when, after all the introductions and housekeeping clip_image002on the Monday morning were done, Paul started off by diving straight into the structure of pages and rows. Start slowly, where’s the fun in that? By lunchtime I had on the order of 20 pages of notes. The reasoning behind starting with that is that the internal structures come up again and again and again in the later material, so a firm understanding of how things are put together at the lowest level makes it easier to understand the features that are built on top.

The material covered is detailed on the SQLSkills website (http://www.sqlskills.com/T_ImmersionInternalsDesign.asp), so I’m not going clip_image004to waste space by listing it all again. The material was all covered in incredible detail and any point could be and frequently was expanded on if any of the students asked.

On the subject of questions, there was not a single one, over all five days, that Paul and Kimberly could not answer with at most a couple of minutes of research (mostly for things like kb article numbers). Questions and comments were encouraged and often the discussions were as valuable as the main course material.

By the Friday, all the students were looking a little worn out. Paul, of course, was still as vibrant as ever, to the point of heckling (in good fun naturally) someone who crawled in late on the Friday morning.

All in all that was a fantastic experience, and that was just week one out of the four. I’d really like to thank Paul and Kimberly for coming across to London and giving people who can’t make it to the US classes an opportunity to take one of their Immersion Courses. I certainly hope that they plan to make a return visit soon.

clip_image006

Converting OR to Union

When I looked at indexing for queries containing predicates combined with OR, it became clear that the are some restrictive requirements for indexes for the optimiser to consider using the indexes for seek operations.

  • Each predicate (or set of predicates) combined with an OR must have a separate index
  • All of those indexes must be covering, or the row count of the concatenated result set low enough to make key lookups an option, as the optimiser does not apparent to consider the possibility of doing key lookups for a subset of the predicates before concatenating the result sets.

So what can be done if it’s not possible to meet those requirements?

The standard trick is to convert the query with ORs into multiple queries combined with UNION. The idea is that since OR predicates are evaluated separately and the result sets concatenated, we can do that manually by writing the queries separately and concatenating them using UNION or UNION ALL. (UNION ALL can only be safely used if the predicates are known to be mutually exclusive)

CREATE TABLE Persons (
PersonID INT IDENTITY PRIMARY KEY,
FirstName    VARCHAR(30),
Surname VARCHAR(30),
Country CHAR(3),
RegistrationDate DATE
)

CREATE INDEX idx_Persons_FirstName ON dbo.Persons (FirstName) INCLUDE (Surname)
CREATE INDEX idx_Persons_Surname ON dbo.Persons (Surname) INCLUDE (FirstName)
GO

-- Data population using SQLDataGenerator

SELECT FirstName, Surname
FROM dbo.Persons
WHERE FirstName = 'Daniel' OR Surname = 'Barnes'

SELECT FirstName, Surname
FROM dbo.Persons
WHERE FirstName = 'Daniel'
UNION
SELECT FirstName, Surname
FROM dbo.Persons
WHERE Surname = 'Barnes'

In this case, the OR can be replaced with a UNION and the results are the same. The Union form is slightly less efficient according to the execution plan’s costings (60% compared to the OR at 40%), and the two queries have the same general form, with two index seeks and some form of concatenation and remove duplicates.

OrResult1
OrExecPlan1

So in that case it worked fine, although the original form was a little more efficient
(more…)

Are all updates split into delete-insert?

This should be another quick one.

Earlier I saw a forum post where someone asserted that SQL always executes an update as a delete of the row and then an insert of the row with the new values. Now I need another excuse to play with undocumented features, so let’s see if that’s true (or at least true for all the common cases I try)

First up, a heap, no indexes at all, an update that does not change the size of the row.

CREATE TABLE TestingUpdate1 (
ID INT IDENTITY,
SomeString CHAR(50)
)

INSERT INTO TestingUpdate1 (SomeString)
VALUES
('One'),('Two'),('Three'),('Four'),('Five'),('Six'),('Seven'),('Eight'),('Nine')

CHECKPOINT -- truncate the log, DB is in simple recovery.

UPDATE TestingUpdate1
SET SomeString = 'NotFour'
WHERE ID = 4 -- one row

SELECT Operation, Context, AllocUnitName, [Transaction Name], Description FROM fn_dblog(NULL, NULL) AS TranLog

DROP TABLE TestingUpdate1

UpdateHeapFixedSize

The log operation here is Modify Row. so in this case, the update was done as an in-place update.

Second test, a heap, no indexes at all, and an update that changes the row size (however plenty free space on the page, so no issues with forwarding pointers here)

(more…)

T-SQL Tuesday: Disasters don’t just come in huge

imageSo we’re supposed to talk about disasters we’ve had or discuss disaster recovery technologies. I’m going to take a slightly different approach…

<soapbox>

I’m in the fortunate position of living in a region of the world that’s relatively free of natural disasters. We’re reasonably geologically stable, the nearest fault lines are the Great Rift Valley and somewhere in the Antarctic ocean. We don’t get tornadoes, we’re a long way from the ocean (and Madagascar partially protects the coast from typhoons and tsunamis)

Given that, and looking at the recent events in Japan and Southern USA, local IT managers might be grateful that their critical infrastructure is here, not there. But that is no reason for complacency, no reason to ignore putting a disaster recovery plan in place.

Major huge national disasters, while they attract a whole lot of attention (and rightly so) are probably not the main cause of IT disasters. IT disasters, for the most part, are likely to be caused more by smaller events like these1

  • A drive in the RAID 5 array fails, and the SAN admin switches out the wrong drive.
  • A SAN controller fails and overwrites a LUN or 2 with binary garbage.
  • The server room floor collapses dumping the SAN 2 floors down into a garage, with the server on top of it.
  • A water leak a floor above the server room results in the UPS getting a shower, and the resultant power surge fries the servers’ main boards
  • A developer with far too many permissions truncates an important table on the production server, thinking he was working on the development environment.
  • The mains power fails but the generators don’t come online because their fuel was drained a day earlier in preparation for maintenance.

Events like those (or probably even more mundane events) are the ones that we need to plan for. Relatively minor disaster that can leave business without critical infrastructure or critical data for hours or days.

You need to plan for the small disasters as well as the large ones. Plan for the dropped table. Plan for two drives failing in the RAID 5 array. Plan for the server’s power supply failing. Plan for the big disasters too, just don’t think that they’re the only thing that endangers your data and your business.

(1) I’ve personally seen 3 of those events happen, I’ve heard from people who have seen two more and I know of a company that’s at risk of one. They’re not just made-up improbably occurrences.

On the exorcism of Ghost Records

There was a question earlier on one of the SQL forums as to whether or not Ghost Cleanup overwrote the deleted rows when it ran so as to ensure no one could read that data again.

Now I could just reference Paul Randal‘s blog post where he described Ghost Cleanup in depth and leave it there, but where would the fun be in that? Smile

So, now that everyone’s read Paul’s blog post, time for some fun with undocumented commands to prove (again) what we already know (and what Paul’s already shown).

Setup code:

CREATE TABLE TestingCleanup (
ID INT IDENTITY PRIMARY KEY,
Description VARCHAR(20),
Filler CHAR(50) DEFAULT ''
);
GO

INSERT INTO TestingCleanup (Description)
VALUES ('One'), ('Two'), ('Three'), ('Four'), ('Five'), ('Six'), ('Seven'), ('Eight'), ('Nine'), ('Ten')

-- Find the page number that the table is on

SELECT OBJECT_ID('TestingCleanup') -- 1399012065
DBCC IND(11,1399012065,1)
-- The IAM is 1:309 and the data page is 1:308

Now I’m going to delete half the table and then go and look at the page. The ghost cleanup can’t run until the transaction commits, so running this within a transaction gives a chance to go and look at how things work.

BEGIN TRANSACTION
DELETE FROM TestingCleanup WHERE ID%2 = 0 -- delete the even rows
SELECT ID, Description FROM TestingCleanup -- 5 rows
DBCC TRACEON (3604)
DBCC PAGE(11,1,310,1)
DBCC TRACEOFF (3604)
COMMIT TRANSACTION

The delete removes all the rows with even identity values and the select returns only 5 rows, as expected

Dump type 1 for DBCC Page gives the header and then each row separately in binary. I’m editing out uninteresting parts of the output to keep things manageable.

(more…)

PASS Summit Session Votes

I blogged a week ago about the sessions that I submitted for the PASS Summit this year. At the point that I scheduled the post I hadn’t seen the announcement from PASS, hence a second post on the same topic.

This year, you can vote for sessions that you want to see. How much influence this will have over the Program Committee isn’t defined anywhere, but anything is better than nothing.

Voting closes on the 20th May (Friday), so if you have any interest in what gets presented at PASS Summit this year, go over to the speaker preference survey and vote.

Oh, and to make things easier, if you want to vote for any of my sessions, they’re here

On Transactions, errors and rollbacks

Do errors encountered within a transaction result in a rollback?

It seems, at first, to be a simple question with an obvious answer. Transactions are supposed to be atomic, either the entire transaction completes or none of it completes.

Maybe too simple…

CREATE TABLE TestingTransactionRollbacks (
 ID INT NOT NULL PRIMARY KEY ,
 SomeDate DATETIME DEFAULT GETDATE()
 ) ;
GO
BEGIN TRANSACTION
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (1)
-- Fails. Cannot insert null into a non-null column
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (NULL)
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (2)
-- fails. Duplicate key
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (2)
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (3)
COMMIT TRANSACTION
GO
SELECT ID, SomeDate FROM TestingTransactionRollbacks
GO
DROP TABLE TestingTransactionRollbacks

If a transaction rolled back at the first failure, that final select would return no rows. But it doesn’t, it returns 3 rows. The failure of the individual statements was ignored and the transaction completed and committed. If that had been an important business process, not a made-up example, that could have some nasty consequences for transactional consistency of data.

What’s really going on here? Aren’t transactions supposed to be atomic? Isn’t SQL supposed to roll them back if they don’t complete successfully?

Well, kinda.

(more…)