Author Archive: Gail

Recompiles

I’ve looked at cache hit and miss events and at the cache insert and remove events. The only cache-monitoring event (of current interest) left is the recompile event.

There are two recompile events available in Profiler in SQL 2008

  • SP:Recompile under Stored Procedures
  • SQL:StmtRecompile under T-SQL

Which to use when?

Books Online has the following to say on the two events.

The SQL:StmtRecompile event class indicates statement-level recompilations caused by all types of batches: stored procedures, triggers, ad hoc batches, and queries. Starting in SQL Server 2005, the SQL:StmtRecompile event class should be used instead of the SP:Recompile event class.

and

The SP:Recompile event class indicates that a stored procedure, trigger, or user-defined function has been recompiled. In SQL Server 2005 and later, recompilations reported by this event class occur at the statement level, whereas those in SQL Server 2000 occurred at the batch level.

In SQL Server 2005 and later, the preferred way to trace statement-level recompilations is to use the SQL:StmtRecompile event class. Starting in SQL Server 2005, the SP:Recompile event class is deprecated.

So it appears that they show the same thing and SP:Recompile is deprecated. That simplifies the entire situation, the only one that I’m going to look at in that case is the SQL:StmtRecompile event.

So what does the event look like?

RecompileEvents

(more…)

First Impressions

SELECT @@Version

Microsoft SQL Server “Denali” (CTP1) – 11.0.1103.9 (X64)

I’ve been playing with the CTP for a few days (with absolutely no documentation) and there are a few fun and interesting things that I found. These may not be earth-shattering changes, but they are interesting, and I don’t doubt there’s much more fun to come.

Configuration manager

It’s now much easier to add and remove startup parameters for the service. No more of SQL not starting because you forgot the ; between the end of the location of the master log file and the traceflag that you added

Startup parameters

What’s SQL HADR all about? Wish I knew. It requires a clustered installation and I don’t have a cluster handy to play with (yet). So no investigations of the Denali High availability/Disaster recovery features yet.

New SQL Server features

Sequences

People have been asking for sequences in SQL for ages (1) (2) and workarounds for creating replacements have been around as long (3). Now they’re here.

Sequences

So what can we do with these things?

(more…)

A Trio of Table Variables

So, it’s the second Tuesday of the month again, and it’s time for T-SQL Tuesday again. TSQL2sDay150x150This month it’s hosted by Sankar Reddy and the topic is “Misconceptions in SQL Server

I thought I’d tackle a trio of table variable myths and partial truths.

Table Variables are memory-only

This one is pervasive and irritating. It typically goes like this:

You should use table variables rather than temp tables because table variables are memory only.

This myth can be broken down into two parts:

  1. That table variables are not part of TempDB
  2. That table variables are not written to disk

The first is easy to prove and has been done repeatedly. I’m not doing it again. I’m going to tackle the second portion only.

See, one could argue that, even though the table variable is created in the TempDB system tables and allocated pages within the TempDB data file, it is still kept entirely and only in memory. Let’s see if that’s true…

DECLARE @LargeTable TABLE (
id INT IDENTITY PRIMARY KEY,
LargeStringColumn1 CHAR(100),
LargeStringColumn2 CHAR(100)
)

INSERT INTO @LargeTable (LargeStringColumn1, LargeStringColumn2)
SELECT TOP (100000) 'Table Variable Test','T-SQL Tuesday!'
FROM master.sys.columns a CROSS JOIN master.sys.columns b

WAITFOR DELAY '00:01:00' -- so that the table var doesn't go out of scope and get deallocated too quickly.

This is not a massively large table. 100000 rows at 204 bytes per row (excluding header). A query of sys.dm_db_index_physical_stats (which does work on temp tables and table variables) reveals a total page count of 2632. That’s a grand total of 20.6 MB. 20 Megabytes. The SQL instance I’m running this on is allowed to use up to 2 GB of memory. No way on earth is this table variable going to cause any form of memory pressure (and I promise there is nothing else running)

So, run that code and, while that waitfor is running, do something that should never be done to a SQL server that you care anything about. (more…)

One wide index or multiple narrow indexes?

TSQL2sDay150x150 Or “If one index is good, surely many indexes (indexes? indices? indi?) will be better

This is a question that comes up very often on the forums. Something along the lines of:

I have a query with multiple where clause conditions on a table. Should I create one index for each condition, or one index with all the columns in it?

The question basically boils down to this: Which is more optimal and more likely for the optimiser to pick, a single seek operation against a wide index that seeks on all three conditions in one go, or three seek operations against three indexes followed by a join to get back the final set of rows.

One thing to keep in mind is that one of the jobs of an index is to reduce the number of rows in consideration for a query as early as possible in the query’s execution.

So let’s take a made-up example. Let’s say we have a table with a number of columns in it. A query is run against that table with three conditions in the where clause

WHERE ColA = @A AND ColB = @B AND ColC = @C

Let’s further say that 1000 rows qualify for the condition ColA = @A, 15000 rows qualify for ColB = @B and 30000 rows qualify for ColC = @C. The total number of rows that qualify for all three conditions is 25.

Which sounds like it would be more efficient?

  • Seek on an index with all three columns and retrieve just 25 rows
  • Seek on an index on ColA, retrieve 1000 rows, seek on an index on ColB, retrieve 15000 rows, seek on an index on ColC, retrieve 30000 rows then join the three result-sets together to get the desired 25 rows (called an Index Intersection)

Time for some tests to find out.

(more…)

Come and gone

Or “Plan cache monitoring – insert and remove

Previously I took a look at the CacheHit and CacheMiss events to see how they behave and how to identify what’s been searched for in the cache. in this follow up, I want to take a similar look at the CacheInsert and CacheRemove events, see when they fire and how to identify the objects that they relate to.

Again, a word of caution, these can be frequently occurring events on busy servers and so traces should be kept short and to a minimum of events and columns. That said, these should occur a lot less often than the CacheHit and CacheMiss events. If they are occurring very frequently it may indicate that the SQL Server is not reusing plans efficiently.

CacheInsert

The CacheInsert event fires after a CacheMiss. The search for a matching plan in the cache failed, firing a CacheMiss event. Since there’s no plan, the optimiser is invoked to generate one and then that plan is inserted into the plan cache before the Query Execution engine begins execution.

The event is fairly simple, though of course there are a few surprises (what in SQL doesn’t have?).

Exec FireCacheEvents
GO

SELECT ID, SomeDate, Status
FROM TestingCacheEvents
WHERE Status =  'C'

CacheInsert

(more…)

Hit and miss

Or “Monitoring plan cache usage

For people interested in the details of how SQL is using and reusing execution plans, there are some useful events in profiler for watching this in detail, under the Stored procedure group:

  • SP:CacheMiss
  • SP:CacheInsert
  • SP:CacheHit
  • SP:CacheRemove
  • SP:Recompile
  • SP:StmtRecompile

Additionally there’s the SQL:StmtRecompile event under the TSQL group.

For now, I just want to look briefly at the CacheMiss and CacheHit events.

One word of caution early on, these are frequently occurring events and it may not be a good idea to trace these on busy production servers. If you do need to, keep the duration of the trace short and the columns to a minimum.

CacheEvents

CacheMiss

The cache miss event fires any time SQL looks for the execution plans for an object or ad-hoc batch and does not find it in the plan cache.

For an object (scalar function, multi-statement table-valued function, stored procedure or trigger) the match is done on the object ID (along with some of the connection’s SET options and possibly the database user and c couple other factors1). For an ad-hoc batch, the match is done on a hash of the text of the batch (along with some of the connection’s SET options and possibly the database user)

When testing stored procedures from Management Studio (or another SQL querying tool), two CacheMiss events will appear in the trace.

CacheMiss

What’s going on here?

(more…)

Does a missing data file send a database suspect?

Short answer: No

I keep seeing this come up on various forums when people ask what makes a database go suspect or what could have caused their database to be marked suspect. I can almost guarantee when that question comes up, one or more people will answer ‘missing files’. That may have been true on earlier versions (I don’t have a SQL 2000 or earlier instance to play with), but it is not true in the current versions of SQL. A missing file may result in the database being inaccessible (depending what file), but it will not result in a suspect database.

Let’s prove it.

I’m going to create a database with three files, two of which are in the primary filegroup, with one user table on each filegroup. (T-SQL code at the end) Before each test I’ll begin a transaction, modify data in both tables, shut SQL down so that there’s an uncommitted transaction in the log (database cannot be cleanly shut down), then rename a file before restarting SQL and see what happens.

I’m not going to play with the transaction log. That I’ve done before. In SQL 2005/2008, if the transaction log is missing and the database was cleanly shut down, SQL will recreate it. If the the transaction log is missing and the database was not cleanly shut down, the database goes into the RECOVERY_PENDING state, so no SUSPECT here.

Let’s try the file in the secondary filegroup first.

(more…)

DBA Hell

On the first day of DBA hell, the server gave to me
A database with damaged system tables and no good backups (1)

On the second day of DBA hell, the server gave to me
Two databases with widespread corruption, no backups (1, 2)

On the third day of DBA hell, the server gave to me
Three suspect databases, no backups (1,2,3)

On the fourth day of DBA hell, the server gave me nothing, cause I didn’t have a job any longer…

How does one end up with a critical production database that has no backups? I could kinda understand if the backups were damaged, if the corruption went undetected for long enough that it was in the backups as well, but to have no backups at all? Of an important database?

The only excuse for having no backups is if the database can be trivially and completely recreated from another source with minimal impact to the users. This is not the normal scenario.

There’s an immense amount of information available on backup and restore strategies.

That’s just a quick list, there’s far more information available than that. Enough that there’s really no good excuse to not have backups when they’re needed.

As Steve Jones (blog|twitter) is fond of saying “Good backup, good resume. You only need one”

Redgate's Exceptional DBA competition

Redgate’s Exceptional DBA competition is back for a third year! I was one of the judges for this last year and, while I’m not judging it this year, I do have some advice for anyone considering entering.

Be Explicit and detailed

This is not a competition won by luck. There are no dice rolled, no coins tossed, no numbers drawn from a hat.

As an entrant, you need to convince the judges that you (or the person you are nominating) are the best of the best. The only thing that you can use to do that are the answers on the entry form.

The more the better (within reason). To give an idea, last year the answers to one question (What’s the hallmarks of an exceptional DBA?) ranged from one word to half a page. Which of those two do you think the judges rated higher?

If you can, get a colleague to read over your answers before submitting them. Ask them for their opinion, ask them if there are any pieces that they’d change or add to show you (or the person you are nominating) in their very best light.

Spell check

Please, please, please run a spell check and grammar check over your entries before submitting. This goes double if English (or American) is not your first language. There is nothing that makes an entry look bad more than por speeling thet teh jugdes mast spand tyme desifering.

No, not all of us speak English fluently, but there are enough grammar and spell checkers available (hint Firefox includes one if you download the dictionary) that not bothering shows a lack of interest and professionalism. Besides, if the judges can’t work out what you’re saying, they’re not going to rate your entry highly.

On this point, watch the l33t speak and SMS/twitter style word-shortening. They’re harder to read that fully written out words, and space is not at a premium for these entries. Again, you should be trying to show that you are a professional, much like you would when writing up a CV.

Watch the humour

What’s funny for one person may be annoying or offensive to another. A joke about ‘cleaning up after those incompetent developers’ may not be funny to a judge who is a developer or comes from a development background. Again, keep it professional, imagine that these answers are going to be seen by the CIO/owner/MD of the company you work for.

Along the same lines, funny answers aren’t. One entry last year, for the question “Why do you deserve to win?”, gave as an answer “No idea ;-)”

Well dude, if you don’t know why you should win, I sure as hell don’t.

In Conclusion

If you’re planning to enter this competition, you have one chance to make an impression with the judges – your answers. Make it the best impression that you can.

In, Exists and join – a roundup

Over the last several months I’ve had a look at IN, Exists, Join and their opposites to see how they perform and whether there’s any truth in the advice that is often seen on forums and blogs advocating replacing one with the other.

Previous parts of this series can be found:

In this roundup post, I’m going to do multiple tests on the 6 query forms, with different numbers of rows, indexes, no indexes and, for the negative forms (NOT IN, NOT EXISTS), nullable and non-nullable join columns.

In the individual tests, I used 250000 rows in the first table and around 3000 rows in the secondary table. In this roundup, I’m going to use 3 different row counts, 1000000 rows, 50000 rows and 2500 rows. That should give a reasonable idea for performance at various table sizes. (Not much point in going smaller than 2500 rows. Everything’s fast on 100 rows)

Some notes on the tests.

  • The version of SQL is SQL Server 2008 SP1 x64 Developer Edition.
  • The tests were run on a laptop. Core-2 Duo, 3 GB memory. SQL limited to 1 processor, so no parallelism possible.
  • Each query will be run 10 times, reads, cpu and duration measured by profiler and averaged.
  • Each query will be run once before the tests start to ensure that the data is in cache and the execution plans are generated and cached.
  • Reproduction scripts will be available for download.

(more…)