SQL Server

T-SQL Tuesday: Disasters don’t just come in huge

imageSo we’re supposed to talk about disasters we’ve had or discuss disaster recovery technologies. I’m going to take a slightly different approach…

<soapbox>

I’m in the fortunate position of living in a region of the world that’s relatively free of natural disasters. We’re reasonably geologically stable, the nearest fault lines are the Great Rift Valley and somewhere in the Antarctic ocean. We don’t get tornadoes, we’re a long way from the ocean (and Madagascar partially protects the coast from typhoons and tsunamis)

Given that, and looking at the recent events in Japan and Southern USA, local IT managers might be grateful that their critical infrastructure is here, not there. But that is no reason for complacency, no reason to ignore putting a disaster recovery plan in place.

Major huge national disasters, while they attract a whole lot of attention (and rightly so) are probably not the main cause of IT disasters. IT disasters, for the most part, are likely to be caused more by smaller events like these1

  • A drive in the RAID 5 array fails, and the SAN admin switches out the wrong drive.
  • A SAN controller fails and overwrites a LUN or 2 with binary garbage.
  • The server room floor collapses dumping the SAN 2 floors down into a garage, with the server on top of it.
  • A water leak a floor above the server room results in the UPS getting a shower, and the resultant power surge fries the servers’ main boards
  • A developer with far too many permissions truncates an important table on the production server, thinking he was working on the development environment.
  • The mains power fails but the generators don’t come online because their fuel was drained a day earlier in preparation for maintenance.

Events like those (or probably even more mundane events) are the ones that we need to plan for. Relatively minor disaster that can leave business without critical infrastructure or critical data for hours or days.

You need to plan for the small disasters as well as the large ones. Plan for the dropped table. Plan for two drives failing in the RAID 5 array. Plan for the server’s power supply failing. Plan for the big disasters too, just don’t think that they’re the only thing that endangers your data and your business.

(1) I’ve personally seen 3 of those events happen, I’ve heard from people who have seen two more and I know of a company that’s at risk of one. They’re not just made-up improbably occurrences.

On the exorcism of Ghost Records

There was a question earlier on one of the SQL forums as to whether or not Ghost Cleanup overwrote the deleted rows when it ran so as to ensure no one could read that data again.

Now I could just reference Paul Randal‘s blog post where he described Ghost Cleanup in depth and leave it there, but where would the fun be in that? Smile

So, now that everyone’s read Paul’s blog post, time for some fun with undocumented commands to prove (again) what we already know (and what Paul’s already shown).

Setup code:

CREATE TABLE TestingCleanup (
ID INT IDENTITY PRIMARY KEY,
Description VARCHAR(20),
Filler CHAR(50) DEFAULT ''
);
GO

INSERT INTO TestingCleanup (Description)
VALUES ('One'), ('Two'), ('Three'), ('Four'), ('Five'), ('Six'), ('Seven'), ('Eight'), ('Nine'), ('Ten')

-- Find the page number that the table is on

SELECT OBJECT_ID('TestingCleanup') -- 1399012065
DBCC IND(11,1399012065,1)
-- The IAM is 1:309 and the data page is 1:308

Now I’m going to delete half the table and then go and look at the page. The ghost cleanup can’t run until the transaction commits, so running this within a transaction gives a chance to go and look at how things work.

BEGIN TRANSACTION
DELETE FROM TestingCleanup WHERE ID%2 = 0 -- delete the even rows
SELECT ID, Description FROM TestingCleanup -- 5 rows
DBCC TRACEON (3604)
DBCC PAGE(11,1,310,1)
DBCC TRACEOFF (3604)
COMMIT TRANSACTION

The delete removes all the rows with even identity values and the select returns only 5 rows, as expected

Dump type 1 for DBCC Page gives the header and then each row separately in binary. I’m editing out uninteresting parts of the output to keep things manageable.

(more…)

On Transactions, errors and rollbacks

Do errors encountered within a transaction result in a rollback?

It seems, at first, to be a simple question with an obvious answer. Transactions are supposed to be atomic, either the entire transaction completes or none of it completes.

Maybe too simple…

CREATE TABLE TestingTransactionRollbacks (
 ID INT NOT NULL PRIMARY KEY ,
 SomeDate DATETIME DEFAULT GETDATE()
 ) ;
GO
BEGIN TRANSACTION
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (1)
-- Fails. Cannot insert null into a non-null column
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (NULL)
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (2)
-- fails. Duplicate key
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (2)
-- succeeds
INSERT INTO TestingTransactionRollbacks (ID)
 VALUES (3)
COMMIT TRANSACTION
GO
SELECT ID, SomeDate FROM TestingTransactionRollbacks
GO
DROP TABLE TestingTransactionRollbacks

If a transaction rolled back at the first failure, that final select would return no rows. But it doesn’t, it returns 3 rows. The failure of the individual statements was ignored and the transaction completed and committed. If that had been an important business process, not a made-up example, that could have some nasty consequences for transactional consistency of data.

What’s really going on here? Aren’t transactions supposed to be atomic? Isn’t SQL supposed to roll them back if they don’t complete successfully?

Well, kinda.

(more…)

Indexing for ORs

All of the indexing strategy posts I’ve written in the past have been concerned with predicates combined with ANDs. That’s only one half of the possibilities though. There’s the case of predicates combines with ORs, and the guidelines for indexing that work well with ANDs don’t work with ORs

When dealing with predicates combined with AND, the predicates are cumulative, each one operates to further reduce the resultset.

For this reason, multi-column indexes support multiple predicates combined with AND operators.

If we look at a quick example, consider the following.

CREATE TABLE Customers (
  CustomerID INT IDENTITY PRIMARY KEY,
  Surname VARCHAR(30) NOT NULL,
  FirstName VARCHAR(30),
  Title VARCHAR(5),
  CustomerType CHAR(1) NOT NULL,
  IsActive BIT DEFAULT 1 NOT NULL,
  RegistrationDate DATETIME NOT NULL DEFAULT GETDATE()
);</pre>
CREATE INDEX idx_Customers_SurnameFirstName ON Customers (Surname, FirstName);

Again I’m going to be lazy and get SQLDataGenerator to generate a few rows.

With that two column index on those columns and a query that looks for Surname = ‘Kelley’ AND Name = ‘Rick’, SQL can do a double column seek to go directly to the start of the range then just read down the index to the end of the range, basically until it finds the first row that it’s not interested in.

So how does that that differ when the operator is an OR?

The main difference is that with an OR, the predicates are independent. The second doesn’t serve to reduce the recordset, but rather to expand it. It’s similar to evaluating two separate predicates and combining the result. Let’s have a look at that 2 column index again when the two predicates are combined with an OR.

SELECT CustomerID
  FROM Customers
  WHERE Surname = 'Kelley' OR FirstName = 'Rick';

If we try to use that index to evaluate Surname = ‘Kelley’ OR Name = ‘Rick’, there’s a problem. While the first of those predicates can be evaluated with a seek (it’s a sargable predicate on the left-most column of an index), the second predicate cannot. It’s sargable, but it is on the second column of the index (and for the moment let’s assume there are no other indexes on the table). Seeks are only possible if the predicate filters on a left-based subset of the index key.

Hence to evaluate that predicate SQL will have to do an index scan. Since it has to do a scan to evaluate the one predicate, it won’t bother also doing a seek to evaluate the first predicate as it can also evaluate that during the scan.

Hence, in this case, the query will execute with a single index scan.

IndexScanWithOr

So how do we get this query to rather seek?

(more…)

Meme Monday: I Got 99 SQL Problems And the Disk Ain’t One

From Tom LaRock’s latest idea

For a change I’m not going to mention performance (well, not more than once anyway) and I’m not going to preach backups or recoverability. Nor am I going to talk about technical issues like many others are doing.

Instead I’m going to briefly mention several (well, 9) problems that I’ve seen a lot in last year or so. These certainly aren’t specific to databases, but I’ve been seeing them with regards to databases, so…

Poor/missing development environment

A development environment that’s months out of date with production, has different databases from production (eg has databases that in the production environment are on different servers) will cause problems. Either code will never work in dev (eg missing linked servers) and hence can’t be properly tested or works fine in dev and fails in production.

Either way, the appropriate response from the senior architect when told that the development environment is months out of date is not "Well, yes. I could have told you that. So what?"

Lax security

When everyone, from developers to help desk to business analysis has full sysadmin access, things are going to go wrong, sooner or later. Probably sooner. From new databases that no one knows anything about, to missing databases, altered settings that break backups or other maintenance, code changes that ‘no one’ made, missing data, etc.

This kind of setup is a bloody pain to fix; without strong support from management it’s near-impossible to fix. Far better to start correctly if possible

Lack of change control

There must be some control over what changes are made, by who, to what and when. Developers should not be just deploying their changes whenever they feel like it with no regard to what other people are doing, whether the system is in use or not, or who is inconvenienced.

Likewise there needs to be some record of what was done, when and why. If no one knows what is being done then when something breaks (and it’s when, not if) there will be no way to tell where to start looking.

 

(more…)

To TOP or not to TOP an EXISTS

Earlier this year I had a look at a query pattern that I often see on forums and in production code, that of the Distinct within an IN subquery. Today I’m going to look at a similar patters, that being the use of TOP 1 within an EXISTS subquery.

Three tests. First a straightforward exists with no correlation (no where clause linking it to an outer query). Second, an exists with a complex query (one with a non-sargable where clause and a group by and having). Third an exists subquery correlated to the outer query.

Table structures are nice and simple, in fact, for ease I’m going to use the same tables as I did back on the exists, in and inner join tests. Code to create and populate the tables it attached to the end of the post.

First up, a simple exists query, in an IF, just to be different.

IF EXISTS (SELECT 1 FROM PrimaryTable_Medium)
PRINT 'Exists'

IF EXISTS (SELECT TOP (1) 1 FROM PrimaryTable_Medium)
PRINT 'Exists too'

For a benchmark, a SELECT 1 FROM PrimaryTable_Medium has the following IO characteristics

Table ‘PrimaryTable_Medium’. Scan count 1, logical reads 89, physical reads 0.

SQL Server Execution Times:
CPU time = 15 ms,  elapsed time = 510 ms.

Ignore the elapsed time, that’s likely mostly from displaying the records. I’m going to focus mostly on the CPU and IO.

Execution plans of the two exists variations are absolutely identical.

TopExists1

The index operators are scans because there is no way they could be anything else, there’s no predicate so a seek is not possible. That said, it’s not a full index scan. The properties of the Index Scan show 1 row only (actual and estimated). So SQL did not read the entire index, just enough to evaluate the EXISTS, and that’s what it did in both cases. IO stats confirm that.

(more…)

Statistics, row estimations and the ascending date column

SQL’s auto-updating statistics go a fair way to making SQL Server a self-tuning database engine and in many cases they do a reasonably good job

However there’s one place where the statistics’ auto-update fails often and badly. That’s on the large table with an ascending column where the common queries are looking for the latest rows.

Let’s have a look at a common scenario.

We have a large table (imaginatively called ‘Transactions’) with a date time column (even more imaginatively called ‘TransactionDate’). This table gets about 80,000 new records a day and currently has around 8,000,000 records in it. So we can say roughly that another 1% is added to the table size daily. No records are updated and there’s a monthly purge of old data so that the total size remains about the same. A reasonably common real-life scenario.

CREATE TABLE Accounts (
AccountID INT IDENTITY PRIMARY KEY,
AccountNumber CHAR(8),
AccountType CHAR(2),
AccountHolder VARCHAR(50),
Filler CHAR(50) -- simulating other columns
)

CREATE TABLE Transactions (
TransactionID INT IDENTITY PRIMARY KEY NONCLUSTERED,
AccountID INT NOT NULL FOREIGN KEY REFERENCES Accounts (AccountID),
TransactionDate DATETIME NOT NULL DEFAULT GETDATE(),
TransactionType CHAR(2),
Amount NUMERIC(18,6),
Filler CHAR(150) -- Simulating other columns
)
GO
CREATE CLUSTERED INDEX idx_Transactions_TransactionDate
ON Transactions (TransactionDate)

CREATE NONCLUSTERED INDEX idx_Transactions_AccountID
ON Transactions (AccountID)

CREATE NONCLUSTERED INDEX idx_Accounts_AccountType
ON Accounts (AccountType)

-- Using RedGate's SQLDataGenerator to generate some data for this.

Accounts Transactions

Day 1 of the month, the indexes have just been rebuilt (after the data purge) and the statistics associated with those have been updated. The latest value in the TransactionDate column is ‘2011/01/31’ and the last value in the statistics histogram is ‘2011/01/31’. Life is good.

Day 2 of the month, there have been 80,000 new records added for the previous day. Only 1% of the table has been updated, so the automatic statistics update would not have triggered. The latest value in the TransactionDate column is ‘2011/02/01’ and the last value in the statistics histogram is ‘2011/01/31’. Doesn’t look like a problem.

Fast forwards another couple of days. Day 5 of the month. By this point 300,000 rows have been added since the beginning of the month. This amounts to around 5% of the table. Hence the statistics auto-update (triggered at 20%) still would not have run. The latest value in the TransactionDate column is ‘2011/02/04’ and the last value in the statistics histogram is ‘2011/01/31’. Starting to look less than ideal.

So, what kind of effect does this have on the queries against that table? (more…)

Q & A from 24 Hours of PASS session

Earlier this week I took part in the 24 Hours of PASS live webcast event. There were far more questions at the end of the presentation than what I could answer online, so the answers to the rest are given here. (I have edited some of the questions for spelling, grammar and readability). My slide deck and demos I was using are linked at the end. I’m not including a backup of the DB, as it’s 60MB. The schema is there, as is a SQL Data Generator project.

Q: Is the problem here that addition of checking if the variable is null? (This was related to catch-all queries. )

No, not specifically. The problem is the query pattern of <column> = <parameter> OR <parameter> = <constant> . It’s not specific to the case where that constant is NULL, though that is the more common form that I’ve seen.

Q: What are the good alternatives for catch-all queries?

If you are on SQL 2008 SP2 or later you can add OPTION(RECOMPILE) to the catch all query and get a plan that is optimal for each particular set of parameters.

On SQL 2005 and earlier, even with a recompile you would not get an optimal plan. This is because the optimizer was required to create a plan that was safe for reuse, even if it was never going to be reused.

On builds on SQL 2008 prior to SP1, the use of OPTION (RECOMPILE) worked, but there was a bug related to that which could produce inaccurate results, so use with caution.

On builds of SQL 2008 between SP1 and SP2 (or SP1 CU5), the behaviour of a catch-all with recompile was the same as for SQL 2005

If you’re using SQL 2005 (or horrors SQL 2000) or a build of SQL 2008 that did not have the recompile options working, you can use dynamic SQL (there are advantages to dynamic SQL even on the latest builds of SQL 2008). Use the parameters passed to build up a string that just filtered on the columns for which parameters were passed, then use sp_executesql to run the parameterised dynamic SQL statement.

Details are available on my blog – https://www.sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

Q: If you don’t have access to run the execution plan, but have to have DBAs run it for you, is there a good way to have it captured?

Absolutely. In SQL 2005 and above, the execution plan is an XML document. You can save the plan as a .sqlplan file. That file can then be copied, emailed, etc. You can open it in management studio and SSMS will display the graphical plan.

Or, you could download Plan Explorer and open the plan in that.

(more…)

Full backups, the log chain and the COPY_ONLY option.

There have been a large number of posts made regarding the interaction between full database backups and the log chain (like this one). However there still seems to be a lot of confusion around regarding how they interact. I doubt I can clear up the confusion with yet another blog post, but I’ll try.

One that seems to come up now and again is the COPY_ONLY Option on backups. Typically the comment is something like ‘Use the COPY_ONLY option when taking ad-hoc full backups so that you don’t impact the log backups.’ Now we know from the blog posts linked above that full backups don’t ever break the log chain (and I’m not going to run yet more tests to prove it) so what is the copy only option there for?

Books Online states the following regarding the COPY_ONLY option for backups – “Specifies that the backup is a copy-only backup, which does not affect the normal sequence of backups. A copy-only backup is created independently of your regularly scheduled, conventional backups. A copy-only backup does not affect your overall backup and restore procedures for the database.”

Well, that doesn’t clear things up much. It does however go on to say this: “When used with BACKUP DATABASE, the COPY_ONLY option creates a full backup that cannot serve as a differential base. The differential bitmap is not updated, and differential backups behave as if the copy-only backup does not exist. Subsequent differential backups use the most recent conventional full backup as their base.”

So it’s not the log chain that copy only is there to not affect, it’s the differential base. Let’s test and see how it works.

(more…)

And now for a completely inappropriate use of SQL Server

A while back I wrote up a short introductory overview of Genetic Algorithms. Just for the shear, absolute fun of it, I thought I’d implement a basic genetic algorithm within SQL Server and use it to solve a form of the knapsack problem.

Now first a few comments on this. As the title states, this is not really an appropriate use of SQL Server. Genetic algorithms are generally implemented in languages like Java, C++, C#, etc; languages that are good at complex mathematics, string manipulation and have complex data types. I’m not necessarily using efficient, well-performing methods here, UDFa abound. This is not an article on best practices and well-performing code. I’m also doing no error handling, which I would if this were a real system (in a more suitable language)

Still, doing just for the sake of seeing if it’s possible is all sorts of fun. So, without further ado, the knapsack problem, an approximate solution with genetic algorithms in SQL Server. (For the record, this is a multi-constrained, bounded knapsack problem)

The scenario

There’s a bag that has a maximum volume that it can hold and a maximum mass that it can hold (and we assume that we can pack perfectly with no wasted space). There are eight items, all with different masses, different volumes and different values. The goal here is to maximise the total value of all the items contained within the bag.

CREATE TABLE ObjectStatistics (
  ObjectNumber TINYINT NOT NULL,
  Mass NUMERIC(4,2) NOT NULL,
  Volume NUMERIC(4,2) NOT NULL,
  Value NUMERIC(4,2) NOT NULL,
  NumberAvailable TINYINT NOT NULL,
  CONSTRAINT pk_ObjectStats PRIMARY KEY CLUSTERED (ObjectNumber)
);

CREATE TABLE BagStatistics (
  MaxMass NUMERIC(5,2),
  MaxVolume NUMERIC(5,2)
);

INSERT INTO dbo.ObjectStatistics (ObjectNumber, Mass, Volume, Value, NumberAvailable)
VALUES
  (1,0.5,0.45,2,50),
  (2,1.5,0.25,20,10),
  (3,0.1,1.25,8,150),
  (4,0.25,0.1,1,250),
  (5,0.67,0.3,6,100),
  (6,0.34,0.75,18,5),
  (7,1.25,0.25,5,40),
  (8,1.1,0.8,10,25);

INSERT INTO dbo.BagStatistics (MaxMass, MaxVolume)
VALUES  (100, 75);

Those two tables set up the constraints for the scenario, the maximum mass and volume for the bag and the mass, volume, value and maximum number available for each of the items.

(more…)