Catch-all queries

13 March 2018: There is an updated post on this subject – https://www.sqlinthewild.co.za/index.php/2018/03/13/revisiting-catch-all-queries/

The query form that I refer to as ‘Catch-all’ typically results from search screens in the application where the user may enter any one (or more) of a number of optional parameters. One of the more common ways for such a query to be written in SQL is with multiple predicates in the where clause of the form (WHERE SomeColumn = @SomeVariable OR @SomeVariable IN NULL)

Now this does work, the problem is that it works fairly inefficiently and, on large tables, can result in really poor query performance. I’m going to take a look at why that is the case and what alternatives there are.

Erland Sommarskog has written on this as well, and in a lot more detail than I’m going to. His article on dynamic search conditions is well worth reading, as are the rest of his articles.

A typical example of a ‘catch-all’ query would be this one, based off a table in the AdventureWorks database.

CREATE PROCEDURE SearchHistory
(@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)
AS
SELECT ProductID, ReferenceOrderID, TransactionType, Quantity,
TransactionDate, ActualCost from Production.TransactionHistory
WHERE (ProductID = @Product Or @Product IS NULL)
AND (ReferenceOrderID = @OrderID OR @OrderID Is NULL)
AND (TransactionType = @TransactionType OR @TransactionType Is NULL)
AND (Quantity = @Qty Or @Qty is null)
GO

Now, let’s say that I run that query and pass values for the ProductID and the Transaction type. Let’s further say that there’s a nonclustered index (called idx_TranHistory_TranTypeProductID) on those two columns.

EXEC SearchHistory @Product = 978, @TransactionType = 'W'

Now this returns 52 rows out of 980000 that are in the table, so we’d expect that SQL would use an index seek operation on that index, followed by a bookmark lookup.

Nope. It’s using that index all right, but it’s doing a scan, not a seek. Ok, not great, but not bad. Let me try a different set of parameters

EXEC SearchHistory @Qty = 100

The plan’s exactly the same. No surprise, it was cached the first time and then reused. There’s a problem here though, the index that’s used is completely inappropriate and there’s a bookmark lookup that ran almost a million times. No wonder this execution took 3 seconds and 2,949,715 IOs to return 29 rows.

Ok, so let me try a different form of the catch-all query

CREATE PROCEDURE SearchHistory_Improved
(@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)
AS
SELECT ProductID, ReferenceOrderID, TransactionType, Quantity, TransactionDate, ActualCost from Production.TransactionHistory
WHERE (ProductID = CASE WHEN @Product IS NULL THEN ProductID ELSE @Product END)
AND (ReferenceOrderID = CASE WHEN @OrderID IS NULL THEN ReferenceOrderID ELSE @OrderID END)
AND (TransactionType = CASE WHEN @TransactionType IS NULL THEN TransactionType ELSE @TransactionType END)
AND (Quantity = CASE WHEN @Qty IS NULL THEN Quantity ELSE @Qty END)
GO

Let’s see what that does for the first test:

EXEC SearchHistory_Improved @Product = 978, @TransactionType = 'W'

Well that’s no better. Full blown table scan.

The problem with these types of queries is that there is no stable plan. The optimal plan differs completely depending on what paramters are passed. The optimiser can tell that and it plays safe. It creates plans that will always work. That’s (one of the reasons) why in the first example it was an index scan, not an index seek.

The downside of the safe plan is that it’s highly unlikely to be a good plan and, even if it is, it won’t be good for all possible combinations of plans.

So, how to handle this type of query? Well, there are typically two ways.

Recompile

This is only an option on SQL 2008. On 2008, if the query is specified with the OPTION (RECOMPILE) hint, then the optimiser knows it doesn’t have to worry about safe plans because the plan will never be reused. In fact, if I add that hint to the query in the first example, I get the expected index seek.

Dynamic SQL

The other option is to build up the query string dynamically, based on the parameters passed and then to use sp_executesql to run it. There are the ususal downsides to dynamic SQL but, it may be that the performance improvement is worth it.

CREATE PROCEDURE SearchHistory_Dynamic (@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)
AS
DECLARE @sSQL NVARCHAR(2000), @Where NVARCHAR(1000) = ''
SET @sSQL = 'SELECT ProductID, ReferenceOrderID, TransactionType, Quantity, TransactionDate, ActualCost
from Production.TransactionHistory '

IF @Product is not null
SET @Where = @Where + 'AND ProductID = @_Product '
IF @OrderID is not null
SET @Where = @Where + 'AND ReferenceOrderID = @_OrderID '
IF @TransactionType IS NOT NULL
SET @Where = @Where + 'AND TransactionType = @_TransactionType '
IF @Qty IS NOT NULL
SET @Where = @Where + 'AND Quantity = @_Qty '

IF LEN(@Where) > 0
SET @sSQL = @sSQL + 'WHERE ' + RIGHT(@Where, LEN(@Where)-3)

EXEC sp_executesql @sSQL,
N'@_Product int, @_OrderID int, @_TransactionType char(1), @_Qty int',
@_Product = @Product, @_OrderID = @OrderID, @_TransactionType = @TransactionType, @_Qty = @Qty

GO

Note that there’s no SQL injection vulnerability in this. The parameters are never concatenated into the string and the execution is parametrised.

Now each different set of parameters gets a different cached plan, optimal for that particular set of parameters.

EXEC SearchHistory_Dynamic @Product = 978, @TransactionType = 'W'

EXEC SearchHistory_Dynamic @Qty = 100

The first gets an index seek, the second a clustered index scan (because there’s no index on Quantity). Much better than the behaviour with the earlier non-dynamic versions.

77 Comments

  1. Bikerdad

    G’morning Gail,

    I just got referred in to this post from somewhere or other, and since I use a variety of NULL conditions, I was wondering how my preferred structure works, since I don’t see it above.

    I often use this structure:

    WHERE (CASE WHEN @ProductID IS NULL THEN 1
    WHEN @ProductID = ProductID THEN 1
    ELSE 0
    END) = 1

    More efficient, same, less efficient, or truly horrible?

    When using multi-value parameters, I try to always bust the passed in string down to a table (variable or temp, depending on my mood), then join to the result.

    One definite takeaway for me will be to use the OPTION(RECOMPILE) in any of my procs where I’m allowing a NULL parameter.

    Reply
  2. Gail (Post author)

    Same (I tested it on a forum post, not here).

    It has the same performance effects as the WHERE (@ProductID IS NULL OR ProductID = @ProductID), however I don’t know if OPTION(RECOMPILE) will help that form, depends whether the optimiser can recognise the construct or not (also make sure you’re on 2008 SP2 or above before trying RECOMPILE)

    Reply
  3. Charlie Gildawie

    Yo Gail.

    Is there any reason that you do a check on the @where and then subtract the first AND. Instead of doing a @where with a static ‘WHERE 1 = 1 ‘ then add the clauses with AND’s.

    I generally find it’s a little easier to debug because you are doing less string manipulation.

    Cheers.
    Transact Charlie.

    Reply
  4. Gail (Post author)

    Because I heartily dislike the WHERE 1=1 .. form in the where clause and I would never willingly use or promote code that contains it.

    Reply
  5. Aaron

    Wonderful article. Thanks for posting. Was exactly the help i needed.

    Reply
  6. Charlie

    This is very nice – BUT you can not use it in Table Valued Functions (TVF), since it uses an EXEC !!
    Any hint on how to use dynamic WHERE in TVF ?

    Reply
  7. rey

    Nice Article πŸ™‚

    Reply
  8. Mark

    Gail – How might we modify this to use LIKE and IN? If I modify it as follows:

    IF @firstname is not null
    SET @Where = @Where + ‘AND OrderID LIKE (”@_OrderID%”) ‘

    … PRINT sSQL produces:
    WHERE firstname LIKE (‘@_OrderID%’)

    … which looks and executes correct when run manually. However I receive no results from the stored proc when run. Any ideas?

    Reply
  9. Mark

    Apologies – I screwed up the code in my previous post by trying to adapt it to your example. Please ignore and consider this instead:

    Gail – How might we modify this to use LIKE and IN? If I modify it as follows:

    IF @OrderID is not null
    SET @Where = @Where + β€˜AND OrderID LIKE (”@_OrderID%”) β€˜

    … PRINT sSQL produces:
    WHERE OrderID LIKE (β€˜@_OrderID%’)

    … which looks correct as far as I can tell. However I receive no results from the stored proc when run. Any ideas?

    Reply
    1. Gail (Post author)

      LIKE @Order + ”%”

      Reply
  10. Pingback: Case TSQL statement in WHERE clause causing scanning (ss2005)

  11. peterG

    Gail, in Erland Sommarskog’s blog that you reference above, he suggests that building the query including the parameterized WHERE clause in client-side code is perfectly acceptable. Any thoughts?

    Reply
  12. Gail (Post author)

    Perfectly valid, just make sure that it is parameterised correctly.

    I showed it as a stored proc because I often encounter the ‘catch-all’ form in existing procs and have to tune it without making any changes to the client app.

    Reply
  13. Jeff

    Thank you for the great, simple, write up. I chose to use dynamic sql, which I have used before, but not in such a clean way. Another thing I like about doing the parameterized list is that it will often lead to shorter sql statements which is helpful when you have large queries that are approaching the 4000 NVARCHAR limit.

    Reply
  14. George

    Thank you for a great article! It really helped me. I have a small query though. How much performance advantage, in terms of time, does the dynamic query score against the other methods?

    Reply
  15. Eduardo

    I have a stored procedure that I need to select registers that have at least one parameter in the WHERE clause.

    example:

    CREATE PROCEDURE [dbo].[SPPeople]
    (
    @Name varchar(50) = null,
    @Status1 char(1) = ‘0’,
    @Status2 char(1) = ‘0’,
    @Status3 char(1) = ‘0’
    )
    as

    SELECT People.Name, People.Age

    FROM People

    WHERE
    (People.Name = CASE WHEN @Name is null THEN People.Name ELSE @Name END) AND
    ( (People.Status1 = CASE WHEN @Status1 = ‘0’ THEN People.Status1 ELSE @Status1 END) OR
    (People.Status2 = CASE WHEN @Status2 = ‘0’ THEN People.Status2 ELSE @Status2 END)
    (People.Status3 = CASE WHEN @Status3 = ‘0’ THEN People.Status3 ELSE @Status3 END) )

    ——

    If parameters @Status1 and @Status2 are equal ‘1’ the query should bring registers that necessarily have Status1 OR Status2 equal to ‘1’, not depending on Status3.

    But the query above brings all registers when I mark @Status1 and @Status2 as ‘1’.

    Reply
    1. Gail (Post author)

      Can you post this question on one of the SQL forums please?

      Reply
  16. Jhogue

    I am wrong in my taking this as “parameter sniffing” and would it help if a local variable hodling a copy of the parameter value be used in the T-SQL statements ?

    Reply
    1. Gail (Post author)

      It’s related to parameter sniffing, but it’s not parameter sniffing and local variables won’t suddenly give you a good plan. At best, they’ll give you a stable, bad plan.

      Reply
  17. Jared Karney

    If my catch-all only has 1 parameter, would it be better to create 2 additional procs and nest them in the first; i.e. only adding 1 extra condition based on a not null parameter? In the case I am referring to, they are either returning all rows in the set if the parameter is null, or filtered if the parameter is not null.

    Reply
  18. Nilanaksha B

    Nice article. Like to see more of these articles..

    Reply
  19. Charlie

    Thank you, thank you, thank you! Great solution, and enough detail to understand what you’re getting at without getting too into the weeds.

    Reply
  20. Commenter Man

    Thank you Thank you Thank you

    Reply
  21. cari

    I had a consultant insist on no dynamic sql and all new queries were constructed using common table expressions. I am comparing performance between parameterized dynamic sql and cte’s because the results are being filtered on multiple delimited value lists as well as a general string search. Have you ever used cte’s to perform the dynamic filtering on delimited string lists? Wondering if this is sacrificing performance for principles. I reviewed the query plans and it seems that having to do the splits and exists whether a filter value is included or not is a bit more costly. Having said that, after initial execution, subsequent calls are much faster. Would appreciate your feedback. Thanks!

    cte example:

    proc (@fkid1 bigint, @valueslist1…, @valueslist2…, @srchtext…)
    as…

    with cte_filter1(value)
    as (select value from dbo.split(@valueslist1, N’,’),
    cte_filter2(value)
    as (select value from dbo.split(@valueslist2, N’,’),
    cte_results1 (id, fkid1, value, filtercolumn1, filtercolumn2)
    as (select t.id, t.fkid1, t.value, t.filtercolumn1, t.filtercolum2
    from dbo.table1 t
    where t.fkid1 = @fkid1

    and (
    @valueslist1 is null
    or exists(select f1.value from cte_filter1 f1
    where f1.value = t.filtercolumn1))

    and (@valueslist2 is null
    or exists(select f2.value from cte_filter2 f2
    where f2.value = t.filtercolumn2))
    )
    select id, fkid1, value, filtercolumn1, filtercolumn2
    from cte_results1
    where value like (‘%’ + @srchtext + ‘%’)

    Reply
    1. Gail (Post author)

      A CTE has no magic to do filtering like this. It’s just a named subquery. You’re not gaining or losing performance by using a CTE over a single query.

      Second and onward executions are always faster, the initial has to cache both data and query plan. Generally ignore the first execution and look at performance for later ones.

      Reply
  22. Pingback: Resolved: How to compare Entity framework Any() with lower comparing objects - Daily Developer Blog

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.