Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Jackie Broph » Sun, 10 Feb 2002 04:10:36



SQL 7 SP2, NT 4 SP6a

Hi all,

Attached is a file that contains two queries and their execution plans. The
only difference between these queries is that one has a hard coded value in
the where clause, while the other uses a variable. The variable is of course
the real life situation because this has been extracted from a stored
procedure. The hard coded version is the better performer. I scanned the KB
for an answer to this and could not find one. If the answer is move to SP3,
well I cannot do that due to vendor restrictions. I also tried the trick to
set the variable to another variable within the stored proc, but that did
not change the plan either. Any hints?? Both plans use the right indexes,
but the "bad" plan does a Merge Join/Right Outer Join that does not exist in
the "good" plan. Thanks.

Jackie

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Narayana Vyas Kondredd » Sun, 10 Feb 2002 04:17:10


Jackie, it is because the optimizer will not know what value the variable
will contain, while optimizing. The value will be determined at run time.
--
HTH,
Vyas, MVP (SQL Server)

http://vyaskn.tripod.com/


SQL 7 SP2, NT 4 SP6a

Hi all,

Attached is a file that contains two queries and their execution plans. The
only difference between these queries is that one has a hard coded value in
the where clause, while the other uses a variable. The variable is of course
the real life situation because this has been extracted from a stored
procedure. The hard coded version is the better performer. I scanned the KB
for an answer to this and could not find one. If the answer is move to SP3,
well I cannot do that due to vendor restrictions. I also tried the trick to
set the variable to another variable within the stored proc, but that did
not change the plan either. Any hints?? Both plans use the right indexes,
but the "bad" plan does a Merge Join/Right Outer Join that does not exist in
the "good" plan. Thanks.

Jackie

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Andrew J. Kell » Sun, 10 Feb 2002 04:21:56


Jackie,

Here is the best explanation I have seen yet:

The reason for the performance difference stems from a feature called
"parameter sniffing".  Consider a stored proc defined as follows:



     GO

Keep in mind that the server has to compile a complete execution plan for
the proc before the proc begins to execute.  In 6.5, at compile time SQL

when compiling a plan.  Suppose all of the actual parameter values for

integers that were greater than 0, but suppose 40% of the [c1] values in
[table1] were, in fact, 0.  SQL would use the average density of the
column to estimate the number of rows that this predicate would return;
this would be an overestimate, and SQL would might choose a table scan
over an index seek based on the rowcount estimates.  A table scan would
be the best plan if the parameter value was 0, but unfortunately it

stored proc for more typical parameters suffers.

In SQL 7.0 or 2000, suppose you executed this proc for the first time

10".  Parameter sniffing allows SQL to insert the known value of


compile a plan that is tailored to the class of parameters that is
actually passed into the proc, so for example it might select an index
seek instead of a table scan based on the smaller estimated rowcount --

execution plans, but a key requirement for everything to work as expected
is that the parameter values used for compilation be "typical".

In your case, the problem is that you have default NULL values for your

the parameter values are changed inside the stored proc before they are
used -- as a result NULL will never actually be used to search the
column.  If the first execution of this stored proc doesn't pass in an

be NULL.  When SQL compiles the plan for this sp it substitutes NULL for

Unfortunately, after execution begins the first thing the stored proc

unfortunately SQL doesn't know about this at compile time.  Because NULL
is a very atypical parameter value, the plan that SQL generates may not
be a good one for the new value of the parameter that is assigned at
execution time.

So, the bottom line is that if you assign defaults to your sp parameters
and later use those same parameters in a query, the defaults should be
"typical" because they will be used during plan generation.  If you must
use defaults and business logic dictates that they be atypical (as may be
the case here if app modifications are not an option), there are two
possible solutions if you determine that the substitution of atypical
parameter values is causing bad plans:

1. "Disable" parameter sniffing by using local DECLARE'd variables that
you SET equal to the parameters inside the stored proc, and use the local
variables instead of the offending parameters in the queries. This is the
solution that you found yourself.  SQL can't use parameter sniffing in
this case so it must make some guesses, but in this case the guess based
on average column density is better than the plan based on a specific but
"wrong" parameter value (NULL).

2. Nest the affected queries somehow so that they run within a different
context that will require a distinct execution plan.  There are several
possibilities here. for example:
    a. Put the affected queries in a different "child" stored proc.  If

has been changed to its final value, parameter sniffing will suddenly
become your friend because the value SQL uses to compile the queries
inside the child stored proc is the actual value that will be used in the
query.
    b. Use sp_executesql to execute the affected queries.  The plan won't
be generated until the sp_executesql stmt actually runs, which is of
course after the parameter values have been changed.

An equivalent approach would be to put the query in a child stored proc
just like 2.a, but execute it within the parent proc with EXEC WITH
RECOMPILE.

Option #1 seems to have worked well for you in this case, although
sometimes one of the options in #2 is a preferable choice.  Here are some
guidelines, although when you're dealing with something as complicated as
the query optimizer experimentation is often the best approach <g>:

   - If you have only one "class" (defined as values that have similar
density in the table) of actual parameter value that is used within a
query (even if there are other classes of data in the base table that are
never or rarely searched on), 2.a. or 2.b is probably the best option.
This is because these options permit the actual parameter values to be
used during compilation which should result in the most efficient query
plan for that class of parameter.
   - If you have multiple "classes" of parameter value (for example, for
the column being searched, half the table data is NULL, the other half
are unique integers, and you may do searches on either class), 2.c can be
effective.  The downside is that a new plan for the query must be
compiled on each execution, but the upside is that the plan will always
be tailored to the parameter value being used for that particular
execution.  This is best when there is no single execution plan that
provides acceptable execution time for all classes of parameters.

HTH -
Bart
------------
Bart Duncan
Microsoft SQL Server Support

Please reply to the newsgroup only - thanks.

This posting is provided "AS IS" with no warranties, and confers no
rights.

--
Andrew J. Kelly,  SQL Server MVP
TargitInteractive

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Jackie Broph » Sun, 10 Feb 2002 04:34:30


Thanks. I did try option 1 to no avail...oh well, I am trying to trim a few
seconds off of this proc, but I guess I'll have to let run as is. Thanks to
both of you.

Jackie


> Jackie,

> Here is the best explanation I have seen yet:

> The reason for the performance difference stems from a feature called
> "parameter sniffing".  Consider a stored proc defined as follows:



>      GO

> Keep in mind that the server has to compile a complete execution plan for
> the proc before the proc begins to execute.  In 6.5, at compile time SQL

> when compiling a plan.  Suppose all of the actual parameter values for

> integers that were greater than 0, but suppose 40% of the [c1] values in
> [table1] were, in fact, 0.  SQL would use the average density of the
> column to estimate the number of rows that this predicate would return;
> this would be an overestimate, and SQL would might choose a table scan
> over an index seek based on the rowcount estimates.  A table scan would
> be the best plan if the parameter value was 0, but unfortunately it

> stored proc for more typical parameters suffers.

> In SQL 7.0 or 2000, suppose you executed this proc for the first time

> 10".  Parameter sniffing allows SQL to insert the known value of


> compile a plan that is tailored to the class of parameters that is
> actually passed into the proc, so for example it might select an index
> seek instead of a table scan based on the smaller estimated rowcount --
> this is a good thing if most of the time 0 is not the value passed as

> execution plans, but a key requirement for everything to work as expected
> is that the parameter values used for compilation be "typical".

> In your case, the problem is that you have default NULL values for your

> the parameter values are changed inside the stored proc before they are
> used -- as a result NULL will never actually be used to search the
> column.  If the first execution of this stored proc doesn't pass in an

> be NULL.  When SQL compiles the plan for this sp it substitutes NULL for

> Unfortunately, after execution begins the first thing the stored proc

> unfortunately SQL doesn't know about this at compile time.  Because NULL
> is a very atypical parameter value, the plan that SQL generates may not
> be a good one for the new value of the parameter that is assigned at
> execution time.

> So, the bottom line is that if you assign defaults to your sp parameters
> and later use those same parameters in a query, the defaults should be
> "typical" because they will be used during plan generation.  If you must
> use defaults and business logic dictates that they be atypical (as may be
> the case here if app modifications are not an option), there are two
> possible solutions if you determine that the substitution of atypical
> parameter values is causing bad plans:

> 1. "Disable" parameter sniffing by using local DECLARE'd variables that
> you SET equal to the parameters inside the stored proc, and use the local
> variables instead of the offending parameters in the queries. This is the
> solution that you found yourself.  SQL can't use parameter sniffing in
> this case so it must make some guesses, but in this case the guess based
> on average column density is better than the plan based on a specific but
> "wrong" parameter value (NULL).

> 2. Nest the affected queries somehow so that they run within a different
> context that will require a distinct execution plan.  There are several
> possibilities here. for example:
>     a. Put the affected queries in a different "child" stored proc.  If

> has been changed to its final value, parameter sniffing will suddenly
> become your friend because the value SQL uses to compile the queries
> inside the child stored proc is the actual value that will be used in the
> query.
>     b. Use sp_executesql to execute the affected queries.  The plan won't
> be generated until the sp_executesql stmt actually runs, which is of
> course after the parameter values have been changed.

> An equivalent approach would be to put the query in a child stored proc
> just like 2.a, but execute it within the parent proc with EXEC WITH
> RECOMPILE.

> Option #1 seems to have worked well for you in this case, although
> sometimes one of the options in #2 is a preferable choice.  Here are some
> guidelines, although when you're dealing with something as complicated as
> the query optimizer experimentation is often the best approach <g>:

>    - If you have only one "class" (defined as values that have similar
> density in the table) of actual parameter value that is used within a
> query (even if there are other classes of data in the base table that are
> never or rarely searched on), 2.a. or 2.b is probably the best option.
> This is because these options permit the actual parameter values to be
> used during compilation which should result in the most efficient query
> plan for that class of parameter.
>    - If you have multiple "classes" of parameter value (for example, for
> the column being searched, half the table data is NULL, the other half
> are unique integers, and you may do searches on either class), 2.c can be
> effective.  The downside is that a new plan for the query must be
> compiled on each execution, but the upside is that the plan will always
> be tailored to the parameter value being used for that particular
> execution.  This is best when there is no single execution plan that
> provides acceptable execution time for all classes of parameters.

> HTH -
> Bart
> ------------
> Bart Duncan
> Microsoft SQL Server Support

> Please reply to the newsgroup only - thanks.

> This posting is provided "AS IS" with no warranties, and confers no
> rights.

> --
> Andrew J. Kelly,  SQL Server MVP
> TargitInteractive

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Steve Dassi » Sun, 10 Feb 2002 05:45:21




Quote:> Jackie, it is because the optimizer will not know what value the variable
> will contain, while optimizing. The value will be determined at run time.

MS 'sniffing' could be anticipated from this section of bol
on sp_executesql.

Reusing Execution Plans
In earlier versions of SQL Server, the only way to be able to reuse
execution plans was to define the Transact-SQL statements as a stored
procedure and have the application execute the stored procedure. This
generates extra administrative overhead for the applications. Using
sp_executesql can help reduce this overhead while still allowing SQL Server
to reuse execution plans. sp_executesql can be used instead of stored
procedures when executing a Transact-SQL statement a number of times, when
the only variation is in the parameter values supplied to the Transact-SQL
statement. Because the Transact-SQL statements themselves remain constant
and only the parameter values change, the SQL Server query optimizer is
likely to reuse the execution plan it generates for the first execution.

This whole situation is nothing more than MS making
the case for the use of dynamic sql as a hedge against
the *high* probability that one size doesn't fit all.
There isn't enough CROW to go around:).


------------
voodoo

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Isaac Blan » Sun, 10 Feb 2002 06:09:58


Glad you like it...



> ------------
> voodoo

 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Steve Dassi » Sun, 10 Feb 2002 07:19:41



> Glad you like it...




> > ------------
> > voodoo

That's what I *really* get:)
 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Narayana Vyas Kondredd » Sun, 10 Feb 2002 07:28:08


Steve and Isaac, you both are making me laugh  now :-)

--
HTH,
Vyas, MVP (SQL Server)

http://vyaskn.tripod.com/



> Glad you like it...




> > ------------
> > voodoo

That's what I *really* get:)
 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Steve Dassi » Sun, 10 Feb 2002 07:37:16




Quote:> Steve and Isaac, you both are making me laugh  now :-)

That is a very good thing,oh great one:)
 
 
 

Why would using a variable in a where clause produce a different, less optimal query plan vs. hard coded value

Post by Erland Sommarsko » Tue, 12 Feb 2002 08:10:58



> Here is the best explanation I have seen yet:

> The reason for the performance difference stems from a feature called
> "parameter sniffing".  Consider a stored proc defined as follows:



>      GO

> Keep in mind that the server has to compile a complete execution plan for
> the proc before the proc begins to execute.  In 6.5, at compile time SQL

> when compiling a plan.  

This is not correct. SQL 6.5 (and 4.2) also knew of the parameter value,
and built the plan accordingly.

Else the article, which I believe that Andrew has taken from Bart Duncan,
is right on the money.

--
Erland Sommarskog, Abaris AB

SQL Server MVP

 
 
 

1. Why would the same query on 2 servers uses different query plans

Hi,
I have 2 servers with the same database (backup/restore
was used to create the copy). Statistics were updated on
both servers.

If I run the same Queries one server I do not get the same
query plan as on the other server. I know that the plan is
created according to hardware configuration and my two
servers are different. However, some queries run faster on
my less powerful server and some query run 10 times faster
on my bigger server. How can that be? Do you have any
suggestions?

Thanks

-Sly

2. Single update fills transaction log?

3. Procedure produces different query plans

4. Oracle Developer

5. Different return values when using Query Analyzer and VB code

6. sqlexec

7. Variable vs Hard-Coding

8. Where oh where??

9. Different SQL Servers produce different response time for Same Query

10. Different query plans for SP and the same SQL code

11. Query processor could not produce a query plan...

12. Replication Error:The query processor could not produce a query plan from the op

13. [sql server] The query processor could not produce a query plan