This week I ran across a new-to-me option in some T-SQL code I was reviewing: OPTION (RECOMPILE). It turns out that this option improves performance by forcing SQL Server to recompile the query rather than using the plan it already generated.
Wait, what? Don’t we save time by saving a query plan and reusing it?
Generally, yes. When you first run a stored procedure, the SQL engine will figure out a plan for that procedure and cache the plan for reuse, thus saving the time of recomputing the it each time the procedure is run. The problem comes in when a plan is generated that makes sense for the current database state and parameters, but doesn’t make sense when those change. For example, suppose you have a table with ten million rows; different approaches are required if you want to retrieve ten of those rows vs seven million of them.
Generating the query plan
When the stored procedure is executed, SQL Server looks at the distribution statistics it has about the data in the tables the procedure references and uses this data to guess at the best way to run the query. Assuming that the statistics are up to date, it can generally make a fairly good guess; for example, it may choose an index seek if it expects to return a small number of rows, but a table scan if it expects that a significant percentage of the table will be returned.
Sometimes, the output of the procedure depends heavily on the value of a parameter; one value might result in returning one row, while another returns a million rows. When you run the procedure, SQL Server sniffs the parameters and uses whose to choose the query plan. If a query is called with literal values, you get one plan for each call, but if it is parameterized, the same plan is used the next time the procedure is called even if the value of the parameter is very different. If the procedure behaves similarly as the parameter varies, this works fine, and you save the cost of creating a new plan each time. If it doesn’t, however, you can end up with a plan that is orders of magnitude slower than optimal.
Telling SQL Server to recompile forces it to create a new query plan rather than reusing the old one. When the old plan was “good enough”, this is a bad thing – generating the plan is expensive. In some cases, though, the performance improvement from using a better plan easily overwhelms the cost of recomputing that plan.
One way to do this is to alter the stored procedure to include the WITH COMPILE option; this tells SQL Server not to cache a plan, so the query plan is regenerated each time. Alternatively, we can add OPTION (RECOMPILE) at the statement level, in which case only the plans for that particular statement, rather than the entire procedure, will be regenerated.
Being bossy with SQL Server
Forcing a recompile isn’t the only way to take control away from SQL Server. Another option is to use the OPTIMIZE FOR hint, when you know that the plan for a particular value will work well for a wide range of possible parameter values; in this case, you use domain knowledge to improve on what SQL Server could manage with statistics alone. You can also use OPTIMIZE FOR UNKNOWN to keep SQL Server from optimizing based on the particular value that was passed in.
Remember the disadvantages of taking control away: you lose the performance benefits of reusing a query plan, and if you make a bad choice as to what value to optimize for (or the data changes to make it a poor choice), you could see much worse performance than if you just leave SQL Server alone to do its own thing. As usual, you want to tell SQL Server how you want it to process something (rather than just what you want it to do) only when you have a good reason for doing so.