Vita_ArrayAsTable - notes about performance

Jun 7, 2016 at 7:13 AM
I have noticed that some queries that use "IN" clause along with "Vita_ArrayAsTable" take a long time (in seconds) and returns only a few lines. The table to which are referred has about a million rows. I found out ,investigating the query execution plan, that it is performed cluster index scan instead of an index seek.

I got a suspicion that the cause may be "sql_variant" in user defined table type (Vita_ArrayAsTable). I created a table type that has a "Value" defined as integer type - the same as referenced columln - and executed the same query.
All was OK.

I disabled the use of Vita_ArrayAsTable in queries (ForceArrayAsLiterals) and performance issues disappeared.

In fact it is now each query using the "IN" clause unique query that Sql Server engine have to compile and keeps in the memory execution plan for later use. However, there is little likelihood that such a query will be used again in the future. This puts pressure on the memory of Sql Server, especially for an Express edition.

Built-in support for using user defined table types is a great feature of VITA. There are many reasons to use it, but performance issue limits its use. Not in each case it is possible to re-write query to avoid use "IN" clause in query.

I understand why the use of "sql_variant" type, but , as you can see bellow, this confuses the query optimizer and causes use of the wrong execution plan.

Here is warning that you can see in SSMS Execution Plan tab:
(I noticed it only when I deals with the problem in detail :) )

Type conversion in expression (CONVERT_IMPLICIT(sql_variant,[MassFrontierTest].[dbo].[SpectrumPeak].[SpectrumId],0)) may affect "CardinalityEstimate" in query plan choice, Type conversion in expression (CONVERT_IMPLICIT(sql_variant,[MassFrontierTest].[dbo].[SpectrumPeak].[SpectrumId],0)=[Value]) may affect "SeekPlan" in query plan choice
Coordinator
Jun 7, 2016 at 7:58 AM
wow, that's a great catch, and great feedback, thanks! my bad, did not have a million-rows table on hand to test the performance.
I was prepared initially to create several table types, for each element types - int, Guid, string, etc.
But then I found out that Sql_Variant works, kinda, without testing performance or query plan.
I will definitely fix it. We have 2 ways to go:
  1. create table type for each element type
  2. Or try a trick:
    instead of
    WHERE Id IN (SELECT [Value] FROM @Values)
    use subquery in subquery:
    WHERE Id IN (SELECT lst.V FROM
             (SELECT CAST("Value" AS uniqueidentifier) as V 
                   FROM @Values) lst);
    so internal subquery selects casted GUID array, which is used then in IN clause in main query. The hope is that SQL Server would run nested subquery once, get list of casted values, and then use it in main query.
    Since you have already million rows table, can you please try this - with stored procedure? Just find an 'array' proc, modify it (after app starts), make it run and see the timing or query plan. If this trick works, it would be an easy fix.
    thanks!
Jun 7, 2016 at 10:34 AM
I was not sure whether the second way is right.

I made test and sub-sub query leads to performing clustered index seek but warning about "CardinalityEstimate" again appears in execution plan. During tests I did not encounter any performance issue.

I do not know how many work is with the first way, but I intercede for the first one.
It is a clean solution without any "warnings" that, I fear, may show up later and unexpected and cause problems.
Jun 7, 2016 at 1:32 PM
Addendum to the test:

Because queries with performance problems do not call directly stored procedure that accepts Vita_ArrayAsTable] parameter , I performed synthetic test only:
I made changes in corresponding procedure that refers to table with many rows and executed query that calls this procedure.

The same occurred as I described in previous message - warning appears in execution plan but cluster index seek was performed.
Coordinator
Jun 7, 2016 at 4:42 PM
All right, multiple table types it is. Will try to squeeze it into next push. thanks a lot for investigating
Roman
Coordinator
Jun 8, 2016 at 4:36 PM
Hi again.
Can I ask you again to do me a big favor? Before I go full speed into implementation, it would be nice to check 2 things:
  1. Using a single subquery with a CAST:
    WHERE ID IN (SELECT CAST("Value" AS uniqueidentifier)
           FROM @Values);
    • would it make cardinality warning go away?
  2. Using a table type with strongly typed column (GUID), does it actually solve a problem?
thanks in advance
Roman
Jun 9, 2016 at 9:42 PM
Hi Roman,

when you use form of "CAST " the warning appears, but query optimizer selects cluster index seek plan.

From my point of sight, there is no guarantee that changing the distribution or the amount of data does not cause usage of cluster index scan. I think this what the query optimizer wants to tell us.

Strongly typed column, in my case unfortunately only integer column, leads to cluster index seek without warning.

Best regards,

Milos
Coordinator
Jun 9, 2016 at 9:56 PM
thanks so much!
I will probably start with adding explicit CAST but keeping SQl_VAriant, it is really easy fix; will change to multiple types later