# 这个Any比包含的更好还是不更好?

This Any is better or not than this contains?
2020-11-21
•  译文(汉语)
•  原文(英语)

``````dbContext.MyTable
.Where(x => myIDS.Any(y=> y == x.MyID));
``````

``````SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE  EXISTS (SELECT
1 AS [C1]
FROM  (SELECT
[UnionAll2].[C1] AS [C1]
FROM  (SELECT
[UnionAll1].[C1] AS [C1]
FROM  (SELECT
cast(130 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast(139 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
UNION ALL
SELECT
cast(140 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2]
UNION ALL
SELECT
cast(141 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]) AS [UnionAll3]
WHERE [UnionAll3].[C1] = [Extent1].[MiID]
)
``````

``````dbContext.MyTable
.Where(x => myIDS.Contains(x.MiID));
``````

``````SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE [Extent1].[MiID] IN (cast(130 as bigint), cast(139 as bigint), cast(140 as bigint), cast(141 as bigint))
``````

``````System.Diagnostics.Stopwatch miswContains = new System.Diagnostics.Stopwatch();
miswContains.Start();
for (int i = 0; i < 100; i++)
{
IQueryable<MyTable> iq = dbContext.MyTable
.Where(x => myIDS.Contains(x.MyID));

iq.ToArrayAsync();
}
miswContains.Stop();

System.Diagnostics.Stopwatch miswAny = new System.Diagnostics.Stopwatch();
miswAny.Start();
for (int i = 0; i < 20; i++)
{
IQueryable<MyTable> iq = dbContext.Mytable
.Where(x => myIDS.Any(y => y == x.MyID));

iq.ToArrayAsync();
}
miswAny.Stop();
``````

`Entity Framework``select *`?生成了代码 ID的类型长吗?您如何知道结果是否未缓存,是否尝试交换查询顺序?

``````dbContext.MyTable
.Where(x => myIDS.Count(y=> y == x.MyID) > 0);
``````

``````DECLARE @p0 Decimal(9,0) = 12345
SELECT COUNT(*) AS [value]
FROM [ids] AS [t0]
WHERE [t0].[id] = @p0
``````

`Contains`在这方面要好得多.在性能受到严重影响之前,它可以处理成千上万的元素.

I am using EF6 and I would like to get the records in a table which are in a group of IDs.

In my test for example I am using 4 IDs.

I try two options, the first is with any.

``````dbContext.MyTable
.Where(x => myIDS.Any(y=> y == x.MyID));
``````

And the T-SQL that this linq exrepsion generates is:

``````SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE  EXISTS (SELECT
1 AS [C1]
FROM  (SELECT
[UnionAll2].[C1] AS [C1]
FROM  (SELECT
[UnionAll1].[C1] AS [C1]
FROM  (SELECT
cast(130 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast(139 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
UNION ALL
SELECT
cast(140 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2]
UNION ALL
SELECT
cast(141 as bigint) AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]) AS [UnionAll3]
WHERE [UnionAll3].[C1] = [Extent1].[MiID]
)
``````

How can is seen, the T-SQL is a "where exists" that use many subqueries and unions.

The second option is with contains.

``````dbContext.MyTable
.Where(x => myIDS.Contains(x.MiID));
``````

And the T-SQL:

``````SELECT
*
FROM [dbo].[MiTabla] AS [Extent1]
WHERE [Extent1].[MiID] IN (cast(130 as bigint), cast(139 as bigint), cast(140 as bigint), cast(141 as bigint))
``````

The contains is translated into "where in", but the query is much less complex.

I have read that any it use to be faster, so I have the doubt if the any is, although it is more complex at a first glance, is faster or not.

Thank so much.

EDIT: I have some test (I don't know if this is the best way to test this).

``````System.Diagnostics.Stopwatch miswContains = new System.Diagnostics.Stopwatch();
miswContains.Start();
for (int i = 0; i < 100; i++)
{
IQueryable<MyTable> iq = dbContext.MyTable
.Where(x => myIDS.Contains(x.MyID));

iq.ToArrayAsync();
}
miswContains.Stop();

System.Diagnostics.Stopwatch miswAny = new System.Diagnostics.Stopwatch();
miswAny.Start();
for (int i = 0; i < 20; i++)
{
IQueryable<MyTable> iq = dbContext.Mytable
.Where(x => myIDS.Any(y => y == x.MyID));

iq.ToArrayAsync();
}
miswAny.Stop();
``````

the results are that miswAny is about 850ms and the miswContains is about 4251ms.

So the second option, with contaions, is slower.

Talk1:
`Entity Framework` generated the code with `select *`? Are the IDs of type long? How do you know if the result are not cached, did you try to swap the queries order ?
Solutions1

Your second option is the fastest solution I can think of (at least for not very large arrays of ids) provided your `MiTabla.MiID` is in an index.

If you want to read more about `in` clause performance: Is SQL IN bad for performance?.

Talk1:
I have edit the original post to add a test. But I don't know if it is the best way to test this cases.
Talk2:
No, it isn't, because you are not taking into account some external factors, like SQL cache or net latency, although it may give you a rough estimation. Anyway, you are iterating 20 times to do the second test and 100 to do the first one (and the results show it's 5 times faster, probably due to that). Also, you should probably use await the instruction `iq.ToArrayAsync()`
Solutions2

If you know the ID, then using LINQ2SQL Count() method would create a much cleaner and faster SQL code (than both Any and Contains):

``````dbContext.MyTable
.Where(x => myIDS.Count(y=> y == x.MyID) > 0);
``````

The generated SQL for the count should look something like this:

``````DECLARE @p0 Decimal(9,0) = 12345
SELECT COUNT(*) AS [value]
FROM [ids] AS [t0]
WHERE [t0].[id] = @p0
``````
Talk1:
I know the IDs in this case. But when I have many IDs, how it would be the query?
Talk2:
You check the count of the matches - if the count is positive (at least 1, if ID is not unique), then the list of id's contains the specified id.
Solutions3

You can tell by the shape of the queries that `Any` is not scalable at all. It doesn't take many elements in `myIDS` (~50 probably) to get a SQL exception that the maximum nesting level has exceeded.

`Contains` is much better in this respect. It can handle a couple of thousands of elements before its performance gets severely affected.

So I would go for the scalable solution, even though `Any` may be faster with small numbers. It is possible to make `Contains` even better scalable.

I have read that any it use to be faster,

In LINQ-to-objects that's generally true, because the enumeration stops at the first hit. But with LINQ against a SQL backend, the generated SQL is what counts.