I have 2 table named FileList and ExtensionsCategory :
FileList
-------------------------------------------------------------
ID Path Extension
1 C:\7.pdf pdf
2 D:\3.png png
3 C:\1.mp3 mp3
4 D:\32.pdf pdf
and
ExtensionsCategory
------------------------------------------------------------
ID Extension Category
1 mp3 Multimedia
2 pdf Document
3 png Photo
there isn't any relation between tables.
now i want to know each category contain how many files ?
i wrote the following TSQL query in SSMS and work fine :
SELECT
Category
,SUM(FileCount) AS TotalFileCount
FROM
(SELECT
COUNT(*) AS FileCount
,(SELECT ExtensionsCategories.Category FROM ExtensionsCategories WHERE FileList.Extension = ExtensionsCategories.Extension) AS Category
FROM FileList GROUP BY Extension) AS T1 GROUP BY Category
i want know is there any improved way to do this without Joins ?
i try to write it using Linq to Entities, this is my code but i can't complete it :
var a = _dbContext.FileLists
.Select(fl => new
{
Extension = fl.Extension,
Count = _dbContext.FileLists.Count(),
Cat =
_dbContext.ExtensionsCategories.Where(w => w.Extension == fl.Extension).Select(s => s.Category)
}).GroupBy(g => g.Extension);
This should give you correct result:-
var result = from f in _dbContext.FileLists
group f by f.Extension into g
let firstCategory = _dbContext.ExtensionsCategories
.FirstOrDefault(x => x.Extension == g.Key)
select new
{
Category = firstCategory != null ? firstCategory.Category : "",
TotalFileCount = g.Count()
};
Or if you need with Method Syntax
:-
var result = _dbContext.FileLists.GroupBy(e => e.Extension)
.Select(x =>
{
var firstCategory = _dbContext.ExtensionsCategories
.FirstOrDefault(z => z.Extension == x.Key);
return new
{
Category = firstCategory != null ? firstCategory.Category : "",
TotalFileCount = x.Count()
};
});
Fiddle with Linq-to-objects.
Update:
If a category can contain multiple extension then you can use below query:-
var result = extensionCat.GroupBy(x => x.Category)
.Select(x =>
new
{
Category = x.Key,
TotalFileCount = fileList.Count(f => x
.Any(z => z.Extension == f.Extension))
}
);
There is another way to get the same results which might be faster.
First GroupBy
file extension and Count
them inside the GroupBy
element selector. There you can also get the category name for the keys which are in this case file extensions:
var result =
Files
.GroupBy(x => x.Extension, (ext, fs) => new
{
Extension = ext,
Category = Categories.Single(c => c.Extension == ext).Name,
FileCount = fs.Count()
});
This will produce the following query:
SELECT [t1].[Extension], (
SELECT [t2].[Name]
FROM [Category] AS [t2]
WHERE [t2].[Extension] = [t1].[Extension]
) AS [Category], [t1].[value] AS [FileCount]
FROM (
SELECT COUNT(*) AS [value], [t0].[Extension]
FROM [File] AS [t0]
GROUP BY [t0].[Extension]
) AS [t1]
@RahulSingh version on the other hand produces this:
-- Region Parameters
DECLARE @p0 NVarChar(1000) = ''
-- EndRegion
SELECT
(CASE
WHEN EXISTS(
SELECT TOP (1) NULL AS [EMPTY]
FROM [Category] AS [t2]
WHERE [t2].[Extension] = [t1].[Extension]
) THEN (
SELECT [t4].[Name]
FROM (
SELECT TOP (1) [t3].[Name]
FROM [Category] AS [t3]
WHERE [t3].[Extension] = [t1].[Extension]
) AS [t4]
)
ELSE CONVERT(NVarChar(50),@p0)
END) AS [Category], (
SELECT COUNT(*)
FROM [File] AS [t5]
WHERE (([t1].[Extension] IS NULL) AND ([t5].[Extension] IS NULL)) OR (([t1].[Extension] IS NOT NULL) AND ([t5].[Extension] IS NOT NULL) AND ([t1].[Extension] = [t5].[Extension]))
) AS [TotalFileCount]
FROM (
SELECT [t0].[Extension]
FROM [File] AS [t0]
GROUP BY [t0].[Extension]
) AS [t1]
I've tested both queries with 1.000.000 rows and the results are:
vs. the ToList
version:
In the last rows of the statistics you can see that short version is about 3 times faster.