I am trying to write a Linq to SQL statement which displays all customer records and only the matching max(InvoiceId) of the invoice table; basically the newest invoice for the customer. The left join is required because a customer may not have any invoices but need to be in result set.
Two basic tables with a foreign key of Customer.CustomerID = Invoice.CustomerId
CREATE TABLE [dbo].[Customer](
[CusomerId] [int] IDENTITY(1,1) NOT NULL,
[CustomerName] [int] NOT NULL
CONSTRAINT [PK_Customer] PRIMARY KEY CLUSTERED
(
[CustomerId] ASC
)
) ON [PRIMARY]
CREATE TABLE [dbo].[Invoice](
[InvoiceId] [int] IDENTITY(1,1) NOT NULL,
[CustomerId] [int] NOT NULL,
[InvoiceTotal] [float] NOT NULL
CONSTRAINT [PK_Invoice] PRIMARY KEY CLUSTERED
(
[InvoiceId] ASC
)
) ON [PRIMARY]
The SQL of the desired result set is as follows:
SELECT *
FROM Customers c
LEFT JOIN
(Invoice i
INNER JOIN (SELECT CustomerId, MAX(InvoiceId) as InvId FROM Invoice GROUP BY CustomerId) as InvList
ON i.InvoiceNo = InvList.InvoiceNo) ON c.CustomerId = i.CustomerId
From what I have discovered, I don't think this can be done in a single statement; that the MAX(InvoiceId) product needs to be created first and used in the main statement. Since I can't get it to work, perhaps I am wrong about that too.
I wasn't able to get Ben M's example to work, but I was able to work it into the following:
In the first statement I joined the invoice table back into the result and purposely did not use select new.
This actually simplifies the second statement a bit only having to do a simple left join to the first statements object. I am now getting a result set of all customers and latest invoices for customers who have them.
I am not sure why Ben M's solution does not work, but I only get a left join product when the following line is removed:
With this line included the product is an inner join.
You could write this particular query in LINQ as follows--although this will result in a correlated subquery:
If you want to do it the other way, the LINQ syntax is less readable, but you can split the query up a bit thanks to deferred execution:
The first query (
latestInvoicesPerCustomerQuery
) does not execute until you enumerate over it, or over the second query, which references the first. As far as the runtime is concerned, the final query is one expression tree--so you can think of the first query as having been absorbed into the second.If you really want it all in one query, you can do that too:
Either variant of the
customersAndLatestInvoicesQuery
should translate roughly into the SQL you list in your post.