Is @@ROWCOUNT after UPDATE reliably a measure of *

2019-04-03 12:29发布

站内文章 / SQL-Server

15 0

做自己的国王

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Does @@ROWCOUNT reliably tell you how many rows matched the WHERE clause in an UPDATE, as opposed to how many where actually changed by it?

In the documentation for @@ROWCOUNT it says:

Data manipulation language (DML) statements set the @@ROWCOUNT value to the number of rows affected by the query and return that value to the client.

(My emphasis.)

But if I have

CREATE TABLE [Foo] ([a] INT, [b] INT)
GO
INSERT INTO [Foo] ([a], [b]) VALUES (1, 1),(1, 2),(1, 3),(2, 2)
GO
UPDATE [Foo] SET [b] = 1 WHERE [a] = 1
SELECT @@ROWCOUNT
GO

...I see 3 (the number of rows matching [a] = 1), not 2 (the number of rows modified by the UPDATE — one of the three rows already had the value 1 for b). This seems like an odd definition of "affected" (not wrong, just at odds with how I'd normally use the word — it's actually quite handy for what I want to do, in fact).

(The similar MySQL ROW_COUNT function, for instance, would return 2 in this situation.)

Is this reliable behavior, ideally documented somewhere I just haven't found? Or are there odd edge cases...

To be clear: I'm not asking if 3 is the right answer. I'm asking if it's a reliable answer, or are there edge cases where SQL Server will leave out rows that matched but didn't require a change.

Update: A couple of people have asked (or hinted at) what kind of "reliability" issues I'm worried about. The fact is they're quite nebulous, but, dunno, replication? Transactions? Partitioning? Indexes it could use to avoid seeking to rows because it knows that b is already 1, and so it skips those? ...?

Update: I was hoping for someone with a more "insider" view of how SQL Server works to answer this question, but it looks like the triggers example (and others I've played with) by xacinay is as close as we're going to get. And it seems pretty darned solid; if it behaves that way in the normal case and it didn't behave that way despite partitioning or whatsit, as someone said, surely that would qualify as a bug. It's just empirical rather than academic.

回答1:

The documentation for @@ROWCOUNT is telling you the truth because 3 rows would be reliably affected as opposed to MySQL's ROW_COUNT().

not 2 (the number of rows modified by the UPDATE — one of the three rows already had the value 1 for b).

For UPDATE it's not important if the new and previous values are identical. It simply does what its told to: finds data source, filters rows according to provided condition, and applies 'set' changes to filtered rows.

That's the way SQL Server works without any reservations. MySQL may work different. A row counting procedure is not a part of the SQL standard. So, you have to look before you leap for those kinds of artefacts every time you switch from one RDBMS to another.

Some triggers to see actual update behaviour:

CREATE TRIGGER [dbo].[trgFooForUpd]
ON [dbo].[Foo]
FOR UPDATE 
AS begin declare @id int;
      select @id = [a] from INSERTED;
      select * from INSERTED; end;
GO
CREATE TRIGGER [dbo].[trgFooAfterUpd]
ON [dbo].[Foo]
AFTER UPDATE 
AS print 'update done for ' + cast(coalesce( @@ROWCOUNT, -1) as varchar )+'rows'

回答2:

To expand on xacinay's answer because he is correct.

You have 3 rows changed and therefore @@ROWCOUNT is accurate. The SQL Server changes all rows, it does not verify that a value is in fact different before changing it, as that would require a lot of overhead on update commands. Just imagining having to check a VARCHAR(MAX) for whether the value was actually changed or not.

The easiest way to illustrate this is to actually change yor UPDATE query to something like this:

UPDATE [Foo]  SET [b] = 1
OUTPUT INSERTED.b
WHERE [a] = 1

It will output 3 rows of INSERTED which is the 'pseudo' table that holds the new values for a given update/insert command. That the value in fact is already b = 1 in one instance does not matter.

If you want that to matter you'll need to include it in your WHERE clause:

UPDATE [Foo]  SET [b] = 1
WHERE [a] = 1 AND [b] <> 1
SELECT @@ROWCOUNT

Alternatively, and as a more general way of doing this check, you can make a trigger and compare the values/fields in the DELETED table with the values in the INSERTED table and use that as foundation for whether a row is actually 'changed'.

So - 3 is the accurate number as you have updated 3 rows because 3 rows were touched by [a] = 1

回答3:

I think the documentation is correct because no matter that one row in your example already had 1 as the value in [b], that row still satisfied the criteria in the WHERE clause so the value was 'updated'.

We can see a proof by extending your example a little bit and including a TIMESTAMP column as in this SQLFiddle. After the update the TIMESTAMP on all columns which were matched by the WHERE cluase has changed indicating that the row itself was changed and not merely assessed and discarded because the target value matched that which was already present.

回答4:

In summary you are asking if @@rowcount is deterministic. Really if it was non-deterministic you don't think you would see that in the documentation? It is reasonable to assume deterministic. @@VERSION and @@MAX_PRECISION are also not documented as deterministic based on that are you questioning edge cases where they are not deterministic. I seriously doubt there is an edge case that it fails but it it does then Microsoft will accept it as a bug. They are not going to come back with @@rowcount is nondeterministic - the documentation did not explicitly state deterministic.

Numerous examples that "affected" in MMSQL TSQL counts a row set to the same value (it only cares about the where)
And you have examples that MSSQL does in fact assign the same value (timestamp)
You assert but SQL could care so how do I do I know there is not an edge case that is does
Because that is not reasonable behavior
Any program should produce consistent output
The order without sort is not guaranteed but the rows are the same - that is specifically documented by Microsoft and SQL as nondeterministic
If @@rowcount was nondeterministic then I trust Microsoft would document that
It is reasonable to assume @@rowcount is deterministic
C# and Java do not always have the same behavior
That is not a reason for me to suspect C# and Java are not reliable

Look at the query plan in MSSQL
There is no predicate on [a]
There is a predicate on [b]

If you change the query to

UPDATE [Foo] SET [b] = 1 WHERE [a] <> 1 and [b] = 1;

Then you will see predicates on both [a] and [b]

The query optimizer will decide how most efficiently process the query it is not going to change the query.
Introducing a predicate on [b] in the first query is changing the query.
A proper database just does not do that.

I highly suspect that in MySQL if you look at the query plan there will be a predicate on [a] in the first query.

Rather than require proof that is reliable create an edge case that proves it is not reliable.
If you can create a situation it is not reliable I submit Microsoft would accept it as a bug.

What is the business case?
If you have business case that a value must be updated with same value then by definition you have something to test.
The only examples I can thing of are timestamp or trigger.
If you have true need to for an update to the same to take place then you have something to measure.
Do you have any evidence that a value is not updated to the same value?
If you still don't trust it then wrap it in a transaction.

If you don't need an update to the same value then why not just add [b] <> 1. That is more efficient.

SO is for specific programming question.
What is the programming question?
Prove MSSQL is reliable is not a programming question.