I need to load some test data into the Channel field on my Account table. The Channel can be one of 10 different values, so I thought I'd randomly assign the Channel one of the values using a CASE
expression along with ABS(CHECKSUM(NewId())) % 10
like so:
SELECT
id,
name,
Channel =
CASE ABS(CHECKSUM(NewId())) % 10
WHEN 0 THEN 'Baby Only'
WHEN 1 THEN 'Club'
WHEN 2 THEN 'Drug'
WHEN 3 THEN 'Food'
WHEN 4 THEN 'Internet'
WHEN 5 THEN 'Liquidators'
WHEN 6 THEN 'Mass'
WHEN 7 THEN 'Military'
WHEN 8 THEN 'Other'
WHEN 9 THEN 'Speciality'
ELSE '*NONE*' -- How is this ever getting reached?
END
FROM
retailshelf_nil...account A
Since I'm using modulo 10 I thought the only possible values should be 0-9. But when I run the above code, I'm finding that the ELSE
clause is indeed being reached and that my data is coming up with 'NONE' on some records as shown:
id name Channel
001L000000KpgFqIAJ Acct1 *NONE*
001L000000KpgFrIAJ Acct2 Mass
001L000000KpgFsIAJ Acct3 Club
001L000000KpgFtIAJ Acct4 *NONE*
001L000000KpgFuIAJ Acct5 Baby Only
001L000000KpgFvIAJ Acct6 *NONE*
001L000000KpgFwIAJ Acct7 Mass
Can someone please explain what logical error I've made that's allowing the ELSE clause to be reached?
When I run a simple test to just generate the random number like so:
SELECT
RadomNum = ABS(CHECKSUM(NewId())) % 10
FROM
retailshelf_nil...account A
ORDER BY
1
All the numbers generated are from 0-9 as expected, so what's different about this first SQL?
And is there a workaround to ensure that ELSE
is never reached?
The written form of the query is expanded to:
Channel =
CASE
WHEN ABS(CHECKSUM(NewId())) % 10 = 0 THEN 'Baby Only'
WHEN ABS(CHECKSUM(NewId())) % 10 = 1 THEN 'Club'
WHEN ABS(CHECKSUM(NewId())) % 10 = 2 THEN 'Drug'
WHEN ABS(CHECKSUM(NewId())) % 10 = 3 THEN 'Food'
WHEN ABS(CHECKSUM(NewId())) % 10 = 4 THEN 'Internet'
WHEN ABS(CHECKSUM(NewId())) % 10 = 5 THEN 'Liquidators'
WHEN ABS(CHECKSUM(NewId())) % 10 = 6 THEN 'Mass'
WHEN ABS(CHECKSUM(NewId())) % 10 = 7 THEN 'Military'
WHEN ABS(CHECKSUM(NewId())) % 10 = 8 THEN 'Other'
WHEN ABS(CHECKSUM(NewId())) % 10 = 9 THEN 'Speciality'
ELSE '*NONE*' -- How is this ever getting reached?
END
A new value for NEWID
is used in each test.
A new "random" number will be calculated for every WHEN clause - you can instead use a derived table:
SELECT ID, Name,
Channel =
CASE Rand
WHEN 0 THEN 'Baby Only'
WHEN 1 THEN 'Club'
WHEN 2 THEN 'Drug'
WHEN 3 THEN 'Food'
WHEN 4 THEN 'Internet'
WHEN 5 THEN 'Liquidators'
WHEN 6 THEN 'Mass'
WHEN 7 THEN 'Military'
WHEN 8 THEN 'Other'
WHEN 9 THEN 'Speciality'
ELSE '*NONE*' -- How is this ever getting reached?
END
FROM
( SELECT
id,
name,
ABS(CHECKSUM(NewId())) % 10 Rand
FROM
retailshelf_nil...account A
) zzz;
or a CROSS APPLY subquery:
SELECT A.ID, A.Name,
Channel =
CASE zzz.Rand
WHEN 0 THEN 'Baby Only'
WHEN 1 THEN 'Club'
WHEN 2 THEN 'Drug'
WHEN 3 THEN 'Food'
WHEN 4 THEN 'Internet'
WHEN 5 THEN 'Liquidators'
WHEN 6 THEN 'Mass'
WHEN 7 THEN 'Military'
WHEN 8 THEN 'Other'
WHEN 9 THEN 'Speciality'
ELSE '*NONE*' -- How is this ever getting reached?
END
FROM
retailshelf_nil...account A
CROSS APPLY
( SELECT
ABS(CHECKSUM(NewId())) % 10
) zzz (Rand);
That way NewID()
is called only once per record.
A similar scneario was resolved here.
The T-SQL documentation explains this phenomenon (granted it's for Sybase but apparently still applies to SQL Server):
Expressions that reference the rand
function, the getdate
function,
and so on, produce different values each time they are evaluated. This
can yield unexpected results when you use these expressions in certain
case expressions. For example, the SQL standard specifies that case
expressions with the form:
case expression
when value1 then result1
when value2 then result2
when value3 then result3
...
end
are equivalent to the following form of case expression:
case expression
when expression=value1 then result1
when expression=value2 then result2
when expression=value3 then result3
...
end
Related to second question,
CHECKSUM(NewId()) will return negative results sometimes, which is not match with any of the case
conditions. If a negative number is divided with any number the result will be negative. Execute the following query,
declare @v nvarchar(50) = newid()
select CHECKSUM(@v),@v,CHECKSUM(@v) % 10