SQL Server Query to find CHI-SQUARE Values (Not Wo

2019-02-16 02:20发布

问题:

I am trying to find the Chi-Square test from my following SQL Server Query on the sample data:

 SELECT sessionnumber, sessioncount, timespent, expected, dev, dev*dev/expected as    chi_square
 FROM (SELECT clusters.sessionnumber, clusters.sessioncount, clusters.timespent,
 (dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected,
 clusters.cnt-(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as dev
 FROM clusters JOIN
 (SELECT sessionnumber, SUM(cnt) as cnt FROM clusters
 GROUP BY sessionnumber) dim1 ON clusters.sessionnumber = dim1.sessionnumber JOIN
 (SELECT sessioncount, SUM(cnt) as cnt FROM clusters
 GROUP BY sessioncount) dim2 ON clusters.sessioncount = dim2.sessioncount JOIN
 (SELECT timespent, SUM(cnt) as cnt FROM clusters
 GROUP BY timespent) dim3 ON clusters.timespent = dim3.timespent CROSS JOIN
 (SELECT SUM(cnt) as cnt FROM clusters) dimall) a

My table has this sort of sample data:

sessionnumber   sessioncount    timespent       cnt
1                  17               28          NULL
2                  22               8           NULL
3                  1                1           NULL
4                  1                1           NULL
5                  8               111          NULL
6                  8                65          NULL
7                  11               5           NULL
8                  1                1           NULL
9                  62               64          NULL
10                 6                42          NULL

The problem is that this query works fine but it gives wrong output or you can say no output at all. The output it gives my is like:

sessionnumber   sessioncount    timespent       expected    dev     chi_square
1               17              28              NULL        NULL    NUL
2               22              8               NULL        NULL    NULL
3               1               1               NULL        NULL    NULL
4               1               1               NULL        NULL    NULL
5               8               111             NULL        NULL    NULL
6               8               65              NULL        NULL    NULL
7               11              5               NULL        NULL    NULL
8               1               1               NULL        NULL    NULL
9               62              64              NULL        NULL    NULL
10              6               42              NULL        NULL    NULL

How can I get rid of this problem because I tried my best at all! Thanks in advance telling me what I' doing wrong!

回答1:

In your sample data, cnt is NULL, so the results are also NULL. You can replace these NULL values with a default value (1 for example, I don't know what is the context) using ISNULL, like

SELECT sessionnumber, SUM(ISNULL(cnt, 1)) as cnt FROM clusters GROUP BY sessionnumber