SQL - columns for different categories

2019-01-17 23:02发布

问题:

I am new to SQL. I have a database with data for different exams, for example:

Student Test Grade
--------------------
St1    T1   A
St2    T1   B
St3    T1   B
St1    T2   B
St2    T2   B
St3    T2   A
St1    T3   A
St2    T3   C
St3    T3   B

Then, I would like to print a report using the Tests (T1, T2 and T3) as columns:

Student  T1   T2   T3
----------------------
St1      A    B    A
St2      B    B    C
St3      B    A    B

I have tried different things, but I got stuck on how to produce such a printout. Any help is appreciated!

回答1:

Use:

  SELECT t.student,
         MAX(CASE WHEN t.test = 'T1' THEN t.grade END) AS T1,
         MAX(CASE WHEN t.test = 'T2' THEN t.grade END) AS T2,
         MAX(CASE WHEN t.test = 'T3' THEN t.grade END) AS T3
    FROM TABLE t
GROUP BY t.student


回答2:

I asked a similar question a while back. You need something akin to a pivot table, but, that's not available in SQLite (as far as I know).



回答3:

There are a couple of ways to do it, both of which (in pure SQL and not code generating a SQL command) require the number of columns to be known and fixed. The most straightforward to implement would be:

SELECT eg.Student,
(SELECT Grade from ExamGrade eg1 WHERE eg1.Student = eg.Student AND Test = 'T1') AS T1
(SELECT Grade from ExamGrade eg2 WHERE eg2.Student = eg.Student AND Test = 'T2') AS T2
(SELECT Grade from ExamGrade eg3 WHERE eg3.Student = eg.Student AND Test = 'T3') AS T3
FROM ExamGrade eg

This will work in practically any environment including SQLite, and it could be made a bit more elegant with a scalar-valued function GetTest() that would take the student and test number and return the grade. However, in any case, this is neither performant nor closed to change; it will query the table N-squared times for N tests, and if you add a 4th test, this query will have to change to include it in the report.

If the combination of Student and Test is unique, and you're working in a database with Pivot functionality (which apparently SQLite doesn't have), you can use a Pivot query with just about any aggregator (the MAX/MIN/AVG/SUM of a set with a single value is that value). The following works in MSS2005:

SELECT Student, T1, T2, T3
FROM (Select Student, Test, Grade FROM ExamGrade) As SourceQuery
PIVOT (MAX(Grade) FOR Test IN (T1, T2, T3)) AS PivotTable

This will be more performant, and it's far more elegant. The column lists still cannot be dynamically determined AFAIK, but they're trivial to generate if you're making this query from application code, or using the sp_executesql built-in stored proc in MS SQL Server to generate a query from another stored proc or function.



回答4:

I believe that if you are going to expand this system to include more information then, you could benefit from reworking your database, I would build it like so:

Table name = Bold

Column name = Italicized

Students:

  • SID (Primary Key)
  • Other information about the student

Tests:

  • TID (Primary Key)
  • Other information about the test

Test Grades

  • GID (Primary Key)
  • TID (Foreign key)
  • SID (Foreign Key)
  • Grade

This structure is based on an idea called database normalizing (you'll get lots of information if you google it). I will give you a partial summary below, but if you're going to do lots of SQL you should read up on it yourself:

The first thing to know is that a primary key is just a unique identifier, it is not usually part of the information (however, since it is unique every datum must have different value for its primary key), and a foreign key is a way to reference a row in one table from a row in another table, using the referencee's primary key: for example here the foreign key SID in each grade references a single student, based on their primary key SID.

e.g. student one has SID 1 and all his tests have 1 in the SID column. same for student 2, 3, 4, and so on.

The basic idea of normalizing, is that all unique data is stored only once and then referenced in the other places that use it (If you take a look at how the keys are structured in the example, all student information is stored in a table and then referenced in their test grades, instead of being duplicated in each grade).

To retrieve what you want from these tables I would use this (its written in PHP):

$sql = 'SELECT * FROM Tests ORDER BY TID';
$tempresult = mysql_query($sql);
while($temprow = mysql_fetch_array($tempresult)){
    echo $temprow['TID'];
}

$sql = 'SELECT * FROM Students';
$result = mysql_query($sql);
while($row = mysql_fetch_array($result)){
   echo '\n'.$row['SID'];
   $sql = 'SELECT * FROM Grades WHERE SID='.$row['SID'].' ORDER BY TID';
   $result2 = mysql_query($sql);
   while($row2 = mysql_fetch_array($result2)){
      echo ' '.$rows['Grade'];
   }
}

You can add formatting to this in the echo statements and also print any of the extra information you choose to add. If you ave any questions please ask them.

EDIT: I have read the others, and agree that their method is in all likelihood superior, the only thing I'm not sure about is whether pivot tables can expand to handle a varying number of tests, if it can (or you won't need to), then I suggest their method, otherwise I feel this might have a place in your application.



回答5:

Try This

SELECT Student, MAX(CASE WHEN Test = 'T1' THEN Grade END) AS T1,
   MAX(CASE WHEN Test = 'T2' THEN Grade END) AS T2,
   MAX(CASE WHEN Test = 'T3' THEN Grade END) AS T3 FROM tablename  GROUP BY Student

Use your table name instead of "tablename".



回答6:

I would like to add some explanation to @OMG_Ponies answer because it could be useful for SQL no-super-users (like myself)

Let's create an example table and add the dummy data:

CREATE TABLE t (
   t_ID integer primary key autoincrement not null,
   student integer,
   test text,
   grade text
);

INSERT INTO t 
   (student, test, grade)
VALUES
   ('St1', 'T1', 'A'),
   ('St2', 'T1', 'B'),
   ('St3', 'T1', 'B'),
   ('St1', 'T2', 'B'),
   ('St2', 'T2', 'B'),
   ('St3', 'T2', 'A'),
   ('St1', 'T3', 'A'),
   ('St2', 'T3', 'C'),
   ('St3', 'T3', 'B');

So we have the following:

t_ID student test grade
-------------------------
 1   St1     T1     A
 2   St2     T1     B
 3   St3     T1     B
 4   St1     T2     B
 5   St2     T2     B
 6   St3     T2     A
 7   St1     T3     A
 8   St2     T3     C
 9   St3     T3     B

Using the statement case when ... then ... end it is possible to get the desired columns

SELECT 
   t_ID,
   student,
   (case when t.test = 'T1' then t.grade end) as T1,
   (case when t.test = 'T2' then t.grade end) as T2,
   (case when t.test = 'T3' then t.grade end) as T3
FROM t
   order by student

Result

t_ID student  T1    T2     T3
----------------------------------
1    St1      A    NULL  NULL       
4    St1     NULL   B    NULL   
7    St1     NULL  NULL   A
2    St2      B    NULL  NULL   
5    St2     NULL   B    NULL   
8    St2     NULL  NULL   C
3    St3      B    NULL  NULL   
6    St3     NULL   A    NULL   
9    St3     NULL  NULL   B 

However, we see that it is necessary to group the results by the field "student". When we group we have to specify an aggregate function to specify which value to keep in case of having more than one row with the same value of "student". In this case we use the "max" function, to discard the null's.

SELECT 
   t_ID,
   student,
   max(case when t.test = 'T1' then t.grade end) as T1,
   max(case when t.test = 'T2' then t.grade end) as T2,
   max(case when t.test = 'T3' then t.grade end) as T3
FROM t
GROUP BY student
ORDER BY student

Result

t_ID student  T1   T2   T3
-----------------------------
7    St1      A    B     A      
8    St2      B    B     C  
9    St3      B    A     B

Final note. Since we have not grouped also by t_ID, nor have we specified an aggregate function for it, you should assume that the value of t_ID of each row is a random one of each group. Be careful with that.