Aggregate Relational Algebra (Maximum)

2019-01-22 18:28发布

问题:

I am currently working on a homework assignment that requires a selection to occur that pulls out an element containing a specific attribute of maximum value compared to all other records. I've read a number of sources online that reference an "aggregate" relational algebra function called maximum, but they don't describe how it works using the basic operators. How does one select the attribute containing a maximum value?

回答1:

You can very well express aggregate functions with only basic operators. It's a pretty neat thing.

Suppose we have a table T, and we'd like to find the maximum of its "value" field. First, we should take the cartesian product of T with itself -- or rather with a copy of itself, T2. Then we select the rows where T.value is smaller than T2.value: this nets us all the unwanted rows, whose value is less than the value of some other row. To get the maximal values, we should subtract these unwanted rows from the set of all rows. And that's it. At least that's the basic idea, we also need to use projections to get the dimensions right.

Unfortunately I have no idea how to insert Latex here, but using relational algebra notation, it'd be something like this:

π(T.a1...Tan, T.value)(T)
    -
π(T.a1...Tan, T.value)(
    σ(T.value<T2.value)( ρ(T, T2) x T )
)

where π is the projection operator, - is the set difference, σ is the selection operator and ρ is the rename operator.

SQLishly:

SELECT T.* FROM T
    MINUS
SELECT T.* FROM T, T as T2 WHERE T.value<T2.value

And more practically:

SELECT T.* FROM T LEFT JOIN T as T2 ON T.value<T2.value WHERE T2.value IS NULL

Of course, all this is mostly only of academic interest, i.e. that it shows that the relational algebra works.



回答2:

Suppose we have the table T with attributes a1, a2, ..., an, v and we need to find the row where attribute v has the maximum value compared to all other rows.

First, we need a cross product of T and T' (a copy of T where v was renamed to v1) so we can compare the values of v:

T x ρ{a1, a2, ..., an, v1}T

Second, select the rows where v < v1, and we get all the rows whose v value is less than the v value in at least one other row. These are the rows that we need to exclude later:

σ{v < v1}(T x ρ{a1, a2, ..., an, v1}T)

Then project the columns with T's original attributes(column names) so we have a table with T's schema, containing all the unwanted rows, which are to be excluded from T in the next step:

π{a1, a2, ..., an, v}(σ{v < v1}(T x ρ{a1, a2, ..., an, v1}T))

Last, exclude the unwanted rows from T and we get the row with maximum v value:

T - π{a1, a2, ..., an, v}(σ{v < v1}(T x ρ{a1, a2, ..., an, v1}T))

(I worked this out based on SaT's answer and testing with Stanford's online RA course , since I didn't really understand SaT's notation, I put the solution in my notation here, in which the operator conditions are in {}. Hope it can help someone in the future)



回答3:

table1:= project field (table);
table2 (fieldrenamed):= project field (table);
Producted:=table1 mult table2;
minors:= select fild<fieldrenamed (producted);
result:=table1 difference (project field(minors)); 

Code correspond with winRDBI, field is the attribute that you want to compare to get the max value. Table is the original table where that field is.



回答4:

lets think we have a relation with an attribute A and values 1,2,3

A

1
2
3

so now..

project A values and rename with A1

A1
1
2
3

again project A values and rename with A2

A2
1
2
3

join this with A2<A1 i.e \join_{A2<A1}
so the - Output schema: (A2 integer, A1 integer)

A2<A1

1|2
1|3
2|3

hear always A2 values will be less than A1 because we join like that(a2<a1)

now project A2 the output is like below

A2
1
2

now diff with original attribute

A diff A2

A
1
2
3

 diff

A2
1
2

Output is 3 which is maximum value

Hi, i know some one have to help in editing, for better look

\project_{Att}Relation

\diff

\project_{A2}(\rename_{A2}(\project_{Att}Relation)
              \join_{A2<A1}
             \rename_{A1}(\project_{Att}Relation))


回答5:

max(columnname) will return the highest value in the column columnname.