Is there any way I can store the last iterated row result and use that for next row iteration?
For example I have a table say(Time_Table
).
__ Key type timeStamp
1 ) 1 B 2015-06-28 09:00:00
2 ) 1 B 2015-06-28 10:00:00
3 ) 1 C 2015-06-28 11:00:00
4 ) 1 A 2015-06-28 12:00:00
5 ) 1 B 2015-06-28 13:00:00
Now suppose I have an exceptionTime
of 90 minutes which is constant.
If I start checking my Time_Table
then:
for the first row, as there is no row before 09:00:00,
it will directly put this record into my target table. Now my reference point is at 9:00:00.
For the second row at 10:00:00,
the last reference point was 09:00:00
and TIMESTAMPDIFF(s,09:00:00,10:00:00)
is 60 which is less than the required 90. I do not add this row to my target table.
For the third row, the last recorded exception was at 09:00:00
and the TIMESTAMPDIFF(s,09:00:00,11:00:00)
is 120 which is greater than the required 90 so I choose this record and set reference point to 11:00:00
.
For the fourth row the TIMESTAMPDIFF(s,11:00:00,12:00:00)
. Similarly it will not be saved.
This one is again saved.
Target table
__ Key type timeStamp
1 ) 1 B 2015-06-28 09:00:00
2 ) 1 C 2015-06-28 11:00:00
3 ) 1 B 2015-06-28 13:00:00
Is there any way that I can solve this problem purely in SQL
?
My approach:
SELECT * FROM Time_Table A WHERE NOT EXISTS(
SELECT 1 FROM Time_Table B
WHERE A.timeStamp > B.timeStamp
AND abs(TIMESTAMPDIFF(s,B.timeStamp,A.timeStamp)) > 90
)
But this will not actually working.
This is not possible using just pure SQL in Vertica. To do this in pure SQL you need to be able to perform a recursive query which is not supported in the Vertica product. In other database products you can do this using a WITH clause. For Vertica you are going to have to do it in the application logic. This is based on the statement "Each WITH clause within a query block must have a unique name. Attempting to use same-name aliases for WITH clause query names within the same query block causes an error. WITH clauses do not support INSERT, DELETE, and UPDATE statements, and you cannot use them recursively" from Vertica 7.1.x documentation
Definitely YES, (Not in pure SQL) either use LAG (since 7.1.x) depend on which version of Vertica you use
or create a custom UDx (User-Defined Extensions)
UDx in Java to access previous row which acts like LAG with only one step (hastag # performance)
(github full of udx examples)
public class UdxTestFactory extends AnalyticFunctionFactory {
@Override
public AnalyticFunction createAnalyticFunction(ServerInterface srvInterface) {
return new Test();
}
@Override
public void getPrototype(ServerInterface srvInterface, ColumnTypes argTypes,
ColumnTypes returnType) {
argTypes.addInt();
argTypes.addInt();
returnType.addInt();
}
@Override
public void getReturnType(ServerInterface srvInterface, SizedColumnTypes argTypes,
SizedColumnTypes returnType) throws UdfException {
returnType.addInt();
}
private class Test extends AnalyticFunction {
@Override
public void processPartition(ServerInterface srvInterface, AnalyticPartitionReader inputReader, AnalyticPartitionWriter outputWriter)
throws UdfException, DestroyInvocation {
SizedColumnTypes inTypes = inputReader.getTypeMetaData();
ArrayList<Integer> argCols = new ArrayList<Integer>();
inTypes.getArgumentColumns(argCols);
outputWriter.setLongNull(0);
while (outputWriter.next()) {
long v1 = inputReader.getLong(argCols.get(0)); // previous row
inputReader.next();
long v2 = inputReader.getLong(argCols.get(0)); // curent row
outputWriter.setLong(0, v2 - v1);
}
}
}
}
compile & combine compiled classes into single jar, named it TestLib.jar for simplicity
$ javac -classpath /opt/vertica/bin/VerticaSDK.jar /opt/vertica/sdk/BuildInfo.java UdxTestFactory.java -d .
$ jar -cvf TestLib.jar com/vertica/sdk/BuildInfo.class com/vertica/JavaLibs/*.class
Load library & function
CREATE OR REPLACE LIBRARY TestFunctions AS '/home/dbadmin/TestLib.jar' LANGUAGE 'JAVA';
CREATE OR REPLACE ANALYTIC FUNCTION lag1 AS LANGUAGE 'java' NAME 'com.vertica.JavaLibs.UdxTestFactory' LIBRARY TestFunctions;
And.. use it
SELECT
lag1(col1, null) OVER (ORDER BY col2) AS col1_minus_col2
FROM ...