How to save the last checkpoint in the sql to be u

2020-08-09 10:47发布

问题:

Is there any way I can store the last iterated row result and use that for next row iteration?

For example I have a table say(Time_Table).

__   Key   type   timeStamp             
1 )    1     B    2015-06-28 09:00:00  
2 )    1     B    2015-06-28 10:00:00  
3 )    1     C    2015-06-28 11:00:00  
4 )    1     A    2015-06-28 12:00:00  
5 )    1     B    2015-06-28 13:00:00  

Now suppose I have an exceptionTime of 90 minutes which is constant. If I start checking my Time_Table then:

  1. for the first row, as there is no row before 09:00:00, it will directly put this record into my target table. Now my reference point is at 9:00:00.

  2. For the second row at 10:00:00, the last reference point was 09:00:00 and TIMESTAMPDIFF(s,09:00:00,10:00:00) is 60 which is less than the required 90. I do not add this row to my target table.

  3. For the third row, the last recorded exception was at 09:00:00 and the TIMESTAMPDIFF(s,09:00:00,11:00:00) is 120 which is greater than the required 90 so I choose this record and set reference point to 11:00:00.

  4. For the fourth row the TIMESTAMPDIFF(s,11:00:00,12:00:00). Similarly it will not be saved.

  5. This one is again saved.

Target table

__   Key   type   timeStamp             
1 )    1     B    2015-06-28 09:00:00  
2 )    1     C    2015-06-28 11:00:00   
3 )    1     B    2015-06-28 13:00:00 

Is there any way that I can solve this problem purely in SQL?

My approach:

SELECT * FROM Time_Table A WHERE NOT EXISTS(
       SELECT 1 FROM Time_Table B
       WHERE  A.timeStamp > B.timeStamp
       AND    abs(TIMESTAMPDIFF(s,B.timeStamp,A.timeStamp)) > 90 
)

But this will not actually working.

回答1:

This is not possible using just pure SQL in Vertica. To do this in pure SQL you need to be able to perform a recursive query which is not supported in the Vertica product. In other database products you can do this using a WITH clause. For Vertica you are going to have to do it in the application logic. This is based on the statement "Each WITH clause within a query block must have a unique name. Attempting to use same-name aliases for WITH clause query names within the same query block causes an error. WITH clauses do not support INSERT, DELETE, and UPDATE statements, and you cannot use them recursively" from Vertica 7.1.x documentation



回答2:

Definitely YES, (Not in pure SQL) either use LAG (since 7.1.x) depend on which version of Vertica you use or create a custom UDx (User-Defined Extensions)

UDx in Java to access previous row which acts like LAG with only one step (hastag # performance) (github full of udx examples)

public class UdxTestFactory extends AnalyticFunctionFactory {

    @Override
    public AnalyticFunction createAnalyticFunction(ServerInterface srvInterface) {
        return new Test();
    }

    @Override
    public void getPrototype(ServerInterface srvInterface, ColumnTypes argTypes,
                             ColumnTypes returnType) {
        argTypes.addInt();
        argTypes.addInt();
        returnType.addInt();
    }

    @Override
    public void getReturnType(ServerInterface srvInterface, SizedColumnTypes argTypes,
                              SizedColumnTypes returnType) throws UdfException {
        returnType.addInt();
    }

    private class Test extends AnalyticFunction {

        @Override
        public void processPartition(ServerInterface srvInterface, AnalyticPartitionReader inputReader, AnalyticPartitionWriter outputWriter)
                throws UdfException, DestroyInvocation {

            SizedColumnTypes inTypes = inputReader.getTypeMetaData();
            ArrayList<Integer> argCols = new ArrayList<Integer>();

            inTypes.getArgumentColumns(argCols);

            outputWriter.setLongNull(0);

            while (outputWriter.next()) {
                long v1 = inputReader.getLong(argCols.get(0)); // previous row
                inputReader.next();
                long v2 = inputReader.getLong(argCols.get(0)); // curent row
                outputWriter.setLong(0, v2 - v1);
            }
        }

    }


}

compile & combine compiled classes into single jar, named it TestLib.jar for simplicity

$ javac -classpath /opt/vertica/bin/VerticaSDK.jar /opt/vertica/sdk/BuildInfo.java UdxTestFactory.java -d . 
$ jar -cvf TestLib.jar com/vertica/sdk/BuildInfo.class com/vertica/JavaLibs/*.class

Load library & function

CREATE OR REPLACE LIBRARY TestFunctions AS '/home/dbadmin/TestLib.jar' LANGUAGE 'JAVA';
CREATE OR REPLACE ANALYTIC FUNCTION lag1 AS LANGUAGE 'java' NAME 'com.vertica.JavaLibs.UdxTestFactory' LIBRARY TestFunctions;

And.. use it

SELECT 
    lag1(col1, null) OVER (ORDER BY col2) AS col1_minus_col2 
FROM ...


标签: sql vertica