Google Cloud: matrix multiplication with Bigquery

2020-03-06 03:02发布

问题:

I am using Google Analytics and processing the data with Bigquery, I need to do a matrix multiplication.

What is the most feasible way of implementing matrix multiplication in Google Cloud? Can it be done directly in Bigquery?

回答1:

Assuming MatrixA is a table with below columns:

i, k, value   

and MatrixB - has schema as

k, j, value    

and also assuming that range of k-values is the same in both tables:

This would mimic below matrices :

Matrix A
 2 -3  4
-1  0  2

Matrix B
-1  2  3
 0  1  7
 1  1 -2

Below code for multiplication is for BigQuery Standard SQL

#standardSQL
WITH MatrixA AS (
  SELECT 1 AS i, 1 AS k, 2 AS val UNION ALL
  SELECT 1, 2, -3 UNION ALL
  SELECT 1, 3, 4 UNION ALL
  SELECT 2, 1, -1 UNION ALL
  SELECT 2, 2, 0 UNION ALL
  SELECT 2, 3, 2 
), MatrixB AS (
  SELECT 1 AS k, 1 AS j, -1 AS val UNION ALL
  SELECT 1, 2, 2 UNION ALL
  SELECT 1, 3, 3 UNION ALL
  SELECT 2, 1, 0 UNION ALL
  SELECT 2, 2, 1 UNION ALL
  SELECT 2, 3, 7 UNION ALL
  SELECT 3, 1, 1 UNION ALL
  SELECT 3, 2, 1 UNION ALL
  SELECT 3, 3, -2 
)
SELECT i, j, SUM(a.val * b.val) val
FROM MatrixA AS a
CROSS JOIN MatrixB AS b
WHERE a.k = b.k
GROUP BY i, j
ORDER BY i, j   

result will be as below

Row i   j   val  
1   1   1   2    
2   1   2   5    
3   1   3   -23  
4   2   1   3    
5   2   2   0    
6   2   3   -7     

which represents MatrixA * MatrixB

2   5  -23
3   0   -7

as a note: you can use

FROM MatrixA AS a
JOIN MatrixB AS b
ON a.k = b.k  

instead of

FROM MatrixA AS a
CROSS JOIN MatrixB AS b
WHERE a.k = b.k   

just matter of your preferences



回答2:

Suppose a matrix with two columns, a path and a value column. You want to get the adjacency matrix P P^t. With 6 paths, you wanna get 36 values or 6x6 matrix that can be done with matrix multiplication.

I refactor Mikhail example for a path adjacency matrix given some tuple (Path, value). The demo contains 6 paths so the resulting adjacency matrix has dimensions 6x6. It is now in a straight format, but I would like to get it into a matrix formar or crosstab format with 6x6 form.

--standardSQL
WITH MatrixA AS (
  SELECT 1 AS p, 2 AS val UNION ALL
  SELECT 2, -3 UNION ALL
  SELECT 3, 4 UNION ALL
  SELECT 4, -1 UNION ALL
  SELECT 5, 0 UNION ALL
  SELECT 6, 2 
), MatrixB AS (
  SELECT 1 AS p, -1 AS val UNION ALL
  SELECT 2, 2 UNION ALL
  SELECT 3, 3 UNION ALL
  SELECT 4, 3 UNION ALL
  SELECT 5, 0 UNION ALL
  SELECT 6, 1
),
matrixMultiplication AS
(
SELECT a.p AS ap, b.p as bp, SUM(a.val * b.val) val
FROM MatrixA AS a
CROSS JOIN MatrixB AS b
GROUP BY a.p, b.p
ORDER BY a.p, b.p
)

--36 elements for the 6x6 PATHS Matrix
--TODO: how to shape it to 6x6 matrix?
SELECT * FROM matrixMultiplication

where the shaping question is now here about shaping the straight table into more traditional matrix multiplication format.