I have some image data stored in a PostgreSQL database table column as bytea. I also have metadata about the data for use in interpreting it, relevant ones being image dimensions and class. Classes include int16, uint16. I cannot find any information on interpreting signed/unsigned ints correctly in R.
I am using RPostgreSQL to pull the data into R and I want to view the image in R.
MWE:
# fakeDataQuery <- dbGetQuery(conn,
# 'select byteArray, ImageSize, ImageClass from table where id = 1')
# Example 1 (no negative numbers)
# the actual byte array shown in octal sequences in pgadmin (1.22.2) Query Output is:
# "\001\000\002\000\003\000\004\000\005\000\006\000\007\000\010\000\011\000"
# but RPostgreSQL returns the hex-encoded version:
byteArray <- "\\x010002000300040005000600070008000900"
ImageSize <- c(3, 3, 1)
ImageClass <- 'int16'
# expected result
> array(c(1,2,3,4,5,6,7,8,9), dim=c(3,3,1))
# , , 1
#
# [,1] [,2] [,3]
#[1,] 1 4 7
#[2,] 2 5 8
#[3,] 3 6 9
# Example 2: (with negtive numbers)
byteArray <- "\\xffff00000100020003000400050006000700080009000a00"
ImageSize <- c(3, 4, 1)
ImageClass <- 'int16'
# expectedResult
> array(c(-1,0,1,2,3,4,5,6,7,8,9,10), dim=c(3,4,1))
#, , 1
#
# [,1] [,2] [,3] [,4]
#[1,] -1 2 5 8
#[2,] 0 3 6 9
#[3,] 1 4 7 10
What I've tried:
The bytea data from PostgreSQL is a long character string of digits encoded as "hex", which you can tell by the \\x
pre-pended to it (I believe there is an extra \
for escaping the existing one?): https://www.postgresql.org/docs/9.1/static/datatype-binary.html (see: section 8.4.1. 'bytea Hex format')
Decode 'hex' back to the original type ('int16' based on ImageClass)
Per the same url above, hex encoding uses '2 hexadecimal digits per byte'. So I need to split the encoded byteArray into the appropriate length substrings, see: this link
# remove the \\x hex encoding indicator(s) added by PostgreSQL
byteArray <- gsub("\\x", "", x = byteArray, fixed=T)
l <- 2 # hex digits per byte (substring length)
byteArray <- strsplit(trimws(gsub(pattern = paste0("(.{",l,"})"),
replacement = "\\1 ",
x = byteArray)),
" ")[[1]]
# for some reason these appear to be in the opposite order than i expect
# Ex: 1 is stored as '0100' rather than '0001'
# so reverse the digits (int16 specific)
byteArray <- paste0(byteArray[c(F,T)],byteArray[c(T,F)])
# strtoi() converts a vector of hex values given a decimal base
byteArray <- strtoi(byteArray, 16L)
# now make it into an n x m x s array,
# e.g., 512 x 512 x (# slices)
V = array(byteArray, dim = ImageSize)
There are two problems with this solution:
- It does not work with signed types, so negative integer values will be interpreted as unsigned values (e.g., 'ffff' is -1 (int16) but 65535 (uint16) and strtoi() will return 65535 always).
- It currently is coded for int16 only, and would need a little extra code to work with other types (e.g., int32, int64)
Anyone have a solution that would work with signed types?