How big is an Oracle XMLType when stored as BINARY

2019-04-20 03:15发布

问题:

The Oracle documentation claims that it stores XMLType more compact as BINARY XML than as CLOB. But how do I find out how much space is taken by the binary xml?

CREATE TABLE t (x XMLTYPE) XMLTYPE x STORE AS BINARY XML;

SELECT vsize(x), dbms_lob.getlength(XMLTYPE.getclobval(x)) FROM t;

94 135254
94  63848
94  60188

So, vsize seems to be the size of some sort of pointer or LOB locator, and getclobval unpacks the binary XML into text. But what about the storage size of the binary XML itself?

Please help, the table size is 340GB, so it's worth looking into storage options...

回答1:

Oracle Binary XML format corresponds to "Compact Schema Aware XML Format" abbreviated as CSX. Encoded data stored as BLOB field. Details about binary XML format available from Oracle documentation (here and here).

Real size of data field depends on LOB storage parameters of XMLType column. E.g. if storage in row option enabled then small documents stored directly with other data and vsize() returns appropriate values.

In reality Oracle creates underlying BLOB column with system name, which can be found by querying user_tab_cols view:

select table_name, column_name, data_type 
from user_tab_cols 
where 
  table_name = 'T' and hidden_column = 'YES'
  and
  column_id = (
      select column_id 
      from user_tab_cols 
      where table_name = 'T' and column_name = 'X'
  ) 

This query returns system hidden column name which looks like SYS_NC00002$.

After that it's possible to get size of fields with regular dbms_lob.getlength() call against hidden column:

select dbms_lob.getlength(SYS_NC00002$) from t


回答2:

Actual storage consumption is stored in a view called user_segments. To find the correlating LOB to the column you will have to join user_segments with user_lobs:

CREATE TABLE clob_table (x XMLTYPE) XMLTYPE x store as CLOB;

CREATE TABLE binaryxml_table (x XMLTYPE) XMLTYPE x STORE AS BINARY XML;

INSERT INTO clob_table (x) SELECT
  XMLELEMENT("DatabaseObjects",
    XMLAGG(
      XMLELEMENT("Object", XMLATTRIBUTES(owner, object_type as type, created, status), object_name)
    )
  ) as x
FROM all_objects;

INSERT INTO binaryxml_table (x) select
  XMLELEMENT("DatabaseObjects",
    XMLAGG(
      XMLELEMENT("Object", XMLATTRIBUTES(owner, object_type as type, created, status), object_name)
    )
  ) as x
FROM all_objects;

SELECT lobs.table_name,
  (SELECT column_name
     FROM user_tab_cols
       WHERE table_name = lobs.table_name AND data_type = 'XMLTYPE'  AND column_id =
         (SELECT column_id
            FROM user_tab_cols
              WHERE table_name = lobs.table_name AND column_name = lobs.column_name
          )
    ) column_name,
  seg.segment_name, seg.bytes
    FROM user_lobs lobs, user_segments seg
      WHERE lobs.segment_name = seg.segment_name;

TABLE_NAME      COLUMN_NAME SEGMENT_NAME                 BYTES
--------------- ----------- ------------------------- --------
BINARYXML_TABLE X           SYS_LOB0000094730C00002$$  7536640 
CLOB_TABLE      X           SYS_LOB0000094727C00002$$ 19922944 


回答3:

[rep issue, not allowed to post comments] you wanted to say "between questions" as i understood. the only similarity is storage space issue, thought it might be helpful for "guess" estimation. you didn't mention what type of data you are going to store as bXML.

unpacks the binary XML into text

If pure XML then it depends on what compressor you are going to use. Usually lzma|gzip is used for binary compression. Maybe I am writing about too obvious things, but that's all I know