Parsing large XML file with PL/SQL

2019-02-19 12:53发布


I'm trying to parse quite large XML file. FIle is large enough to make impossible to use VARCHAR2, so I've decided to use CLOB. Code looks fine, but I still get an error.
Here is sample of xml file I'm parsing:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<SeznamOvmIndex>
    <Subjekt>
        <Zkratka>CUZK</Zkratka>
        <ICO>00025712</ICO>
        <Nazev>Český úřad zeměměřický a katastrální</Nazev>
        <AdresaUradu>
            <AdresniBod>25133616</AdresniBod>
            <UliceNazev>Pod sídlištěm</UliceNazev>
            <CisloDomovni>1800</CisloDomovni>
            <CisloOrientacni>9</CisloOrientacni>
            <ObecNazev>Praha</ObecNazev>
            <ObecKod>554782</ObecKod>
            <PSC>18200</PSC>
            <KrajNazev>Hlavní město Praha</KrajNazev>
        </AdresaUradu>
        <Email>
            <Polozka>
                <Typ text="podatelna">2</Typ>
                <Email>cuzk@cuzk.cz</Email>
                <Poznamka>Elektronická podatelna ČÚZK</Poznamka>
            </Polozka>
        </Email>
        <TypSubjektu id="11">Orgán státní správy</TypSubjektu>
        <PravniForma type="325">Organizační složka státu</PravniForma>
        <PrimarniOvm>Ano</PrimarniOvm>
        <IdDS>uuaaatg</IdDS>
        <TypDS>OVM</TypDS>
        <StavDS>1</StavDS>
        <StavSubjektu>1</StavSubjektu>
            <DetailSubjektu>http://seznam.gov.cz/ovm/datafile.do?format=xml&amp;service=seznamovm&amp;id=CUZK</DetailSubjektu>
    </Subjekt>
</SeznamOvmIndex>

And this is code:
SET SERVEROUTPUT ON

DECLARE
  xmlClob CLOB;
  xmlFile UTL_FILE.FILE_TYPE;
  x XMLType;
 BEGIN
   xmlFile := UTL_FILE.FOPEN('XMLPARSERADRESYCUZK', 'pokus.xml','R');
  LOOP
    BEGIN
      UTL_FILE.GET_LINE(xmlFile,xmlClob,NULL);
    EXCEPTION WHEN No_Data_Found THEN EXIT; END;
  END LOOP;
  UTL_FILE.FCLOSE(xmlFIle);
  x := XMLType.createXML(xmlClob);
  FOR r IN (
   SELECT ExtractValue(Value(p),'/Subjekt/Zkratka/text()') as kod
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/UliceNazev/text()') as ulice
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/CisloDomovni/text()') as  cislo_domovni
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/CisloOrientacni/text()') as cislo_orientacni
    FROM   TABLE(XMLSequence(Extract(x,'/SeznamOvmIndex/Subjekt'))) p
   WHERE ExtractValue(Value(p),'/Subjekt/Zkratka/text()') = 'CUZK'
   ) LOOP
      dbms_output.put_line(r.kod||' '||r.ulice||' '||r.cislo_domovni||'/'||r.cislo_orientacni);
   END LOOP;
 END;

I think that it shoud run right, but when I got SQL Developer I got:

Error at line 1 ORA-06512: na "SYS.XMLTYPE", line 5 ORA-06512: na line 13 31011. 00000 - "XML parsing failed" *Cause: XML parser returned an error while trying to parse the document. *Action: Check if the document to be parsed is valid.

1条回答
该账号已被封号
2楼-- · 2019-02-19 13:45

You are reading the file line by line, but overwritting your xmlClob with each line, not appending. You could build up the CLOB by reading into a varchar2 buffer and appending, but you can also use the DBMS_LOB built-in procedures to do it for you:

DECLARE
  xmlClob CLOB;
  xmlFile BFILE;
  x XMLType;

  src_offset number := 1 ;
  dest_offset number := 1 ;
  lang_ctx number := DBMS_LOB.DEFAULT_LANG_CTX;
  warning integer;
BEGIN
  xmlFile := BFILENAME('XMLPARSERADRESYCUZK', 'pokus.xml');
  DBMS_LOB.CREATETEMPORARY(xmlClob, true);
  DBMS_LOB.FILEOPEN(xmlFile, DBMS_LOB.FILE_READONLY);
  DBMS_LOB.LOADCLOBFROMFILE(xmlClob, xmlFile, DBMS_LOB.LOBMAXSIZE, src_offset,
    dest_offset, DBMS_LOB.DEFAULT_CSID, lang_ctx, warning);
  x := XMLType.createXML(xmlClob);
  DBMS_LOB.FILECLOSEALL();
  DBMS_LOB.FREETEMPORARY(xmlClob);
  FOR r IN (
...

When I use that and load your file I get the output:

CUZK Pod smdli.t.m 1800/9

You probably want some error checkign around the DBMS_LOB calls, this is just a simple demo.

查看更多
登录 后发表回答