How to change the format of all dates in xml?

2019-07-13 06:59发布

问题:

I have an xml in a cell in SQL, like:

Table:

    <?xml version="1.0" encoding="utf-16"?>
    <Document xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <Page W="2480" H="3516">
            <Word L="1871" R="2031" T="221" B="252" Text="INVOICE" Id="25509747671106" />
            <Word L="1988" R="2046" T="2232" B="2279" Text="tf.'l" Id="25886807122412" />
            <Word L="1872" R="1990" T="324" B="351" Text="26603333345"Id="24493329746300" />
            <Word L="1871" R="2015" T="373" B="401" Text="08-02-17" Id="25109308586898" />
            <Word L="1873" R="2007" T="422" B="448" Text="S-44404" Id="24914704754685" />
            <Word L="1874" R="1887" T="468" B="496" Text="1" Id="22024234663427" />
            <Word L="1068" R="1148" T="1278" B="1309" Text="DHL" Id="8152496756181" />
            <Word L="1692" R="1848" T="1279" B="1310" Text="08-02-17" Id="21119731019927" />
            <Word L="2096" R="2251" T="1278" B="1310" Text="10-01-17" Id="31333127836454" />
            <Word L="112" R="243" T="1352" B="1358" Text="_" Id="365589546232" />
            <Word L="252" R="411" T="1322" B="1350" Text="QUANTITY" Id="1050334834310" />
            <Word L="1415" R="1913" T="745" B="787" Text="______ShlpTo" Id="22635743273663" />
     </Page>
   </Document>

I need to update [FRData] and change the date from format mm-dd-yy to format yyyy-mm-dd in all xml.

I wrote a regular expression that validates the date format sought:

^(0?[1-9]|1[012])[\-](0?[1-9]|[12][0-9]|3[01])[\-]\d{2}$

I know how to change the date format mm-dd-yy to yyyy-mm-dd:

select left(convert(varchar, cast('08-02-17' as datetime), 120),10)

But I do not know how to change dates in the entire xml

回答1:

Try this please.

declare @I int
declare @X nvarchar(100)
declare @D date

select @I = max(FRData.value('count(/Document/Page//Word)', 'int'))
from #t

while @I > 0
begin

    set @X = (select top 1 FRData.value('(/Document/Page//Word/@Text)[sql:variable("@I")][1]', 'nvarchar(100)')
              from #t)

    if isdate(@X) = 1
    begin
        set @D = convert(date, @X)

        update #t
        set FRData.modify('replace value of ((/Document/Page//Word/@Text)[sql:variable("@I")])[1]
                           with sql:variable("@D")')
    end

    set @I = @I - 1
end


回答2:

Try xml linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            StreamReader reader = new StreamReader(FILENAME,Encoding.Unicode);
            reader.ReadLine();
            XDocument doc = XDocument.Load(reader);

            foreach (XElement word in doc.Descendants("Word"))
            {
                DateTime date;
                Boolean isDate = DateTime.TryParse((string)word.Attribute("Text"), out date);
                if (isDate)
                {
                    word.Attribute("Text").SetValue(date.ToString("yyyy-MM-dd"));
                }

            }

        }
    }
}


回答3:

You are probably looking for: SET DATEFORMAT

SET DATEFORMAT ymd;

Docs: https://docs.microsoft.com/en-us/sql/t-sql/statements/set-dateformat-transact-sql

But it will depend on your script to create xml, which is missing because it can be necessary to convert/cast your date format.



回答4:

You are dealing with a very bad XML:

  • You should store XML in a natively typed XML column. The leading <?xml version="1.0" encoding="utf-16"> shows clearly, that you are storing this as some sort of string. XML is not stored as the text you see, but as a hierarchy tree. This makes XML astonishingly fast. Your approach will have to parse the XML over and over...
  • You should never use culture dependant date/time formats
  • Even worse: never use a date-format with a 2-byte year (01-02-03 will be taken as completely different values depending on the system's settings!)
  • The Xml is not valid: Text="26603333345"Id="24493329746300" has missing a blank before Id

Your own approach is dangerous:

I know how to change the date format mm-dd-yy to yyyy-mm-dd: select left(convert(varchar, cast('08-02-17' as datetime), 120),10)

The first step is cast('08-02-17' as datetime). This will rely on your system's settings implicitly

Better try select left(convert(varchar, CONVERT(DATETIME,'08-02-17',1), 120),10)

The accepted answer uses if isdate(@X) = 1 which is dangerous due to the same reason.

But what you are trying to achieve is good: yyyy-m-dd is ISO8601 which should be a date's format within XML:

You might use XQuery with FLWOR for this:

DECLARE @dummy TABLE(ID INT IDENTITY,YourXML NVARCHAR(MAX));
INSERT INTO @dummy VALUES
(N'<Document xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <Page W="2480" H="3516">
            <Word L="1871" R="2031" T="221" B="252" Text="INVOICE" Id="25509747671106" />
            <Word L="1988" R="2046" T="2232" B="2279" Text="tf.''l" Id="25886807122412" />
            <Word L="1872" R="1990" T="324" B="351" Text="26603333345" Id="24493329746300" />
            <Word L="1871" R="2015" T="373" B="401" Text="08-02-17" Id="25109308586898" />
            <Word L="1873" R="2007" T="422" B="448" Text="S-44404" Id="24914704754685" />
            <Word L="1874" R="1887" T="468" B="496" Text="1" Id="22024234663427" />
            <Word L="1068" R="1148" T="1278" B="1309" Text="DHL" Id="8152496756181" />
            <Word L="1692" R="1848" T="1279" B="1310" Text="08-02-17" Id="21119731019927" />
            <Word L="2096" R="2251" T="1278" B="1310" Text="10-01-17" Id="31333127836454" />
            <Word L="112" R="243" T="1352" B="1358" Text="_" Id="365589546232" />
            <Word L="252" R="411" T="1322" B="1350" Text="QUANTITY" Id="1050334834310" />
            <Word L="1415" R="1913" T="745" B="787" Text="______ShlpTo" Id="22635743273663" />
     </Page>
   </Document>');

WITH CastedToXML AS
(
    SELECT ID 
          ,CAST(YourXML AS XML) AS TheXml
    FROM @dummy
)
SELECT ID 
      ,TheXml.query
      (N'
        <Document><Page>
        {
            for $w in /Document/Page/Word
            return 
            if(string-length($w/@Text)=8 and substring($w/@Text,3,1)="-" and substring($w/@Text,6,1)="-") 
            then
                <Word L="{$w/@L}" R="{$w/@R}" T="{$w/@T}" B="{$w/@B}" Text="{concat("20",substring($w/@Text,7,2),"-",substring($w/@Text,1,5))}" Id="{$w/@Id}" />
            else
                $w
        }
        </Page></Document>
      ')
FROM CastedToXML;

The result:

<Document>
  <Page>
    <Word L="1871" R="2031" T="221" B="252" Text="INVOICE" Id="25509747671106" />
    <Word L="1988" R="2046" T="2232" B="2279" Text="tf.'l" Id="25886807122412" />
    <Word L="1872" R="1990" T="324" B="351" Text="26603333345" Id="24493329746300" />
    <Word L="1871" R="2015" T="373" B="401" Text="2017-08-02" Id="25109308586898" />
    <Word L="1873" R="2007" T="422" B="448" Text="S-44404" Id="24914704754685" />
    <Word L="1874" R="1887" T="468" B="496" Text="1" Id="22024234663427" />
    <Word L="1068" R="1148" T="1278" B="1309" Text="DHL" Id="8152496756181" />
    <Word L="1692" R="1848" T="1279" B="1310" Text="2017-08-02" Id="21119731019927" />
    <Word L="2096" R="2251" T="1278" B="1310" Text="2017-10-01" Id="31333127836454" />
    <Word L="112" R="243" T="1352" B="1358" Text="_" Id="365589546232" />
    <Word L="252" R="411" T="1322" B="1350" Text="QUANTITY" Id="1050334834310" />
    <Word L="1415" R="1913" T="745" B="787" Text="______ShlpTo" Id="22635743273663" />
  </Page>
</Document>

You might use the same approach for an immediate update with an updateable CTE:

WITH CastedToXML AS
(
    SELECT ID 
          ,YourXML 
          ,CAST(YourXML AS XML) AS TheXml
    FROM @dummy
)
UPDATE CastedToXML SET YourXml=CAST(TheXml.query
      (N'
        <Document><Page>
        {
            for $w in /Document/Page/Word
            return 
            if(string-length($w/@Text)=8 and substring($w/@Text,3,1)="-" and substring($w/@Text,6,1)="-") 
            then
                <Word L="{$w/@L}" R="{$w/@R}" T="{$w/@T}" B="{$w/@B}" Text="{concat("20",substring($w/@Text,7,2),"-",substring($w/@Text,1,5))}" Id="{$w/@Id}" />
            else
                $w
        }
        </Page></Document>
      ') AS NVARCHAR(MAX));

SELECT * FROM @dummy;

Final advise: Store this in a natively typed column!