SQL concatenate text fields from different records

2020-07-18 07:20发布

问题:

Were talking about data in a file on an iSeries here using DB2.

File structure is as such:

Item     Textline    Text

12755    1           this item
12755    2           is no longer
12755    3           for sale
abc123   1           please use
abc123   2           another code
xyz987   1           obsolete

The result im trying to get is

   Item     Text

   12755    this item is no longer for sale
   abc123   please use another code
   xyz987   obsolete

So it is grouped by Item code and the text lines are added up regardless of how many there are.

I tried it in query first by joining the file to itself 5 times (there can be as many as 5 text lines) but i could never get it to work.

I tried it in SQL but i cant get the commands im familair with to work correctly with text and im struggling on the criteria to add them together without duplicating the text lines up.

Anyone any idea's ?

My last resort is to extract this file 5 times over itself with each file just containing a different text line and then matching them all back together based on item and concatenating that way... but wow how messy that is :)

Please help :)

回答1:

DB2 for i can do this.

Perhaps the most concise way is to use an often overlooked feature, that the other DB2 platforms do not have -- a type of recursive query known as a hierarchical query. Like many things, it's fairly simple once you understand it, but it deserves a bit of explaining to understand how it works.

A recursive or hierarchical query helps you connect one row to one or more others. In this case we are making a chain of one to one connections.

The aggregate SYS_CONNECT_BY_PATH function will concatenate strings along the path from a starting (or "root") row to the end (or "leaf") row.

Let's start by putting your sample data into a table.

declare global temporary table snippets
( item    varchar(10)
, seq     smallint
, words   varchar(30)
);

insert into snippets  
  values 
   ('12755',  1, 'this item')
  ,('12755',  2, 'is no longer')
  ,('12755',  3, 'for sale')
  ,('abc123', 1, 'please use')
  ,('abc123', 2, 'another code')
  ,('xyz987', 1, 'obsolete')
;

Let's look at the hierarchical query we will use as a first phase building block.

 SELECT item
      , seq
      , CONNECT_BY_ISLEAF as flag
      , SYS_CONNECT_BY_PATH(words, ' ') as phrase
   FROM snippets
   START WITH seq = 1
   CONNECT BY PRIOR item = item and PRIOR (seq + 1) = seq

The CONNECT BY clause defines the join conditions from one row in the table to another.

The START WITH clause specifies which row(s) to begin joining from, the "root" row. So we are going to start with rows where seq=1, and join to the next seq value (seq+1). That row will join to the following seq value, and so on.

CONNECT_BY_ISLEAF will indicate when we are at the last row in the chain. You'll in a moment why I used it here.

Here are the results:

    ITEM   SEQ  FLAG  PHRASE

    12755   1    0    this item
    12755   2    0    this item is no longer
    12755   3    1    this item is no longer for sale
    abc123  1    0    please use
    abc123  2    1    please use another code
    xyz987  1    1    obsolete

Now all we need to do is put that in a common table expression, from which we can take only the rows we want. The "leaf" rows at the end of the chain have the completed string, so those are the ones to pick.

with q as
(
  SELECT item
       , seq
       , CONNECT_BY_ISLEAF as flag
       , SYS_CONNECT_BY_PATH(words, ' ') as phrase
    FROM snippets
    START WITH seq = 1
    CONNECT BY PRIOR item = item and PRIOR (seq + 1) = seq
)
SELECT item
     , phrase
  FROM q
  WHERE flag = 1
  ORDER BY item
;

Which gives us:

    ITEM     PHRASE

    abc123   please use another code    
    xyz987   obsolete   
    12755    this item is no longer for sale    

There you have it.

Folks on other DB2 platforms will have to find another solution, such as a regular recursive query in funkworm's comment above.



回答2:

If you want to create a User-defined function that can do this sort of thing, here's one I wrote to accomplish an essentially identical task:

create function GetTextBlock(inItemNo decimal(13,0))
returns varchar(1200)
begin atomic
    declare outstr varchar(1200) default '';
    for list_comments as
            select tltx60
            from msytxl
            where tlcono = 1 
              and tldivi = '' 
              and tltxid = inItemNo 
            order by tllino
    do
            set outstr = outstr || trim(tltx60) || ' ';
    end for;
    set outstr = rtrim(outstr);
    return outstr;
end
go

Then you can use it in a SELECT statement like so:

Select ItemNo, GetTextBlock(ItemNo)
from ItemMaster

(This code was written for, and works on, DB2 for i V6R1)



回答3:

V5R4

declare global temporary table qtemp.snippets
( item    varchar(10)
, seq     smallint
, words   varchar(30)
);

insert into qtemp.snippets  
  values 
   ('12755',  1, 'this item')
  ,('12755',  2, 'is no longer')
  ,('12755',  3, 'for sale')
  ,('abc123', 1, 'please use')
  ,('abc123', 2, 'another code')
  ,('xyz987', 1, 'obsolete')
;

with q(item, seq, words, phrase) as
(
    SELECT s.item
       , s.seq
       , s.words
       , s.words phrase 
    FROM qtemp.snippets s
    where s.seq = 1
    union all 
    SELECT s1.item
       , s1.seq
       , s1.words
       , q.Phrase || ' ' || s1.words phrase  
    FROM qtemp.snippets s1
    join q on s1.item = q.item
        and s1.seq = q.seq +1
)
select q.item, q.phrase from q 
join (
Select q.item, max(q.seq) maxseq from q
group by item
 ) q1 on q.item = q1.item and q.seq = q1.maxseq 


回答4:

DB2 for i does not have an aggregate function that could be used for this purpose nor does it support user defined aggregate functions.

The only options are a stored procedure or 'client' side aggregation.



回答5:

@WarrenT -- after some 16+ years on the i, I thought I was one of those cats... but you have disturbingly proved me wrong. :) Great response on the SQL. And yes, I see a few opportunities to use it right away!!!

Unfortunately, I am limited to V5R4... will that work as a stored procedure call from, say a .NET web app?