Started here, I was asked to create a new post in order to flesh it out which I'll try to do here but I'm afraid it's rather wordy. I created a test table with what seems to be a good cross-section of the data that needs to be parsed. Using this query from here to which I added some additional fields that are needed, this is the query now.
SELECT ID, PartNo,
GROUP_CONCAT(
REPLACE(t.part, '-', CONCAT(' ', LEFT(t.part, 2)))
ORDER BY t.ModelList
SEPARATOR ' '
) Models
FROM (
SELECT t.ID, t.PartNo, t.Models, p.ModelList,
REPLACE(REPLACE(
SUBSTRING_INDEX(t.Models, ';', p.ModelList),
SUBSTRING_INDEX(t.Models, ';', p.ModelList - 1),
''
), ';', '') part
FROM parts_listing t CROSS JOIN (
SELECT 1 ModelList UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) p
WHERE REPLACE(REPLACE(Models, '-', ''), ';', '') REGEXP '^[0-9]*$'
) t
WHERE t.part <> ''
GROUP BY t.ID, t.PartNo, t.Models
ORDER BY t.PartNo, t.Models
For the most basic parts, the Models data looks like this:
2206-13-26-33; 2302-06-13-32-33
. . . and this is the part that I am working on now. The Models column also contains ALL 23RD etc that will also need to be parsed somehow but for now I am concerned only with the data as shown above and I need to parse it to something like:
2206 2213 2226 2233 2302 2306 2313 2332 2333
The query above, however, seems to be repeating the first bit so it comes out as:
2206 2213 2226 2233 2202 2206 2213 2232 2233
It gives the result in one chunk but it would also be okay if each result was on it's own line as ultimately it will be dumped into another table and that is the way it will be by then.
It also gives too few results. The actual table contains over 15,000 rows but this is fetching only a small fraction and in the data sample below, it should fetch about 20 rows but actually fetches only 6 although the fiddle, which has a very limited data set, does indeed give the proper results. Running the the fiddle query against my own data fetches only 555 rows.
A cross-section sample of data can be created with this:
DROP TABLE IF EXISTS `parts_listing`;
CREATE TABLE IF NOT EXISTS `parts_listing` (
`ID` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`GroupNumber` varchar(20) DEFAULT NULL,
`PartNo` varchar(30) DEFAULT NULL,
`Models` varchar(255) DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
INSERT INTO `parts_listing` (`ID`, `GroupNumber`, `PartNo`, `Models`) VALUES
(1, '0.000', '391906', '2201-02-11-20-22-32-40; 2301'),
(2, '0.000', '391907', '2206-26-33'),
(3, '0.000', '399391', '2213'),
(4, '0.000', '415209', '2302-06-32-33'),
(5, '0.000', '415245', '2313'),
(6, '0.000', '437273', 'ALL 24TH; 25TH; 26TH'),
(7, '0.000', '436803', '2631 (BODY 2678); 5431 (BODY 5478)'),
(8, '0.000', '448611', 'ALL 54TH'),
(9, '0.0015', '372174', 'ALL 22ND; 23RD'),
(10, '0.0015', '399324', '2201-02-11-20-22-32-40; 2301'),
(11, '0.0015', '399324', '2206-13-26-33; 2302-06-13-32-33'),
(12, '0.0015', '372146', 'ALL 22ND; 23RD'),
(13, '0.0015', 'G179359', '2201-02-11-20-22-32-40; 2301'),
(14, '0.0015', 'G179559', '2206-13-26-33; 2302-06-13-32-33'),
(15, '0.0015', '422024', '2201-02-11-20-22-32-40; 2301'),
(16, '0.0015', '422024', '2206-13-26-33; 2302-06-13-32-33'),
(17, '0.0016', '378664', 'ALL 22ND; 23RD'),
(18, '0.0016', 'G178532', 'ALL 22ND; 23RD'),
(19, '0.0016', 'G123197', 'ALL 22ND; 23RD'),
(20, '30.793', '448421', 'ALL 54TH'),
(21, '30.7932', '422762', '2462-65-92-95-98; 2562-65-92-95; 2662-65-92-95'),
(22, '30.7932', '448902', '5462-65-67-82-85-92-95-97'),
(23, '30.7938', '399941', 'ALL 22ND; 23RD'),
(24, '30.7938', '425809', 'ALL 24TH; 25TH; 26TH'),
(25, '30.7938', '447799', 'ALL 54TH'),
(26, '30.79383', '399927', 'ALL 22ND; 23RD; 24TH; 25TH; 26TH'),
(27, '30.79396', 'PA437550', '2562-65-77-79; 2677-79'),
(28, '30.79398', '422470', '2452-72; 2552-72'),
(29, '30.79398', '422471', '2452-72; 2552-72'),
(30, '30.79398', '422553', 'ALL 24TH; 25TH'),
(31, '30.8007', '425517', '2413-52-62-65-67-72-92-95-98; 2552-62-65-72-77-92-95; 2650-51-52-62-65-72-77-92-95-97; 5450-51-52-62-65-67-72-77-82-85-92-95-97'),
(32, '30.8007', '425920', 'RHD, 2462-65-92-95; 2562-65-92-95; 2662-65-92-95-97; 5462-65-82-85-92-95-97'),
(33, '30.8044', 'G129662', '2293');
I think you need a more sophisticated way to verify that the Models
column values are correct due to spaces after semicolons. In addition, if you need to process the Models
column values that contain more than ten parts, you will need a longer sequence of numbers. I use the INFORMATION_SCHEMA COLUMNS
table for this. You can also create an auxiliary table for this.
SELECT
CONCAT(
IF(N.n = 0, '', LEFT(t.part, 2)),
SUBSTRING_INDEX(
SUBSTRING_INDEX(t.part, '-', N.n + 1), '-', -1
)
) AS part, t.isHD, t.ID, t.GroupNumber, t.PartNo, t.Models
FROM Numbers AS N CROSS JOIN (
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(t.part, ' ', N.n + 1), ' ', -1
) AS part, t.isHD, t.ID, t.GroupNumber, t.PartNo, t.Models
FROM Numbers AS N CROSS JOIN (
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(t.part, ';', N.n + 1), ';', -1
) AS part, t.isHD, t.ID, t.GroupNumber, t.PartNo, t.Models
FROM Numbers AS N CROSS JOIN (
SELECT
REPLACE(
SUBSTRING_INDEX(
SUBSTRING_INDEX(t.part, 'LHD', N.n + 1), 'LHD', -1
), ',', ''
) AS part, IF(N.n > 0, 1, t.isHD) AS isHD,
t.ID, t.GroupNumber, t.PartNo, t.Models
FROM Numbers AS N CROSS JOIN (
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(p.Models, 'RHD', N.n + 1), 'RHD', -1
) AS part, IF(N.n > 0, 2, 0) AS isHD,
p.ID, p.GroupNumber, p.PartNo, p.Models
FROM Numbers AS N CROSS JOIN parts_listing AS p
WHERE p.Models REGEXP '[[:<:]]([LR]HD,? +)?[0-9]+(-[0-9]+)*[[:>:]]' AND
N.n <= (LENGTH(p.Models) - LENGTH(REPLACE(p.Models, 'RHD', ''))) / 3
) AS t WHERE N.n <= (LENGTH(t.part) - LENGTH(REPLACE(t.part, 'LHD', ''))) / 3
) AS t WHERE N.n <= LENGTH(t.part) - LENGTH(REPLACE(t.part, ';', ''))
) AS t WHERE N.n <= LENGTH(t.part) - LENGTH(REPLACE(t.part, ' ', ''))
) AS t WHERE t.part REGEXP '^[0-9]+(-[0-9]+)*$' AND
N.n <= LENGTH(t.part) - LENGTH(REPLACE(t.part, '-', ''));
Partial output (29 rows):
+------+------+----+-------------+---------+--------------------------------------+
| part | isHD | ID | GroupNumber | PartNo | Models |
+------+------+----+-------------+---------+--------------------------------------+
| 2201 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2202 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2211 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2220 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2222 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2232 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2240 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2301 | 0 | 1 | 0.000 | 391906 | 2201-02-11-20-22-32-40; 2301 |
| 2213 | 1 | 3 | 0.000 | 399391 | LHD 2213 |
| 2313 | 2 | 5 | 0.000 | 415245 | RHD, 2313 |
| 2631 | 0 | 7 | 0.000 | 436803 | 2631 (BODY 2678); 5431 (BODY 5478) |
| 5431 | 0 | 7 | 0.000 | 436803 | 2631 (BODY 2678); 5431 (BODY 5478) |
| 2206 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2213 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2226 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2233 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2302 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2306 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2313 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2332 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2333 | 1 | 14 | 0.0015 | G179559 | LHD, 2206-13-26-33; 2302-06-13-32-33 |
| 2201 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2202 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2211 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2220 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2222 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2232 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2240 | 0 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
| 2301 | 2 | 15 | 0.0015 | 422024 | 2201-02-11-20-22-32-40; RHD, 2301 |
+------+------+----+-------------+---------+--------------------------------------+
Demo on DB Fiddle.
You can try this. I changed the parts where You split the Models column.
I split them into different rows, and let the code from forpas go over it and join them again together. rows with text in it aren't touched, because you didn't specify nothing.
select Id,GroupNumber,PartNo,
group_concat(
replace(t.part, '-', concat(' ', left(t.part, 2)))
order by t.partno, t.part ASC
separator ' '
) Models
from
(SELECT
Id,GroupNumber,PartNo,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(models, ';', n.digit+1), ';', -1)) part
FROM
( SElECT Id,GroupNumber,PartNo,
REPLACE(REPLACE(REPLACE(Models, ')', ''), 'RHD,', ''), '(BODY', ';') Models
FROM
parts_listing ) p_l
INNER JOIN
(SELECT 0 digit UNION ALL select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9) n
ON LENGTH(REPLACE(Models, ';' , '')) <= LENGTH(Models)-n.digit
Where REPLACE(REPLACE(REPLACE(Models, '-', ''), ' ', ''), ';', '') REGEXP '^[0-9]+$'
ORDER BY
Id,
n.digit,part) t
Group by Id,GroupNumber,PartNo
;
Gets you
Id GroupNumber PartNo Models
1 0.000 391906 2201 2202 2211 2220 2222 2232 2240 2301
2 0.000 391907 2206 2226 2233
3 0.000 399391 2213
4 0.000 415209 2302 2306 2332 2333
5 0.000 415245 2313
7 0.000 436803 2631 2678 5431 5478
10 0.0015 399324 2201 2202 2211 2220 2222 2232 2240 2301
11 0.0015 399324 2206 2213 2226 2233 2302 2306 2313 2332 2333
13 0.0015 G179359 2201 2202 2211 2220 2222 2232 2240 2301
14 0.0015 G179559 2206 2213 2226 2233 2302 2306 2313 2332 2333
15 0.0015 422024 2201 2202 2211 2220 2222 2232 2240 2301
16 0.0015 422024 2206 2213 2226 2233 2302 2306 2313 2332 2333
21 30.7932 422762 2462 2465 2492 2495 2498 2562 2565 2592 2595 2662 2665 2692 2695
22 30.7932 448902 5462 5465 5467 5482 5485 5492 5495 5497
27 30.79396 PA437550 2562 2565 2577 2579 2677 2679
28 30.79398 422470 2452 2472 2552 2572
29 30.79398 422471 2452 2472 2552 2572
31 30.8007 425517 2413 2452 2462 2465 2467 2472 2492 2495 2498 2552 2562 2565 2572 2577 2592 2595 2650 2651 2652 2662 2665 2672 2677 2692 2695 2697 5450 5451 5452 5462 5465 5467 5472 5477 5482 5485 5492 5495 5497
32 30.8007 425920 2462 2465 2492 2495 2562 2565 2592 2595 2662 2665 2692 2695 2697 5462 5465 5482 5485 5492 5495 5497
33 30.8044 G129662 2293
Dbfiddle example