Regex - Split string at every datetime

2020-04-14 03:37发布

I'm trying to split up a update string I get from a feed into an array each time there is a time stamp.

This is the regex I have so far, but it seems to only find the first datetime in the string.

^(\d{1,2}\/\d{1,2}\/\d{4})

Here is an example of my string.

$Comment = "8/13/2015 11:44:10 AMVN - Upon additional underwriting review. Account will be declined due to inconsistencies in personal and/or business information that can not be verified or validated 8/13/2015 8:32:52 AMFA Rcvd Change In Terms letter, will fwd to the Underwriter. 8/10/2015 1:21:17 PMVN - Please provide change in term letter capping monthly volume $20K, average ticket to $500 and high ticket to $1K. 8/10/2015 11:02:19 AMVN Declined as the financial condition do not support business type and requested limits. 8/10/2015 9:37:03 AMFA Rcvd Bank Statements, will fwd to the Underwriter. 8/4/2015 3:35:05 PMVN - Please provide 3 most recent bank statements and 3 most recent processing statements. 8/4/2015 9:52:04 AMBAI In Underwriting iEntry Application";

Using this example, I would like to have an array with seven values.

$Pattern = "^(\d{1,2}\/\d{1,2}\/\d{4})";
$Comments = preg_split($Pattern, $Comment);

标签: php regex
2条回答
爷的心禁止访问
2楼-- · 2020-04-14 04:26

When you need to split a long string with no line breaks at a date string, you may consider a regex split method with

\s+(?=<DATE_PATTERN HERE>)              # DATE is preceded with whitespace, anything can follow
\s+(?=<DATE_PATTERN HERE>\b)            # DATE is preceded with whitespace, date is not followed with letter/digit/_
\s*(?<!\d)(?=<DATE_PATTERN HERE>)       # Whitespace before date optional, no digit before
\s*(?<!\d)(?=<DATE_PATTERN HERE>)(?!\d) # Whitespace before date optional, no digit before and after
\s*(?<!\d)(?<!\d<DEL>)(?=<DATE_PATTERN HERE>)(?!<DEL>?\d) # Whitespace before date optional, no digit with delimiter before and after

Here, you may use a simple \s+(?=\d{1,2}/\d{1,2}/\d{4}) regex that matches 1+ whitespaces followed with (a (?=...) is a positive lookaround that does not consume any text, just checks if there is a match and returns true or false) one or two digits, /, one or two digits, / and four digits:

$records = preg_split('~\s+(?=\d{1,2}/\d{1,2}/\d{4})~', $Comment);

See the PHP demo:

$Comment = "8/13/2015 11:44:10 AMVN - Upon additional underwriting review. Account will be declined due to inconsistencies in personal and/or business information that can not be verified or validated 8/13/2015 8:32:52 AMFA Rcvd Change In Terms letter, will fwd to the Underwriter. 8/10/2015 1:21:17 PMVN - Please provide change in term letter capping monthly volume $20K, average ticket to $500 and high ticket to $1K. 8/10/2015 11:02:19 AMVN Declined as the financial condition do not support business type and requested limits. 8/10/2015 9:37:03 AMFA Rcvd Bank Statements, will fwd to the Underwriter. 8/4/2015 3:35:05 PMVN - Please provide 3 most recent bank statements and 3 most recent processing statements. 8/4/2015 9:52:04 AMBAI In Underwriting iEntry Application";
$records = preg_split('~\s+(?=\d{1,2}/\d{1,2}/\d{4})~', $Comment);
print_r($records);

Output:

Array
(
    [0] => 8/13/2015 11:44:10 AMVN - Upon additional underwriting review. Account will be declined due to inconsistencies in personal and/or business information that can not be verified or validated
    [1] => 8/13/2015 8:32:52 AMFA Rcvd Change In Terms letter, will fwd to the Underwriter.
    [2] => 8/10/2015 1:21:17 PMVN - Please provide change in term letter capping monthly volume $20K, average ticket to $500 and high ticket to $1K.
    [3] => 8/10/2015 11:02:19 AMVN Declined as the financial condition do not support business type and requested limits.
    [4] => 8/10/2015 9:37:03 AMFA Rcvd Bank Statements, will fwd to the Underwriter.
    [5] => 8/4/2015 3:35:05 PMVN - Please provide 3 most recent bank statements and 3 most recent processing statements.
    [6] => 8/4/2015 9:52:04 AMBAI In Underwriting iEntry Application
)
查看更多
Melony?
3楼-- · 2020-04-14 04:35

Dont anchor it to the start of the string, hence get rid of the ^

(\d{1,2}\/\d{1,2}\/\d{4})
查看更多
登录 后发表回答