I have a custom routing class, which allows me to do matches like this on requests:
'/[*:cat1]/[*:cat2]/?[*:cat3]/?[*:cat4]/?[p:page]/?'
Which will match the following links:
category-one/
category-one/cat-two/
category-one/cat-two/cat-three/
category-one/cat-two/cat-three/cat-four/
As you can see the ? after / means that parameter is optional.
My problem is with [p:page]/? which is also optional.
category-one/page-2/ category-one/cat-two/page-2/ category-one/cat-two/cat-three/page-2/ category-one/cat-two/cat-three/cat-four/page-2/
My problem is that when i try to match this link
/category-one/cat-two/page-2/
it will give me these params:
cat1 => category-one
cat2 => cat-two
cat3 => page-2
Instead of
cat1 => category-one
cat2 => cat-two
page => page-2
I am using this generated regexp:
`^(?:/(?P<cat1>[^/\.]+))(?:/(?P<cat2>[^/\.]+/)?)(?:(?P<cat3>[^/\.]+/)?)(?:(?P<cat4>[^/\.]+/)?)(?:(?P<page>(a^)|(?:pag-)(\d+)/)?)$`u
Any help is appreciated. Thanks! Alex
I would use a token lexer/parser approach. I have a few examples on my git hub page at:
https://github.com/ArtisticPhoenix/MISC/tree/master/Lexers
These are others I have used to answer questions on SO, one is a JSON Object parser not a JSON string. This would be malformed JSON without the
"
around the properties whichjson_decode
can't handle. The other is a HTML minifier (in an OOP style, same concept though) that you can exclude things like<textarea>
tags from because white space matters there. So you can do pretty much any kind of processing of text with this method.I modified one, but I don't really know how you want the output or what you want to do with it, but it should get you started. Probably you will have to integrate it into your URL routing class, which I have no idea what that looks like. But this is a far better method to use then a simple
preg_match
because it gives you a place to preform complex logic on each segment of the match.You can see it in action here
Output of the above code:
How it works, this basically uses preg match all, but it is wrapped up in a convince type deal to make processing the output a bit easier and building the regular expression. So instead of one monolithic Regx, you wind up with a smaller easier to deal with one. It seems complicated at first but in reality once you understand what it does it makes it so so much easier.
You can even check the order if you want by adding some logic into the
parseTokens
function. This should be the only place you have to edit stuff and mainly in the token switch statement.The regx it creates is like this
So you can't add sub-capture groups note when I added the or in this one
cat-(?:one|two|three|four)
it's a non-capture group. But you can just usesubstr
to separate it later so it's no big deal.The
\Z
is a bit obscure, but it just matches the end of the string without capturing anything.Also the processing part is called like this (in
parse
):So you can return data that will get returned through the
parse
function to where you called it (if you wish)I don't have the time right now to go into the full explanation of what a lexer is or how it all works. So hopefully this is enough to get you started.
UPDATE
To counter this (don't get me wrong or take this the wrong way) I feal I need to explain it a bit further.
This is very generalized
This is very specific
If you have to edit that it's going to be a huge problem, what if you want to route to books or something else. How are you going to expand on that? I don't even know where to begin.
The array approach I have given you, you simple add it
Then you modify the switch statement:
And Bam you can do whatever you want in a clear and concise way. You can add whatever complex logic, what ever error checking etc... that you need to, very easily.