Whole-word match only in Django query

2019-02-04 20:15发布

问题:

I am trying to write a Django query that will only match whole words. Based on the answer here, I've tried something like:

result = Model.objects.filter(text__iregex='\bsomeWord\b')

But this isn't returning the expected result. I also tried

result = Model.objects.filter(text__iregex=r'\bsomeWord\b')

to no avail. My end goal is to be able to pass in a string variable as well, something like:

result = Model.objects.filter(text__iregex=r'\b'+stringVariable+r'\b')

or

result = Model.objects.filter(text__iregex=r'\b %s \b'%stringVariable)

But right now I can't even get it to work with a raw string. I'm using PostgreSQL.

回答1:

Use “\y” instead of “\b” when you're using PostgreSQL, this is because Django passes your regular expression straight down to PostgreSQL – so your RegEx's need to be compatible with it. You should be able to execute them from psql without any problems.

result = Model.objects.filter(text__iregex=r"\y{0}\y".format(stringVariable))

See: http://bit.ly/ZtpojU



回答2:

You might be able to get something by dropping the regex and using a few django lookups

result = Model.objects.filter(Q(text__contains=' someword ') |
                              Q(text__contains=' someword.') |
                              Q(text__istartswith = 'someword.' |
                              Q(text__istartswith = 'someword.' |
                              Q(text__iendswith = 'someword')

see here for docs.

I realize that's not so elegant (but makes for easy maintenance if you're not a fan of regex).



回答3:

I had the same problem trying to match word boundaries using the Perl-compatible escape sequence \b. My backend database is MySQL.

I solved the problem by the character class expression [[:space:]], e.g.

        q_sum = Q()
        search_list = self.form.cleaned_data['search_all'].split(' ');
        for search_item in search_list:
            search_regex = r"[[:space:]]%s[[:space:]]" % search_item
            q_sum |= Q(message__iregex=search_regex)
        queryset = BlogMessages.objects.filter(q_sum).distinct()