Python

\b in Python regular expressions


14th of June 2005

Boy did that shut me up! The \b special character i python regular expressions is so useful. I've used it before but have forgotten about it. The following code:

 def createStandaloneWordRegex(word):
    """ return a regular expression that can find 'peter'
    only if it's written alone (next to space, start of 
    string, end of string, comma, etc) but not if inside 
    another word like peterbe """

    return re.compile(r"""
      (
      ^ %s
      (?=\W | $)
      |
      (?<=\W)
      %s
      (?=\W | $)
      )
      """
% (re.escape(word), re.escape(word)),
            re.I|re.L|re.M|re.X)

can with the \b gadget be simplified to this:

 def createStandaloneWordRegex(word):
    """ return a regular expression that can find 'peter'
    only if it's written alone (next to space, start of 
    string, end of string, comma, etc) but not if inside 
    another word like peterbe """

    return re.compile(r'\b%s\b' % word, re.I)

Quite a lot simpler isn't it? The simplified passes all the few unit tests I had.



Comment

Show all 3 comments
 

Commenting is currently disabled in Mobile version