Python RegEx :

Python RegEx :

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.

RegEx Module :

Python has a built-in package called re, which can be used to work with Regular Expressions. Import the re module: import re

RegEx in Python :

When you have imported the re module, you can start using regular expressions:

Example :

  1. import re
  2. txt = "The rain in Amazon"
  3. x = re.search("^The.*Amazon$",txt)
  4. if(x):
  5. print("We found a match!")
  6. else :
  7. print("No match found")
  8. input()

Output :

python RegEx example

RegEx Functions :

The re module offers a set of functions that allows us to search a string for a match :

Function Description
findall Returns a list containing all matches
search Returns a Match object if there is a match anywhere in the string
split Returns a list where the string has been split at each match
sub Replaces one or many matches with a string


Metacharacters :

Metacharacters are characters with a special meaning :

Character Description Example
[ ] A set of characters "[a-m]"
\ Signals a special sequence (can also be used to escape special characters) "\d"
. Any character (except newline character) "he..o"
^ Starts with "^hello"
$ Ends with "world$"
* Zero or more occurrences "aix*"
+ One or more occurrences "aix+"
{ } Excactly the specified number of occurrences "al{2}"
| Either or "al{2}"
( ) Capture and group "falls|stays"


Special Sequences :

A special sequence is a \ followed by one of the characters in the list below, and has a special meaning :

Character Description Example
\A Returns a match if the specified characters are at the beginning of the string "\AThe"
\b Returns a match where the specified characters are at the beginning or at the end of a word r"\bmazon"
r"zon\b"
\B Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word r"\Bmazon"
r"zon\B"
\d Returns a match where the string contains digits (numbers from 0-9) "\d"
\D Returns a match where the string DOES NOT contain digits "\D"
\s Returns a match where the string contains a white space character "\s"
\S Returns a match where the string DOES NOT contain a white space character "\S"
\w Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character) "\w"
\W Returns a match where the string DOES NOT contain any word characters "\W"
\Z Returns a match if the specified characters are at the end of the string "Amazon\Z"


Set :

A set is a set of characters inside a pair of square brackets [ ] with a special meaning:

Set Description
[arn] Returns a match where one of the specified characters (a, r, or n) are present
[a-n] Returns a match for any lower case character, alphabetically between a and n
[^arn] Returns a match for any character EXCEPT a, r, and n
[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present
[0-9] Returns a match for any digit between 0 and 9
[0-5][0-9] Returns a match for any two-digit numbers from 00 and 59
[a-zA-Z] Returns a match for any character alphabetically between a and z, lower case OR upper case
[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present
[+] In sets, +, *, ., |, ( ), $,{ } has no special meaning, so [+] means: return a match for any + character in the string


The findall() Function :

The findall() function returns a list containing all matches.

Example :

  1. #Print a list of all matches
  2. import re
  3. txt = "The rain in Amazon"
  4. x = re.findall("in",txt)
  5. print(x)
  6. if(x):
  7. print("We found a match!")
  8. else:
  9. print("No match found")
  10. input()

Output :

python findall() function

The search() function searches the string for a match, and returns a Match object if there is a match.

Example :

  1. #Search for the first white-space character in the string
  2. import re
  3. txt = "The rain in Amazon"
  4. x = re.search("\s", txt)
  5. print("The first white-space character is located in position:", x.start())
  6. input()

Output :

python search() function

Note : If no matches are found, the value None is returned



The split() Function :

The split() function returns a list where the string has been split at each match.

Example :

  1. #Split at each white-space character
  2. import re
  3. txt = "The rain in Amazon"
  4. x = re.split("\s", txt)
  5. print(x)
  6. input()

Output :

python split() function

The sub() Function :

The sub() function replaces the matches with the text of your choice.

  1. #Replace every white-space character with the # symbol
  2. import re
  3. txt = "The rain in Amazon"
  4. x = re.sub("\s", "#", txt)
  5. print(x)
  6. input()

Output :

python sub() function

Computer Science Engineering

Special Notes

It's a special area where you can find special questions and answers for CSE students or IT professionals. Also, In this section, we try to explain a topic in a very deep way.

CSE Notes