Regular Expression-

  • – Regular expression are used to filter the substring/set of characters from a given main string.
  • – “re” is the module which contains predefined functions.
  • – Predefined functions can be used to filter the substring.
  • – regular expression functions can be used heavily during email, phone no, date and etc validation.
  • – Ex- import re

Few Of The Important Predefined Functions From Re Module-

1) Match() –

– This is used to identify either the main string is starting with a specified pattern/substring or not. Ex-

import re
msg = "simpleAlgo has all Algo"
pattern = re.match(r'Algo', msg)
print("Fail", pattern)

pattern = re.match(r'simple', msg)
print("Success", pattern) 
Output-
	Fail None
	Success 

2)Search() –

  • – It is used to search for a specified string.
  • – Either that string/pattern is present in the entire line or not.
  • – If pattern found again then it will not give the result of 2nd occurrence.
  • – After got a first found it will not search in further string.

Ex-

import re
msg = "simpleAlgo has all Algo"

pattern = re.search(r'Algo', msg)
print(pattern)
if pattern:
	print("Pattern found", pattern.group())
else:
	print("Pattern not found")

Output-
	
	Pattern found Algo

3) Finalall() –

– Used to return a list of all matching sub string/pattern.
Ex-

import re
msg = "simpleAlgo has all Algo"

data1 = re.findall(r'.', msg) # with spaces
print(data1) 

data2 = re.findall(r'\w', msg) # without spaces
print(data2)
Output-
['s', 'i', 'm', 'p', 'l', 'e', 'A', 'l', 'g', 'o', ' ', 'h', 'a', 's', ' ', 'a', 'l', 'l', ' ', 'A', 'l', 'g', 'o']
['s', 'i', 'm', 'p', 'l', 'e', 'A', 'l', 'g', 'o', 'h', 'a', 's', 'a', 'l', 'l', 'A', 'l', 'g', 'o']

Ex-
import re
line = "Java and Python gives Jython"
import re
msg = "simpleAlgo has all Algo"

pattern = re.search(r'Algo', msg)
print(pattern)
print("Pattern starting index", pattern.start())
print("Pattern ending index", pattern.end())

if pattern:
	print("Pattern found", pattern.group())
else:
	print("Pattern not found")

Output-
	
	Pattern starting index 6
	Pattern ending index 10
	Pattern found Algo
    

4) Compile()-

  • – We can create the pattern object by calling the compile method.
  • – We can use the pattern object with the other function to apply all the existing re functions.

Ex-

import re
pattern = re.compile("Algo")
msg = "simpleAlgo has all Algo"
r1 = pattern.match(msg)
print(r1)
r2 = pattern.search(msg)
print(r2)
r3 = pattern.findall(msg)
print(r3)

Output-
	None
	
	['Algo', 'Algo']

5) Split()-

import re

msg = "simpleAlgo has all Algo"
data = re.split(r' ', msg)
print(data)

Output-
	['simpleAlgo', 'has', 'all', 'Algo']

Regular expression can also be used-

  • -To validate the correct set of data like
  • -Email validation, phone-no validation, date of birth validation, zip-code validation and etc

Ex- Email Address Validation-

import re
line = 'python@gmail.com, java@hotmail.in'
data = re.findall(r'@\w+', line)
print(data)
print("******")

line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@(\w+)', line)
print(data)

print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+.\w+', line)
print(data)

print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+.(\w+)', line)
print(data)

print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+(.\w+)', line)
print(data)

Output-
	['@gmail', '@hotmail']
	******
	['gmail', 'hotmail']
	******
	['@gmail.com', '@hotmail.com']
	******
	['com', 'com']
	******
	['.com', '.com']

Ex- Phone No. Validation-

import re
# Phone = 123-4567-780
line = "456-999-9999, 123-9999-999, 9999-999-999"
res = re.findall(r'\d{3}-\d{4}-\d{3}', line)
print(res)

Output-
	['123-9999-999']
Menu