Regular Expression-
- – Regular expression are used to filter the substring/set of characters from a given main string.
- – “re” is the module which contains predefined functions.
- – Predefined functions can be used to filter the substring.
- – regular expression functions can be used heavily during email, phone no, date and etc validation.
- – Ex- import re
Few Of The Important Predefined Functions From Re Module-
1) Match() –
– This is used to identify either the main string is starting with a specified pattern/substring or not. Ex-
import re
msg = "simpleAlgo has all Algo"
pattern = re.match(r'Algo', msg)
print("Fail", pattern)
pattern = re.match(r'simple', msg)
print("Success", pattern)
Output-
Fail None
Success
2)Search() –
- – It is used to search for a specified string.
- – Either that string/pattern is present in the entire line or not.
- – If pattern found again then it will not give the result of 2nd occurrence.
- – After got a first found it will not search in further string.
Ex-
import re
msg = "simpleAlgo has all Algo"
pattern = re.search(r'Algo', msg)
print(pattern)
if pattern:
print("Pattern found", pattern.group())
else:
print("Pattern not found")
Output-
Pattern found Algo
3) Finalall() –
– Used to return a list of all matching sub string/pattern.
Ex-
import re
msg = "simpleAlgo has all Algo"
data1 = re.findall(r'.', msg) # with spaces
print(data1)
data2 = re.findall(r'\w', msg) # without spaces
print(data2)
Output-
['s', 'i', 'm', 'p', 'l', 'e', 'A', 'l', 'g', 'o', ' ', 'h', 'a', 's', ' ', 'a', 'l', 'l', ' ', 'A', 'l', 'g', 'o']
['s', 'i', 'm', 'p', 'l', 'e', 'A', 'l', 'g', 'o', 'h', 'a', 's', 'a', 'l', 'l', 'A', 'l', 'g', 'o']
Ex-
import re
line = "Java and Python gives Jython"
import re
msg = "simpleAlgo has all Algo"
pattern = re.search(r'Algo', msg)
print(pattern)
print("Pattern starting index", pattern.start())
print("Pattern ending index", pattern.end())
if pattern:
print("Pattern found", pattern.group())
else:
print("Pattern not found")
Output-
Pattern starting index 6
Pattern ending index 10
Pattern found Algo
4) Compile()-
- – We can create the pattern object by calling the compile method.
- – We can use the pattern object with the other function to apply all the existing re functions.
Ex-
import re
pattern = re.compile("Algo")
msg = "simpleAlgo has all Algo"
r1 = pattern.match(msg)
print(r1)
r2 = pattern.search(msg)
print(r2)
r3 = pattern.findall(msg)
print(r3)
Output-
None
['Algo', 'Algo']
5) Split()-
import re
msg = "simpleAlgo has all Algo"
data = re.split(r' ', msg)
print(data)
Output-
['simpleAlgo', 'has', 'all', 'Algo']
Regular expression can also be used-
- -To validate the correct set of data like
- -Email validation, phone-no validation, date of birth validation, zip-code validation and etc
Ex- Email Address Validation-
import re
line = 'python@gmail.com, java@hotmail.in'
data = re.findall(r'@\w+', line)
print(data)
print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@(\w+)', line)
print(data)
print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+.\w+', line)
print(data)
print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+.(\w+)', line)
print(data)
print("******")
line = 'python@gmail.com, java@hotmail.com'
data = re.findall(r'@\w+(.\w+)', line)
print(data)
Output-
['@gmail', '@hotmail']
******
['gmail', 'hotmail']
******
['@gmail.com', '@hotmail.com']
******
['com', 'com']
******
['.com', '.com']
Ex- Phone No. Validation-
import re
# Phone = 123-4567-780
line = "456-999-9999, 123-9999-999, 9999-999-999"
res = re.findall(r'\d{3}-\d{4}-\d{3}', line)
print(res)
Output-
['123-9999-999']