Python — Extracting Email addresses and domain names from strings
As a python developers, we have to accomplished a lot of jobs such as data cleansing from a file before processing the other business operations.
For an example, you have a raw data text file and you have to read some specific data like email addresses and domain names by to performing the actual Regular Expression matching.
What is a Regular Expression and which module is used in Python?
Regular expression is a sequence of special character(s) mainly used to find and replace patterns in a string or file, using a specialized syntax held in a pattern.
The Python module re provides full support for Perl-like regular expressions in Python. The re module raises the exception re.error if an error occurs while compiling or using a regular expression.
pandas is a Python package providing fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
Example — Python program to extract emails and domain names from the String By Regular Expression.
# Importing module required for regular expressions
import re
import pandas as pd
# Example string
txt = “Ryan has sent an invoice email to john.d@yahoo.com by using his email id ryan.arjun@gmail.com and he also shared a copy to his boss rosy.gray@amazon.co.uk on the cc part.”
# \w matches any non-whitespace character
# @ for as in…