Regular Expressions

Symbols

Identifiers

  • \d – any numbers
  • \D – anything but a number
  • \s – space
  • \S – anything but a space
  • \w – any character
  • \W – anything but a character
  • . – any character, except for a new line
  • \b – whitespace around words
  • . – a period

Modifiers

  • {1,3} – expecting 1-3
  • ‘+’ – match 1 or more
  • ? – match 0 or 1
  • ‘*’ (astericks) – match 0 or more
  • ‘$’ – match the end of a string
  • ^ – matching the beginning of a string
  • | – or
  • [ ] – range or ‘variance’ (e.g.[A-Za-z])
  • {x} – expecting ‘x’ amount

White Space Characters

  • \n – new line
  • \s – space
  • \t – tab
  • \e – escape
  • \f – form feed
  • \r – return

DONT FORGET

. + * ? [ ] $ ^ ( ) {} | \

In [1]:
import re
In [2]:
exampleString = '''
Jessica is 15 years old, and Daniel is 27 years old.
Edward is 97, and his grandfather, Oscar, is 102.
'''
In [3]:
ages = re.findall(r'\d{1,3}',exampleString)
In [5]:
names = re.findall(r'[A-Z][a-z]*',exampleString)
In [6]:
ages
Out[6]:
['15', '27', '97', '102']
In [7]:
names
Out[7]:
['Jessica', 'Daniel', 'Edward', 'Oscar']
In [8]:
ageDict = {}
x=0
for eachName in names:
    ageDict[eachName] = ages[x]
    x+=1
In [9]:
ageDict
Out[9]:
{'Daniel': '27', 'Edward': '97', 'Jessica': '15', 'Oscar': '102'}

Parsing websites with re and urllib

In [11]:
import urllib.request
import urllib.parse
In [12]:
url = 'http://pythonprogramming.net'
values = {'s':'basics',
         'submit':'search'}
In [13]:
data = urllib.parse.urlencode(values)
In [14]:
data = data.encode('utf-8')
In [15]:
req = urllib.request.Request(url,data)
In [16]:
resp = urllib.request.urlopen(req)
In [17]:
respData = resp.read()
In [18]:
respData

.*? – find everything

In [21]:
paragraphs = re.findall(r'<p>(.*?)</p>',str(respData))
In [22]:
for eachP in paragraphs:
    print(eachP)
Learn how to use Python with Pandas, Matplotlib, and other modules to gather insights from and about your data.
Control hardware with Python programming and the Raspberry Pi.
How to develop websites with either the Flask or Django frameworks for Python.
Create your own games with Python\'s PyGame library, or check out the multi-platform Kivy.
Learn the basic and intermediate Python fundamentals.
Just getting started?
Not a problem, learn the basics of programming with Python 3 here!
Create software with a user interface using Tkinter, PyQt, or Kivy.
Curious about more than just Python? While not covered in nearly as much depth, here are some tutorials in other languages:
Go is a programming language aimed at being simple, easy to work with, and capable of high performance.
\n\t\t\t\t\t\t<a href="#" class="btn btn-flat white modal-close">Cancel</a> &nbsp;\n\t\t\t\t\t\t<a href="#" class="waves-effect waves-blue blue btn btn-flat modal-action modal-close">Login</a>\n\t\t\t\t\t
\n\t\t\t\t\t\t\t\t<a href="#" class="btn btn-flat white modal-close">Cancel</a> &nbsp;\n\t\t\t\t\t\t\t\t<button class="btn" type=submit value=Register>Sign Up</button>\n\t\t\t\t\t\t\t

Leave a Reply

Your email address will not be published. Required fields are marked *