Topics
♦ Persistent object shelves
♦ Storing class instances
♦ Pickling without shelves
♦ DBM-style files
♦ Shelve gotchas
♦ ZODB object-oriented database
♦
Python SQL database
♦ Odds and ends
● Shelve = dbm file + object pickling (serialization)
● Stores arbitrary Python objects by string key
● Shelves are processed with normal Python code
● Pickler handles nested and circular objects
Basic usage
►Shelves are dictionaries that must be opened
import shelve
dbase = shelve.open("mydbase")
►Assigning to a shelve key stores an object
dbase['key'] = object
►Indexing a shelve fetches a stored object
value = dbase['key']
►Most dictionary operations supported
len(dbase) # number items stored
dbase.keys() # stored item key index
■ Creating a new shelve
● Creates dbm file(s) to store objects
● Manual ‘close’ needed for some file systems
% python
>>> import shelve
>>> dbase = shelve.open("mydbase")
>>> object1 = ['The', 'bright', ('side', 'of'), ['life']]
>>> object2 = {'name': 'Brian', 'age': 33, 'motto': object1}
>>> dbase['brian'] = object2
>>> dbase['knight'] = {'name': 'Knight', 'motto': 'Ni!'}
>>> dbase.close()
■ Using an existing shelve
% python
>>> import shelve
>>> dbase = shelve.open("mydbase")
>>> len(dbase) # entries
2
>>> dbase.keys() # index
['knight', 'brian']
>>> dbase['knight'] # fetch
{'motto': 'Ni!', 'name': 'Knight'}
>>> for row in dbase.keys():
... print row, '=>'
... for field in dbase[row].keys():
... print ' ', field, '=', dbase[row][field]
...
knight =>
motto = Ni!
name = Knight
brian =>
motto = ['The', 'bright', ('side', 'of'), ['life']]
age = 33
name = Brian
● Classes defined at top-level of module
● No non-default constructor arguments (until 1.5.1!)
● Python stores instance __dict__, not class
● Changing class changes stored object behavior
file: person.py
# a person object: fields + behavior
class Person:
def __init__(self, name='', job='', pay=0):
self.name = name
self.job = job
self.pay = pay # real instance data
def tax(self):
return self.pay * 0.25 # computed on call
def info(self):
return self.name, self.job, self.pay, self.tax()
% python
>>> from person import Person
>>> bob = Person('bob', 'psychologist', 70000)
>>> emily = Person('emily', 'teacher', 40000)
>>>
>>> import shelve
>>> dbase = shelve.open('cast') # make new shelve
>>> for obj in (bob, emily): # store objects
>>> dbase[obj.name] = obj
>>> dbase.close() # need for bsddb
% python
>>> import shelve
>>> dbase = shelve.open('cast') # reopen shelve
>>> print dbase['bob'].tax() # call: bob's tax
17500.0
Changing classes changes behavior
● Classes = records + processing programs
● Example: changing the Person.tax method
file: person.py
# a person object: fields + behavior
# change: the tax method is now a virtual member
class Person:
def __init__(self, name='', job='', pay=0):
self.name = name
self.job = job
self.pay = pay # real instance data
def __getattr__(self, attr): # on person.attr
if attr == 'tax':
return self.pay * 0.30 # computed on access
else:
raise AttributeError # other unknown names
def info(self):
return self.name, self.job, self.pay, self.tax
% python
>>> import shelve
>>> dbase = shelve.open('cast') # reopen shelve
>>> print dbase['bob'].tax # no need to call tax()
21000.0
● Shelve = dbm file + object pickler
● Pickler serializes Python objects into text streams
● Streams may be sent to flat file, dbm file, socket, etc.
● Also see: ‘marshal’ module (object limitations)
file: testpickle.py
import pickle
def saveDbase(filename, table):
file = open(filename, 'w')
pickle.dump(table, file) # pickle to file
file.close()
def loadDbase(filename):
file = open(filename, 'r')
table = pickle.load(file) # unpickle from file
file.close()
return table
% python
>>> from testpickle import *
>>> L = [0]
>>> D = {'x':0, 'y':L}
>>> table = {'A':L, 'B':D} # L appears twice (D)
>>> saveDbase('myfile', table) # serialize to file
% python
>>> from testpickle import *
>>> table = loadDbase('myfile') # reload/unpickle
>>> print table
{'B': {'x': 0, 'y': [0]}, 'A': [0]}
>>> table['A'][0] = 1 # change shared list
>>> saveDbase('myfile', table)
% python
>>> from testpickle import *
>>> print loadDbase('myfile') # both L’s updated!
{'B': {'x': 0, 'y': [1]}, 'A': [1]}
● Shelve = dbm file + object pickler
● Dbm files can only store strings (not any object)
● ‘anydbm’ selects from dbhash, gdbm, dbm or dumbdbm
● Dictionary-like interface like shelve (shelve uses dbm)
% python
>>> import anydbm
>>> file = anydbm.open('languages', 'c') # create, in+out
>>> file['perl'] = "Text processing" # store
>>> file['tcl'] = "Simple glue"
>>> file['python'] = "OO scripting"
>>> file.close()
% python
>>> import anydbm
>>> file = anydbm.open('languages', 'c') # existing file
>>> file['python'] # fetch
'OO scripting'
>>> for lang in file.keys(): # index
... print lang + ':', file[lang]
...
perl: Text processing
tcl: Simple glue
python: OO scripting
>>> del file['tcl'] # delete
% python
>>> import anydbm
>>> anydbm.open('languages', 'c').keys() # sorry, tcl!
['python', 'perl']
►Keys must still be strings
dbase[42] = value # fails
►Shared objects are only noticed within a given slot
dbase[key] = object # store parts just once
dbase[key] = object # two copies of object!
►Updates must treat as fetch-modify-store mappings
dbase[key].attr = value # shelve unchanged!
object = dbase[key] # fetch it
object.attr = value # modify it
dbase[key] = object # store back
►Doesn't support simultaneous
updates by itself (e.g.
Plus class pickling rules
►At the top-level of a module (importable)
►No non-default constructor arguments (until 1.5.1!)
►Class changes must be backward-compatible
►Classes can use special methods to break constructor rule
● Full-blown OODB, 3rd party open source add-on
● ZODB – Zope’s advanced OODB, available by itself
● Adds object identifiers to Python pickling to do write-thru on change (via a mixed-in superclass)
● Supports concurrent update, transaction commit/rollback, write-thru on changes
● Like shelve, but extra boilerplate code
● Concurrent updates for multi-threaded
● ZEO distributed objects server for multi-process
● Can map ZODB db to flat files, relation dbase, other
● Legacy from the web: faster at reads than writes
→ See also: newer SQLObject and SQLAlchemy Python relational mappers (more ahead)
Creating a ZODB database
…\PP3E\Database\ZODBscripts\>python
>>>
from ZODB import FileStorage, DB
>>>
storage = FileStorage.FileStorage(r'C:\Mark\temp\mydb.fs')
>>>
db = DB(storage)
>>>
connection = db.open()
>>>
root = connection.root()
>>>
object1 = (1, 'spam', 4, 'YOU')
>>>
object2 = [[1, 2, 3], [4, 5, 6]]
>>>
object2.append([7, 8, 9])
>>>
object2
[[1,
2, 3], [4, 5, 6], [7, 8, 9]]
>>>
>>>
object3 = {'name': ['Bob', 'Doe'],
'age': 42,
'job': ('dev', 'mgr')}
>>>
root['mystr']
= 'spam' * 3
>>>
root['mytuple'] = object1
>>>
root['mylist']
= object2
>>>
root['mydict']
= object3
>>>
root['mylist']
[[1,
2, 3], [4, 5, 6], [7, 8, 9]]
>>>
import transaction
>>>
transaction.commit()
>>>
storage.close()
Using a ZODB database
…\PP3E\Database\ZODBscripts\>python
>>>
from ZODB import FileStorage, DB
>>>
storage = FileStorage.FileStorage(r'C:\Mark\temp\mydb.fs')
>>>
db = DB(storage)
>>>
connection = db.open()
>>>
root = connection.root() # connect
>>>
print len(root), root.keys() # size, ix
4
['mylist', 'mystr', 'mytuple', 'mydict']
>>>
>>>
print root['mylist'] # fetch obs
[[1,
2, 3], [4, 5, 6], [7, 8, 9]]
>>>
print root['mydict']
{'job':
('dev', 'mgr'), 'age': 42,
'name': ['Bob', 'Doe']}
>>>
root['mydict']['name'][-1] # last name
'Doe'
>>>
for key in root.keys():
print key.ljust(10),
'=>', root[key]
mylist => [[1, 2, 3], [4,
5, 6], [7, 8, 9]]
mystr => spamspamspam
mytuple => (1, 'spam', 4,
'YOU')
mydict => {'job': ('dev', 'mgr'), 'age': 42, 'name':
['Bob', 'Doe']}
Classes under ZODB
# zodb_class_make
import
time
mydbfile = 'data/class.fs' # dbase file
from
persistent import Persistent
class
BookMark(Persistent): # use zodb
def __init__(self, title, url):
self.hits = 0
self.updateBookmark(self,
url)
def updateBookmark(self, title, url):
self.title =
title # updates db
self.url = url
self.modtime
= time.asctime()
def
connectdb(dbfile):
from ZODB import FileStorage,
DB
storage = FileStorage.FileStorage(dbfile)
db =
DB(storage)
connection = db.open()
root = connection.root()
return root, storage
def
addobjects():
root, storage = connectdb(mydbfile)
root['ora'] = BookMark('Oreilly', 'http://www.oreilly.com')
root['pp3e'] = BookMark('PP3E',
'http://www.rmi.net/~lutz/about-pp.html')
import transaction
transaction.commit()
storage.close()
# zodb_class_read
mydbfile = 'data/class.fs'
from
zodb_class_make import connectdb
root,
storage = connectdb(mydbfile)
#
this updates db: attrs
changed in method
print
'pp3e url:', root['pp3e'].url
print
'pp3e mod:', root['pp3e'].modtime
root['pp3e'].updateBookmark('PP3E',
'www.rmi.net/~lutz/about-pp3e.html')
#
this updates too: attr changed here
ora
= root['ora']
print
'ora hits:', ora.hits
ora.hits += 1
#
commit changes made
import
transaction
transaction.commit()
storage.close()
● Support for Oracle, Sybase, Informix, ODBC, Postgres, mySql,…
● Portable
Connection objects
represent a connection to a database
are the interface to rollback and commit operations
generate cursor objects
Cursor objects
represent a single SQL statement submitted as a string
can be used to step through SQL statement results
can execute DDL (create), DML (insert), and DQL (select) statements
Query results
SQL select results returned to scripts as Python data
Tables of rows are Python lists of tuples
Field values within rows are normal Python objects
An example query result: [('bob',38), ('emily',37)]
########################################
#
example of portable database
########################################
from
cx_Oracle import Connect
connobj = Connect("user/password@system")
cursobj = connobj.cursor()
value1,value2
='developer', 40
query
= 'SELECT name, shoesize FROM empl
WHERE job = ?
cursobj.execute(query, (value1, value2))
results
= cursobj.fetchall()
for
(name, size) in results:
print name, size
Other DB
● Select results: fetchall(), fetchone(), fetchmany()
● Transactions: commit, rollback on connection object
● update, insert, delete statements: rowcount
● Stored procedure methods
● Blobs
● Thread Safety check
● Description field: column names, types
Other SQL DB concepts
● SQLObject 3rd party extension object relational mapper: table/row/column → class/instance/attr
● SQLAlchemy: similar mapper system
● Full
Sqlite – standard library in Python 2.5
● In-process C lib based SQL engine, no server
● Useful for internal program data, prototyping
import sqlite3
conn =
sqlite3.connect('/tmp/example') # or “:memory:”
c = conn.cursor()
# Create table
c.execute('''create table stocks
(date text, trans text, symbol text,
qty real, price real)''')
# Insert a row of data
c.execute("""insert into stocks
values
('2006-01-05','BUY','RHAT',100,35.14)""")
# run a query
t = (symbol,)
c.execute('select * from stocks where symbol=?', t)
for t in (('2006-03-28',
'BUY', 'IBM', 1000, 45.00),
('2006-04-05', 'BUY', 'MSOFT', 1000,
72.00),
('2006-04-06', 'SELL', 'IBM', 500,
53.00),
):
c.execute('insert
into stocks values (?,?,?,?,?)', t)
MySQL
● Server-based database engine
● Install MySQL, msql-python
from MySQLdb
import Connect
conn = Connect(host='localhost', user='root', passwd='python')
curs = conn.cursor()
try:
curs.execute('drop
database testpeopledb')
except:
pass
# did not exist
curs.execute('create database testpeopledb')
curs.execute('use testpeopledb')
curs.execute('create table people (name char(30), job char(10),
pay int(4))')
curs.execute('insert people values (%s, %s, %s)',
('Bob', 'dev', 50000))
curs.execute('insert people values (%s, %s, %s)',
('Sue', 'dev', 60000))
curs.execute('select * from people')
for row in curs.fetchall():
print row
curs.execute('select * from people')
colnames = [desc[0] for desc in curs.description]
while True:
print '-' * 30
row = curs.fetchone()
if not row: break
for (name, value) in zip(colnames, row):
print '%s => %s' % (name, value)
conn.commit() # save
inserted records
Informix
>>> import informixdb
>>> conn = informixdb.connect('mydatabase')
>>> cursor = conn.cursor()
>>> cursor.execute("SELECT
* FROM names")
>>> cursor.fetchall()
[('donald', 'duck', 34),
('mickey', 'mouse', 23)]
>>> import informixdb
>>> conn = informixdb.connect('db@daniel', user='me', password='something')
>>> cursor.execute('INSERT
INTO names VALUES(:1,:2,:3)', ('some', 'body', 56))
1
>>> cursor.execute('INSERT
INTO names VALUES(?,?,?)', ('dagobert', 'duck', 77))
1
PostgreSQL: see interface, docs online
● Flat files
■ Objects and descriptors
● ‘marshal’ module
■ Serializes objects like pickler, but only simple types
● Performance
■ ‘cPickle’ module is 1 to 3 orders of magnitude faster than ‘pickle’ (auto selected by shelve)
● Database interfaces available
■ At python.org or search: Oracle, Sybase, PostgreSQL Informix, ODBC, MySQL, etc.
■
(See above) Portable database
► Alternatives: JSON and MongoDB (structures)
>>> import json
>>> rec = {'name': 'Bob', 'job': ['dev', 'mgr'], 'age': 40.5}
>>> text = json.dumps(rec)
>>> text
'{"age": 40.5, "name": "Bob", "job": ["dev", "mgr"]}'
>>> rec2 = json.loads(text)
>>> rec2
{'age': 40.5, 'name': 'Bob', 'job': ['dev', 'mgr']}
► Alternatives: AppEngine, AWS, Azure (cloud)
► Alternative: Pandas (big data)
Click here to go to
lab exercises
Click here to go to
exercise solutions
Click here to go to solution source files
Click here to go to
lecture example files