12. Databases and Persistence

 

 

 

 

 

 

Topics

 

 

     Persistent object shelves

     Storing class instances

     Pickling without shelves

     DBM-style files

     Shelve gotchas

     ZODB object-oriented database

     Python SQL database API

     Odds and ends

 


 

Object persistence: shelves

 

 

 

    Shelve = dbm file + object pickling (serialization)

    Stores arbitrary Python objects by string key

    Shelves are processed with normal Python code

    Pickler handles nested and circular objects

 

 

 

Basic usage

 

 

Shelves are dictionaries that must be opened

      import shelve

      dbase = shelve.open("mydbase")

 

Assigning to a shelve key stores an object

      dbase['key'] = object

 

Indexing a shelve fetches a stored object

      value = dbase['key']

 

Most dictionary operations supported

      len(dbase)      # number items stored

      dbase.keys()    # stored item key index


 

 

 

    Creating a new shelve

      Creates dbm file(s) to store objects

      Manual ‘close’ needed for some file systems 

 

% python

>>> import shelve

>>> dbase = shelve.open("mydbase")

>>> object1 = ['The', 'bright', ('side', 'of'), ['life']]

>>> object2 = {'name': 'Brian', 'age': 33, 'motto': object1}

>>> dbase['brian']  = object2

>>> dbase['knight'] = {'name': 'Knight', 'motto': 'Ni!'}

>>> dbase.close()

 

 

 

 

    Using an existing shelve

 

% python

>>> import shelve

>>> dbase = shelve.open("mydbase")

>>> len(dbase)                            # entries

2

 

>>> dbase.keys()                          # index

['knight', 'brian']

 

>>> dbase['knight']                       # fetch

{'motto': 'Ni!', 'name': 'Knight'}

 

>>> for row in dbase.keys():

...     print row, '=>'

...     for field in dbase[row].keys():

...         print '  ', field, '=', dbase[row][field]

...

knight =>

   motto = Ni!

   name = Knight

brian =>

   motto = ['The', 'bright', ('side', 'of'), ['life']]

   age = 33

   name = Brian


 

 

 

Storing class instances

 

 

    Classes defined at top-level of module

    No non-default constructor arguments (until 1.5.1!)

    Python stores instance __dict__, not class

    Changing class changes stored object behavior

 

 

file: person.py

# a person object: fields + behavior

 

class Person:

    def __init__(self, name='', job='', pay=0):

        self.name = name

        self.job  = job

        self.pay  = pay               # real instance data

    def tax(self):

        return self.pay * 0.25        # computed on call

    def info(self):

        return self.name, self.job, self.pay, self.tax()

 

 

% python

>>> from person import Person

>>> bob   = Person('bob', 'psychologist', 70000)

>>> emily = Person('emily', 'teacher', 40000)

>>> 

>>> import shelve

>>> dbase = shelve.open('cast')          # make new shelve

>>> for obj in (bob, emily):             # store objects

>>>     dbase[obj.name] = obj

>>> dbase.close()                        # need for bsddb

 

% python

>>> import shelve

>>> dbase = shelve.open('cast')          # reopen shelve

>>> print dbase['bob'].tax()             # call: bob's tax

17500.0


 

 

Changing classes changes behavior

 

 

    Classes = records + processing programs

    Example: changing the Person.tax method

 

 

 

file: person.py

# a person object: fields + behavior

# change: the tax method is now a virtual member

 

class Person:

    def __init__(self, name='', job='', pay=0):

        self.name = name

        self.job  = job

        self.pay  = pay               # real instance data

    def __getattr__(self, attr):      # on person.attr

        if attr == 'tax':

            return self.pay * 0.30    # computed on access

        else:

            raise AttributeError      # other unknown names

    def info(self):

        return self.name, self.job, self.pay, self.tax

 

 

 

 

% python

>>> import shelve

>>> dbase = shelve.open('cast')      # reopen shelve

>>> print dbase['bob'].tax           # no need to call tax()

21000.0

 

 


 

 

Pickling objects without shelves

 

 

      Shelve = dbm file + object pickler

      Pickler serializes Python objects into text streams

      Streams may be sent to flat file, dbm file, socket, etc.

      Also see: ‘marshal’ module (object limitations)

 

 

file: testpickle.py

import pickle

 

def saveDbase(filename, table):

    file = open(filename, 'w')

    pickle.dump(table, file)        # pickle to file

    file.close()

 

def loadDbase(filename):

    file = open(filename, 'r')

    table = pickle.load(file)       # unpickle from file

    file.close()

    return table

 

 

% python

>>> from testpickle import *

>>> L = [0]

>>> D = {'x':0, 'y':L}

>>> table = {'A':L, 'B':D}            # L appears twice (D)

>>> saveDbase('myfile', table)        # serialize to file

 

% python

>>> from testpickle import *

>>> table = loadDbase('myfile')       # reload/unpickle

>>> print table

{'B': {'x': 0, 'y': [0]}, 'A': [0]}

>>> table['A'][0] = 1                 # change shared list

>>> saveDbase('myfile', table)

 

% python

>>> from testpickle import *

>>> print loadDbase('myfile')         # both L’s updated!

{'B': {'x': 0, 'y': [1]}, 'A': [1]}


 

 

 

Using simple dbm files

 

 

      Shelve = dbm file + object pickler

      Dbm files can only store strings (not any object)

      anydbm’ selects from dbhash, gdbm, dbm or dumbdbm

      Dictionary-like interface like shelve (shelve uses dbm)

 

 

 

% python

>>> import anydbm

>>> file = anydbm.open('languages', 'c')    # create, in+out

>>> file['perl']   = "Text processing"      # store

>>> file['tcl']    = "Simple glue"

>>> file['python'] = "OO scripting"

>>> file.close()

 

 

 

% python

>>> import anydbm

>>> file = anydbm.open('languages', 'c')    # existing file

>>> file['python']                          # fetch

'OO scripting'

>>> for lang in file.keys():                # index

...     print lang + ':', file[lang]

...

perl: Text processing

tcl: Simple glue

python: OO scripting

>>> del file['tcl']                         # delete

 

 

 

% python

>>> import anydbm

>>> anydbm.open('languages', 'c').keys()    # sorry, tcl!

['python', 'perl']

 

 

 

 


 

Shelve gotchas

 

 

 

Keys must still be strings

      dbase[42] = value      # fails

 

 

Shared objects are only noticed within a given slot

      dbase[key] = object   # store parts just once

      dbase[key] = object   # two copies of object!

 

 

Updates must treat as fetch-modify-store mappings

      dbase[key].attr = value   # shelve unchanged!

     

      object = dbase[key]       # fetch it

      object.attr = value       # modify it

      dbase[key] = object       # store back

 

 

Doesn't support simultaneous updates by itself (e.g. CGI scripts)

 

 

Plus class pickling rules

 

At the top-level of a module (importable)

No non-default constructor arguments (until 1.5.1!)

Class changes must be backward-compatible

Classes can use special methods to break constructor rule

 

 

 

 

ZODB object-oriented database

 

 

      Full-blown OODB, 3rd party open source add-on

      ZODB – Zope’s advanced OODB, available by itself

      Adds object identifiers to Python pickling to do write-thru on change (via a mixed-in superclass)

      Supports concurrent update, transaction commit/rollback, write-thru on changes

      Like shelve, but extra boilerplate code

      Concurrent updates for multi-threaded

      ZEO distributed objects server for multi-process

      Can map ZODB db to flat files, relation dbase, other

      Legacy from the web: faster at reads than writes

 

See also: newer SQLObject and SQLAlchemy Python relational mappers (more ahead)

 

 

 

Creating a ZODB database

 

…\PP3E\Database\ZODBscripts\>python

>>> from ZODB import FileStorage, DB

>>> storage = FileStorage.FileStorage(r'C:\Mark\temp\mydb.fs')

>>> db = DB(storage)

>>> connection = db.open()

>>> root = connection.root()

 

 

>>> object1 = (1, 'spam', 4, 'YOU')

>>> object2 = [[1, 2, 3], [4, 5, 6]]

>>> object2.append([7, 8, 9])

>>> object2

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

>>> 

>>> object3 = {'name': ['Bob', 'Doe'],

               'age':  42,

               'job':  ('dev', 'mgr')}

 

>>> root['mystr']   = 'spam' * 3

>>> root['mytuple'] = object1

>>> root['mylist']  = object2

>>> root['mydict']  = object3

>>> root['mylist']

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

 

>>> import transaction

>>> transaction.commit()

>>> storage.close()

 

 

 

Using a ZODB database

 

…\PP3E\Database\ZODBscripts\>python

>>> from ZODB import FileStorage, DB

>>> storage = FileStorage.FileStorage(r'C:\Mark\temp\mydb.fs')

>>> db = DB(storage)

>>> connection = db.open()

>>> root = connection.root()                          # connect

 

>>> print len(root), root.keys()                      # size, ix

4 ['mylist', 'mystr', 'mytuple', 'mydict']

>>> 

>>> print root['mylist']                              # fetch obs

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

>>> print root['mydict']

{'job': ('dev', 'mgr'), 'age': 42, 'name': ['Bob', 'Doe']}

 

>>> root['mydict']['name'][-1]                        # last name

'Doe'

 

>>> for key in root.keys():

        print key.ljust(10), '=>', root[key]

 

mylist     => [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

mystr      => spamspamspam

mytuple    => (1, 'spam', 4, 'YOU')

mydict     => {'job': ('dev', 'mgr'), 'age': 42, 'name': ['Bob', 'Doe']}

 

 

 

Classes under ZODB

 

 

# zodb_class_make

 

import time

mydbfile = 'data/class.fs'                         # dbase file

from persistent import Persistent

 

class BookMark(Persistent):                        # use zodb

    def __init__(self, title, url):

        self.hits = 0

        self.updateBookmark(self, url)

    def updateBookmark(self, title, url):

        self.title = title                         # updates db

        self.url = url

        self.modtime = time.asctime()

 

def connectdb(dbfile):

    from ZODB import FileStorage, DB

    storage = FileStorage.FileStorage(dbfile) 

    db = DB(storage)                          

    connection = db.open()

    root = connection.root()

    return root, storage

 

def addobjects():

    root, storage = connectdb(mydbfile)

    root['ora']  = BookMark('Oreilly', 'http://www.oreilly.com')

    root['pp3e'] = BookMark('PP3E',  

 'http://www.rmi.net/~lutz/about-pp.html')

    import transaction

    transaction.commit()

    storage.close()

 

 

 

# zodb_class_read

 

mydbfile = 'data/class.fs'

from zodb_class_make import connectdb

root, storage = connectdb(mydbfile)

 

# this updates db: attrs changed in method

print 'pp3e url:', root['pp3e'].url

print 'pp3e mod:', root['pp3e'].modtime

root['pp3e'].updateBookmark('PP3E',

 'www.rmi.net/~lutz/about-pp3e.html')

 

# this updates too: attr changed here

ora = root['ora']

print 'ora hits:', ora.hits

ora.hits += 1

 

# commit changes made

import transaction

transaction.commit()

storage.close()

 

 

 


 

 

 

Python SQL database API

 

 

      Support for Oracle, Sybase, Informix, ODBC, Postgres, mySql,…

      Portable API Implemented by vendor-specific extension modules

 

 

 

Connection objects

represent a connection to a database

are the interface to rollback and commit operations

generate cursor objects

 

Cursor objects

represent a single SQL statement submitted as a string

can be used to step through SQL statement results

can execute DDL (create), DML (insert), and DQL (select) statements

 

Query results

SQL select results returned to scripts as Python data

Tables of rows are Python lists of tuples

Field values within rows are normal Python objects

An example query result: [('bob',38), ('emily',37)]

 

 

 

 

 

########################################

# example of portable database API usage

########################################

 

from cx_Oracle import Connect

connobj = Connect("user/password@system")

cursobj = connobj.cursor()

 

value1,value2 ='developer', 40

query = 'SELECT name, shoesize FROM empl WHERE job = ? AND age = ?'

 

cursobj.execute(query, (value1, value2))

results = cursobj.fetchall()

for (name, size) in results:

    print name, size

 

 

 

 

Other DB API concepts

 

      Select results: fetchall(), fetchone(), fetchmany()

      Transactions: commit, rollback on connection object

      update, insert, delete statements: rowcount

      Stored procedure methods

      Blobs

      Thread Safety check

      Description field: column names, types

 

 

Other SQL DB concepts

 

      SQLObject 3rd party extension object relational mapper:   table/row/column class/instance/attr

      SQLAlchemy: similar mapper system

      Full API at python.org: search Google for “Python Database API”  --  PEP, 2.0

 

 

 

 

Sqlite – standard library in Python 2.5

    In-process C lib based SQL engine, no server

    Useful for internal program data, prototyping

 

import sqlite3

conn = sqlite3.connect('/tmp/example')    # or “:memory:”

c = conn.cursor()

 

# Create table

c.execute('''create table stocks

(date text, trans text, symbol text,

           qty real, price real)''')

 

# Insert a row of data

c.execute("""insert into stocks

          values ('2006-01-05','BUY','RHAT',100,35.14)""")

 

# run a query

t = (symbol,)

c.execute('select * from stocks where symbol=?', t)

 

for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),

          ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),

          ('2006-04-06', 'SELL', 'IBM', 500, 53.00),

         ):

    c.execute('insert into stocks values (?,?,?,?,?)', t)

 

 

 

 

MySQL

    Server-based database engine

    Install MySQL, msql-python

 

 

from MySQLdb import Connect

conn = Connect(host='localhost', user='root', passwd='python')

curs = conn.cursor()

try:

    curs.execute('drop database testpeopledb')

except:

    pass  # did not exist

 

curs.execute('create database testpeopledb')

curs.execute('use testpeopledb')

curs.execute('create table people (name char(30), job char(10), pay int(4))')

 

curs.execute('insert people values (%s, %s, %s)',

('Bob', 'dev', 50000))

curs.execute('insert people values (%s, %s, %s)',

('Sue', 'dev', 60000))

 

curs.execute('select * from people')

for row in curs.fetchall():

     print row

 

curs.execute('select * from people')

colnames = [desc[0] for desc in curs.description]

while True:

    print '-' * 30

    row = curs.fetchone()

    if not row: break

    for (name, value) in zip(colnames, row):

        print '%s => %s' % (name, value)

 

conn.commit()   # save inserted records

 

 

 

Informix

 

>>> import informixdb

>>> conn = informixdb.connect('mydatabase')

>>> cursor = conn.cursor()

>>> cursor.execute("SELECT * FROM names")

>>> cursor.fetchall()

[('donald', 'duck', 34), ('mickey', 'mouse', 23)]

 

 

 

>>> import informixdb

>>> conn = informixdb.connect('db@daniel', user='me', password='something')

>>> cursor.execute('INSERT INTO names VALUES(:1,:2,:3)', ('some', 'body', 56))

1

>>> cursor.execute('INSERT INTO names VALUES(?,?,?)', ('dagobert', 'duck', 77))

1

 

 

 

 

PostgreSQL: see interface, docs online

 

 

 

 

 

 

Persistence odds and ends

 

 

 

    Flat files

      Objects and descriptors

 

 

    ‘marshal’ module

      Serializes objects like pickler, but only simple types

 

 

    Performance

      cPickle’ module is 1 to 3 orders of magnitude faster than ‘pickle’ (auto selected by shelve)

 

 

    Database interfaces available

      At python.org or search: Oracle, Sybase, PostgreSQL Informix, ODBC, MySQL, etc.

      (See above) Portable database API: works on many systems

 

 

 

         Alternatives: JSON and MongoDB (structures)

>>> import json

>>> rec = {'name': 'Bob', 'job': ['dev', 'mgr'], 'age': 40.5}

>>> text = json.dumps(rec)

>>> text

'{"age": 40.5, "name": "Bob", "job": ["dev", "mgr"]}'

>>> rec2 = json.loads(text)

>>> rec2

{'age': 40.5, 'name': 'Bob', 'job': ['dev', 'mgr']}

 

 

         Alternatives: AppEngine, AWS, Azure (cloud)

         Alternative: Pandas (big data)

 

 

 

 

Lab Session 9

 

Click here to go to lab exercises

Click here to go to exercise solutions

Click here to go to solution source files

 

Click here to go to lecture example files