To JSON, Pickle or Marshal in Python


8th of May 2009

To JSON, Pickle or Marshal in Python I was reading David Cramer's tip to use JSONField in Django to be able to store arbitrary fields in a SQL database. Nice. But is it fast enough? Well, I can't answer that but I did look into the difference in read/write performance between simplejson, cPickle and marshal.

Only reading:

 JSON 0.00593531370163
 PICKLE 0.0109532237053
 MARSHAL 0.00413788318634

Reading and writing:

 JSON 0.0434390544891
 PICKLE 0.0289686655998
 MARSHAL 0.00728442907333

Clearly marshal is faster but to quote the documentation:

"Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."

Clearly simplejson is a very fast reader and the JSON format has the delicious advantage that it's "human readable" (compared to the others).

NOTE! I spent about 5 minutes putting together the script and about 10 minutes writing this so feel free to doubt it's scientific accuracy.

Also, just because JSON wrote slowest here doesn't mean it's slow. Look at this code for example:

 >>> import simplejson
 >>> d=simplejson.load(open('classes.json'))
 >>> len(open('classes.json').read())
 114254
 >>> from time import time
 >>> def test():
 ...     t0=time(); simplejson.dump(d, open('/tmp/write.json','w')); t1=time()
 ...     return t1-t0
 ... 
 >>> test()
 0.06772303581237793
 >>> test()
 0.076719999313354492
 >>> test()
 0.081094026565551758

That's right! Less than a tenth of a second to write more than 100Kb of data.



Comment

Show all 4 comments
 
Name:
Email:
hide my email address.

Your email address will be encoded to prevent email-extraction spiders from reading it so you won't get spammed if you decide to show your email address.