Secs sell! How I cache my entire pages (server-side)
10 May 2012
1 comment
Python, Django
http://www.peterbe.com/stats/
I've blogged before about how this site can easily push out over 2,000 requests/second using only 6 WSGI workers excluding latency. The reason that's possible is because the whole page(s) can be cached server-side. What actually happens is that the whole rendered HTML blob is stored in the cache server (Redis in my case) so that no database queries are needed at all.
I wanted my site to still "feel" dynamic in the sense that once you post a comment (and it's published), the page automatically invalidates the cache and thus, the user doesn't have to refresh his browser when he knows it should have changed. To accomplish this I used a hacked cache_page decorator that makes the cache key depend on the content it depends on. Here's the code I actually use today for the home page:
def _home_key_prefixer(request):
if request.method != 'GET':
return None
prefix = urllib.urlencode(request.GET)
cache_key = 'latest_comment_add_date'
latest_date = cache.get(cache_key)
if latest_date is None:
# when a blog comment is posted, the blog modify_date is incremented
latest, = (BlogItem.objects
.order_by('-modify_date')
.values('modify_date')[:1])
latest_date = latest['modify_date'].strftime('%f')
cache.set(cache_key, latest_date, 60 * 60)
prefix += str(latest_date)
try:
redis_increment('homepage:hits', request)
except Exception:
logging.error('Unable to redis.zincrby', exc_info=True)
return prefix
@cache_page_with_prefix(60 * 60, _home_key_prefixer)
def home(request, oc=None):
...
try:
redis_increment('homepage:misses', request)
except Exception:
logging.error('Unable to redis.zincrby', exc_info=True)
...
And in the models I then have this:
@receiver(post_save, sender=BlogComment)
@receiver(post_save, sender=BlogItem)
def invalidate_latest_comment_add_dates(sender, instance, **kwargs):
cache_key = 'latest_comment_add_date'
cache.delete(cache_key)
So this means:
- whole pages are cached for long time for fast access
- updates immediately invalidates the cache for best user experience
- no need to mess with ANY SQL caching
So, the next question is, if posting a comment means that the cache is invalidated and needs to be populated, what's the ratio of hits versus hits where the cache is cleared? Glad you asked. That's why I made this page:
It allows me to monitor how often a new blog comment or general time-out means poor django needs to re-create the HTML using SQL.
At the time of writing, one in every 25 hits to the homepage requires the server to re-generate the page. And still the content is always fresh and relevant.
The next level of optimization would be to figure out whether a particular page update (e.g. a blog comment posting on a page that isn't featured on the home page) should or should not invalidate the home page. esp
Secs sell! How frickin' fast this site is! (server side)
05 April 2012
0 comments
Linux, Web development, Django
This is part 2. Part 1 is here about how I managed to make this site fast.
The web framework powering this site is Django and in front of that is Nginx which serves all the static content (once before Amazon CloudFront CDN takes over) and all non-static traffic is passed on to a uWSGI daemon which is running 6 worker processes. The database that stores the content is PostgreSQL and all caching is done in Redis. Actually another Redis database is used for other things such as maintaining a quick look-up index of keywords to primary keys so that I can quickly mesh together blog posts by keywords.
However, as we all know the deciding factor of a web sites server-side speed is effectively the speed of the database or any other disk-bound I/O device. To remedy this I've set up some practical caching strategies which I'm quite happy with.
So, how fast is it? Here's an ab stress test against home page with 10,000 requests spread across 10 concurrent users:
Document Path: / Document Length: 73272 bytes Concurrency Level: 10 Time taken for tests: 4.426 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 734250000 bytes HTML transferred: 732720000 bytes Requests per second: 2259.59 [#/sec] (mean) Time per request: 4.426 [ms] (mean) Time per request: 0.443 [ms] (mean, across all concurrent requests) Transfer rate: 162022.11 [Kbytes/sec] received
I could probably make that 2,300 requests/second to 3,000 or 4,000 if I just increase the number of workers. However, that costs memory and since I'm currently running 19 other uWSGI workers on this server that all (all 25) in total take up a steady 1.4 Gb I don't feel like increasing that number much more. Besides since this site doesn't really get any traffic, I'm not so concerned about massive throughput on concurrent benchmarks but more about serving each and every page as fast as possible the few times it's called.
Every single page on this site is behind some sort of internal cache. The only time the PostgreSQL is involved is in rendering a page is when it's first requested after a comment has been entered or I've added (or edited) a new post. Thing is, I don't want to be inconvenienced by a stupid cache that forces me to wait an hour every time I change something. No, instead lots of Django database model signals are put in place that fire off cache invalidation when certain pieces of data is changed. You can see the code for that here.
So, for the home page for example: For each request, a small piece of Python code checks the Redis for what the latest comment add-date is and based on that tells the Django page_cache decorator to either render the page as normal or to serve the whole HTML payload from Redis. In other words, on a successful cache "hit" it actually needs two Redis look-ups. Even that could be improved and blindly just spare these look-ups by serving from the workers allocated Python memory instead but that would make things fragile, hard to unit test and it would only make the benchmarks faster which is not necessary.
The most important thing to optimize on a web site is the static content. Well, there's little point in serving the static content fast if it takes 3 seconds to say what static content to serve. Also, a fast website is likely to appear more favorable on the Google bot which effectively makes the site appear higher on Google searches.
In the next part, I'll try to share more in-depth technical bits and pieces of what I actually did although they're no secrets I think some of them are best practice and even senior web developers sometimes get them wrong.
How much faster is Nginx+gunicorn than Apache+mod_wsgi?
22 March 2012
8 comments
Linux, Django
Short answer: about 5%
I had a few minutes and wanted to see if changing from Apache + mod_wsgi to Nginx + gunicorn would make the otherwise slow site any faster. It's not this site but another Django site for work (which, by the way, doesn't have to be fast). It's slow because it doesn't cache any of the SQL queries.
# with Apache + mod_wsgi $ ab -n 1000 -c 10 http://thelocaldomain/ ... Requests per second: 39 [#/sec] (mean) ... # Uses about 110 Mb
That's after running multiple times and roughly averaging the requests per seconds.
# with Nginx + guncorn --workers=4 $ ab -n 1000 -c 10 http://thelocaldomain/ ... Requests per second: 41 [#/sec] (mean) ... # uses about 70 Mb
So, if you want to make a site fast forget about how the code is being served until all the slow db I/O is taken care of properly.
Cryptic errors when using django-nose
07 December 2011
0 comments
Django
After about 3 days of debugging using pdb, print and writing to a log file I've almost finally solve my bizarre errors I was getting when running a whole test suite. The error that it lead to was that Django refused to re-register models to the admin and the errors looked something like this:
...
File "/Users/peterbe/dev/MOZILLA/PTO/pto/urls.py", line 6, in <module>
admin.autodiscover()
File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/contrib/admin/__init__.py", line 26, in autodiscover
import_module('%s.admin' % app)
File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/utils/importlib.py", line 35, in import_module
__import__(name)
File "/Users/peterbe/dev/MOZILLA/PTO/pto/apps/users/admin.py", line 30, in <module>
admin.site.register(UserProfile, UserProfileAdmin)
File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/contrib/admin/sites.py", line 85, in register
raise AlreadyRegistered('The model %s is already registered' % model.__name__)
AlreadyRegistered: The model UserProfile is already registered
Turns out to be independent of which Django project I ran and it was something no one else was able to reproduce on any machine with the exact same code.
After 2 days I found that there's a difference between a successful run and a failing run was how I specified (to nose) which module to load:
./manage.py test users # fails!
./manage.py test users.test # works!
In both cases it finds the same tests. So it would either fail 10 times or work 10 times. Hmmm...
The bridging between nose and Django is done by awesome django-nose developed here at Mozilla by Django extraordinaire Jeff Balogh and it's a non-trivial piece of code as it depends on some really smart importing tricks and stuff which I haven't even begun to understand.
However, after so many trial and errors I finally discovered that the solution (for me) was to delete the ~/.noserc file. What's strange is that all it contained was:
[nosetests]
with-doctest=1
I might never actually find out what went wrong. Ultimately I think a reason things went wrong was because it incorrectly populated sys.modules with excessive keys that would cause double imports of urls.py which in turn runs admin.autodiscover() but incorrectly does so twice.
Sorry for the rambling. And sorry for not actually finding the real bug. I did spent 2-3 days debugging this non-stop and hopefully some other poor frustrated person is going to see this and also look into the ~/.noserc for ways to fix it maybe.
EmailInput HTML5 friendly for Django
02 August 2011
6 comments
Django
Suppose you have a Django app with a login where people can only log in with their email address. Then use this widget on your login form:
## The input widget class
class EmailInput(forms.widgets.Input):
input_type = 'email'
def render(self, name, value, attrs=None):
if attrs is None:
attrs = {}
attrs.update(dict(autocorrect='off',
autocapitalize='off',
spellcheck='false'))
return super(EmailInput, self).render(name, value, attrs=attrs)
## Example usage
class AuthenticationForm(django.contrib.auth.forms.AuthenticationForm):
"""override the authentication form because we use the email address as the
key to authentication."""
# allows for using email to log in
username = forms.CharField(label="Username", max_length=75,
widget=EmailInput())
rememberme = forms.BooleanField(label="Remember me", required=False)
This input field does some cool stuff in the browser such as automatic validation in the browser as seen in this screenshot here.
More importantly it fixes a very annoying problem when surfing on a smartphone or a tablet like the iPad. As I'm about to type "someusername@mozilla.com" it first wants to start capitalized and which might fail the login. Also if the email address contains a word that it wants to correct like ("mozilla" -> "Mozilla") you have to click the little correct tooltip to tell the input is correct in verbatim.
Note to Djangonauts who want to use this and have a dual authentication backend that takes both usernames and email addresses, this form will make it impossible to log in as something called "admin" for example.
A taste of the Django on inside Mozilla, Sheriffs Duty
22 July 2011
0 comments
Django
http://github.com/mozilla/sheriffs
One of the many great things about working for Mozilla is that everything we do is Open Source. Even our wiki is open (however we have an internal wiki for corporation boring stuff such as meeting rooms, HR etc.)
Last week I wrote an internal application for Mozilla's build engineers. Essentially it's a roster that lists one user per day and it's helped by being visualized as a calendar and as a vCal export. It's very unlikely that anybody outside Mozilla will find this particularly useful. But who knows, perhaps other companies have needs to take turns to sheriff build machines.
Anyway, the project was easy to write because we have something called Playdoh. It's a set of nifty and useful settings and a folder structure and it comes with a submodule called "playdoh-lib" which is stuffed with lots of useful packages that you'll most likely want to use. If you browse Playdoh on Github it might look like a lot of stuff but after a second look you'll see that there's actually almost no code. So don't you dare to play the "bloat card"! :)
What this app uses is TastyPie for the REST API which was awesome by the way.
For the authentication I used django-auth-ldap and some custom classes because at Mozilla we use email addresses instead of usernames.
To make the vCal export I use VObject which was easy to work with but has some usual syntax in places.
Jinja was used for the template rendering and it meant I had to do some tricks to use the django.contrib.auth.views.login view but with my templates. Might be worth looking into if people are interested.
The code has 98% test coverage but I had to upgrade to the latest nose to be able to run test coverage on app modules that have similar names to modules in the standard lib.
Test static resources in Django tests
02 June 2011
2 comments
Django
At Mozilla we use jingo-minify to bundle static resources such as .js and .css files. It's not a perfect solution but it's got some great benefits. One of them is that you need to know exactly which static resources you need in a template and because things are bundled you don't need to care too much about what files it originally consisted of. For example "jquery-1.6.2.js" + "common.js" + "jquery.cookies.js" can become "bundles/core.js"
A drawback of this is if you forget to compress and prepare all assets (using the compress_assets management command in jingo-minify) is that you break your site with missing static resources. So how to test for this?
Here's a simple solution I cooked up which appears to do the trick. It's just a quick start but it works. First:
# tests/mixins.py
import re
from nose.tools import eq_, ok_
SCRIPTS_REGEX = re.compile('<script\s*[^>]*src=["\']([^"\']+)["\'].*?</script>',
re.M|re.DOTALL)
STYLES_REGEX = re.compile('<link.*?href=["\']([^"\']+)["\'].*?>',
re.M|re.DOTALL)
class EmbedsTestCaseMixin:
def assert_all_embeds(self, response):
if hasattr(response, 'content'):
response = response.content
response = re.sub('<!--(.*)-->', '', response, re.M)
for found in SCRIPTS_REGEX.findall(response):
if found.endswith('.js'):
resp = self.client.get(found)
eq_(resp.status_code, 200, found)
for found in STYLES_REGEX.findall(response):
if found.endswith('.css'):
resp = self.client.get(found)
eq_(resp.status_code, 200, found)
Then with this you can render a page and check that all resources are reachable:
# apps/foo/tests.py
from tests.mixins import EmbedsTestCaseMixin
from django.test import TestCase
from django.core.urlresolvers import reverse
class FooViewsTestCase(TestCase):
def test_add_page_static_files(self):
url = reverse('foo.views.add_page')
response = self.client.get(url)
assert response.status_code == 200
self.assert_all_embeds(response)
I know it ain't django-rocket science but it's damn useful as it quickly catches me out if I accidentally get a typo in any of the bundles which in previous projects I've become accustomed to check by simply clicking around in Firefox with Firebug (Net tab) open.
Hope it helps!
Optimization of getting random rows out of a PostgreSQL in Django
23 February 2011
48 comments
Django
There was a really interesting discussion on the django-users mailing list about how to best select random elements out of a SQL database the most efficient way. I knew using a regular RANDOM() in SQL can be very slow on big tables but I didn't know by how much. Had to run a quick test!
Cal Leeming discussed a snippet of his to do with pagination huge tables which uses the MAX(id) aggregate function.
So, I did a little experiment on a table with 84,000 rows in it. Realistic enough to matter even though it's less than millions. So, how long would it take to select 10 random items, 10 times? Benchmark code looks like this:
TIMES = 10
def using_normal_random(model):
for i in range(TIMES):
yield model.objects.all().order_by('?')[0].pk
t0 = time()
for i in range(TIMES):
list(using_normal_random(SomeLargishModel))
t1 = time()
print t1-t0, "seconds"
Result:
41.8955321312 seconds
Nasty!! Also running this you'll notice postgres spiking your CPU like crazy.
A much better approach is to use Python's random.randint(1, <max ID>). Looks like this:
from django.db.models import Max
from random import randint
def using_max(model):
max_ = model.objects.aggregate(Max('id'))['id__max']
i = 0
while i < TIMES:
try:
yield model.objects.get(pk=randint(1, max_)).pk
i += 1
except model.DoesNotExist:
pass
t0 = time()
for i in range(TIMES):
list(using_max(SomeLargishModel))
t1 = time()
print t1-t0, "seconds"
Result:
0.63835811615 seconds
Much more pleasant!
UPDATE
Commentator, Ken Swift, asked what if your requirement is to select 100 random items instead of just 10. Won't those 101 database queries be more costly than just 1 query with a RANDOM(). Answer turns out to be no.
I changed the script to select 100 random items 1 time (instead of 10 items 10 times) and the times were the same:
using_normal_random() took 41.4467599392 seconds
using_max() took 0.6027739048 seconds
And what about 1000 items 1 time:
using_normal_random() took 204.685141802 seconds
using_max() took 2.49527382851 seconds
Nice testimonial about django-static
21 February 2011
0 comments
Django
My friend Chris is a Django newbie who has managed to build a whole e-shop site in Django. It will launch on a couple of days and when it launches I will blog about it here too. He sent me this today which gave me a smile:
"I spent today setting up django_static for the site, and optimising it for performance. If there's one thing I've learned from you, it's optimisation.
So, my homepage is now under 100KB (was 330KB), and it loads in @5-6 seconds from hard refresh (was 13-14 seconds at its worst). And I just got a 92 score on Yslow. I do believe I have the fastest tea website around now, and I still haven't installed caching.
Wicked huh?"
He's talking about using django-static. Then I get another email shortly after with this:
"correction - I get 97 on YSlow if I use a VPN.
I just found that the Great Firewall tags extra HTTP requests onto every request I make from my browser, pinging a server in Shanghai with a PHP script which probably checks the page for its content or if its on some kind of blocked list. Cheeky buggers!"
It's that interesting! (Note: Chris is based in China but hosts the test site in the UK)
Fastest "boolean SQL queries" possible with Django
14 January 2011
5 comments
Django
For those familiar with the Django ORM they know how easy it is to work with and that you can do lots of nifty things with the result (QuerySet in Django lingo).
So I was working report that basically just needed to figure out if a particular product has been invoiced. Not for how much or when, just if it's included in an invoice or not.
The code was initially this:
def is_invoiced(self):
for __ in Invoice.objects.filter(products=self):
return True
return False # prefer 'False' to 'None' in this case
Since the Invoice model has an automatic ordering on its date field I thought, that doing a loop would put that ordering into play each time which would suck. A quick way around that is to aggregate instead. First rewrite:
def is_invoiced(self):
return bool(Invoice.objects.filter(products=self).count())
If you run the "EXPLAIN ANALYZE ..." on that SQL in PostgreSQL you'll notice that the aggregate causes one extra operation before it goes into doing any other filtering. Hmm... Perhaps I can optimize it even further by just doing a simple select but without the ordering and to optimize it even further I make it return a specific field only since this table has many fields and doing SELECT * FROM would be a waste of time. All I need is anything, id will do:
def is_invoiced(self):
qs = Invoice.objects.filter(products=self)
qs.query.clear_ordering(True)
return bool(qs.only('id'))
Right, here's the kicker! I put all of these different "patterns" into simple files A.sql, B.sql, etc. Then, I ran these against a database where the relevant table contains about 1,500 rows like this:
$ time for i in $(seq 1 1000); do psql mydatabase < X.sql > /dev/null ; done
This doesn't test the speed of the database or the SQL statement because the overhead is spent on the reading from stdin and opening the database. So, for each file I copied the statement over 200 lines.
Also, to more simulate my real application, instead of filtering on an indexed primary key I made it into a simple operator on an integer number which yields about 1% of the whole table.
After having run them a bunch of times each and measured and noted their times I can conclude the following result:
* Doing an aggregate on COUNT(*): 100%
* Doing a full select without ordering: 200%
* Doing a full select with ordering: 200%
* Doing a select only on the id without ordering: 100%
* Doing a aggregate on COUNT('id'): 100%
What this means is that doing an aggregate takes as long on * as it does on a primary key. Doing a minimal select without any ordering is as fast as doing the aggregate.
PostgreSQL is a smart cookie! This isn't news to me but it sure proves itself again. It knows to do the right things first and it knows what's needed in the end and runs the query in a very intelligent way.
Going back to the Django aspect of this. Surely, if you run this many many times there's an overhead when the ORM first applies itself to what's received back from the SQL and the __bool__ operator of the QuerySet completely discards the columns to fields conversion. Just having to get a simple integer back from the database would surely be faster. I would love to have time to dig further into that too one day.