I think there would have to be more information on what kinds of sites you are deploying: there would be differences based on the relations between the sites, both programatically and 'legally' (as in a business relation):
- Having an system account per 'site' can be handy if the sites are 'owned' by different people - if you are a web designer or programmer with a few clients, then you might benefit from separation.
- If your sites are related, i.e. a forum site, a blog site etc, you might benefit from a single deployment system (like ours).
- for libraries, if they're hosted on reputable sources (pypy, github etc), its probably ok to leave them there and deploy from them - if they're on dodgy hosts which are up or down, we take a copy and put them in a /thirdparty folder in our git repo.
FABRIC
Fabric is amazing - if its setup and configured right for you:
- We have a policy here which means nobody ever needs to log onto a server (which is mostly true - there are occasions where we want to look at the raw nginx log file, but its a rarity).
- We've got fabric configured so that there are individual functional blocks (restart_nginx, restart_uwsgi etc), but also
- higher level 'business' functions which run all the little blocks in the right order - for us to update all our servers we meerly type 'fab -i secretkey live deploy' - the live sets the settings for the live servers, and deploy ldeploys (the -i is optional if you have your .ssh keys set up right)
- We even have a control flag that if the live setting is used, it will ask 'are you sure' before performing the deploy.
Our code layout
So our code base layout looks a bit like this:
/ <-- folder containing readme file etc
/bin/ <-- folder containing nginx & uwsgi binaries (!)
/config/ <-- folder containing nginx config and pip list but also things like pep8 and pylint configs
/fabric/ <-- folder containing fabric deployment
/logs/ <-- holding folder that nginx logs get written into (but not committed)
/src/ <-- actual source is in here!
/thirdparty/ <-- third party libs that we didn't trust the hosting of for pip
Possibly controversial because we load our binaries into our repo, but it means that if i upgrade nginx on the boxes, and want to roll back, i just do it by manipulation of git. I know what works against what build.
How our deploy works:
All our source code is hosted on a private bitbucket repo (we have a lot of repos and a few users, thats why bitbucket is better for us then github). We have a user account for the 'servers' with its own ssh key for bitbucket.
Deploy in fabric performs the following on each server:
- irc bot announce beginning into the irc channel
- git pull
- pip deploy (from a pip list in our repo)
- syncdb
- south migrate
- uwsgi restart
- celery restart
- irc bot announce completion into the irc channel
- start availability testing
- announce results of availability testing (and post report into private pastebin)
The 'availability test' (think unit test, but against live server) - hits all the webpages and API's on the 'test' account to make sure it gets back sane data without affecting live stats.
We also have a backup git service so if bitbucket is down, it falls over to that gracefully, and we even have jenkins integration that on a commit to the 'deploy' branch, it causes the deployment to go through
The scary bit
Because we use cloud computing and expect a high throughput, our boxes auto spawn. Theres a default image which contains a a copy of the git repo etc, but invariably it will be out of date, so theres a startup script which does a deployment to itself, meaning new boxes added to the cluster are automatically up-to-date.