I'm hoping to get some help choosing a database and layout well suited to a web application I have to write (outlined below), I'm a bit stumped given the large number of records and fact that they need to be able to be queried in any manner.
The web app will basically allow querying of a large number of records using any combination of criteria that make up the records, date is the only mandatory item. A record consists of only eight items (below), but there will be about three million new records a day, with very few duplicate records. Data will be constantly inserted into the database real time for the current day.
I know the biggest interest will be in the last 6 months -> 1 years worth of data, but the rest will still need to be available for the same type of queries.
I'm not sure what database is best suited for this, nor how to structure it. The database will be on a reasonably powerful server. I basically want to start with a good db design, and see how the queries perform. I can then judge if I'd rather do optimizations or throw more powerful hardware at it. I just don't want to have to redo the base db design, and it's fine initially if we're doing a lot of optimizations we have time but not $$$.
We need to use something open source, not something like oracle. Right now I'm leaning towards postgres.
A record consists of:
1 Date
2 unsigned integer
3 unsigned integer
4 unsigned integer
5 unsigned integer
6 unsigned integer
7 Text 16 chars
8 Text 255 chars
I'm planning on creating yearly schemas, monthly tables, and indexing the record tables on date for sure.
I'll probably be able to add another index or two after I analyze usage patterns to see what the most popular queries are. I can do lots of tricks on the app site as far as caching popular queries and what not, it's really the db side I need assistance with. Field 8 will have some duplicate values so I'm planning on having that column be an id into a lookup table to join on. Beyond that I guess the remaining fields will all be in one monthly table...
I could break it into weekly tables i suppose as well and use a view for queries so the app doesn't have to deal with trying to assemble a complex query....
anyway, thanks very much for any feedback or assistance!