We're developing an application with one function of managing payments to people. A payment will be written to a row in a table, with the following fields:
PersonId (INT)
TransactionDate (DATETIME)
Amount (MONEY)
PaymentTypeId (INT)
...
...
...
It looks like we deal with around 8000 people who we send payments to, and a new transaction per person is added daily (Around 8,000 inserts per day). This means that after 7 years (The time we need to store the data for), we will have over 20,000,000 rows.
We get around 10% more people per year, so this number rises a bit.
The most common query would be to get a SUM(Amount), per person, where Transaction Date between a start date and an end date.
SELECT PersonId, SUM(Amount)
FROM Table
WHERE PaymentTypeId = x
AND TransactionDate BETWEEN StartDate AND EndDate
GROUP BY PersonId
My question is, is this going to be a performance problem for SQL Server 2012? Or is 20,000,000 rows not too bad?
I'd have assumed a clustered index on PersonID
? (To group them), but this would cause very slow insert/updates?
An index on the TransactionDate
?