I'm working with electronic equipment that digitizes waveforms in real-time (each device generates around 1000 512 byte arrays per second - we have 12 devices). I've written a client for these devices in C# that for the most part works fine and has no performance issues.
However, one of the requirements for the application is archival, and Microsoft SQL Server 2010 was mandated as the storage mechanism (outside of my control). The database layout is very simple: there is one table per device per day ("Archive_Dev02_20131015" etc). Each table has an Id
column, a timestamp
column, a Data
column (varbinary
) and 20 more integer columns with some metadata. There's a clustered primary key on Id
and timestamp
, and another separate index on timestamp
. My naive approach was queue all data in the client application, and then inserting everything into the database in 5 second intervals using SqlCommand
.
The basic mechanism looks like this:
using (SqlTransaction transaction = connection.BeginTransaction()
{
//Beginning of the insert sql statement...
string sql = "USE [DatabaseName]\r\n" +
"INSERT INTO [dbo].[Archive_Dev02_20131015]\r\n" +
"(\r\n" +
" [Timestamp], \r\n" +
" [Data], \r\n" +
" [IntField1], \r\n" +
" [...], \r\n" +
") \r\n" +
"VALUES \r\n" +
"(\r\n" +
" @timestamp, \r\n" +
" @data, \r\n" +
" @int1, \r\n" +
" @..., \r\n" +
")";
using (SqlCommand cmd = new SqlCommand(sql))
{
cmd.Connection = connection;
cmd.Transaction = transaction;
cmd.Parameters.Add("@timestamp", System.Data.SqlDbType.DateTime);
cmd.Parameters.Add("@data", System.Data.SqlDbType.Binary);
cmd.Parameters.Add("@int1", System.Data.SqlDbType.Int);
foreach (var sample in samples)
{
cmd.Parameters[0].Value = amples.ReceiveDate;
cmd.Parameters[1].Value = samples.Data; //Data is a byte array
cmd.Parameters[1].Size = samples.Data.Length;
cmd.Parameters[2].Value = sample.IntValue1;
...
int affected = cmd.ExecuteNonQuery();
if (affected != 1)
{
throw new Exception("Could not insert sample into the database!");
}
}
}
}
transaction.Commit();
}
To summarize: a batch of 1 transaction with a loop that generates insert statements and executes them.
This method turned out to be very, very slow. On my machine (i5-2400 @ 3.1GHz, 8GB RAM, using .NET 4.0 and SQL Server 2008, 2 internal HDs in mirror, everything runs locally), it takes about 2,5 seconds to save the data from 2 devices, so saving 12 devices each 5 seconds is impossible.
To compare, I've written a small SQL script (actually I extracted the code C# runs with the sql server profiler) that does the same directly on the server (still running on my own machine):
set statistics io on
go
begin transaction
go
declare @i int = 0;
while @i < 24500 begin
SET @i = @i + 1
exec sp_executesql N'USE [DatabaseName]
INSERT INTO [dbo].[Archive_Dev02_20131015]
(
[Timestamp],
[Data],
[int1],
...
[int20]
)
VALUES
(
@timestamp,
@data,
@compressed,
@int1,
...
@int20,
)',N'@timestamp datetime,@data binary(118),@int1 int,...,@int20 int,',
@timestamp='2013-10-14 14:31:12.023',
@data=0xECBD07601C499625262F6DCA7B7F4AF54AD7E074A10880601324D8904010ECC188CDE692EC1D69472329AB2A81CA6556655D661640CCED9DBCF7DE7BEFBDF7DE7BEFBDF7BA3B9D4E27F7DFFF3F5C6664016CF6CE4ADAC99E2180AAC81F3F7E7C1F3F22FEEF5FE347FFFDBFF5BF1FC6F3FF040000FFFF,
@int=0,
...
@int20=0
end
commit transaction
This does (imo, but I'm probably wrong ;) ) the same thing, only this time I'm using 24500 iterations, to simulate the 12 devices at once. The query takes about 2 seconds. If I use the same amount of iterations as the C# version, the query runs in less than a second.
So my first question is: why does it run way faster on SQL server than in C#? Does this have anything to do with the connection (local tcp)?
To make matters more confusing (to me) this code runs twice as slow on the production server (IBM bladecenter, 32GB ram, fiber connection to SAN, ... filesystem operations are really fast). I've tried looking at the sql activity monitor and write performance never goes above 2MB/sec, but this might as well be normal. I'm a complete newbie to sql server (about the polar opposite of a competent DBA in fact).
Any ideas on how I can make the C# code more performant?