I have a SQL Federation with 97 members, i.e. physical shards. Each member has 1-16 virtual shards, i.e. atomic units. This data tier powers a search lookup web service (on an Azure web role web server), which requires all atomic units to respond before it knows enough to answer the user response.
Given the search parameters, the web server is able to determine the atomic unit IDs that it needs to query, but not their associated federation members (I am using USE Federation for this translation). The goal is to have the web server query all atomic units (wherever they are) as quickly as possible.
Currently, the best solution I have for this works as follows:
- Generate list of needed atomic units.
- Generate USE Federation and SQL statement for each atomic unit. Currently, the best performance I have found specifies FILTERING = OFF in the USE Federation statement, while manually specifying the predicate for the atomic unit in the SQL statement (rather than relying on FILTERING = ON to add these predicates for me).
- For each atomic unit, open a SqlConnection to the Federation Root, execute the USE Federation statement, and then the SQL query, both asynchronously. I use the TPL Dataflow library and async/await to wait for all of the atomic unit queries to conclude, after which I apply (optional, depending on what web request it was) business logic to the results and send back the response.
Each atomic unit in these queries will return between 100-600 records, never more than 2000. The design objective is to query at most 200 atomic units at once => so 400,000 records is the maximum amount that the web server will ever need to apply business logic to, although that logic is never more than "get one object of each distinct numerical ID in the result," for which I have implemented IEqualityComparer.
This approach does not seem to scale that well. Even when querying 30-40 atomic units, there is a marked increase in response time, even though I can test and see that individual atomic unit responses each take less than 1 second.
I think likely places for my issues are:
- Using a separate SqlConnection for each atomic unit query => am I leveraging connection pooling effectively?
- Business logic and object serialization => should I be approaching the distinct operation across atomic units differently? What if I want to do sum operations on other fields as well (not the default use-case, but a common one).
If anyone has a favorite way of handling this, I would love to hear about it. Thanks.
Current solution:
#region Identify atomic units of search query from search parameters (static methods, no i/o, very fast)
List<string> AtomicUnits = AtomicUnitsOfSearch(data);
#endregion
#region Atomic unit-targeted subqueries setup
var atomicUnitQueries = AtomicUnits.Select(au =>
{
return GetItemsFromAtomicUnit(
CreateConstraintSQLQuery(data),
au,
verbosity);
});
#endregion
#region Execute async query across relevant shards; Distinct result item summary query business logic
if (Verbosity.basic.Equals(verbosity)) // return one item per conceptid
{
return (await TaskEx.WhenAll(atomicUnitQueries)).SelectMany(a => a)
.Distinct(new SimpleItemComparer()).ToList(); // Distinct() the fan-out results based on ID
}
else // assume _count verbosity => pick one item per Id, sum Categories and Item counts across atomic units (do not double-count for synonyms)
{
// create the ending schema before executing the results lookup
List<Item> atomicResults = (await TaskEx.WhenAll(atomicUnitQueries)).SelectMany(a => a).ToList();
List<Item> distinctResults = new List<Item>();
foreach (IGrouping<int, Item> g in atomicResults.GroupBy(a => a.Id)) // gets the lists of equivalent items across all atomic units queried
{
Item f = g.First();
distinctResults.Add(new Item(f.Id, f.Name, g.Sum(a => a.CategoryCount), g.Sum(a => a.ItemCount)));
}
return distinctResults;
}
#endregion