Updated with XML example, below
Your current design violates 1st normal form.
That, in itself, is okay. Over some years, I've inherited and had to maintain several systems that did so. I don't know why they were built that way. It doesn't really matter. They had to be maintained and the schedule wasn't always such that there was time for refactoring, testing and validation, not to mention doing so for the stack of apps that were built upon them.
Looking back now, though, I can easily spot the one attribute that they all shared. It was the absolute biggest barrier to optimizing and extending these systems: the underlying "relational" database violated 1st normal form. Virtually every technical "gotcha" encountered, virtually every performance problem, it was the root cause. Splitting strings. Creating a faux datatype system to validate them. Creating further delimited attributes to describe them. Creating special rules for each delimited "location" and having to implement an EVAL function in many systems to enforce them. Using dynamic SQL or worse to search it all. It took more "clever" programming to implement what seemed like conceptually simple features than I care to recollect.
Maybe your system is different. Maybe 40+ years of relational database research does not apply to your situation. For your sake, I truly hope so. The only problem is that you're using a relational database in a non-relational way. Just like you can pound screws with a hammer, and you can pull a boat with a motorcycle (don't hit the brakes if you actually get it going), you can create an index (full-text or b-tree) on text that represents integers.
But why would you do any of these things? Why wouldn't you actually store the integers as integers and enjoy type-safety? Why wouldn't you normalize this into two related tables to take advantage of smaller transactions and more indexing options? If you've inherited a system that you can't change, then please say so and people might be able to help with alternatives (TVPs and XML been rightfully mentioned). But I can't see coming into the situation saying that your hammer and motorcycle are broken because they don't drive screws and pull boats very well.
All that said (maybe somebody, somewhere is rethinking an ill-advised design), I've put LIKE
to good use when searching delimited strings:
-- Setup demo data
declare @delimitedInts table (
data varchar(max) not null
)
insert into @delimitedInts select '0,1,2'
insert into @delimitedInts select '1,2,3,4'
insert into @delimitedInts select '5,10'
-- Create a search term
declare @searchTerm int = 2
-- Get all rows that contain the searchTerm
select data
from @delimitedInts
where ',' + data + ',' like '%,' + cast(@searchTerm as varchar(11)) + ',%'
-- Create many search terms
declare @searchTerms table (
searchTerm int not null primary key
)
insert into @searchTerms select 2
insert into @searchTerms select 3
insert into @searchTerms select 4
-- Get all rows that contain ANY of the searchTerms
select distinct a.data
from @delimitedInts a
join @searchTerms b on ',' + a.data + ',' like '%,' + cast(b.searchTerm as varchar(11)) + ',%'
-- Get all rows that contain ALL of the searchTerms
select a.data
from @delimitedInts a
join @searchTerms b on ',' + a.data + ',' like '%,' + cast(b.searchTerm as varchar(11)) + ',%'
group by a.data
having count(*) = (select count(*) from @searchTerms)
Is this too slow for you? Maybe. Have you actually measured it? At least you could get an implementation in place and prove that it works before you optimize it.
Update: XML
I've done a little testing on converting your space-delimited column to an XML column and querying it, including doing so with XML indexes. Unfortunately, you can't put an XML index on a computed column, so I'm using a trigger to keep an XML column automatically updated. Here are some interesting results (note the SQL comments):
-- Create a demo table
create table MyTable (
ID int not null primary key identity
, SpaceSeparatedInts varchar(max) not null
--, ComputedIntsXml as cast('<ints><i>' + replace(SpaceSeparatedInts, ' ', '</i><i>') + '</i></ints>' as xml) persisted -- Can't use XML index
, IntsXml xml null
)
go
-- Create trigger to update IntsXml
create trigger MyTable_Trigger on MyTable after insert, update as begin
update m
set m.IntsXml = cast('<ints><i>' + replace(m.SpaceSeparatedInts, ' ', '</i><i>') + '</i></ints>' as xml)
from MyTable m
join inserted i on m.ID = i.ID
end
go
-- Add some demo data
insert into MyTable (SpaceSeparatedInts) select '1'
insert into MyTable (SpaceSeparatedInts) select '1 2'
insert into MyTable (SpaceSeparatedInts) select '2 3 4'
insert into MyTable (SpaceSeparatedInts) select '5 6 7 10'
insert into MyTable (SpaceSeparatedInts) select '100 10 1000'
go
-- Search for the number 10 (and use this same query in subsequent testing, below)
select *
from MyTable
where IntsXml.exist('/ints/i[. = "10"]') = 1
-- This query spends virtually all of its time running an XML Reader and an XPath filter
-- Add a primary xml index
create primary xml index IX_MyTable_IntsXml on MyTable (IntsXml)
-- The query now uses a clustered index scan and clustered index seek on PrimaryXML
-- Add secondary xml index for value
create xml index IX_MyTable_IntsXml_Value on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for value
-- No change
-- Add secondary xml index for path
create xml index IX_MyTable_IntsXml_Path on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for path
-- No change
-- Add secondary xml index for property
create xml index IX_MyTable_IntsXml_Property on MyTable (IntsXml) using xml index IX_MyTable_IntsXml for property
-- The query now replaces the clustered index scan on PrimaryXML with an index seek on SecondaryXML
While it is clearly a different method, is this faster than LIKE? You have to test in your environment. Hopefully this will give you some ideas of how to do so. Please let me know how this works out for you, if it's doable in your shop.