We have an application that using a SQL Server 2008 database, and full-text search. I'm trying to understand why the following searches behave differently:
First, a phrase containing a hyphenated word, like this:
contains(column_name, '"one two-three-four five"')
And second, an identical phrase, where the hyphens are replaced by spaces:
contains(column_name, '"one two three four five"')
The full-text index uses the ENGLISH (1033) locale, and the default system stoplist.
From my observations of other full-text searches containing hyphenated words, the first one should allow for matches on either one two three four five
or one twothreefour five
. Instead, it only matches one twothreefour five
(and not one two-three-four five
).
Test Case
Setup:
create table ftTest
(
Id int identity(1,1) not null,
Value nvarchar(100) not null,
constraint PK_ftTest primary key (Id)
);
insert ftTest (Value) values ('one two-three-four five');
insert ftTest (Value) values ('one twothreefour five');
create fulltext catalog ftTest_catalog;
create fulltext index on ftTest (Value language 1033)
key index PK_ftTest on ftTest_catalog;
GO
Queries:
--returns one match
select * from ftTest where contains(Value, '"one two-three-four five"')
--returns two matches
select * from ftTest where contains(Value, '"one two three four five"')
select * from ftTest where contains(Value, 'one and "two-three-four five"')
select * from ftTest where contains(Value, '"one two-three-four" and five')
GO
Cleanup:
drop fulltext index on ftTest
drop fulltext catalog ftTest_catalog;
drop table ftTest;