Typically, a single catalog will handle all the full-text searches for a
database, although dedicating a
single catalog to a very large table (one
with over one million rows) will improve
performance.
全文索引创建完之后,需要进行填充(如果在创建的时候没有选择创建后进行填充),创建后在表的右键盘菜单里选择全文索引-启用全文索引,然后再选启动完全填充就行了。
全文索引的更新和普通的sql索引不一样,它不是在事务过程中进行的,它是要等到sqlserver把数据发给索引才进行更新,这样可以不影响大文本更新的速度,但是缺点是它不是实时的。全文索引创建之初都是空的,如果表中已经有数据要进行完全填充,空的就不需要了。
全文索引的填充有两种方式:Incremental
populations和Change tracking and background population (default),
incremental
population表需要有一个rowversion列,
Incremental populations present two problems.
First, a built-in delay occurs between the time the data is entered and the time
the user can find the data using full-text search.
Second,incremental
populations consolidate all the changes into a single process that consumes
a
significant amount of CPU time during the incremental change.
Change
tracking and background population
(default):每当一行发生变更的时候,sqlserver就会发送一行的增量修改,而且它不在事务范围内,并不会影响更新的执行,它对系统的消耗也不明显,所以这种方式在实时性跟系统消耗间取得了很好的平衡。这个一般来说是最好的方案。
Change
tracking can also be configured to require manual pushes of only the changed
data.
Best Practice:If the database project incorporates searching for
words within columns, use full-text search with change tracking and background
population. It’s the best overall way to balance search performance with
update
performance.
使用全文索引搜索词用contains和containstable函数。
contains函数用来确定一个词是否在某列中出现
containstable函数用来根据The
frequency/uniqueness of the word in the table和The frequency/uniqueness of the
word in the column来给一个词评级,返回一个两列的表,一个是创建全文索引的时候用的标识列一个是评级结果。a rare word will be
ranked as statistically more important than a common
word.
全文索引是不会大小写的。即使sqlserver设置成区分大小写,对它也无效。
在多列中查多个词的解决方案:
/*可以实现在多列中查多个词,但是这个有性能问题*/
SELECT
Title
FROM Fable
WHERE CONTAINS (*,‘Thrifty’)
AND
CONTAINS(*,‘supperless’)
The other solution to the multiple-column search
problem consists of adding an additional column
to hold all the text to be
searched and duplicating the data from the original columns to
a
FullTextSearch column within an after trigger or using a persisted computed
column. This solution
is not smooth either. It duplicates data and costs
performance time during inserts and updates. The crux
of the decision
regarding how to solve the multiple-column search is the conflict between fast
reads and
fast writes — OLAP versus OLTP.
Fuzzy
Searches:
Internally, the free-form text is broken down into multiple words
and phrases, and the full-text search with inflections and weighting is then
performed on the result.
FREETEXT and FREETEXTTABLE essentially turn on
every advanced feature of iFTS and perform a fuzzy word
search.
Performance:
在全文索引性能调整方面有以下两点:
iFTS benefits from a
very fast subsystem. Place your catalog on its own controller, preferably
its
own RAID 10 array. A sweet spot exists for SQL iFTS on eight-way servers. After
a full or
incremental population, force a master merge, which will
consolidate all the shadow indexes
into a single master index, by issuing the
following command:
ALTER FULLTEXT CATALOG catalog_name REORGANIZE;
You
can also increase the maximum number of ranges that the gathering process can
use. To
do so, issue the following command:
EXEC sp_configure ‘max
full-text crawl range’, 32;