问:
Is it possible to turn off directory locking with BDB? How is the performance
compared to regular FSDirectory for queries?
答:
If you're thinking of using Berkeley DB as a the store behind the Lucene index
via the DbDirectory Directory implementation, here are a few things to keep in
mind:
- always setUseCompoundFile(false)
don't use compound lucene index files on top of Berkeley DB:
. there is a bug that prevents this from working correctly
. it makes no sense anyway since it duplicates what DbDirectory is
already doing (all index files are stored in the same Berkeley DB file)
. it slows things down
- if you are using a transaction around all the index updates, you may want
to consider doing all the index updates in a RAMDirectory first and then
adding the RAMDirectory wholesale to the DbDirectory in that transaction.
This makes indexing considerably faster (3 times for me) and does a LOT
less thrashing around in Berkeley DB which can lead to a large number of
transactional log files rapidly filling up your hard drive.
I'm not really sure if and how index merging works. For my use, having no
merging is good enough since I never update existing documents, but always
instead add a new version of them. The concept of version is tied to my
application and each transaction corresponds to a new version.
本文探讨了使用BerkeleyDB (BDB) 作为Lucene索引存储时的一些最佳实践,包括禁用复合文件以避免错误并提高性能,以及如何通过先在RAMDirectory中进行索引更新再整体提交到DbDirectory来显著提升索引速度。
4415

被折叠的 条评论
为什么被折叠?



