Creating a noSql database, what is the best source code to look at?

最新推荐文章于 2023-11-25 14:11:34 发布

转载最新推荐文章于 2023-11-25 14:11:34 发布 · 593 阅读

Java 同时被 2 个专栏收录

195 篇文章

订阅专栏

SQL

16 篇文章

订阅专栏

本文探讨了构建一款用于存储大量嵌套评论的NoSQL数据库的设计思路和技术需求，包括自动/手动分片、全文搜索等功能，并寻求现有代码及算法的学习资源。

I have always wanted a nosql database that was purpose built for storing large volumes of nested/threaded comments. Implementation would probably be done in java because that is what I am best at. I really like how ElasticSearch is dead simple to set up a cluster and throw data into it, I want my product to share those same qualities. Here are the features I have in mind:

1) auto/manual sharding across clusters
2) auto/manual indexing across clusters
3) full text search (probably via lucene or elasticSearch)
4) REST/JSON API
5) retrieve any comment by ID
6) comments can be retrieved with or without child nodes
7) comment trees can be retrieved with a specified depth
8) comment tree can be retrieved can be filtered by time or rank
9) entire comment trees can be re-parented.

What I'm looking for are exceptional pieces of code or specific algorithms that I can study before digging into this project. Can anyone suggest a few places to get started?

asked Aug 8 '12 at 3:24

bostonBob
370 4 12

Much of this will be a feature of the app you build rather than the database you use. The rest (possibly excluding full-text search), any existing NoSQL database should be able to handle. Why exactly can't you use an already existing DB? – cHao Aug 8 '12 at 3:30

Do you want to write your own, or do you want to use one that you like and that is written in Java? – Edmon Aug 8 '12 at 3:30

About 80% of the reason for wanting to write my own is for fun, the other 20% is because I have never really been fully satisfied with the traditional solutions for storing nested comments. I think it would be cool to be able to fire up a cluster to store/search reddit scale volumes of comments. – bostonBob Aug 8 '12 at 3:52

add a comment