SOLR Cloud(5)Some Notes

最新推荐文章于 2024-02-22 16:07:52 发布

原创最新推荐文章于 2024-02-22 16:07:52 发布 · 228 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#运维 #json #ui

Summary 专栏收录该内容

381 篇文章

订阅专栏

本文介绍 SolrCloud 的工作原理、集群搭建步骤及管理技巧，包括使用 Docker 部署、通过 API 操作集合和副本、利用 SQL 查询数据等。此外还介绍了如何监控集群状态并解析返回的 JSON 数据。

SOLR Cloud(5)Some Notes

How SOLR Cloud Works
https://lucene.apache.org/solr/guide/7_0/how-solrcloud-works.html

Logic Concepts
Cluster can host multiple collections of solr documents —> A collection can be partitioned into multiple Shards
Shard will decide —> limit to the number of documents that collection contains —> amount of parallelization for individual search request

Physical Concepts
Cluster —> one or more Solr Nodes
Each Node —> Multiple Cores
Each Core in a Cluster is a physical Replica for a logical Shard.
Number of Replicas decide—> level of redundancy ——> limit in the number concurrent search requests that can be processed

Search all shards
http://localhost:8983/solr/gettingstarted/select?q=*:*

Search one shard
http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=shard1

Search group of shards
http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=shard1,shard2

Search from replica
http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=localhost:7574/solr/gettingstarted,localhost:8983/solr/gettingstarted

Start from a SOLR Docker
https://github.com/docker-solr/docker-solr/tree/master/7.1/scripts
https://github.com/docker-solr/docker-solr/blob/master/docs/docker-networking.md

A docker UI tool
https://github.com/DeemOpen/zkui
http://localhost:9090/home?zkPath=/solr

Exception:
org.apache.solr.common.SolrException: No coreNodeName for CoreDescriptor[name=alljobs;instanceDir=/opt/solr/server/solr/mycores/alljobs]
Caused by: org.apache.solr.common.SolrException: Unable to create core [alljobs]
at org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1045)
at org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:642)

config API
https://lucene.apache.org/solr/guide/6_6/configsets-api.html

collection API
https://lucene.apache.org/solr/guide/6_6/collections-api.html

Delete the Replica
http://172.23.7.212:8983/solr/admin/collections?action=DELETEREPLICA&collection=allJobs&shard=shard1&count=1&onlyIfDown=true

Add the Replica Back
http://172.23.7.212:8983/solr/admin/collections?action=ADDREPLICA&collection=allJobs&shard=shard1

Replica Information
https://lucene.apache.org/solr/guide/7_0/collections-api.html#create-parameters
http://172.23.7.212:8983/solr/admin/collections?action=CLUSTERSTATUS
http://alljobs.us-east-1.elasticbeanstalk.com:8983/solr/admin/collections?action=CLUSTERSTATUS
response format
{
"responseHeader":{
"status":0,
"QTime":13},
"cluster":{
"collections":{
"allJobs":{
"pullReplicas":"0",
"replicationFactor":"2",
"shards":{"shard1":{
"range":"80000000-7fffffff",
"state":"active",
"replicas":{
"core_node4":{
"core":"allJobs_shard1_replica_n2",
"base_url":"http://172.23.7.212:8983/solr",
"node_name":"172.23.7.212:8983_solr",
"state":"active",
"type":"NRT",
"leader":"true"},
"core_node12":{
"core":"allJobs_shard1_replica_n11",
"base_url":"http://172.23.2.179:8983/solr",
"node_name":"172.23.2.179:8983_solr",
"state":"active",
"type":"NRT"},
"core_node14":{
"core":"allJobs_shard1_replica_n13",
"base_url":"http://172.23.7.229:8983/solr",
"node_name":"172.23.7.229:8983_solr",
"state":"down",
"type":"NRT"}}}},
"router":{"name":"compositeId"},
"maxShardsPerNode":"1",
"autoAddReplicas":"false",
"nrtReplicas":"1",
"tlogReplicas":"0",
"znodeVersion":37,
"configName":"allJobs"}},
"live_nodes":["172.23.7.212:8983_solr",
"172.23.2.179:8983_solr"]}}

Parse the JSON in Shell
https://stackoverflow.com/questions/20488315/read-the-json-data-in-shell-script

This will work pretty well to get the data we need
curl 'http://172.23.7.212:8983/solr/admin/collections?action=CLUSTERSTATUS' | jq -r '.cluster.collections.allJobs.shards.shard1.replicas | to_entries[] | select(.value.node_name=="172.23.2.179:8983_solr") | .key'

Here is how we install the jq on MAC OS
>brew install jq

This works on the docker image
>curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | jq -r '.cluster.collections.allJobs.shards.shard1.replicas | to_entries[] | select(.value.node_name=="172.23.2.179:8983_solr") | .key'

Links show that why the replica is gone, but the ZK still have them active in one of the state.json
http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE
http://grokbase.com/t/lucene/solr-user/1598s058v0/solrcloud-admin-ui-shows-node-is-down-but-state-json-says-its-active-up

All the API we can call in SOLR Doc
https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-DeleteaShard

Finally I have this UI tool working on my Machine
https://lucene.apache.org/solr/guide/6_6/solr-jdbc-dbvisualizer.html

From the SQL URL we can do something as follow as well
http://alljobs.us-east-1.elasticbeanstalk.com:8983/solr/allJobs/sql?stmt=select+id+from+allJobs+limit+10&includeMetadata=true&user=&password=&aggregationMode=facet

Steps to create the Query on DBvisualizer
Driver Manager, create a new driver named solr7
URL Format: jdbc:solr://zookeeper1.us-east-1.elasticbeanstalk.com,zookeeper2.us-east-1.elasticbeanstalk.com,zookeeper3.us-east-1.elasticbeanstalk.com/solr/allJobs?collection=allJobs
Driver Class: org.apache.solr.client.solrj.io.sql.DriverImpl

Add library SOLR_HOME/dist/solr-solrj-7.1.0.jar
SOLR_HOME/dist/solrj-lib/*

New Connection
Driver (JDBC) solr7
Database URL: jdbc:solr://zookeeper1.us-east-1.elasticbeanstalk.com,zookeeper2.us-east-1.elasticbeanstalk.com,zookeeper3.us-east-1.elasticbeanstalk.com/solr/allJobs?collection=allJobs

Then we can use SQL Query similar to
select id, title from allJobs limit 10;

Elastic Beanstalk logs to CloudWatch
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.managing.cw.html
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html

References:
http://www.francelabs.com/blog/tutorial-deploying-solrcloud-6-on-amazon-ec2/
http://www.francelabs.com/blog/tutorial-solrcloud-5-amazon-ec2/
https://medium.com/@sarkaramrit2/setting-up-solr-cloud-6-3-0-with-zookeeper-3-4-6-867b96ec4272

official cloud guide
https://lucene.apache.org/solr/guide/7_0/getting-started-with-solrcloud.html

http://www.cnblogs.com/fengjian2016/p/5858320.html
http://blog.javachen.com/2014/03/10/how-to-install-solrcloud.html
https://my.oschina.net/yugm/blog/183311
https://blog.liyang.io/258.html
http://blog.youkuaiyun.com/zhu_tianwei/article/details/46731887
http://blog.cheyo.net/130.html
https://segmentfault.com/a/1190000002444956