让ElasticSearch连接mysql数据库

https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart

Quickstart

A short guide for the impatient!

Prerequisites:

A running MySQL database test, with user test and password test, a table orders

A terminal / console with commands curl and unzip

Internet access (of course)

Let's go!

  1. Download elasticsearch (current version is 0.19.8, check for the most recent version on http://elasticsearch.org) from

    curl -OL https://github.com/downloads/elasticsearch/elasticsearch/elasticsearch-0.19.10.zip

  2. Unpack zip file into you favorite elasticsearch directory, we call it $ES_HOME

    cd $ES_HOME

    unzip path/to/elasticsearch-0.19.10.zip

  3. Install JDBC river plugin (current version is 1.3.2, check for the most recent version, and if you have the file permissions to write into the plugins folder). If you have installed a JDBC river plugin before, it is recommended to remove the folder plugins/river-jdbc before installing a new version. (**Note**: If you're using a version of elasticsearch >= 0.90.0, you need to use at least version 2.2.0 of the river-jdbc plugin).

    ./bin/plugin --url http://bit.ly/10FJhEd -install river-jdbc

  4. Download MySQL JDBC driver (current version is 5.1.21, check for the most recent version)

    curl -OL http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.21.zip/from/http://cdn.mysql.com/

  5. Add MySQL JDBC driver jar to JDBC river plugin directory

    cp mysql-connector-java-5.1.21-bin.jar $ES_HOME/plugins/river-jdbc/

  6. Start elasticsearch in terminal window with logging on the console

    ./bin/elasticsearch -f

  7. Start another terminal window, and create a new JDBC river with name my_jdbc_river with this curl command

    curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
        "type" : "jdbc",
        "jdbc" : {
            "driver" : "com.mysql.jdbc.Driver",
            "url" : "jdbc:mysql://localhost:3306/test",
            "user" : "test",
            "password" : "test",
            "sql" : "select * from orders"
        },
        "index" : {
            "index" : "jdbc",
            "type" : "jdbc"
        }
    }'
    
  8. The river runs! Watch the log on the elasticsearch console for the river activity. When the river fetched the data and is going to wait for the next cycle (which is by default 60 minutes ahead), you can query your elasticsearch node for the data you just indexed with the following curl command

    curl -XGET 'localhost:9200/jdbc/jdbc/_search?pretty&q=*'

  9. Enjoy the result!

  10. If you want to stop the my_jdbc_river river fetching data from the orders table after the quick demonstration, use this curl command:

    curl -XDELETE 'localhost:9200/_river/my_jdbc_river'

What next?

Now, if you want more fine-tuning, you can drop the index jdbc, change the mapping, change the index name or type name, change the river creation settings, play with the data fetch interval - whatever you like. 


### 将 MySQL 数据库的数据导入 Elasticsearch 并建立索引 #### 使用 Logstash 导入并处理时间格式 为了确保 `birth_date` 字段和其他日期字段能够正确映射到 Elasticsearch 的索引中,可以采用如下 SQL 查询来调整日期格式: ```sql SELECT id, nick_name AS nickName, sex, DATE_FORMAT(birth_date, '%Y-%m-%d') AS birthDate, DATE_FORMAT(create_time, '%Y-%m-%d %H:%i:%S') AS createTime, DATE_FORMAT(update_time, '%Y-%m-%d %H:%i:%S') AS updateTime FROM sys_user; ``` 此查询会将日期转换成适合 Elasticsearch 存储的字符串格式[^5]。 #### 配置 Logstash 输入插件连接 MySQL 配置文件 (`testlogstash.conf`) 中定义输入部分以读取来自 MySQL 的数据。这涉及指定 JDBC 连接参数以及上述经过优化的时间戳格式化后的 SQL 语句: ```ruby input { jdbc { jdbc_driver_library => "/path/to/mysql-connector-java.jar" jdbc_driver_class => "com.mysql.cj.jdbc.Driver" jdbc_connection_string => "jdbc:mysql://localhost:3306/es_db?useSSL=false&serverTimezone=UTC" jdbc_user => "root" jdbc_password => "password" statement_filepath => "./queries/sys_user.sql" # 上述SQL保存路径 } } ``` #### 定义输出至 Elasticsearch 接着,在同一配置文件内设置输出目标为本地运行的单节点集群上的 Elasticsearch 实例: ```ruby output { elasticsearch { hosts => ["http://localhost:9200"] index => "sys_users_index" document_id => "%{id}" } } ``` 以上配置使得每条记录都将被发送给名为 `sys_users_index` 的新索引,并且文档 ID 设置为原始表中的主键值。 #### 创建索引模板 (可选) 如果希望提前定义好要存储哪些字段及其类型,则可以在导入前先创建一个匹配模式相同的索引模板。这样做的好处是可以更好地控制分片分配策略、副本数量以及其他高级选项。 ```json PUT _template/template_sys_users { "index_patterns": ["sys_users_*"], "settings": { "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "_source": {"enabled": true}, "properties": { "nickName": { "type": "text"}, "sex": { "type": "keyword"}, "birthDate":{"type":"date","format":"yyyy-MM-dd"}, "createTime":{"type":"date","format":"strict_date_optional_time||epoch_millis"}, "updateTime":{"type":"date","format":"strict_date_optional_time||epoch_millis"} } } } ``` 执行该 API 请求后即可应用这些设定于后续所有名称符合模式的新建索引上[^4]。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值