Mycat配置（三）-rule规则配置

最新推荐文章于 2024-03-25 16:06:06 发布

转载最新推荐文章于 2024-03-25 16:06:06 发布 · 1.1k 阅读

2 ·

CC 4.0 BY-SA版权

原文链接：https://my.oschina.net/u/3242075/blog/2960700

文章标签：

#数据库

Mycat的rule.xml配置文件用于定义表拆分规则，包括tableRule和function两个标签。常用规则包括枚举法、求模法、日期列分区、范围约定、通配取模、编程指定和一致性Hash。这些规则提供了灵活的数据分片策略，以应对不同的数据库水平拆分需求。一致性Hash在数据扩容方面表现出优势。

rule.xml 里面就定义了我们对表进行拆分所涉及到的规则定义。我们可以灵活的对表使用不同的分片算法。这个文件里面主要有 tableRule 和 function 这两个标签。在具体使用过程中可以按照需求添加 tableRule 和 function。

tableRule 标签

<tableRule name="rule1"> #name为规则的唯一名称，用于标识不同的表规则
  <rule>
    <columns>id</columns> #columns 为表示对表的哪个字段进行拆分

    #algorithm 使用function标签中的name属性。连接表规则和具体路由算法。当然，多个表规则可以连接到同一个路由算法上。 
     table 标签内使用。让逻辑表使用这个规则进行分片
    <algorithm>func1</algorithm> 
  </rule>
</tableRule>

function 标签

#name 指定算法的名字,供algorithm引用
#class 制定路由算法具体的类名字。

<function name="hash-int" class="org.opencloudb.route.function.PartitionByFileMap">

   #property 为具体算法需要用到的一些属性。
   <property name="mapFile">partition-hash-int.txt</property>

</function>

下面列举一些Mycat水平拆分表的常用规则：

枚举法

<tableRule name="sharding-by-intfile">
    <rule>
      <columns>name</columns>
      <algorithm>hash-int</algorithm>
    </rule>
  </tableRule>
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
    <property name="mapFile">partition-hash-int.txt</property>

    #type为type默认值为0，0表示mapFile中的枚举类型Integer，非零表示mapFile中的枚举类型String
    <property name="type">1</property>

    #默认节点:小于0表示不设置默认节点，大于等于0表示设置默认节点,结点为指定的值（所有的节点配置都是从0开始，及0代表节点1），默认节点的作用：枚举分片时，如果碰到不识别的枚举值，就让它路由到默认节点，如果不配置默认节点（defaultNode值小于0表示不配置默认节点），碰到不识别的枚举值就会报错
    <property name="defaultNode">0</property>
  </function>

partition-hash-int.txt 配置:

张飞=0
刘备=1
关羽=2

求模法

<tableRule name="mod-long">
    <rule>
      <columns>user_id</columns>
      <algorithm>mod-long</algorithm>
    </rule>
  </tableRule>
  <function name="mod-long" class="io.mycat.route.function.PartitionByMod">
   <!-- how many data nodes  -->
   #此种配置非常明确即根据id与count（你的结点数）进行求模预算，相比方式1，此种在批量插入时需要切换数据源，id不连续
    <property name="count">3</property>
  </function>

日期列分区法

<tableRule name="sharding-by-date">
      <rule>
        <columns>create_time</columns>
        <algorithm>sharding-by-date</algorithm>
      </rule>
   </tableRule> 
<function name="sharding-by-date" class="io.mycat.route.function..PartitionByDate">
   <property name="dateFormat">yyyy-MM-dd</property>
   #配置中配置了开始日期，分区天数，即默认从开始日期算起，分隔10天一个分区
    <property name="sBeginDate">2014-01-01</property>
    <property name="sPartionDay">10</property>
  </function>

Assert.assertEquals(true, 0 == partition.calculate("2014-01-01"));
Assert.assertEquals(true, 0 == partition.calculate("2014-01-10"));
Assert.assertEquals(true, 1 == partition.calculate("2014-01-11"));
Assert.assertEquals(true, 12 == partition.calculate("2014-05-01"));

范围约定

<tableRule name="auto-sharding-long">
    <rule>
      <columns>user_id</columns>
      <algorithm>rang-long</algorithm>
    </rule>
  </tableRule>
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
    <property name="mapFile">autopartition-long.txt</property>
</function>

autopartition-long.txt

# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
500M-1000M=1
1000M-1500M=2
或
0-10000000=0
10000001-20000000=1

通配取模

<tableRule name="sharding-by-pattern">
      <rule>
        <columns>user_id</columns>
        <algorithm>sharding-by-pattern</algorithm>
      </rule>
   </tableRule>
<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern">
    #patternValue 即求模基数
    <property name="patternValue">256</property>
    #defaoultNode 默认节点，如果不配置默认节点，则默认是0即第一个节点
    <property name="defaultNode">2</property>
    <property name="mapFile">partition-pattern.txt</property>
 
  </function>

partition-pattern.txt

# id partition range start-end ,data node index
###### first host configuration
#1-32 即代表id%256后分布的范围，如果在1-32则在分区1，其他类推，如果id非数字数据，则会分配在defaoultNode 默认节点
1-32=0
33-64=1
65-96=2
97-128=3
######## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7

编程指定

<tableRule name="sharding-by-substring">
      <rule>
        <columns>user_id</columns>
        <algorithm>sharding-by-substring</algorithm>
      </rule>
   </tableRule>
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString">
    <property name="startIndex">0</property> <!-- zero-based -->
    <property name="size">2</property>
    <property name="partitionCount">8</property>
    <property name="defaultPartition">0</property>
  </function>

此方法为直接根据字符子串（必须是数字）计算分区号（由应用传递参数，显式指定分区号）。例如id=05-100000002，在此配置中代表根据id中从startIndex=0，开始，截取siz=2位数字即05，05就是获取的分区，如果没传默认分配到defaultPartition

字符串拆分hash解析

<tableRule name="sharding-by-stringhash">
      <rule>
        <columns>user_id</columns>
        <algorithm>sharding-by-stringhash</algorithm>
      </rule>
   </tableRule>
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionByString">
    #length代表字符串hash求模基数,其中length*count=1024
    <property name=length>512</property> <!-- zero-based -->
    #count分区数,其中length*count=1024
    <property name="count">2</property>
    #hashSlice hash预算位,即根据子字符串中int值 hash运算
    <property name="hashSlice">0:2</property>
  </function>


0 代表 str.length(), -1 代表 str.length()-1，大于0只代表数字自身
可以理解为substring（start，end），start为0则只表示0
例1：值“45abc”，hash预算位0:2 ，取其中45进行计算
例2：值“aaaabbb2345”，hash预算位-4:0 ，取其中2345进行计算
/**
* “2” -> (0,2)
* “1:2” -> (1,2)
* “1:” -> (1,0)
* “-1:” -> (-1,0)
* “:-1” -> (0,-1)125
* “:” -> (0,0)
*/

一致性hash

<tableRule name="sharding-by-murmur">
      <rule>
        <columns>user_id</columns>
        <algorithm>murmur</algorithm>
      </rule>
   </tableRule>
<function name="murmur" class="io.mycat.route.function.PartitionByMurmurHash">
      <property name="seed">0</property><!-- 默认是0-->
      <property name="count">2</property><!-- 要分片的数据库节点数量，必须指定，否则没法分片—>
      <property name="virtualBucketTimes">160</property><!-- 一个实际的数据库节点被映射为这么多虚拟节点，默认是160倍，也就是虚拟节点数是物理节点数的160倍-->
      <!--
      <property name="weightMapFile">weightMapFile</property>
                     节点的权重，没有指定权重的节点默认是1。以properties文件的格式填写，以从0开始到count-1的整数值也就是节点索引为key，以节点权重值为值。所有权重值必须是正整数，否则以1代替 -->
      <!--
      <property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
                      用于测试时观察各物理节点与虚拟节点的分布情况，如果指定了这个属性，会把虚拟节点的murmur hash值与物理节点的映射按行输出到这个文件，没有默认值，如果不指定，就不会输出任何东西 -->
  </function>

一致性hash预算有效解决了分布式数据的扩容问题，前1-9中id规则都多少存在数据扩容难题，而10规则解决了数据扩容难点。

转载于:https://my.oschina.net/u/3242075/blog/2960700