rhel5 MOSIX集群

本文详细介绍了如何使用MOSIX在Red Hat Enterprise Linux 5系统上搭建一个由两台虚拟机组成的集群,并通过一个简单的多进程负载测试验证了集群内进程自动迁移的功能。

MOSIX集群(一)–安装

目的: 集群节点内进程能根据负载情况自动迁移

用vmware安装一台rhel5(192.168.100.5)

# 下载MOSIX和kernel代码,准备编译

# 解压到指定目录

[root@rhel5 ~]# tar xjvf MOSIX-2.24.2.2.tbz -C /usr/src/

[root@rhel5 ~]# tar xzvf linux-2.6.26.tar.gz -C /usr/src/

#进入源代码所在目录

[root@rhel5 ~]# cd /usr/src/

#由于other/patch-2.6.26的目标路径是linux-2.6.26.1,做个连接吧(可能是mosix没有为2.6.26单独写patch…,不过还是支持的)

[root@rhel5 src]# ln -s linux-2.6.26/ ./linux-2.6.26.1

#给kernel打上mosix补丁

[root@rhel5 src]# patch -p0 < /usr/src/mosix-2.24.2.2/other/patch-2.6.26

#进入源代码目录,开始编译

[root@rhel5 src]# cd linux-2.6.26

#生成配置文件

[root@rhel5 linux-2.6.26]# make menuconfig

#生成依赖关系

[root@rhel5 linux-2.6.26]# make dep

#编译内核

[root@rhel5 linux-2.6.26]# make bzImage

#编译内核模块

[root@rhel5 linux-2.6.26]# make modules

#安装内核模块

[root@rhel5 linux-2.6.26]# make modules_install

#安装内核

[root@rhel5 linux-2.6.26]# make install

#进入mosix目录

[root@rhel5 mosix-2.24.2.2]# cd ../mosix-2.24.2.2

#安装mosix,一路回车,只用安装,记得把你常用级别的mosix服务打开就可以了.配置以后再说

[root@rhel5 mosix-2.24.2.2]# ./mosix.install

关机以后,用rhel5(192.168.100.5)克隆出slave(192.168.100.6)

安装完成

MOSIX-2.24.2.2/linux-2.6.26集群(二)–配置

将rhel5和slave开启,开机的时候,在grub界面按回车,然后选择2.6.26内核启动



 

slave启动以后,把ip地址,机器名改好(应为是由rhel5克隆得到的嘛)

[reel5]

#配置mosix

[root@rhel5 ~]# mosconf

MOSIX CONFIGURATION

===================

If this is your cluster's file-server and you want to configure MOSIX

for a set of nodes with a common root, please type their common root

directory. Otherwise, if you want to configure the node that you are

running on, just press &lt;ENTER> :-

What would you like to configure?

=================================

1. Which nodes are in this cluster (ESSENTIAL)

2. Authentication (ESSENTIAL)

3. Logical node numbering (recommended)

4. Queueing policies (recommended)

5. Freezing policies

6. Miscellaneous policies

7. Become part of a multi-cluster organizational Grid

Configure what :- 1

There are no nodes in your cluster yet:

=======================================

To add a new set of nodes to your cluster, type 'n'.

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- n &lt;==添加节点

Adding new node(s) to the cluster:

First host-name or IP address :- 192.168.100.5 &lt;==节点ip

Number of nodes :- 1 &lt;==节点数

Nodes in your cluster:

======================

1. 192.168.100.5

To add a new set of nodes to your cluster, type 'n'.

To modify an entry, type its number.

To delete an entry, type 'd' followed by that entry-number (eg. d1).

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- n &lt;==添加节点

Adding new node(s) to the cluster:

First host-name or IP address :- 192.168.100.6 &lt;==节点ip

Number of nodes :- 1 &lt;==节点数

Nodes in your cluster:

======================

1. 192.168.100.5

2. 192.168.100.6

To add a new set of nodes to your cluster, type 'n'.

To modify an entry, type its number.

To delete an entry, type 'd' followed by that entry-number (eg. d2).

To turn on advanced options, type '+'.

For help, type 'h'.

To save and exit, type 'q'. (to abandon all changes and exit, type 'Q')

Option :- q &lt;==保存退出

Cluster configuration was saved.

OK to also update the logical node numbers [Y/n]? y

Suggesting to assign '192.168.100.5'

as the central queue manager for the cluster

(but be cautious if you mix 32-bit and 64-bit nodes in the same cluster)

OK to update it now [Y/n]?

What would you like to configure next?

======================================

1. Which nodes are in this cluster

2. Authentication (ESSENTIAL)

3. Logical node numbering

4. Queueing policies

5. Freezing policies

6. Miscellaneous policies

7. Become part of a multi-cluster organizational Grid

q. Exit

Configure what :- 2 &lt;==设置密码

MOSIX Authentication:

=====================

To protect your MOSIX cluster from abuse, preventing unauthorized

persons from gaining control over your computers, you need to set

up a secret cluster-protection key. This key can include any

characters, but must be identical throughout your cluster.

Your secret cluster-protection key: xxxx &lt;==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

To allow your users to send batch-jobs to other nodes in the cluster,

you must set up a secret batch-client key. This key can include any

characters, but must match the 'batch-server' key on the node(s) that

can receive batch-jobs from this node.

Your secret batch-client key: xxxx &lt;==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

For this node to accept batch jobs,

you must set up a secret batch-server key. This key can include any

characters, but must match the 'batch-client' key on the sending nodes.

To make your batch-server key the same as your batch-client key, type '+'.

Your secret batch-server key: xxxx &lt;==输入密码

Your key is 5 characters long.

(in the future, please consider a longer one)

#保持退出

[root@rhel5 ~]# service mosix restart

[root@slave ~]# mosconf

....

#操作同rhel5一样

#重启服务

[root@slave ~]# service mosix restart

#看看状态吧

[root@slave ~]# service mosix status

This MOSIX node is: 192.168.100.6 (no features)

Nodes in cluster:

=================

192.168.100.5: proximate

192.168.100.6: proximate

Status: Running Normally (32-bits)

Load: 0.01 (equivalent to about 0.0066 CPU processes)

Speed: 6650 units

CPUS: 1

Frozen: 0

Util: 100%

Avail: YES

Procs: Running 0 MOSIX processes

Accept: Yes, will welcome processes from here

Memory: Available 461MB/503MB

Swap: Available 0.9GB/0.9GB

Daemons:

Master Daemon: Up

MOSIX Daemon : Up

Queue Manager: Up

Remote Daemon: Up

Postal Daemon: Up

Guest processes from other clusters in the grid: 0/8

#我比较喜欢看看端口是不是起来了

#TCP/IP ports 249-253 and UDP/IP ports 249-250 must be available for MOSIX

[root@slave ~]# netstat -antu | grep -E "24|25"

tcp 0 0 0.0.0.0:2401 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:249 0.0.0.0:* LISTEN

tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:250 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:251 0.0.0.0:* LISTEN

tcp 0 0 0.0.0.0:252 0.0.0.0:* LISTEN

udp 0 0 0.0.0.0:249 0.0.0.0:*

udp 0 0 0.0.0.0:250 0.0.0.0:*

#好了,装完了

MOSIX-2.24.2.2/linux-2.6.26集群(三)–应用测试

#先在rehl5和slave上各开启一个终端,运行mon命令,检查

[root@rhel5 ~]# mon



 

#2个节点上应该都是闲置的吧

#为了能出些效果,做点费cpu的脚本,还必须是多线程的,

#mosix能够迁移的最小单位是进程,而不是指令或者函数,

#所以单进程负载再高也没意义

[root@rhel5 ~]# cat a.sh &lt;&lt; EOF

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

awk 'BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}' &

EOF

[root@rhel5 ~]# chmod +x a.sh

#在rhel5上运行a.sh,也就是产生6个进程了

[root@rhel5 ~]# mosrun -e ./a.sh

#开始观察2个节点上的mon画面,刚开始rhel负载很高,然后slave的负载也起来了,能够看到



 

#能够看到在rhel5上,awk的6个进程还在,但是只有3个在运行,还有3个的状态是T(stop),哈哈,应该是迁移了

[root@rhel5 ~]# ps -aux | grep awk

Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ

root 25648 0.6 0.0 0 0 pts/0 T 16:16 0:00 [awk]

root 25650 0.4 0.0 0 0 pts/0 T 16:16 0:00 [awk]

root 25652 32.0 0.7 4168 3812 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}

root 25654 32.0 0.7 4168 3816 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}

root 25656 32.0 0.7 4168 3816 pts/0 R 16:16 0:37 awk BEGIN {for(i=0;i&lt;100000;i++)for(j=0;j&lt;100000;j++);}

root 25658 1.4 0.0 0 0 pts/0 T 16:16 0:01 [awk]

root 25665 0.0 0.1 3860 624 pts/0 R+ 16:18 0:00 grep awk

#到slave上top看看吧,明显看到有3个叫remoted的进程占用了cpu,这个就是迁移过来的状态吧

top - 16:19:19 up 3:10, 3 users, load average: 2.78, 1.18, 0.44

Tasks: 99 total, 5 running, 94 sleeping, 0 stopped, 0 zombie

Cpu(s): 99.3%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si

Mem: 515376k total, 423576k used, 91800k free, 107980k buff

Swap: 1048568k total, 0k used, 1048568k free, 234028k cach

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

16929 root 20 0 4168 3936 0 R 33.2 0.8 0:48.13 remoted

16925 root 20 0 4168 3932 0 R 32.9 0.8 0:50.57 remoted

16927 root 20 0 4168 3932 0 R 32.9 0.8 0:50.13 remoted

1 root 20 0 2036 664 572 S 0.0 0.1 0:01.36 init

2 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 kthreadd

3 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migratio

4 root 15 -5 0 0 0 S 0.0 0.0 0:02.00 ksoftirq

##############全文测试结束############

MOSIXLinux核心集群计算的功能之一。它支持的操作系统平台有BSD/OS 和Linux,它允许任意多个基于X86/Pentium的服务器和工作站协同工作。在MOSIX集群环境中,用户无需对应用程序进行修改,或将应用程序与库连接起来,或将应用程序分配到不同的节点上运行。   MOSIXLinux核心增添了集群计算的功能。它支持的操作系统平台有BSD/OS 和Linux,它允许任意多个基于X86/Pentium的服务器和工作站协同工作。在MOSIX集群环境中,用户无需对应用程序进行修改,或将应用程序与库连接起来,或将应用程序分配到不同的节点上运行。MOSIX会自动将这些工作透明地交给别的节点来执行。   MOSIX的核心是适应性的资源管理算法,它对各节点的负载进行监测并做出相应的回应,从而提高所有进程的整体性能。它使用抢先的进程迁移方法来在各节点中分配和再分配进程,从而充分利用所有的资源。适应性的资源管理算法具体上又包括适应性的负载平衡算法、内存引导算法和文件I/O的优化算法。这些算法都对集群中的资源使用情况的变化做出响应。如:节点上的不平衡的负载分布或由于内存不足而导致的过多的磁盘换入换出。在这种情况下,MOSIX将进程从一个节点迁移到另外一个节点上,从而来均衡负载或将进程迁移到有足够的内存空间的节点上。   由于MOSIX是在Linux的核心中实现的,因此它的操作对应用程序而言是完全透明的。可以用它来定义不同的集群类型,这些集群中的机器可以相同也可以不同。   与Turbocluster、Linux Virtual Server、Lsf等集群系统不同的是,MOSIX集群中的每个节点既是主节点又是服务节点,不存在主控节点。对于那些在本地节点创建的进程而言,该节点就是一个主节点;对于那些从远方节点迁移过来的进程而言,该节点就是服务节点。这意味着可以在任意时刻向集群中增加节点或从集群中删除节点,而不会对正在运行的进程产生不良的影响。MOSIX的另外一个特性就是它的监测算法能够监测每个节点的速度、负载、可用内存、IPC 以及I/O rate 。系统使用这些信息来决定将进程发送到哪个具体的节点上。当在某个节点上创建了一个进程以后,该进程就在这个节点上执行。当该节点的负载超过了一定的阀值以后,就将该进程透明地迁移到别的节点上继续执行。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值