
Cluster
文章平均质量分 68
amaowolf
这个作者很懒,什么都没留下…
展开
-
配置NFS
配置 NFS (root 用户 )(1)在 master 上检查是否已经安装了 nfs 包[root@hadoop01 ~]# rpm -qa|grep nfsnfs -utils -1.0.6-46s ystem-config-nfs -1.2.8-1[root@hadoop01 ~]# rpm -qa|grep portmapportmap-4.原创 2011-12-21 11:38:57 · 680 阅读 · 0 评论 -
munge installation
InstallationGuide MUNGE Installation GuideUpdated Mar 9, 2012 by chris.m.dunlapInstalling the SoftwareMUNGE requires either the Libgcrypt or OpenSSL cryptographic library. Libgc转载 2012-11-27 18:36:35 · 1482 阅读 · 0 评论 -
RedHat安装SLURM
其实和Ubuntu下面差不多的Ubuntu下配置Slurm,没有那么简单的apt-get使用,就从源码编译。参考:http://www.linuxidc.com/Linux/2012-10/71552.htmmunge的路径好像不太一样,不是/xxxx而是/usr/local/xxxx。后面还会说到这个问题。SLURM会提示出错plugin_load_from_fi转载 2012-11-29 15:13:17 · 2000 阅读 · 0 评论 -
安装openssl 和配置munge
1. 安装openssl (caoj7用户)sudo yum searchopensslsudo yum installopensslopenssl-devel2. 安装munge (root)a)各节点之间先打通sshb)./configure --prefix=/usr/local --sysconfdir=/etc --localstatedi原创 2012-11-29 18:47:27 · 3179 阅读 · 1 评论 -
VMware+Ubuntu环境安装配置SLURM
Ubuntu中ssh环境的配置ssh客户端是默认安装的,但是服务端可能要手动安装sudo apt-get openssh-server完成之后看ps -e | grep ssh看ssh-agent和sshd是否出现,出现的话就说明启动成功,其他节点就可以访问这台ubuntu了。Ubuntu中MUNGE的安装SLURM需要一个用作安全管理的插件转载 2012-11-29 15:12:14 · 2770 阅读 · 0 评论 -
slurm(1): sinfo squeue scancel
1. sinfo[caoj7@vm1 soft]$ sinfoPARTITION AVAIL TIMELIMIT NODES STATE NODELISTcompute* up infinite 4 idle vm[2-5]2. squeue[caoj7@vm1 soft]$ squeue JOBID PARTITION NAME原创 2012-11-30 14:54:24 · 2830 阅读 · 0 评论 -
srun
4. srun[caoj7@vm1 soft]$ srun -n3 -l hostname1: vm20: vm22: vm2[caoj7@vm1 soft]$ srun -n4 -l /bin/hostname0: vm22: vm21: vm23: vm2[caoj7@vm1 soft]$ srun -N4 -n16 --ntasks-per-core=4 -l原创 2012-11-30 14:55:23 · 1751 阅读 · 0 评论 -
salloc
1. salloc + srun[caoj7@vm1 mpi]$ salloc -N4 srun -n16 hello2. salloc + mpirun[caoj7@vm1 mpi]$ salloc -N4 mpirun -np 16 hello原创 2012-11-30 15:03:21 · 2312 阅读 · 0 评论 -
slurm.conf
#slurm.conf file generated by configurator easy.html.# Put this file on all nodes of your cluster.# See the slurm.conf man page for more information.#ControlMachine=node01#ControlAddr=##MailPro原创 2012-09-21 09:28:56 · 1019 阅读 · 0 评论 -
pdsh+pdcp
1. pdsh-2.28.tar.bz2tar -jxvf pdsh-2.28.tar.bz2cd pdsh-2.28./configure --prefix=/usr/local --with-ssh --with-slurmmakesudo make install2. pdshpdsh -R ssh -w vm2,vm3,vm4,vm5 hostnamep原创 2012-11-30 15:54:40 · 2559 阅读 · 0 评论 -
openssl安装配置
安装openssl# tar –zxvf openssl# cd openssl# ./config --prefix=/usr/local/openssl# make# make install 加密解密传统加密(对称加密)openssl enc –ciphername(加密算法) –k password(口令) –in file(被加密的算法) -out (转载 2012-09-20 13:21:47 · 2287 阅读 · 0 评论 -
安装OpenMPI
1. 前提ssh打通2. Make–./configure --with-devel-headers--with-slurm–make && make install•Config (~/.bashrc) »export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib/openmpi/:$LD_LIBRARY_P原创 2012-11-30 10:55:30 · 1939 阅读 · 0 评论 -
hostname config
NETWORKING=yes#HOSTNAME=rhel61-1HOSTNAME=vm1GATEWAY=192.168.2.1GATEWAYDEV=eth0PEERDNS=no192.168.2.200 vm1 vm1#rhel61-1# NIC 192.168.2.201 vm2 vm2#rhel61-2# NIC 19原创 2012-12-11 17:30:53 · 988 阅读 · 0 评论 -
安装slurm与重启slurm
1. 先安装openssl和munge 2. installInstall(caoj7)./configure --prefix=/usr/local --sysconfdir=/usr/local/etc --enable-debugmake sudomake install2. Slurm.conf (If revised, slurmctld andslu原创 2012-11-30 10:51:28 · 8124 阅读 · 0 评论 -
Lustre I/O性能特点
1 Lustre概述Lustre是面向集群的存储架构,它是基于Linux平台的开源集群(并行)文件系统,提供与POSIX兼容的文件系统接口。Lustre两个最大特征是高扩展性和高性能,能够支持数万客户端系统、PB级存储容量、数百GB的聚合I/O吞吐量。Lustre是Scale-Out存储架构,借助强大的横向扩展能力,通过增加服务器即可方便扩展系统总存储容量和性能。Lustre的集群和并行架转载 2012-10-31 08:06:41 · 1881 阅读 · 0 评论 -
SLURM and OpenMPI
1) The MpiDefault configuration parameter in slurm.conf establishes the system default MPI to be supported. The srun option --mpi= (or the equivalent environment variable SLURM_MPI_TYPE can be used原创 2012-10-30 10:12:08 · 1886 阅读 · 0 评论 -
初步安装Condor
1. 准备 [root@node1 /]# cat /etc/hosts# Do not remove the following line, or various programs# that require network functionality will fail.127.0.0.1 localhost.localdomain localhos原创 2012-06-02 12:41:35 · 3480 阅读 · 3 评论 -
在Condor中执行Checkpoint
Condor运行有多种模式,不同的模式有各自不同的功能。在standard模式下,Condor提供检查点和远程系统调用。这些特性使得任务的运行更加可靠并且允许任务从机群中的任何地点以相同方式访问资源。要把一个程序配置成标准模式任务,就必须使用condor_compile进行重连接。大多数程序都能配置成标准模式任务。一个检查点映像本质上就是任务当前运行状态的一幅快照。如果某件任务必须从一台机原创 2012-06-02 12:46:15 · 1234 阅读 · 0 评论 -
Nodes, Sockets, Cores and FLOPS
Recently, a fellow blogger here at HPCatDell, Dr. Jeff Layton, has been running a series onPetaFLOPS for the Common Man. In that series, he writes that in the November 2009 Top500 list there are a转载 2012-06-04 15:38:22 · 671 阅读 · 0 评论 -
一个简单的 slurm.conf
[root@node002 ~]# cat /usr/local/etc/slurm.conf #slurm.conf file generated by configurator easy.html.# Put this file on all nodes of your cluster.# See the slurm.conf man page for more informati原创 2012-09-03 14:32:57 · 1496 阅读 · 0 评论 -
使用pdsh、ClusterSSH和mussh管理集群系统
当我们管理数以十计或者更多的集群系统时,往往需要在每台机器上执行同样的命令,或者拷贝同样的文件,这时,我们就可以考虑使用三个小工具,分别是pdsh、ClusterSSH和mussh。 在Fedora系统上,我们可以直接通过yum安装这三个软件。yum install clusterssh pdsh pdsh-rcmd-ssh pdsh-rcmd-rsh mussh如果是其他Linux转载 2012-09-04 16:52:50 · 708 阅读 · 0 评论 -
使用pdsh在集群执行命令
这个小工具作用就是批量在集群节点执行命令,比如我想在 node_1至node_9上执行hostname命令: /usr/bin/pdsh -R ssh -w node_[1-9] hostname -R:指定传输方式,默认为rsh,本例为ssh,如果希望ssh传输需要另行安装pdsh-rcmd-ssh,如果希望ssh无密码访问需要提前配置好。-w:转载 2012-09-04 16:28:55 · 2342 阅读 · 0 评论 -
SSH login without password
for root user[root@A ~]vi /etc/hosts #[IP address] [hostname]192.168.1.X A192.168.1.Y Bgenerate authentication keys and distribute [root@A ~]# ssh-keygen -t rsa[root@A ~]# cp ~转载 2011-12-05 09:58:52 · 1008 阅读 · 0 评论 -
ssh中“Host key verification failed.“的解决方案
我们使用ssh链接linux主机时,可能出现“Host key verification failed.“的提示,ssh连接不成功。可能的提示信息如下:@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ WARNING: REMOTE HOST IDENTIFICATION HAS CHAN转载 2012-09-05 11:17:53 · 992 阅读 · 0 评论 -
lustre-01: 简介
------------------------------------2. Componentsa) MDSMDS target (MDT).b) OSS (Object Storage Server)OSS targets (OSTs)c) client Metadata Client (MDC), Object Storage Client (OSC), an原创 2012-10-16 14:47:33 · 717 阅读 · 0 评论 -
lustre02: 安装
============================================1. Required Tools and Utilitiesa) e2fsprogs 见后面server和client安装b) which perlc) ssh /pdsh (recommended, optional)./configure --prefix=/usr/local原创 2012-10-16 14:53:43 · 1271 阅读 · 0 评论 -
Lustre vs. HDFS
1. Challenges of Hadoop + HDFSHadoop cannot make task is data local absolutelySave big MapTask outputs in local Linux file system will get OS/disk I/O bottleneckReduce node need to use HTTP to s转载 2012-10-18 16:21:38 · 2994 阅读 · 0 评论 -
What is the different between a Cluster and MPP supercomputer architecture?
The difference is huge. I'll spare you the LMGTFY answer, although you really should try that sometime.In a cluster, each machine is largely independent of the others in terms of memory, disk, etc转载 2012-10-19 14:02:55 · 1083 阅读 · 0 评论 -
SLURM Scheduler
1.src/plugins/sched/built-in: will initiate jobs strictly in their priority order, typically (first-in-first-out)backfill: will initiate a lower-priority job if doing so does not delay the expecte原创 2012-10-30 10:04:22 · 1075 阅读 · 0 评论 -
munge 安装
Installing the SoftwareMUNGE requires either the Libgcrypt or OpenSSL cryptographic library. Libgcrypt is licensed under the LGPL, whereas OpenSSL is licensed under dual original-BSD-style licen转载 2013-01-10 09:11:07 · 5049 阅读 · 0 评论