一些storage 的概念以及与VMware的关系

本文探讨了ALUA技术如何使中端阵列表现得更像企业级活动-活动阵列,并详细讨论了路径选择策略,如固定路径选择(Fixed ALUA)及循环路径选择(Round Robin)等,在不同场景下的应用及优缺点。

Active-Passive/Active

The default owner refers to which storage processor owns a LUN and is responsible for sending IOs to the backend disk.  This is a critical differentiator to mention behind an enterprise and mid-range array.  Simply put on an enterprise active-active array any storage processor can send IO to any backend disk.  Whereas under a mid-range active-passive array only one SP can send IO to a disk at a time.  These SPs can however run active-passive passive-active for different LUNs which allows you to balance IO out across the different SPs.

In comes a bit of complication however, due to something called ALUA, there is now the ability for a mid-range array to work similar to an enterprise active-active array.  Note, ALUA is enabled by default when registering ESX hosts to vSphere 4.1 and EMC CX/NS flare 29+.  It allows ESX/i to send IO down any path to a storage processor and that processor will accept the IO either send it to backend disk directly or internally redirect the IO to its partner SP that then sends the IO to backend disk.  Without ALUA this situation where IO was being received down different paths would cause a ping-pong effect where a LUN would change owners constantly in order to service IO (very bad).  With ALUA, this does not happen.  However what will happen is after a certain threshold the SP will recognize that it would be more efficient to have its partner SP own the LUN and thus the backend disk for the LUN and will pass its “current owner” status to its partner.

 

So in summary ALUA enables a mid-range array to behave more like an active-active array from ESX/i’s perspective, but let’s look at some more details in the stack..

 

Pathing choices

The goal as I see it for running an efficient storage stack via block protocols is to maximize IO capabilities and minimize management overhead and risk.  So in simple terms, EMC has software called PowerPath/VE that in my opinion allows alleviates most of what I will describe next.

 

Trespasses

We’ve described a few things, and the most critical thing that we are going to move forward with is around the lun owernship; this being current or default.  Current referring to what SP can send IO to a LUN and a LUNs backend disk and default referring to what SP is set to be the owner on array bootup or a non-trespassed situation.  A LUN being in a trespass state simply means the LUN is currently owned by the SP that isn’t it's default owner.

 

The question at this point is, why would a LUN enter a trespass state?  There is a list of reasons why this may happen, let’s start from a general point and move into more cluster and VMware specific reasons.  Trespasses can happen manually, by the array, or caused due to ESX/i operations.  From a manual perspective, at any point you can force a trespass (move to opposite SP) from the GUI/CLI of the storage array.  What’s important to get here is that even if I manually make trespassing decisions they can be overridden by the array or the hypervisor right away.  So from a manual perspective, I force trespasses from the GUI/CLI.  I may decide to do this to untrespass a LUN, or if I know I will be removing paths and want to force IO down a certain SP.  On the more automatic side, a LUN can be trespassed by an SP for many conditions.  An NDU “non-disruptive upgrade” to an array will cause LUNs to be trespassed from one SP to the other to ensure data access LUN access during the whole upgrade.  After the upgrade, the LUNs may exist solely on one SP and those LUNs with current owner not equal to default owner are considered  trespassed.  There are many other reasons from an array perspective. .

 

Moving more into the VMware world, we are talking about a cluster of hosts, each making their own decision as to what path should be used.  These hosts are using pre-determined decision trees to set these paths.  However, the decision trees are all happening among the hosts at different times.  For example, when using FIXED ALUA pathing and an ESX/i server boots up it will assign the active path to be based on what the current owner of a LUN is.  There may be a trespass of this LUN for some reason after this, and another ESX/i host then applies the same decision tree.  It then decides its active path will go down the new current owner SP.  At this point you have two ESX/i servers that made pathing choices down different SPs.  Problem here?  Not a huge one since ALUA allows for both SPs to receive IOs.  Problem comes in where the SPs may in the backend trespass the LUN whenever thresholds are met and it decides it’s more efficient to service the LUN from partner SP and you predetermined balance of LUNs per SP is thrown off.  As well, there is extra overhead in sending IO through more channels, ie. the internal link between SP is an extra hop (more CPU cycles, less use of cache).

 

Other situations that you may run in to where this would happen..  An array just did an NDU, all LUNs are currently owned (not default) by the secondary SP (it is first to upgrade, so will own LUNs at end of NDU).  If you then bootup your ESX/i hosts they will set active paths to one SP.  So some of your LUNs will be trespassed and stuck as trespassed if pathing is FIXED.  One more example, and this one may actually be the most relevant..  If there is one host in the cluster of any size that does not have established pathing to an SP, all IO will attempt to traverse the path it has for that LUN.  Sounds bad?  It is, in reality all it takes is one misconfigured host to throw a cluster out of balance.  This is why we suggest reviewing best practices and ensure you have 4 paths to a mid-range array, 2 to each SP.  So there we have it, a few situations where due to systems making pathing decisions independently of some authoritative source, causes inconsistent pathing to mid-range arrays.

 

Ok, so I have a lot of trespassing going on, how do I fix it?

 

It depends =).. The larger the ESX/i environment, the more challenging this can be to fix.  Meaning, if I have a cluster of four hosts I can pretty easily go through and adjust active paths to match via the vCenter GUI.  However, the larger the cluster the more difficult it is.  For example, 4 hosts times 4 datastores yields 16 checks.  Scale that up to 30 hosts and 30 datastores, that’s 900 checks.. Ouch!

 

The Easy Solution

PowerPath/VE is the slam dunk for this.  If we are hard set on using block protocols for datastore access and we don’t want to think about the management of the paths then PP/VE is your software.  We describe it as path management software that adaptively manages paths.  Simply put, VMware’s NMP (native multipathing) does not make any decisions based on authoritative information.  I believe it’s critical in larger environments to do this.  PPVE uses array side information to make pathing decisions dynamically for you.  So in essence, paths that are uncongested and available are used at all times and are adapted to as things change on the fan-in/target array port.  In my opinion, this is far and away the most comprehensive way to ensure optimal block access to a storage array for VMware.

 

The reboot it method

Since the decision tree happens when a ESX/i server first boots an option to fix your active pathing without using a GUI/CLI would be to just place the host into maintenance mode and reboot.  Not a bad method since there is no effect to VMs due to ESX/i migrating VMs online to another host due to entering maintenance mode.  When rebooting however, you need to ensure that LUN is owned by the default owner so the pathing decision can be correct.  And to mention to before booting any ESX/i server up, make sure your LUNs are currently owned by the right SP!  A manual trespass of a LUN or all LUNs to their default owner SP can be done via CLI/GUI.

 

The scripting method

Not for the faint of heart, and I really can’t support it.  See the following link for a script I wrote to choose active paths across an ESX/i cluster based on authoritative information from the array (default owner) https://community.emc.com/thread/113885?tstart=0.

 

Round robin

It is possible to fix the trespassing conditions by putting a host in maintenance mode, switching to round robin pathing, and then trespassing luns to appropriate owners.  When round robin uses its decision tree for pathing choices it chooses the current owner as its active paths.  So it is susceptible to not being balanced similar to FIXED.  It however acts a bit differently under certain conditions.  Under FIXED, there is no time the active path will change.  Under RR the active paths will change when a LUN is trespassed via user intervention.  So if you’re digging into ALUA and RR, the thought may be that if I trespass a LUN from the array, the ESX/i server will still send traffic down what is set as ACTIVE paths.  This is not the case.  ALUA will keep active paths static only during array initiated trespass conditions.  So all of the manual/scripting work to balance paths can be achieved by just enabling RR and ensuring there aren’t trespasses on the array.  So all in all, a pretty easy solution if you’re willing to go to RR!

 

So what’s EMC’s official pathing stance for the mid-range?

The discussion above was mostly focused discussing the storage stack and aimed at FIXED ALUA and PPVE around a problem and solution.  It is important to note that FIXED ALUA is currently what will be chosen automatically by ESX/i 4/4.1 when a hypervisor first boots for EMC CX/NS arrays.  EMC’s best practices are to use ROUND ROBIN (only caveat is for arrays running multiple iSCSI initiators and pre flare 30 code).  RR is a common best practice among array vendors, and EMC is no different.  We do however highly suggest using PPVE instead of round robin to attain adaptive load balancing.

 

 

原文地址:

http://blog.sina.com.cn/s/blog_86ca10130100x4s7.html

源码来自:https://pan.quark.cn/s/a4b39357ea24 ### 操作指南:洗衣机使用方法详解#### 1. 启动水量设定- **使用方法**:使用者必须首先按下洗衣设备上的“启动”按键,同时依据衣物数量设定相应的“水量选择”旋钮(高、中或低水量)。这一步骤是洗衣机运行程序的开端。- **运作机制**:一旦“启动”按键被触发,洗衣设备内部的控制系统便会启动,通过感应器识别水量选择旋钮的位置,进而确定所需的水量高度。- **技术执行**:在当代洗衣设备中,这一流程一般由微处理器掌管,借助电磁阀调控进水量,直至达到指定的高度。#### 2. 进水过程- **使用说明**:启动后,洗衣设备开始进水,直至达到所选的水位(高、中或低)。- **技术参数**:水量的监测通常采用浮子式水量控制器或压力感应器来实现。当水位达到预定值时,进水阀会自动关闭,停止进水。- **使用提醒**:务必确保水龙头已开启,并检查水管连接是否牢固,以防止漏水。#### 3. 清洗过程- **使用步骤**:2秒后,洗衣设备进入清洗环节。在此期间,滚筒会执行一系列正转和反转的动作: - 正转25秒 - 暂停3秒 - 反转25秒 - 再次暂停3秒- **重复次数**:这一系列动作将重复执行5次,总耗时为280秒。- **技术关键**:清洗环节通过电机驱动滚筒旋转,利用水流冲击力和洗衣液的化学效果,清除衣物上的污垢。#### 4. 排水甩干- **使用步骤**:清洗结束后,洗衣设备会自动进行排水,将污水排出,然后进入甩干阶段,甩干时间为30秒。- **技术应用**:排水是通过泵将水抽出洗衣设备;甩干则是通过高速旋转滚筒,利用离心力去除衣物上的水分。- **使用提醒**:...
代码下载地址: https://pan.quark.cn/s/c289368a8f5c 在安卓应用开发领域,构建一个高效且用户友好的聊天系统是一项核心任务。 为了协助开发者们迅速达成这一目标,本文将分析几种常见的安卓聊天框架,并深入说明它们的功能特性、应用方法及主要优势。 1. **环信(Easemob)** 环信是一个专为移动应用打造的即时通讯软件开发套件,涵盖了文本、图片、语音、视频等多种消息形式。 通过整合环信SDK,开发者能够迅速构建自身的聊天平台。 环信支持消息内容的个性化定制,能够应对各种复杂的应用场景,并提供多样的API接口供开发者使用。 2. **融云(RongCloud)** 融云作为国内领先的IM云服务企业,提供了全面的聊天解决方案,包括一对一交流、多人群聊、聊天空间等。 融云的突出之处在于其稳定运行和高并发处理性能,以及功能完备的后台管理工具,便于开发者执行用户管理、消息发布等操作。 再者,融云支持多种消息格式,如位置信息、文件传输、表情符号等,显著增强了用户聊天体验。 3. **Firebase Cloud Messaging(FCM)** FCM由Google提供的云端消息传递服务,可达成安卓设备服务器之间的即时数据交换。 虽然FCM主要应用于消息推送,但配合Firebase Realtime Database或Firestore数据库,开发者可以开发基础的聊天软件。 FCM的显著优势在于其全球性的推送网络,保障了消息能够及时且精确地传输至用户。 4. **JMessage(极光推送)** 极光推送是一款提供消息发布服务的软件开发工具包,同时具备基础的即时通讯能力。 除了常规的文字、图片信息外,极光推送还支持个性化消息,使得开发者能够实现更为复杂的聊天功能。 此...
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值