Overcoming the I/O Bottleneck with General Parallel File System

Overcoming the I/O Bottleneck with General Parallel File System
By Andrew Naiberg
It used to be that I/O was faster than computation. In fact, not too long ago a supercomputer could be loosely defined as any machine that turned a compute-bound problem into an I/O-bound problem. However, dramatic increases in CPU, memory and bus speeds have turned this relationship on its head—now disk I/O is usually the critical factor limiting application performance and the ability to share data across a computing cluster. First seen in scientific supercomputers, this I/O bottleneck is now common in many data-intensive business applications such as digital media, financial analysis, business intelligence, engineering design, medical imaging, geographic/geological analysis and so forth. And with data volumes, CPU and interconnect speeds still increasing, the I/O bottleneck problem will likely only get worse.
 
Parallel file systems (also known as cluster file systems) have emerged as a powerful solution to the I/O bottleneck and IBM’s General Parallel File System (GPFS) is among the best. Originally developed for digital-media applications, GPFS now powers many of the world’s most powerful scientific supercomputers and holds the world’s terabyte sort record and several other performance awards. Parallel file systems offer three primary advantages over traditional distributed and SAN file systems:
 
·        High bandwidth—Parallel file systems are very effective when distributed or SAN file systems can’t deliver the aggregate bandwidth required for the environment. Where network file systems typically deliver less than 125 MB/second and SAN file systems top out around 500 MB/second, GPFS has delivered 15 GB/second on a single node. Moreover, GPFS can scale this performance as more nodes are attached, delivering enormous aggregate bandwidth. In addition to its world terabyte sort record, GPFS won awards for both the highest bandwidth and the most I/Os per second (running a real application) at the 2004 Supercomputing Conference.
 
·        Data sharing—Another key advantage of parallel file systems is that all of the attached nodes have equal access to all of the data on the underlying disks, making parallel file systems ideal for cluster environments where many users or applications work with the same data (e.g., many engineers can share a single set of design files). And GPFS recently added unique “multi-cluster” support, enabling data sharing and collaboration across interconnected clusters; this capability is currently being used to share data and results across a consortium of European research centers.
 
·        High reliability without bottlenecks—Unlike distributed file systems, which transfer all of the data through a single server and path, parallel file systems aren’t client/server designs and employ redundant paths, allowing configurations that eliminate all single points of failure; if one path fails, data can flow via another one. Even SAN file systems, which don’t transfer all of the data through a single data server, typically must access a single metadata server to initiate a transfer, again impacting performance. And despite efforts to alleviate these bottlenecks with simple mechanisms to split the load among multiple data or metadata servers, there’s simply no good way to prevent overload or failure with these designs. Conversely, GPFS stores both data and metadata across any or all of the disks in the cluster so there’s no single data server, metadata server or data path to act as a bottleneck or single point of failure.
 
With these capabilities come several others, including the ability for multiple users or applications to access different parts of a single file simultaneously and high scalability. GPFS currently supports production clusters of more than 2,200 nodes and file systems comprising more than 1,000 disks and hundreds of terabytes of storage.
 
How Does GPFS Do It?
The key to GPFS’s bandwidth is that GPFS divides individual files into multiple blocks and “stripes” (stores) these blocks across all of the disks in a file system (see Figure 1, below). To read or write a file, GPFS initiates multiple I/Os in parallel, thereby performing the transfer quickly. In addition, each block is much larger than in a traditional file system—typically 256 KB and up to 1024 KB. This enables GPFS to transfer large amounts of data with each operation and reduce the effect of seek latency.
 
To keep track of all of these blocks and ensure data integrity, GPFS implements a distributed byte-range locking mechanism. The “distributed” part synchronizes file system operations across compute nodes so that, although file system management is distributed across many machines for optimal performance, the entire file system looks like a single file server to every node in the cluster. The “byte-range” part means that rather than locking an entire file when it’s accessed, thereby preventing all other access like traditional file systems, GPFS locks individual parts of a file separately. This enables multiple users, applications or parallel jobs to work on different parts of a single file simultaneously, offering many benefits. For example, multiple engineers or applications can access and update a single design file simultaneously, eliminating the additional storage and overhead associated with storing multiple copies, not to mention the effort required to merge all of the copies into a finished product. In a broadcast news environment, video editors can work on a live video feed as it streams in from the field, accelerating the time to air.
 
Designed for High Reliability
In addition to eliminating the single points of failure associated with individual servers or paths (as previously discussed), GPFS is also designed to accommodate hardware failures. In fact, several commercial customers use GPFS primarily for its reliability rather than its performance. To protect against failure of a compute node, each node logs all of its updates and stores them on shared disks just like all of the other data and metadata. If a node fails, another node can access the failed node’s log to determine what updates were in progress and restore the affected file(s). These files can be accessed normally once they’re consistent again. To protect against disk failures, GPFS can stripe its data across RAID disks and be configured to store copies of data and metadata on different disks. Finally, GPFS doesn’t require the file system to be taken down to make configuration changes such as adding, removing or replacing disks in an existing file system or adding nodes to an existing cluster.
 
Highly Available Access
GPFS supports a variety of disk hardware and offers three configuration options ranging from a full-access SAN implementation for ultimate performance to a shared-disk server model that’s less expensive as the cluster gets very large. GPFS is a proven solution for virtually any environment requiring extremely reliable, high-bandwidth shared data access.
------------------------------------------------------------------------------------------------------------------ 
Andrew Naiberg is the product-marketing manager for pSeries software. He’s also been a software engineer and service delivery specialist since joining IBM in 1997. Andrew can be reached at anaiberg@us.ibm.com .
 
 
Figure 1 
内容概要:本文档是傲拓科技股份有限公司发布的NA200H可编程控制器(PLC)硬件手册,旨在详细介绍NA200H系列PLC的硬件特性、安装、拆卸与接线原则。NA200H系列PLC是一款小型一体化PLC,具有集成以太网接口,适用于恶劣运行环境,具备良好的扩展性能和低廉的价格。手册涵盖了NA200H PLC的硬件组成(包括CPU模块和各种扩展模块)、工作方式、编程软件介绍、快速应用指南以及详细的安装、拆卸与接线原则。此外,手册还提供了CPU模块和各类I/O扩展模块的技术参数、端子定义与接线说明,并介绍了多种通讯模块的功能和特性。最后,附录部分列出了订货参数和扩展模块的功率消耗清单。 适合人群:具备一定电气知识的操作人员和技术人员,特别是从事自动化控制系统设计、安装和维护的工程师。 使用场景及目标:①帮助用户了解NA200H PLC的硬件结构和工作原理;②指导用户正确安装、配置和维护NA200H PLC;③协助用户选择合适的扩展模块以满足特定的应用需求;④提供详细的接线和通讯配置指导,确保系统稳定运行。 阅读建议:本手册内容详尽,建议读者在使用产品前仔细阅读并理解相关内容,尤其是安装、拆卸和接线部分,以确保系统的安全和稳定运行。同时,建议结合实际应用场景,灵活运用手册中的信息,进行合理的系统设计和配置。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值