How does YARN compare to Mesos?

本文对比分析了YARN和Mesos两种集群资源调度系统,它们都旨在实现不同框架间的大型集群资源共享。文章从语言、功能、安全性、本地化、成熟度、社区支持等方面详细阐述了两者的特点与区别,帮助读者了解如何选择适合其需求的集群管理系统。

转自; http://www.quora.com/How-does-YARN-compare-to-Mesos

Both systems have the same goal: allowing you to share a large cluster of machines between different frameworks. 

For those who don't know, NextGen MapReduce is a project to factor the existing MapReduce into a generic layer that handles distributed process execution and resource scheduling (this system is called YARN) and then implement MapReduce as an "application" on top of this.

Mesos was originally an academic research project with a very similar goal. They created a system which could run a patched version of Hadoop, MPI and other things. This has grown into an Apache Incubator project in its own right.

I have been looking into these two a bit because we would love something like this at LinkedIn, and the nature of these things is that you really only want one (since you want to run everything on it). So at the moment we don't have any real experience running stuff on top of either of these, but here is what I have pieced together (may be wrong in places):

  1. Nextgen MapReduce (aka YARN) is primarily written in java with bits of native code. Mesos is primarily written in C++.
  2. YARN only handles memory scheduling (e.g. you request x containers of y MB each), but with plans to extend it to other resources. I believe Mesos handles both memory and CPU scheduling, but I don't know the details. In practice I think the OS handles CPU scheduling pretty well so I am not sure that would help our use cases. Supporting some kind of disk space and disk I/O scheduling and enforcement would be super cool, but I don't think either do that (yet).
  3. Mesos uses Linux container groups (http://lxc.sourceforge.net), and YARN uses simple unix processes. Linux container groups are a stronger isolation but may have some additional overhead.
  4. The resource request model is weirdly backwards in Mesos. In YARN you (the framework) request containers with a given specification and give locality preferences. In Mesos you get resource "offers" and choose to accept or reject those based on your own scheduling policy. The Mesos model is a arguably more flexible, but seemingly more work for the person implementing the framework.
  5. YARN is a pretty epic chunk of code, including all kinds of things right down to its own web framework. It is about 3x as much code as Mesos.
  6. YARN integrates something similar to the pluggable schedulers everyone knows and loves/hates in Hadoop. So if you are used to the capacity scheduler, hierarchical queues, and all that, you can get something similar. I don't think the Mesos scheduling capabilities are quite as robust (they list hierarchical scheduling on their roadmap).
  7. YARN integrates with Kerberos and essentially inherits the Hadoop security architecture. I don't think Mesos attempts to deal with security.
  8. YARN directly handles rack and machine locality in your requests, which is convenient. In Mesos you can implement this, but it is less out of the box.
  9. Mesos is much more mature as a project at this point. It is a standalone thing, with great documentation, and good starter examples. YARN exists only on hadoop trunk (and some feature branches) in the mapreduce directory, and the docs are super sparse. That said, the Hadoop guys have been really awesome at helping us get started with YARN (thanks Arun!) and they seem really committed to making sure it works as a general purpose framework, not just for Hadoop. There seems to be a lot of momentum, it is just early.
  10. YARN is going to be the basis for Hadoop MapReduce going forward, so if you have a big Hadoop cluster and want to be able to run other stuff on it, that is likely appealing and will probably work more transparently than Mesos.
  11. YARN was written by the Yahoo/HortonWorks Hadoop team which has should know a thing or two about multi-tenancy and very large-scale cluster computing. YARN is not yet in a stable Hadoop release so I am not sure how much actual testing it has had or the extent of deployment internally at Yahoo. Regardless, if/when the YARN team is able to get the majority of the worlds Hadoop clusters successfully running on top of YARN, that will likely get the project to a level of hardening that will be hard to compete with.
  12. Mesos ships with a number of out-of-the-box frameworks ported to it. This somewhat helps to validate the generality of their framework, but i don't know how much of a hack the various ports of things to it are.

Here are a few pointers for folks trying to find out more about Mesos:
  1. Docs: http://www.mesosproject.org/docu...
  2. Papers: http://www.mesosproject.org/rese...
  3. Sample framework implementations: https://github.com/mesos/mesos/t...

Here are some pointers on YARN:
  1. Master JIRA: https://issues.apache.org/jira/b...
  2. Article on the new resource scheduler: http://developer.yahoo.com/blogs...
  3. Design document for YARN. This is really essential for understanding their terminology of application masters, resource manager, etc. Before we found this, just looking at code, we were lost. https://issues.apache.org/jira/s...
  4. Spark, an iterative machine learning framework, has been ported to YARN, and serves as a great example of how to do this: https://github.com/mesos/spark-yarn

There is a thread on the Mesos mailing list that discusses differences further:  http://mail-archives.apache.org/...

(1)普通用户端(全平台) 音乐播放核心体验: 个性化首页:基于 “听歌历史 + 收藏偏好” 展示 “推荐歌单(每日 30 首)、新歌速递、相似曲风推荐”,支持按 “场景(通勤 / 学习 / 运动)” 切换推荐维度。 播放页功能:支持 “无损音质切换、倍速播放(0.5x-2.0x)、定时关闭、歌词逐句滚动”,提供 “沉浸式全屏模式”(隐藏冗余控件,突出歌词与专辑封面)。 多端同步:自动同步 “播放进度、收藏列表、歌单” 至所有登录设备(如手机暂停后,电脑端打开可继续播放)。 音乐发现与管理: 智能搜索:支持 “歌曲名 / 歌手 / 歌词片段” 搜索,提供 “模糊匹配(如输入‘晴天’联想‘周杰伦 - 晴天’)、热门搜索词推荐”,结果按 “热度 / 匹配度” 排序。 歌单管理:创建 “公开 / 私有 / 加密” 歌单,支持 “批量添加歌曲、拖拽排序、一键分享到社交平台”,系统自动生成 “歌单封面(基于歌曲风格配色)”。 音乐分类浏览:按 “曲风(流行 / 摇滚 / 古典)、语言(国语 / 英语 / 日语)、年代(80 后经典 / 2023 新歌)” 分层浏览,每个分类页展示 “TOP50 榜单”。 社交互动功能: 动态广场:查看 “关注的用户 / 音乐人发布的动态(如‘分享新歌感受’)、好友正在听的歌曲”,支持 “点赞 / 评论 / 转发”,可直接点击动态中的歌曲播放。 听歌排行:个人页展示 “本周听歌 TOP10、累计听歌时长”,平台定期生成 “全球 / 好友榜”(如 “好友中你本周听歌时长排名第 3”)。 音乐圈:加入 “特定曲风圈子(如‘古典音乐爱好者’)”,参与 “话题讨论(如‘你心中最经典的钢琴曲’)、线上歌单共创”。 (2)音乐人端(创作者中心) 作品管理: 音乐上传:支持 “无损音频(FLAC/WAV)+ 歌词文件(LRC)+ 专辑封面” 上传,填写 “歌曲信息
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值