Twitter & Performance: An update

在本周一,Twitter数据库出现故障导致用户记录表挂起运行中的查询,造成大量表锁定,影响了用户注册、登录、更新资料及API响应。故障通过强制重启数据库服务器解决,但最终仍需将部分锁定的用户数据库与可用副本合并,修复数据库及其相关问题。为防止类似问题,已采取措施快速检测和响应,并计划长期解决方案,包括数据中心迁移,以提高Twitter的可靠性和稳定性。

On Monday, a fault in the database that stores Twitter user records caused problems on both Twitter.com and our API. The short, non-technical explanation is that a mistake led to some problems that we were able to fix without losing any data.

While we were able to resolve these issues by Tuesday morning, we want to talk about what happened and use this an opportunity to discuss the recent progress we’ve made in improving Twitter’s performance and availability. We recently covered these topics in a pair of June posts here and on our company blog ).

Riding a rocket
Making sure Twitter is a stable platform and a reliable service is our number one priority. The bulk of our engineering efforts are currently focused on this effort, and we have moved resources from other important projects to focus on the issue.

As we said last month, keeping pace with record growth in Twitter’s user base and activity presents some unique and complex engineering challenges. We frequently compare the tasks of scaling, maintaining, and tweaking Twitter to building a rocket in mid-flight.

During the World Cup, Twitter set records for usage . While the event was happening, our operations and infrastructure engineers worked to improve the performance and stability of the service. We have made more than 50 optimizations and improvements to the platform, including:

  • Doubling the capacity of our internal network;
  • Improving the monitoring of our internal network;
  • Rebalancing the traffic on our internal network to redistribute the load;
  • Doubling the throughput to the database that stores tweets;
  • Making a number of improvements to the way we use memcache, improving the speed of Twitter while reducing internal network traffic; and,
  • Improving page caching of the front and profile pages, reducing page load time by 80 percent for some of our most popular pages.

So what happened Monday?
While we’re continuously improving the performance, stability and scalability of our infrastructure and core services, there are still times when we run into problems unrelated to Twitter’s capacity. That’s what happened this week.

On Monday, our users database, where we store millions of user records, got hung up running a long-running query; as a result, most of the table became locked. The locked users table manifested itself in many ways: users were unable to sign-up, sign in, update their profile or background images, and responses from the API were malformed, rendering the response unusable to many of the API clients. In the end, this affected most of the Twitter ecosystem: our mobile, desktop, and web-based clients, the Twitter support and help system, and Twitter.com.

To remedy the locked table, we force-restarted the database server in recovery mode, a process that took more than 12 hours (the database covers records for more than 125 million users -- that’s a lot of records). During the recovery, the users table and related tables remained unavailable. Unfortunately, even after the recovery process completed, the table remained in an unusable state. Finally, yesterday morning we replaced the partially-locked user db with a copy that was fully available (in the parlance of database admins everywhere, we promoted a slave to master), fixing the database and all of the related issues.

We have taken steps to ensure we can more quickly detect and respond to similar issues in the future. For example, we are prepared to more quickly promote a slave db to a master db, and we put additional monitoring in place to catch errant queries like the one that caused Monday’s incidents.

Long-term solutions
As we said last month, we are working on long-term solutions to make Twitter more reliable (news that we are moving into our own data center this fall, which we announced this afternoon , is just one example). This will take time, and while there has been short-term pain, our capacity has improved over the past month.

Finally, despite the rapid growth of our company, we’re still a relatively small crew maintaining a comparatively large (rocket) ship. We’re actively looking for engineering talent, with more than 20 openings currently. If you’re interested in learning more about the problems we’re solving or “joining the flock ,” check out our jobs page .

【无人机】基于改进粒子群算法的无人机路径规划研究[和遗传算法、粒子群算法进行比较](Matlab代码实现)内容概要:本文围绕基于改进粒子群算法的无人机路径规划展开研究,重点探讨了在复杂环境中利用改进粒子群算法(PSO)实现无人机三维路径规划的方法,并将其与遗传算法(GA)、标准粒子群算法等传统优化算法进行对比分析。研究内容涵盖路径规划的多目标优化、避障策略、航路点约束以及算法收敛性和寻优能力的评估,所有实验均通过Matlab代码实现,提供了完整的仿真验证流程。文章还提到了多种智能优化算法在无人机路径规划中的应用比较,突出了改进PSO在收敛速度和全局寻优方面的优势。; 适合人群:具备一定Matlab编程基础和优化算法知识的研究生、科研人员及从事无人机路径规划、智能优化算法研究的相关技术人员。; 使用场景及目标:①用于无人机在复杂地形或动态环境下的三维路径规划仿真研究;②比较不同智能优化算法(如PSO、GA、蚁群算法、RRT等)在路径规划中的性能差异;③为多目标优化问题提供算法选型和改进思路。; 阅读建议:建议读者结合文中提供的Matlab代码进行实践操作,重点关注算法的参数设置、适应度函数设计及路径约束处理方式,同时可参考文中提到的多种算法对比思路,拓展到其他智能优化算法的研究与改进中。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值