Programmer’s dilemma

本文探讨了资深程序员在长期服务于稳定技术栈的大公司后,可能会遭遇的职业发展瓶颈——即所谓的“专家陷阱”。作者通过面试经历揭示了许多资深程序员在面对基本技术问题时表现出的能力缺失,并分析了造成这一现象的原因。此外,还提出了个人与团队层面的解决方案。

Recently I interviewed tens of candidates for a kernel programmer’s position. These candidates are from big, good companies, which are famous for chips or embedded OS/systems. Many of them claimed they have at least 10 years on-job experience on kernel. Their resumes look fairly shiny — all kinds of related projects, buzz words and awards…

But most of them cannot answer a really basic question: When we call the standard malloc function, what happens in kernel?

Don’t be astonished. When I ask one of the candidate to write a simple LRU cache framework based on glib hash functions, he firstly claimed he had never used glib — that’s what I expected — I showed the glib hash api page and explained the APIs to him in detail, then after almost an hour he wrote only a few lines of messy code.

I don’t know if the situation is similar in other countries, but in China, or more specifically, in Beijing, this is reality. “Senior” programmers who worked for big, famous foreign companies for years cannot justify themselves in simple, fundamental problems.


Why did this happen?

The more I think about it, the more I believe it is caused not only by themselves but also by the companies they worked for. These companies usually provide stable stack of code, which has no significant changes for years. The technologies around the code wraps up people’s skills, so that they just need to follow the existing path, rather than to be creative. If you happened to work for such kind of code for a long period and did not reach to the outer world a lot, one day you will find yourself to be in a pathetic position — they called you “EXPERT” inside the team or company, yet you cannot find an equally good job in the market unfortunately.

This is so called “Expert Trap”. From day to day, we programmers dreamed of being an expert inside the team/company; however, when that day really comes we trapped ourselves. The more we dig into existing code, the deeper we trapped into it. We gradually lose our ability to write complete projects from scratch, because the existing code is so stable (so big/so profitable). What’s the worse, if our major work is just to maintain the existing code with little feature development, after a while, no matter how much code we’ve read and studies, we will find we cannot write code — even if the problem is as simple as a graduate school assignment. This is the programmer’s dilemma: we make our living by coding, but the big companies who fed us tend to destroy our ability to make a living.


How to get away from this dilemma?

For personal —

First of all, Do your own personal projects. You need to “sharpen your saw” continuously. If the job itself cannot help you do so, pick up the problems you want to concur and conquer it in your personal time. By doing so, most likely you will learn new things. If you publish your personal projects, say in github, you may get chances to know people who may pull you away from your existing position.

Do not stay in a same team for more than two years. Force yourself to move around, even if in the same organization, same company, you will face new challenges and new technologies. Try to do job interviews every 18 months. You don’t need to change your job, but you can see what does the market require and how you fit into it.

For team/company —

Give pressures and challenges to the employees. Rotate the jobs, let the “experts” have chance to broaden their skills. Start new projects, feed the warriors with battles.

Hold hackathon periodically. This will help to build a culture that embrace innovation and creation. People will be motivated by their peers — “gee, that bustard can write such a beautiful framework for 24 hours, I gotta work hard”.

本文旨在系统阐述利用MATLAB平台执行多模态语音分离任务的方法,重点围绕LRS3数据集的数据生成流程展开。LRS3(长时RGB+音频语音数据集)作为一个规模庞大的视频与音频集合,整合了丰富的视觉与听觉信息,适用于语音识别、语音分离及情感分析等多种研究场景。MATLAB凭借其高效的数值计算能力与完备的编程环境,成为处理此类多模态任务的适宜工具。 多模态语音分离的核心在于综合利用视觉与听觉等多种输入信息来解析语音信号。具体而言,该任务的目标是从混合音频中分离出不同说话人的声音,并借助视频中的唇部运动信息作为辅助线索。LRS3数据集包含大量同步的视频与音频片段,提供RGB视频、单声道音频及对应的文本转录,为多模态语音处理算法的开发与评估提供了重要平台。其高质量与大容量使其成为该领域的关键资源。 在相关资源包中,主要包含以下两部分内容: 1. 说明文档:该文件详细阐述了项目的整体结构、代码运行方式、预期结果以及可能遇到的问题与解决方案。在进行数据处理或模型训练前,仔细阅读此文档对正确理解与操作代码至关重要。 2. 专用于语音分离任务的LRS3数据集版本:解压后可获得原始的视频、音频及转录文件,这些数据将由MATLAB脚本读取并用于生成后续训练与测试所需的数据。 基于MATLAB的多模态语音分离通常遵循以下步骤: 1. 数据预处理:从LRS3数据集中提取每段视频的音频特征与视觉特征。音频特征可包括梅尔频率倒谱系数、感知线性预测系数等;视觉特征则涉及唇部运动的检测与关键点定位。 2. 特征融合:将提取的音频特征与视觉特征相结合,构建多模态表示。融合方式可采用简单拼接、加权融合或基于深度学习模型的复杂方法。 3. 模型构建:设计并实现用于语音分离的模型。传统方法可采用自适应滤波器或矩阵分解,而深度学习方法如U-Net、Transformer等在多模态学习中表现优异。 4. 训练与优化:使用预处理后的数据对模型进行训练,并通过交叉验证与超参数调整来优化模型性能。 5. 评估与应用:采用信号失真比、信号干扰比及信号伪影比等标准指标评估模型性能。若结果满足要求,该模型可进一步应用于实际语音分离任务。 借助MATLAB强大的矩阵运算功能与信号处理工具箱,上述步骤得以有效实施。需注意的是,多模态任务常需大量计算资源,处理大规模数据集时可能需要对代码进行优化或借助GPU加速。所提供的MATLAB脚本为多模态语音分离研究奠定了基础,通过深入理解与运用这些脚本,研究者可更扎实地掌握语音分离的原理,从而提升其在实用场景中的性能表现。 资源来源于网络分享,仅用于学习交流使用,请勿用于商业,如有侵权请联系我删除!
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值