The Story of the GnuTLS Bug

http://blog.existentialize.com/the-story-of-the-gnutls-bug.html


You might have heard about the critical GnuTLS bug that was recently fixed recently. What's the deal with it? Why is it a big deal? What happened?

Here's the bug, in essence, in lib/x509/verify.c:

/* Checks if the issuer of a certificate is a
 * Certificate Authority, or if the certificate is the same
 * as the issuer (and therefore it doesn't need to be a CA).
 *
 * Returns true or false, if the issuer is a CA,
 * or not.
 */
static int
check_if_ca (gnutls_x509_crt_t cert, gnutls_x509_crt_t issuer,
            unsigned int flags)
{
    int result;
    result =
        _gnutls_x509_get_signed_data (issuer->cert, "tbsCertificate",
                                    &issuer_signed_data);
    if (result < 0)
        {
        gnutls_assert ();
        goto cleanup;
        }

    result =
        _gnutls_x509_get_signed_data (cert->cert, "tbsCertificate",
                                    &cert_signed_data);
    if (result < 0)
        {
        gnutls_assert ();
        goto cleanup;
        }

    // snip

    result = 0;

cleanup:
    // cleanup type stuff
    return result;
}

Can you spot the bug?

Here's the (abridged) fixed version:


    int result;
    result =
        _gnutls_x509_get_signed_data (issuer->cert, "tbsCertificate",
                                    &issuer_signed_data);
    if (result < 0)
        {
        gnutls_assert ();
        goto fail; // CHANGED
        }

    // snip

fail:  // ADDED
    result = 0;

cleanup:
    // cleanup type stuff
    return result;
}


The bug was a disagreement between return value meanings. The function check_if_ca returns "true" or rather 1, when the certificate is a CA, and zero otherwise. However the other functions used return negative when they fail. In C, any integer value other than zero is regarded as a true value. So if the certificate is invalid, it's actually marked as a CA certificate.

So what's the implication of this? This function is used by gnutls_x509_crt_verify, which verifies x509 certificates. Invalid certificates can be passed off as genuine, even though they're invalid.

Based on this and the previous Apple bug, I don't think we've seen the last serious TLS stack bug. Testing TLS is notoriously difficult, as it's pretty complicated. Of course, GnuTLS doesn't have a great track record for correctness in general, so this specifically isn't that surprising.

Return values in C

The bug is the disagreement about return values and true and false. In C, the situation about what to return for success verses failure is sort of complicated.

Let's take socket(2) and connect(2) as examples. To get an IPv4 socket in C, you need to do this:

#include <sys/types.h>
#include <sys/socket.h>

// in some function
int s = socket(AF_INET, SOCK_STREAM, 0);


How do you know if this socket is valid? That is, you successfully got a valid socket that you can call connect with?

You test s. The man page says that on error, -1 is returned. So a common way to test this would be:

if (s < 0) {
    perror("Couldn't get socket");
    exit(EXIT_FAILURE); // or whatever you want to do
}

Easy enough. Let's connect with our socket now:
struct sockaddr_in addr; // set this up somehow
if (connect(s, &addr, sizeof(addr)) {
    perror("Couldn't connect"); 
    exit(EXIT_FAILURE);
}

Because connect returns zero on success, the error check for connect looks backwards. If we've connected, connect will return 0, which evaluates to false. So to avoid this confusion, most C programmers would add an explicit check for less than zero.

Due to this historical standard, you have essentially two options:

  1. Follow the C tradition and return zero for success and non-zero (or less than zero, it depends) for failure.
  2. Return explicit error codes that should be checked.

GnuTLS used a third option, which is the opposite of the first one, returning 1 for success and 0 for failure, and then mixed that with code that used the C traditional method.

If you think this is all just nonsense and that you should use exceptions instead, it's not really clear that that's better in all cases. Martin Sústrik, the author of ZeroMQ, wishes he wrote ZeroMQ in C rather than C++ with exceptions.

git blame

Let's find out how this bug got in here in the first place.

$ git clone https://git.gitorious.org/gnutls/gnutls.git
$ git checkout 6aa26f78150ccbdf^

Now we're at the commit before the bug fix. Let's run git blame and see who edited what when:
$ git blame lib/x509/verify.c

The problem lines around 141-145 were edited by Simon Josefsson, one of the two maintainers of GnuTLS, in 2005. Wow this bug is old!

If you keep going, to a5891d7^, then to 802e1ed^ you'll finally arrive at 0fba2d9^ the first version without the bug.

So it was 0fba2d9 that caused the issue. Why?

This commit was a large refactor of several parts of the certificate code. Many new functions were written which follow the old C-style error handling (return less than zero for failure, zero for success), such as _gnutls_x509_get_signature.

In the same commit, Nikos refactors check_if_ca and it looks remarkably similar to the traditional C-style error handling the other methods he was adding and refactoring. It looks like he just forgot that this wasn't a C-style error handling method at all, but a true-means-true one.

Lessons

What can we learn from this?

If you're writing in C, you need to pick one of the two C error handling model and stick with it. Aggressively refactor any code that doesn't match your error handling model choice.

Use smaller commits. It's more likely that Nikos or someone reviewing his commit would have seen the error in his diff if it was smaller. He might even have not made the mistake in the first place, as the refactor would have been different for check_if_ca.

Perhaps most important of all, test your crypto code. It's the biggest win for quality. The public facing method, gnutls_x509_crt_verify actually still doesn't have any unit tests for it. This should change.

C can be hard to get right, so it's important to be strict in how you write it and it's important to test it well. We haven't seen the last critical TLS stack bug, so don't be surprised when the next one comes. Maybe we should pool our money and pay some security auditors to audit common TLS implementations that we all use daily.

Sean is a software engineer who is passionate about doing things right. He is currently working on Squadron: an awesome configuration and release management tool for SaaS applications.

related posts


### 解决方案 当遇到 `gnutls_handshake()` 出现的 `Error in the pull function` 时,通常是因为网络连接问题或者 SSL/TLS 配置不兼容引起的。以下是几种可能的解决方案: #### 方法一:禁用 HTTPS 的 SSL 验证 可以通过设置 Git 不验证 SSL 来绕过此错误。这种方法适用于临时解决问题的情况。 ```bash git config --global http.sslVerify false ``` 需要注意的是,关闭 SSL 验证可能会带来安全风险,因此仅建议在受信任的环境中使用[^1]。 #### 方法二:切换到 HTTP 协议 如果 HTTPS 连接存在问题,可以尝试将远程仓库地址从 HTTPS 切换为 HTTP(如果有提供)。例如: ```bash git remote set-url origin http://github.com/username/repository.git ``` 这种方式避免了 TLS 握手过程中的潜在问题[^2]。 #### 方法三:更新或替换加密库 有时该问题是由于 GnuTLS 库版本较旧或存在 bug 导致的。可以考虑升级 GnuTLS 或者让 Git 使用 OpenSSL 替代 GnuTLS。具体操作如下: - **安装或启用 OpenSSL 支持** 如果操作系统支持 OpenSSL,则可以让 Git 使用它来替代 GnuTLS。例如,在某些 Linux 发行版中,Git 可能默认使用 GnuTLS;通过重新编译或更换包管理器中的依赖项可以选择 OpenSSL。 - **强制指定协议栈** 对于 Windows 用户,可以直接下载最新版本的 Git 客户端并确保其配置文件启用了最新的 TLS 版本。运行以下命令以确认当前使用的 TLS 设置: ```bash git -c http.postBuffer=524288000 clone https://github.com/username/repo.git ``` #### 方法四:调整代理设置 即使已经尝试重置代理,仍需进一步检查是否存在隐式的代理干扰通信链路。执行以下指令清除所有已定义的代理变量: ```bash git config --global --unset http.proxy git config --global --unset https.proxy ``` 随后再次测试克隆功能是否恢复正常。 --- ### 总结 上述四种方式分别针对不同层面的原因提供了修复途径——从简单的全局参数修改到深入探讨底层依赖关系及其相互作用机制。实际应用过程中可根据具体情况逐一排查直至找到最合适的解决办法。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值