15

How would you extract the Server Name Indication from a TLS Client Hello message. I'm curently struggling to understand this very cryptic RFC 3546 on TLS Extensions, in which the SNI is defined.

Things I've understood so far:

  • The host is utf8 encoded and readable when you utf8 enocde the buffer.
  • Theres one byte before the host, that determines it's length.

If I could find out the exact position of that length byte, extracting the SNI would be pretty simple. But how do I get to that byte in the first place?

4

4 回答 4

39

我在sniproxy中进行了此操作,在 Wireshark 中检查 TLS 客户端 hello 数据包,同时阅读 RFC 是一个很好的方法。这并不难,只是需要跳过许多可变长度字段并检查是否有正确的元素类型。

我现在正在做我的测试,并且有这个带注释的示例包可能会有所帮助:

const unsigned char good_data_2[] = {
    // TLS record
    0x16, // Content Type: Handshake
    0x03, 0x01, // Version: TLS 1.0
    0x00, 0x6c, // Length (use for bounds checking)
        // Handshake
        0x01, // Handshake Type: Client Hello
        0x00, 0x00, 0x68, // Length (use for bounds checking)
        0x03, 0x03, // Version: TLS 1.2
        // Random (32 bytes fixed length)
        0xb6, 0xb2, 0x6a, 0xfb, 0x55, 0x5e, 0x03, 0xd5,
        0x65, 0xa3, 0x6a, 0xf0, 0x5e, 0xa5, 0x43, 0x02,
        0x93, 0xb9, 0x59, 0xa7, 0x54, 0xc3, 0xdd, 0x78,
        0x57, 0x58, 0x34, 0xc5, 0x82, 0xfd, 0x53, 0xd1,
        0x00, // Session ID Length (skip past this much)
        0x00, 0x04, // Cipher Suites Length (skip past this much)
            0x00, 0x01, // NULL-MD5
            0x00, 0xff, // RENEGOTIATION INFO SCSV
        0x01, // Compression Methods Length (skip past this much)
            0x00, // NULL
        0x00, 0x3b, // Extensions Length (use for bounds checking)
            // Extension
            0x00, 0x00, // Extension Type: Server Name (check extension type)
            0x00, 0x0e, // Length (use for bounds checking)
            0x00, 0x0c, // Server Name Indication Length
                0x00, // Server Name Type: host_name (check server name type)
                0x00, 0x09, // Length (length of your data)
                // "localhost" (data your after)
                0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74,
            // Extension
            0x00, 0x0d, // Extension Type: Signature Algorithms (check extension type)
            0x00, 0x20, // Length (skip past since this is the wrong extension)
            // Data
            0x00, 0x1e, 0x06, 0x01, 0x06, 0x02, 0x06, 0x03,
            0x05, 0x01, 0x05, 0x02, 0x05, 0x03, 0x04, 0x01,
            0x04, 0x02, 0x04, 0x03, 0x03, 0x01, 0x03, 0x02,
            0x03, 0x03, 0x02, 0x01, 0x02, 0x02, 0x02, 0x03,
            // Extension
            0x00, 0x0f, // Extension Type: Heart Beat (check extension type)
            0x00, 0x01, // Length (skip past since this is the wrong extension)
            0x01 // Mode: Peer allows to send requests
};
于 2014-02-21T06:32:14.780 回答
7

使用 WireShark 并通过添加过滤器仅捕获 TLS (SSL) 包tcp port 443。然后找到“Client Hello”消息。你可以在下面看到它的原始数据。

展开 ,你会看到。握手包中的服务器名称未加密。Secure Socket Layer->TLSv1.2 Record Layer: Handshake Protocol: Client Hello->...
Extension: server_name->Server Name Indication extension

http://i.stack.imgur.com/qt0gu.png

于 2015-03-28T15:07:53.860 回答
3

对于任何感兴趣的人,这是 C/C++ 代码的暂定版本。到目前为止它已经奏效了。该函数返回服务器名称在包含客户端 Hello 的字节数组中的位置以及len参数中名称的长度。

char *get_TLS_SNI(unsigned char *bytes, int* len)
{
    unsigned char *curr;
    unsigned char sidlen = bytes[43];
    curr = bytes + 1 + 43 + sidlen;
    unsigned short cslen = ntohs(*(unsigned short*)curr);
    curr += 2 + cslen;
    unsigned char cmplen = *curr;
    curr += 1 + cmplen;
    unsigned char *maxchar = curr + 2 + ntohs(*(unsigned short*)curr);
    curr += 2;
    unsigned short ext_type = 1;
    unsigned short ext_len;
    while(curr < maxchar && ext_type != 0)
    {
        ext_type = ntohs(*(unsigned short*)curr);
        curr += 2;
        ext_len = ntohs(*(unsigned short*)curr);
        curr += 2;
        if(ext_type == 0)
        {
            curr += 3;
            unsigned short namelen = ntohs(*(unsigned short*)curr);
            curr += 2;
            *len = namelen;
            return (char*)curr;
        }
        else curr += ext_len;
    }
    if (curr != maxchar) throw std::exception("incomplete SSL Client Hello");
    return NULL; //SNI was not present
}
于 2016-12-27T03:09:14.487 回答
2

我注意到域总是以两个零字节和一个长度字节为前缀。也许它是无符号的 24 位整数,但我无法测试它,因为我的 DNS 服务器不允许超过 77 个字符的域名。

根据这些知识,我想出了这个(Node.js)代码。

function getSNI(buf) {
  var sni = null
    , regex = /^(?:[a-z0-9-]+\.)+[a-z]+$/i;
  for(var b = 0, prev, start, end, str; b < buf.length; b++) {
    if(prev === 0 && buf[b] === 0) {
      start = b + 2;
      end   = start + buf[b + 1];
      if(start < end && end < buf.length) {
        str = buf.toString("utf8", start, end);
        if(regex.test(str)) {
          sni = str;
          continue;
        }
      }
    }
    prev = buf[b];
  }
  return sni;
}

此代码查找两个零字节的序列。如果它找到一个,它假定下面的字节是一个长度参数。它检查长度是否仍在缓冲区的边界内,如果是,则将字节序列读取为 UTF-8。稍后,可以对数组进行正则表达式并提取域。

效果非常好!尽管如此,我还是注意到了一些奇怪的事情。

'�\n�\u0014\u0000�\u0000�\u00009\u00008�\u000f�\u0005\u0000�\u00005�\u0007�\t�\u0011�\u0013\u0000E\u0000D\u0000f\u00003\u00002�\f�\u000e�\u0002�\u0004\u0000�\u0000A\u0000\u0005\u0000\u0004\u0000/�\b�\u0012\u0000\u0016\u0000\u0013�\r�\u0003��\u0000\n'
'\u0000\u0015\u0000\u0000\u0012test.cubixcraft.de'
'test.cubixcraft.de'
'\u0000\b\u0000\u0006\u0000\u0017\u0000\u0018\u0000\u0019'
'\u0000\u0005\u0001\u0000\u0000'

始终,无论我选择哪个子域,该域都会被定位两次。似乎 SNI 字段嵌套在另一个字段中。

我愿意接受建议和改进!:)

我把它变成了一个 Node 模块,供所有关心的人使用:sni

于 2013-07-24T11:52:37.710 回答