security - 您如何通过测试驱动开发确保安全编码？

Question

我一直在跟上最新趋势，即测试驱动开发 (TDD)。我所做的大部分开发都是使用 C 或 C++ 进行的。让我感到震惊的是，常见的 TDD 实践和常见的安全编码实践之间存在非常明显的冲突。TDD 的核心是告诉你，你不应该为没有失败测试的东西编写新代码。对我来说，这意味着我不应该编写安全代码，除非我有单元测试来查看我的代码是否安全。

这带来了两个问题：

如何有效地编写单元测试来测试缓冲区溢出、堆栈溢出、堆溢出、数组索引错误、格式字符串错误、ANSI、Unicode 和 MBCS 字符串大小不匹配、安全字符串处理（来自 Howard 和 LeBlanc 的“编写安全代码” )?
在标准 TDD 实践中的什么时候应该包括这些测试，因为大部分安全性是非功能性的。

令人惊讶的是，我发现讨论 TDD 和安全性的研究很少。我遇到的大部分是 TDD 论文，它们在非常高的层次上提到 TDD 将“使您的代码更安全”。

我正在寻找上述问题的任何直接答案，与此相关的任何研究（我已经看过但没有找到太多），或者 TDD 大师居住的任何地方，以便我可以去敲他们的门（实际上）和看看他们有没有好的答案。

谢谢！

编辑：

Fuzzing 的话题已经出现，我认为这是解决这个问题的一个很好的方法（一般来说）。这就提出了一个问题：模糊测试是否适合 TDD？模糊测试在 TDD 过程中的什么位置合适？

参数化单元测试（可能是自动化的）也出现在我的脑海中。这可能是一种在测试过程早期获得类似模糊结果的方法。我也不确定它在哪里适合 TDD。

编辑2：

到目前为止，谢谢大家的回答。在这一点上，我对如何利用参数化测试作为我们函数的伪模糊器非常感兴趣。但是，我们如何确定要编写哪些测试来测试安全性？我们如何确保我们充分覆盖了攻击空间？

It is a well known problem in software security that if you protect against 5 attack scenarios, the attacker will just look for, and use, a 6th attack. It is a very difficult cat-and-mouse game. Does TDD give us any advantage against this?

score 10 · Accepted Answer

Yes, TDD is a tool/technique that can help to ensure secure coding.

But as with all things in this industry: assume it's a silver bullet, and you'll shoot yourself in the foot.

Unknown Threats

As you indicated in Edit 2: "you protect against 5 attack scenarios, the attacker will just look for, and use, a 6th attack". TDD is not going to protect you from unknown threats. By its very nature, you have to know what you want to test in order to write the test in the first place.

So suppose threat number 6 is discovered (hopefuly not due to breach, but rather internally due to another tool/technique that attempts to find potential attack vectors).

TDD will help as follows:

Tests can be written to verify the threat.
A solution can be implemented to block the threat, and quickly be confirmed to be working.
More importantly, provided all other tests still pass, you can quickly verify that:
- All other security measures still behave correctly.
- All other functionality still behaves correctly.
Basically TDD assists in allowing a quick turnaround time from when a threat is discovered to when a solution becomes available.
TDD also provides a high degree of confidence that the new version behaves correctly.

Testable Code

I have read that TDD is often misinterpreted as a Testing Methodology, when in fact it is more of a Design Methodology. TDD improves the design of your code, making it more testable.

Specialised Testing

An important feature of test cases is their ability to run without side-effects. Meaning you can run tests in any order, any number of times, and they should never fail. As a result, a number of other aspects of a system become easier to test purely as a result of the testability. For example: Performance, Memory Utilisation.

This testing is usually implemented by way of running special checks of an entire test suite - without directly impacting the suite itself.

A similar security testing module could overlay a test suite and look for known security concerns such as secure data left in memory, buffer overruns or any new attack vector that becomes known. Such an overlay would have a degree of confidence, because it has been checked for all known functionality of the system.

Improved Design

On of the key design improvements arising as a side-effect of TDD is explicit dependencies. Many systems suffer under the weight of implicit or derived dependencies. And these would make testing virtually impossible. As a result TDD designs tend to be more modular in the right places. From a security perspective this allows you to do things like:

Test components that receive network data without having to actually send it over the network.
One can easily mock-out objects to behave in unexpected / 'unrealistic' ways as might occur in attack scenarios.
Test components in isolation.
Or with any desired mix of production components.

Unit Testing

One thing that should be noted is that TDD favours highly localised (unit testing). As a result you could easily test that:

SecureZeroMemory() would correctly erase a password from RAM.
Or that GetSafeSQLParam() would correctly guard against SQL injection.

However, it becomes more difficult to verify that all developers have used the correct method in every place that it's required.
A test to verify a new SQL related feature would confirm that the feature works - it would work just as well with both the 'safe' and 'unsafe' versions of GetSQLParam.

It is for this reason you should not neglect other tools/techniques that can be used to "ensure secure coding".

Coding Standards
Code Reviews
Testing

score 5 · Accepted Answer

I'll take your second question first. Yes, TDD works can be used non-functional requirements. In fact, is often used as such. The most common benefit of an improved modular design, which is non-functional-- but seen by everyone who practices TDD. Other examples that I've used TDD to verify: cross-platform, cross-database, and performance.

For all your tests, you may need to restructure the code so that it is testable. This is one of the biggest effects of TDD-- it really changes how you structure your code. At first it seems like this is perturbing the design, but you soon come to realize that the testable design is better. Anyway...

String interpretation bugs (Unicode vs. ANSI) are particularly nice to test with TDD. It's usually straightforward to enumerate the bad and good inputs, and assert about their interpretation. You may find that you need to restructure your code a bit to "make it testable"; by this I mean extract methods that isolate the string-specific code.

For buffer overruns, making sure routines respond properly if given too much data is pretty straightforward to test as well. Just write a test and send them too much data. Assert that they did what you expected. But some buffer overflows and stack overflows are a bit trickier. You need to be able to cause these to happen, but you also need to figure out how to detect whether they happened. This may be as simple as allocating a buffers with extra bytes in them and verifying that those bytes don't change during tests... Or it may some other creative techniques.

I'm not sure there's a simple answer, though. Testing takes creativity, discipline, and commitment, but is usually worth it.

isolate the behavior you need to test
make sure you can detect the problem
know what you want to happen for the error case
write the test and see it fail

Hope this helps

score 4 · Accepted Answer

TDD is the best way to build a secure system. All software developed by Microsoft is fuzzed and this arguably the number one reason for the dramatic reduction in vulnerabilities found. I highly recommended using the Peach Framework for this purpose. I have personally used Peach with great success in finding Buffer Overflows.

Peach pit files provide a way of describing the data used by your application. You can choose what interface you want test. Does your application read files? Does it have an open port? After you tell peach what the input looks like and how to communicate with your application, you can turn it loose and i knows all of the nasty input to make your application puke all over its self.

To make everything run, peach has a great testing harness, If your application crashes, peach will know because it has a debugger attached. When your application crashes, peach will restart it and keep testing. Peach can categorize all of the crashes and match up the core dumps with the input it used to crash the application.

score 0 · Accepted Answer

Parameterized Tests

While we aren't doing buffer overrun test at my work we do have the notion of template tests. These tests are parameterized to require the specific data for the case we want to test. We then use metaprogramming to dynamically create the real tests by applying the parameters for each case to the template. This has the benefit of being deterministic, and runs as part of our automated test suite.

My TDD Practice

We do Acceptance Test Driven Development at my work. Most of our tests happen to be close to full stack functional tests. The reason is we found it was more valuable to test and assure the behavior of user driven actions. We use techniques like dynamic test generation from parameterized tests to provide us more coverage with a minimum of work. We do this for ASCII vs UTF8, API conventions, and well known variant tests.

score 0 · Accepted Answer

The topic of Fuzzing has come up, which I think is a great approach to this problem (in general). This raises the questions: Does Fuzzing fit into TDD? Where in the TDD process does fuzzing fit?

I believe that it might fit quite well! There are fuzzers like american fuzzy lop that can be scripted and adapt themselves to modifications in the I/O format on their own. In this particular case, you could integrate it with Travis CI, store the input test cases you used and run regression testing against those.

I might extend this answer if you come up with any questions for details in the comments.

security - 您如何通过测试驱动开发确保安全编码？

5 回答 5

Unknown Threats

Testable Code

Specialised Testing

Improved Design

Unit Testing

Related

Reference