intel - 英特尔 SGX 线程和与 TCS

Question

我试图了解 TCS 启用的 SGX 线程和SDK提供的不受信任的线程之间的区别。

如果我理解正确，TCS 允许多个逻辑处理器进入同一个飞地。每个逻辑处理器都有自己的 TCS，因此也有自己的入口点（OENTRYTCS 中的字段）。每个线程运行直到 AEX 发生或到达线程的末尾。但是，这些由 TCS 启用的线程还没有办法相互同步。至少，没有用于同步的 SGX 指令。

然后，另一方面，SGX SDK 提供了一组线程同步原语，主要是互斥锁和条件变量。这些原语不受信任，因为它们最终由操作系统提供服务。

我的问题是，这些线程同步原语是否打算由 TCS 线程使用？如果是这样，这不会降低安全性吗？操作系统可以随心所欲地进行调度。

score 7 · Accepted Answer

首先，让我们处理一下你有点不清楚的术语

TCS 启用的 SGX 线程和 SDK 提供的不受信任的线程。

在飞地内，只有“受信任的”线程可以执行。飞地内没有“不受信任的”线程。可能 SDK 指南 [1] 中的下面这句话误导了你：

不支持在 enclave 内创建线程。在 enclave 内运行的线程是在（不受信任的）应用程序中创建的。

不受信任的应用程序必须设置 TCS 页面（有关 TCS 的更多背景信息，请参阅 [2]）。那么不受信任的应用程序设置的 TCS 如何成为 enclave 内受信任线程的基础呢？[2]给出了答案：

如果测量了所有 TCS 页面的内容，则 EENTER 仅保证在 enclave 的代码内执行受控跳转。

通过测量 TCS 页面，可以通过 enclave 证明来验证线程的完整性（TCS 定义了允许的入口点）。因此，只有已知良好的执行路径才能在 enclave 内执行。

其次，让我们看看同步原语。

SDK 确实提供了同步原语，您说这些原语不可信，因为它们最终由操作系统提供服务。让我们看一下 [1] 中对这些原语的描述：

sgx_spin_lock() and unlock operate solely within the enclave (using atomic operations), with no need for OS interaction (no OCALL). Using a spinlock, you could yourself implement higher-level primitives.
sgx_thread_mutex_init() also does not make an OCALL. The mutex data structure is initialized within the enclave.
sgx_thread_mutex_lock() and unlock potentially perform OCALLS. However, since the mutex data is within the enclave, they can always enforce correctness of locking within the secure enclave.

Looking at the descriptions of the mutex functions, my guess is that the OCALLs serve to implement non-busy waiting outside the enclave. This is indeed handled by the OS, and susceptible to attacks. The OS may choose not to wake a thread waiting outside the enclave. But it can also choose to interrupt a thread running inside an enclave. SGX does not protect against DoS attacks (Denial of Service) from the (potentially compromised) OS.

To summarize, spin-locks (and by extension any higher-level synchronization) can be implemented securely inside an enclave. However, SGX does not protect against DoS attacks, and therefor you cannot assume that a thread will run. This also applies to locking primitives: a thread waiting on a mutex might not be awakened when the mutex is freed. Realizing this inherent limitation, the SDK designers chose to use (untrusted) OCALLs to efficiently implement some synchronization primitives (i.e. non-busy waiting).

[1] SGX SDK Guide

[2] SGX Explained

score 1 · Accepted Answer

qweruiop, regarding your question in the comment (my answer is too long for a comment):

I would still count that as a DoS attack: the OS, which manages the resources of enclaves, denies T access to the resource CPU processing time. But I agree, you do have to design the other threads running in that enclave with the awareness that T might never run. The semantics are different from running threads on a platform you control. If you want to be absolutely sure that the condition variable is checked, you have to do so on a platform you control.

每个代理函数返回的 sgx_status_t （例如，当将 ECALL 放入飞地时）可以返回SGX_ERROR_OUT_OF_TCS。因此，SDK 应该为您处理所有线程 - 只需从飞地外的两个不同（“不可信”）线程 A 和 B 生成 ECALL，并且执行流程应该在飞地内的两个（“可信”）线程中继续，每个线程都绑定到一个单独的 TCS（假设有 2 个未使用的 TCS 可用）。

intel - 英特尔 SGX 线程和与 TCS

2 回答 2

Related

Reference