[Note: The below article written around 8/9/2018, was published on 10/26/2018. I had to take a break from the below, to focus on something else, and in the meantime, the below design issue that I was able to consistently reproduce in practical ways previously, disappeared completely when I resurrected my work couple of months later. I couldn't convince myself that the design issue does not exist, because of the issue having disappeared, so I decided to publish the below anyway, in hopes it might, at some point, start a healthy brainstorming on the topic at hand].
Intel® SGX SDK - Edger8r tool code generation and its potential constraint on trusted library design
I recently stumbled on a problem while refactoring. I was attempting to consolidate commonly used features to a trusted static library. The intend was to make that trusted shared library available to varied SGX projects. I am not sure if the problem I ended up with was the result of a constraint levied by Intel® SGX SDK's Edger8r tool or an oversight on my part. Either way, it would be good to know.
Following is a description of the problem:
Intel® SGX SDK introduces the concept of EDL (Enclave Definition Language) files. EDL files hold the edge routines used for interfacing between trusted and untrusted environment. At build time, Edger8r tool parses the EDL file and generates trusted or untrusted proxy/bridge code, as appropriate. When --header-only option is specified, it generates just the equivalent header file for the EDL file. Otherwise, it generates the proxy/bridge routine definitions as well, that eventually gets compiled into the project. Along with the proxy/bridge definitions, relevant bookkeeping data/code as well are generated. g_ecall_table and g_dyn_entry_table objects being a couple of examples. The former is to hold ECALLs information and the latter could possibly be a place holder for OCALLs data. The problem is in that they are both global data which means there can't be multiple instances of it. This further constraints which trusted library, among the many, can compile in the proxy/bridge generated code, as only one library can compile the generated definitions into the project, to avoid duplicate symbols. This further constraints the design of rest of the trusted static libraries specific to a project. Below example further explains the problem.
Let's assume we want to create a trusted static shared library "X" to hold common features, for use by multiple SGX projects/porting. We would prefer for this shared library to be self-sufficient in that the relevant proxy code generated for relevant EDL files in this library be built into the library. Then let's assume a project "P1" creates static libraries, "Y1", "Y2", "Y3", all with its own EDL files but set to generate only headers with --header-only option set. Also, library "Y1" links to "X". The enclave library, a dll "Z" for the project P1 links to "Y1", "Y2" and "Y3". "Z" also, like "X", compiles in both the proxy/bridge and pertinent bookkeeping data/code generated while building an EDL without --header-only option. This results in a scenario wherein, both "X" and "Z" end up with multiple symbols with the same name (e.g. g_ecall_table, g_dyn_entry_table) resulting in a conflict with the linker complaining about multiple symbol definitions. This link time error can be suppressed by using "/FORCE:Multiple" linker option but that would simply shift the problem to runtime.
To further confirm that I am not missing something obvious, I pruned through existing SGX libraries, shipped with Intel® SGX SDK, using Dumpbin, and scrutinized the EDL published for the relevant libraries. They do not carry the symbols that are likely to eventually result in a conflict. This would mean they were either compiled with --header-only or the EDL content didn't require generating the conflicting symbols. In other words, the libraries they distribute, by design wouldn't have hit the issue I encountered. Since the model I described above results in cascading dependencies requiring one or more libraries that are required to be self-contained, symbols wise, for it to be useful within multiple independent projects, we seem to have hit that design constrain.
Intel's initial vision for SGX might have confined the use of enclaves to a very narrow scope to warrant the need to consider extensive trusted shared libraries design/maintenance, the resulting dependencies and its overall impact. Intel® SGX stands to gain from being in tune with the constraints faced by adopters of its technology, when it is used beyond its initial narrow scope. This can only encourage further adoption of the technology, especially by classic applications hoping to capitalize on the security it provides.
As a solution to the above mentioned problem, Edger8r tool could generate more specific symbol names per trusted library and collate at runtime, proxy/bridge information from multiple libraries or use an equivalent approach so as not to shift the constraint to design level.