The SecreC programming language
ℹ️ This tutorial is adapted from the dissertation "Programming Languages for Secure Multi-party Computation Application Development" by Jaak Randmets (2017).
SecreC 2 is a programming language designed for implementing data analysis algorithms on the Sharemind platform. The main goal of the language is to make the process of implementing secure computation algorithms easy for programmers who are not familiar with secure computation or cryptography in general. We achieve this by
-
providing an imperative C-like syntax which is ubiquitous in data analysis literature and is generally far more familiar to programmers than other computing paradigms;
-
hiding that the program will be possibly run on multiple computers;
-
syntactically not differentiating secure computations from regular public computations, when it is reasonable to do so;
-
offering generic programming facilities that enable code reuse across public computations and different security schemes; and
-
providing a standard library that builds complex functionality on top of simple primitives, thereby allowing security schemes to only implement a small number of basic operations.
However, the language remains domain-specific. We found that general-purpose languages are not a good fit for secure computation. Firstly, in MPC the programmer has to indicate if the data is public or private and can implicitly convert data only from public to private domain but not the other way around. Having only data that is private can incur too great of a performance cost. Secondly, we facilitate intricate array processing in order to make code vectorization easier, which, in turn, makes common performance optimizations easier. Performance has dictated the language design in more than one way: we also do not support any implicit type conversions because it can be extremely costly in non-public domains. There are many more subtle differences relating to general security in programming: array accesses are always dynamically checked, global state is shunned (but still supported), pointers and references are not supported, and so on.
In the Language overview section we describe SecreC to readers who are interested in learning the language or only want a high-level overview. In the section Writing efficient SecreC we talk about how to write efficient SecreC code. Due to very large constant factors and latencies, performance is an extremely important consideration in secure computing. While we talk about the SecreC language, the techniques are generally applicable to the secure computation setting.
Hello world
The following listing is the classical "Hello, world!" program written in SecreC.
void main() {
print("Hello, World!");
}
If it is saved as a file named hello-world.sc
then it can be compiled
using the SecreC compiler with the following command:
scc --input hello-world.sc --output hello-world.sb
The output file hello-world.sb
is compiled bytecode which is
interpreted by the virtual machine inside the Sharemind Application
Server. The bytecode file has to be copied to the scripts folder of the
server installation. The default path is /var/lib/sharemind/programs/
.
The program can then be run using:
sharemind-runscript --run hello-world.sb
Language overview
Familiar syntax
A major goal of SecreC is to offer a familiar syntax. It is infeasible to expect the users to be fluent in functional languages or complex type systems that provide strong security guarantees. For this reason SecreC has a C-like syntax which is ubiquitous amongst popular programming languages. In fact, just by looking at a complete "Hello, World!" program written in SecreC it is impossible to say that the language is somehow unique or related to MPC.
// Simple "Hello, World!" program.
void main() {
print("Hello, World!");
}
SecreC supports regular public computations the syntax of which is very
close to other C-like languages. Arithmetic, bitwise, logical and
relational operations are all supported. All integer data types are
strictly sized from 8 to 64 bits and include signed and unsigned
variants. We also support Booleans, strings, floating-point values and
arrays of primitive data types. Basic support for user-defined
structures is also available. The following listing demonstrates simple
arithmetics on signed 64-bit and unsigned 32-bit integers. Note that
without explicit type conversion when initializing variable z
the code
will not compile and will raise a type error.
void main() {
int64 x = 25;
int64 y = - 10;
print("x + y = ", x + y);
print("x * y = ", x * y);
uint32 z = (uint32) (x - (x / y));
print("z = ", 1 + 8 * z);
}
A familiar C-like syntax for branching, looping and user-defined
functions is available. Function parameters are always passed by value.
This includes structures, arrays and non-public values. See the
following example on defining and calling functions. Note that the
27 :: uint32
syntax is used to specify that the constant 27
is of
32-bit unsigned integer type. We could have also used C-like (uint) 27
but the former guarantees that there is no type conversion happening.
uint32 nextCollatz(uint32 n) {
if (n % 2 == 0)
return n / 2;
else
return 3 * n + 1;
}
void printCollatz(uint32 n) {
while (n != 1) {
print(n);
n = nextCollatz(n);
}
}
void main() {
printCollatz(27 :: uint32);
}
Performing secure computations
SecreC distinguishes between public and private data on type system level. Every type in SecreC has two components: the data type and the security type. When not specified, the security type is implicitly assumed to be public. To declare private data it is not sufficient to state that the data is simply "private". The programmer has to be explicit about what kind of security scheme is used and has to specify the name of the concrete deployment of this scheme. This is because Sharemind MPC facilitates the use of many different security schemes and even the concurrent use of multiple instances of a given scheme.
Having an omitted security domain default to public is not a security issue. The type system guarantees that private data is never implicitly converted to public data. Even if the programmer accidentally leaves out a non-public domain a type error will be raised. It is possible to construct an artificial example where this is not true but it would have to involve (indirect) declassifications. Another reason for having the default domain be public is that SecreC programs contain a fair share of public code. For example loop counter and array indices are usually public.
To see this in action, consider the code in the following listing where
we first declare a security scheme (or kind) called shared3p
and then
declare a concrete security domain of that kind. In the main function we
declare two variables that are in the shared3p
domain, compare them,
reveal the result, and print it. This example only involves computation
parties. The program does not receive input from input parties or
produce results to output parties. The computing parties only log the
result of declassification.
kind shared3p {
type bool;
type uint;
}
domain pd_shared3p shared3p;
void main () {
pd_shared3p uint x = 3;
pd_shared3p uint y = 5;
public bool z = declassify(x < y);
print("x < y = ", z);
}
Private variables are defined similarly to public ones, except the
security type is written before the data type. Explicitly writing the
public domain is allowed. There are a few things of note in this
example: public values can be implicitly converted to private, the
regular syntax for comparison can be used to invoke secure comparison,
and conversion to the public domain requires an explicit call to
declassify
function. In order for Sharemind Application Server to be
able to execute the following code, it has be configured to supply the
mentioned security scheme and a concrete instance of it. The security
schemes are implemented via loadable modules and concrete instances
(so-called domains) have to be explicitly configured. To configure a
protection domain a user has to specify the domains’ name, which
concrete servers are involved, and a loadable module that associates
operation names with executable code. The details of server
configuration are not included in this tutorial.
Input and output
In real applications private variables are rarely initialized directly
from public ones like we just saw. A more realistic program would be one
that, for example, gets the value of x
from one client and y
from
another. As a result of executing the example both input providers would
be notified whether x
is less than y
. This can be considered an
implementation of the Millionaires’ problem[1]. SecreC 2 code for
such application is presented in the following listing. Instead of
declaring a protection domain kind we use appropriate standard library
module that already declares that kind and defines various operators and
functions for it. Inputs are no longer hard-coded and instead are given
as arguments to the program. The result is published to output parties.
import shared3p;
domain pd_shared3p shared3p;
void main() {
pd_shared3p uint x = argument("IncomeAlice");
pd_shared3p uint y = argument("IncomeBob");
publish("result", x < y);
}
Notice that the example still only directly involves computing parties.
It does not state from where the "IncomeAlice"
and "IncomeBob"
arguments come from or who receives the published result. This logic is
implemented outside of SecreC 2. The very same code could be used in
many different applications. In one case it could be used for an
application that involves three parties two of which provide input and
all of them receive the output. Or it could also involve a web interface
for supplying data. In that case input, computing and result parties
could all be distinct. For a real-world example of how MPC can be
deployed for web applications see the dissertation[2] of
Riivo Talviste.
Private conditionals
Private values can be used exactly like public ones in most places, such
as, expressions, function arguments, variable declarations, and return
values. However, they cannot be branched over. For example, if we want
to conditionally change a secret value x
depending on a private
Boolean b
we are not allowed to write if (b) { x = y; }
. The same
result can be achieved by evaluating (uint) b * (x - y) + y
that
results in x
if b
is true and in y
otherwise. This pattern can be
abstracted to a function.
Many MPC programming languages, for example, ObliVM[3] and PICCO[4], allow branching over secure Booleans in some restricted cases. We have decided to not adopt this approach mainly because we have not had the need for it. While it does help developers a little bit, in our experience development effort is not significantly increased in their absence. There are some other less important reasons for not having private conditionals.
-
When branching over secure Booleans, it is not possible to perform any public side effects such as printing output or performing public assignments. The type system or program analysis would need to detect such conditions. Encoding such information in types makes the language more complicated. On the other hand, achieving the same results via program analysis hides from the programmer if a function can be used in a private-conditional or not.
-
Secure branches are significantly slower than public ones because they force the compiler to evaluate both branches and then select the correct answer based on the private Boolean. This argument would be weak if private conditionals were indistinguishable from public ones, but they are not.
-
Supporting secure conditionals raises the question of why not also provide support for secure looping with an upper bound to the number of iterations.
In summary, supporting conditionals over secure Booleans adds value by helping the programmer but it also complicates the language, its type system, and compiler implementation. Having all control flow be public leads to a more approachable language and a simpler compiler. Given that private conditionals are not overly common we opted to not offer convenient syntax for them. This limitation is conservative in the sense that it can be relaxed in a backwards compatible manner if it turns out that private conditionals are frequently used.
Protection domains and kinds
We provide two definitions that explain our motivation behind the choices for the keywords for declaring a security scheme and its instances.
A protection domain kind (PDK) is a set of data representations, algorithms and protocols for computing on and storing protected data.
A protection domain (PD) is a set of data that is protected with the same resources. There is a well-defined set of algorithms and protocols for computing on that data while keeping the protection.
Each protection domain belongs to one protection domain kind and each kind may have several protection domains. A classical example of a PDK is secret sharing with an associated data representation and protocols for construction, reconstruction and arithmetic operations on shared values. A PD in this PDK would specify the actual parties doing the secret sharing. Another example of a PDK is fully homomorphic encryption with operations for encryption, decryption and algorithms for performing arithmetic operations such as addition and multiplication on encrypted values. For the FHE PDK, each PD is associated with a secret key. It is also possible to consider non-cryptographic methods for implementing PDK-s using trusted hardware or virtualization. Intel® SGX offers an example of a practical PDK with a single physical computer evaluating a program.
In general a PDK has to provide:
-
A list of data types supported by the PDK.
-
For each data type in the PDK:
-
a classification function for converting public data to the protected representation and a declassification function for revealing private data,
-
protocols or functions for operating on protected values.
-
The protocols performing secure computation should be universally composable so that they can be safely combined into programs without losing security guarantees. Here we state that any PDK needs to allow its data to be converted to and from the public domain. This is not strictly necessary and a PDK may instead allow converting data to and from some other PDK. However, we have not found a practical case where restricting public conversions is useful.
Array processing
Rarely do applications of secure computation deal with single values. More often the goal is to learn some information from a large set of data without revealing information about the individual values. Naturally, to process many records a programming language has to support arrays or some other forms of collections. As we will later see, a solid support for data-parallel processing is also an attractive feature. SecreC supports arbitrary-dimensional arrays with publicly known size. Access is provided and modifications are made via public indices.
A simple aggregation procedure is to count the number of occurrences of
a value in a set of data. This can easily be achieved, as shown in the
following listing, by iterating over the input array, comparing with the
value, and incrementing the counter whenever a match is found. Counter
increments are achieved by casting the result of the equality check to
an integer and adding that to the counter. When the comparison results
in true
the counter is incremented by 1 and otherwise by 0. Double
square brackets, such as [[1]]
, denote the array’s dimensionality (the
number of dimensions). For instance, public integer matrices are
represented with int[[2]]
. We use double square brackets to be
distinct from C where int[4]
denotes a one-dimensional array of length
four. Stacking multiple square brackets as int[][][][]
would mean that
special syntax would be needed for dimensionality-generic code.
pd_shared3p uint count(pd_shared3p uint32[[1]] data,
pd_shared3p uint32 value) {
pd_shared3p uint matchCounter = 0;
for (uint i = 0; i < size(data); ++ i)
matchCounter += (uint) (data[i] == value);
return matchCounter;
}
SecreC puts a lot of emphasis on array processing and tries to make it
convenient. For example, many language constructs are lifted to operate
pointwise on arrays. One can add two arrays using the exact same syntax
as for adding two scalars. Binary arithmetic also allows for one operand
to be an array and the other a scalar. Other than being less verbose
vectorized operations are in many situations more efficient than loops
of scalar operations. It is possible to rewrite the counting function in
a more compact style by comparing the value to the data pointwise and
summing the resulting Booleans as sum(data == value)
where sum
is a
function defined in the standard library. We will later see that, in
addition to conciseness, this version also performs significantly better
than the one in the last listing.
Reshaping and moving data between arrays is very common and to avoid
writing verbose loops for such a typical task we support indexing arrays
with ranges. Given a one-dimensional array arr
the expression
arr[b:e]
results in an array of length e-b
where the elements are
taken from the range of indices from b
to e-1
. If omitted the lower
bound is implicitly assumed to be 0 and the upper bound is assumed to be
the length of the array. This syntax expands naturally to
arbitrary-dimensional arrays and can be mixed with regular indexing. For
example, the expression mat[1, :]
results in the second column of the
matrix mat
.
It is possible to change the number of dimensions and dimension sizes of
an array. The only requirement is that the number of elements of the
resulting array be the same as in the input array. For example, a matrix
mat
can be flattened into a one-dimensional array with the
reshape(mat, size(mat))
expression. More generally, reshape
takes
arbitrary number of arguments after the first one that indicate the
shape of the resulting array. SecreC guarantees that arrays are stored
in generalized row-major order. When we flatten a matrix, its first row
occurs first in the resulting array, then the second row, and so on.
Scalars are allowed to be reshaped into any shape or size.
Protection domain polymorphism
Sharemind is not limited to using a single protection domain kind and
can even support the use of multiple protection domains concurrently.
This feature makes it possible to re-use code between different security
schemes. For example, the function for counting occurrences in an array
can be implemented for any scheme that supports comparison, addition,
and conversion from Booleans to integers. However, the count
implementation is not usable across different security schemes but only
on one particular instance of the additive three-party scheme.
To enable reuse across security schemes, SecreC supports protection
domain polymorphism using syntax similar to C++ templates. A generic
version of the counting function is presented in the following listing
where the function has been made polymorphic over any protection domain
D
but the rest of the code remains unchanged compared to the previous
implementation. Note that this includes the public security domain,
making the function also usable on public data. In many cases algorithms
have a single implementation across multiple protection domains.
template <domain D>
D uint count(D uint32[[1]] data, D uint32 value) {
D uint matchCounter = 0;
for (uint i = 0; i < size(data); ++ i)
matchCounter += (uint) (data[i] == value);
return matchCounter;
}
Template functions are type checked similarly to C. Type correctness of a template function body is verified only when the function is instantiated to some concrete protection domain. The definition itself is only syntactically verified. Unfortunately this means that when the counting function is called on a protection domain that does not support equality, a type error is raised at the location of the comparison and not where the function is called from. Some other languages overcome a similar problem by restricting type parameters to only types that are known to offer some necessary functionality (such as equality). The restriction can be statically verified at the call site and error is raised when the constraints are not satisfied. For example, Haskell has type classes, rust has traits, and C has proposed concepts. All these solutions add considerable complexity to the language. Thus, we find it an acceptable tradeoff to not have any form of bounded quantification. SecreC programs are at most medium-sized and typically span less than a few thousand lines of code.
Often it is not sufficient to provide a single implementation of a particular function that works on all protection domains. Sometimes we want a faster implementation for a certain protection domain, or the default implementation uses operations that a domain does not support. Function overloading can typically be used to solve those cases.
Consider function choice
for choosing a value based on a Boolean. It
takes a Boolean value and two arguments and returns the first argument
if the Boolean is true and the second argument if the Boolean is false.
SecreC does not support regular if-expressions over secure conditionals;
thus, this kind of function is often useful for filling the role of
branching. In the following listing the first definition is polymorphic
over any protection domain but the second one is restricted to a domain
based on Yao’s garbled circuits scheme. The first definition can be
considered a more generic default implementation and it relies on
multiplication, addition and subtraction. However, for Yao’s scheme
multiplication can be a rather costly operation that we want to avoid.
Instead, in the case of the yao2p
domain, the Boolean is converted to
a suitable bitmask and the result is computed via bitwise conjunction
and XOR. Note that the second implementation is also not a good default
as it is not efficient for additive secret sharing based schemes. The
polymorphic overload that is specialized for yao2p
protection domain
kind works for any concrete domain of that kind.
template <domain D>
D uint32 choice(D bool b, D uint32 x, D uint32 y) {
return (uint32) b * (x - y) + y;
}
template <domain D : yao2p>
D uint32 choice(D bool b, D uint32 x, D uint32 y) {
D uint32 mask = - (uint32) b;
return mask & (x ^ y) ^ y;
}
When multiple matching definitions are found for the function being the most restricted match will be selected. In the current example, the second definition has a more restrictive signature than the first. When multiple equally matching functions are found, a type error is raised.
We have built a standard library using polymorphic functions,
overloading and modules. The code is structured into a generic stdlib
and specialized modules for each protection domain. The stdlib
module
defines functions that are polymorphic, with some overloaded on public
domain. Each protection domain kind module can overload the standard
library functions and provide the user with more efficient
implementations. For example, we supply the shared3p
module and some
accompanying modules to provide more efficient operations.
In addition to protection domain polymorphism, we also allow functions to be generic over data types and array dimensionalities. For example, the function defined in the following listing reshapes an arbitrary-dimensional array of any type in any protection domain into a one-dimensional array.
template <domain D, type T, dim N>
D T[[1]] flatten(D T[[N]] arr) {
return reshape(arr, size(arr));
}
Writing efficient SecreC
Efficiency is often a concern in secure computation. Secure algorithms, that are implemented directly based on some public version, can be infeasibly slow due to high round counts or impractical network bandwidth requirements. This is a problem as most secure computation protocols require network communication. Some security schemes require a constant number of communication rounds but high network bandwidth, and some schemes have low bandwidth requirements but need to perform many communication rounds. In either case, a programmer must take care when implementing secure algorithms.
In this section we provide some generic techniques that help improve performance of secure programs. We have the following suggestions that we will elaborate on and show how they are applied in SecreC.
-
Prefer parallel execution to sequential execution. Even for reasonably large inputs, applications should take a sublinear number of communication rounds.
-
In many cases the SIMD style of parallelism is sufficient but sometimes more involved techniques need to be applied to lower the number of communication rounds. For example, associative operations can be aggregated in a logarithmic number of rounds.
-
Prefer operations that are not costly in the chosen protection domain. For example, in additive schemes addition does not require network communication.
Parallel execution and SIMD
SecreC operates on the assumption that the underlying computation protocols are universally composable. This property allows secure operations to be executed either in parallel or sequentially without loss in security guarantees.
In real applications, when we execute two subsequent multiplications we invoke the underlying secure multiplication protocol twice sequentially. Regardless of how many communication rounds the underlying multiplication protocol performs, when two sequential multiplications are performed it takes about twice as much time. Compared to regular computers where operations take nanoseconds, in secure computation operations have latencies of over several milliseconds because of network communication. This is a difference in the order of \(10^6\).
Such a drastic overhead might sound like secure computing is completely infeasible. In addition to comparing only instruction latencies, we should also look at parallel execution. Namely, in a networked setting it is possible to perform thousands of operations in parallel in the same amount of time that it takes to execute a single operation. This is because the network is optimised for high-bandwidth usage. Thus, for reasonably efficient applications sequential execution should be avoided and parallel execution should be preferred.
SecreC has opted for data parallelism (performing one task on many pieces of data at the same time) over task parallelism (performing many tasks at the same time). Most operations in the language can be applied to arrays in which case the operation is executed pointwise on the data. This is also known as the single instruction, multiple data or SIMD approach.
To demonstrate the issue with high latency let us look at an implementation of dot product as usually written in a regular imperative programming language. The common solution is to iterate over both arrays adding the product of the corresponding elements to the accumulator. In most security schemes the multiplication operation requires the execution of a protocol that takes at least a single communication round. The iterative solution would then take a linear number of communication rounds with respect to input length. Due to network latencies this makes the algorithm infeasible to use in real applications with non-trivial input sizes.
template <domain D>
D uint32 dotProduct(D uint32[[1]] x, D uint32[[1]] y) {
assert(size(x) == size(y));
D uint32 result = 0;
for (uint i = 0; i < size(x); ++ i)
result += x[i] * y[i];
return result;
}
To optimize the previous example we notice that we can multiply the input arrays pointwise. This is a data parallel operation and takes the same number of rounds as multiplication of scalar values. The result of the product is stored in a temporary array that is then summed. In the additive scheme, integer addition is a local operation; thus, the following implementation is efficient enough because iterative addition does not increase the round count of the procedure.
template <domain D>
D uint32 sum(D uint32[[1]] arr) {
D uint32 result = 0;
for (uint i = 0; i < size(arr); ++ i)
result += arr[i];
return arr;
}
template <domain D>
D uint32 dotProduct(D uint32[[1]] x, D uint32[[1]] y) {
return sum(x * y);
}
Round-efficiency
In some schemes, such as Boolean circuit based ones, addition is not a local operation, meaning that sequential addition still incurs a linear round count. This shortcoming can be overcome by iteratively performing data-parallel additions between the lower and upper half of the array until a singleton array remains. For an input of length \(n\) this procedure takes \(O(\log n)\) communication rounds. A possible implementation that also takes care of non-power-of-two inputs is provided in the following listing. We could use round-efficient summation as the default implementation but it turns out to be slower in the case of public data and security schemes where addition is a local operation.
template <domain D : yao2p>
D uint32 sum(D uint32[[1]] arr) {
uint n = size(arr);
if (n == 0) return 0;
while (n > 1) {
uint smallHalf = n / 2; // Rounds down
uint bigHalf = n - smallHalf;
arr[: smallHalf] += arr[bigHalf : n];
n = bigHalf;
}
return arr[0];
}
This technique is generic and is applicable to any associative binary operation. For example, secure floating-point addition as implemented by Kamm and Willemson[5] performs multiple communication rounds. Summing a sequence of floating-point values is a common operation that is often performance-critical. While floating-point addition is not strictly associative it is regardless very useful to apply this technique for performance reasons.
Security scheme specific techniques
Parallelism and round-reduction techniques are useful tools in general but especially important in a secure setting because of the very high constant factors involved. In many cases it is also possible to exploit the properties of the security scheme to provide even better performance. We shall see few of these techniques here and take a look how they can be implemented in SecreC.
Consider the task to check if at least one value in a Boolean array is true. An implementation that uses a parallel algorithm to compute the disjunction of all elements takes \(O(\log n)\) rounds for an \(n\)-element input array. In practice this is a perfectly reasonable implementation but it is possible to do better. Asymptotically improved round complexity is achieved by converting the Boolean array to additively shared integers, adding up the integers, and checking if the sum is zero. This exploits the fact that addition of additively shared integers is a local operation. In order to not overflow the sum we must use at least \(\lceil \log n \rceil\)-bit integers. Because conversion from Boolean to integer takes a constant number of rounds and integer comparison has logarithmic round complexity, this yields an algorithm with \(O(\log \log n)\) round complexity. Our implementation in the following listing uses 64-bit integers which is sufficient for all practical purposes.
template <domain D : shared3p>
D bool any(D bool[[1]] arr) {
return sum((uint64) arr) != 0;
}
The Performance of shared3p protocols section in the SecreC reference contains performance measurements of shared3p protocols. This information can be useful for optimising functions for shared3p inputs.
It is difficult to adopt many common algorithms to the secure computation setting. For example, most popular sorting algorithms branch their control flow depending on the results of comparisons between the elements of the input. These sorting algorithms cannot be adopted to secure computation setting in a straightforward manner because such control flow dependencies leak information, namely the ranks of the elements and, thus, the relative ordering of secure inputs. However, if the original ordering of the data is hidden by randomly shuffling (permuting) the input then the control flow does not reveal as much information. In fact, if all input values are unique then no information is revealed[6][7].
A useful and efficient technique available for many different secret sharing schemes is to randomly shuffle the input to hide data dependencies. The random shuffling scheme proposed by Laur, Willemson and Zhang[8] is constant in round complexity and \(O(n\log n)\) in communication complexity. Thus, this primitive facilitates the implementation of \(O(n \log n)\) sorting algorithms. Many practical data-oblivious sorting algorithms exist — bitonic sorter[9], Batcher odd–even mergesort[9], pairwise sorting network[10] — but they all have \(O(n \log^2 n)\) time complexity, take \(O(\log^2 n)\) rounds, and are not stable. Numerous applications are built on top of oblivious sorting[11]. The shuffling primitive has also found use in oblivious database linking[12].
Trade-off between efficiency and privacy
When writing MPC code there is almost always a trade-off between efficiency and security. An implementation that does not reveal any information can be impractically slow. By revealing some sensitive information that is found to be acceptable to leak we can speed up the program significantly. One such example is sorting where the best practical oblivious algorithms take \(O(n \log^2 n)\) time. When it is acceptable to reveal the results of comparisons it is straightforward to adopt any \(O(n \log n)\) sorting algorithm to secure computation setting[7][6]. We shall see how SecreC gives the programmer flexibility to make trade-offs between efficiency and privacy.
As a small case study we will implement quickselect[13] (also known as Hoare’s selection algorithm) for finding the \(k\)-th smallest element of an unsorted sequence. On average the public algorithm runs in linear time. In this section we show how a naive secure implementation of the algorithm has \(O(n^2)\) running time but by revealing a little information about the input sequence we can achieve \(O(n \log n)\) average-case complexity, and by revealing even more we get an implementation that works in linear time.
Public quickselect operates similarly to quicksort. The algorithm operates iteratively on a working list. In every iteration we pick a pivot from the list and split the remaining list into two parts: elements smaller than the pivot, and the rest of the elements. Let the list wih smaller elements have \(l\) elements. If \(k = l\) then we are done as the pivot is the \(k\)-th smallest element (we start counting from \(0\)). If \(k < l\) then the answer can be found in the list with smaller entries, and we just drop the larger elements and continue. If \(k > l\) then the element can be found in the list with larger entries. In this case we drop the smaller elements and continue to look for the \((k - l - 1)\)-th smallest element in the remaining sequence.
Usually, the development of a privacy-preserving algorithm starts from a
public implementation or description. In the following listing we have
implemented quickselect in SecreC so that it only operates on public
data. It mostly follows the algorithm description from earlier but we
have already given some forethought to the secure implementation.
Namely, we perform all comparisons between the pivot and the rest of the
list in parallel to keep the round count minimal. The results of the
comparison are stored in the mask
array and variable \(l\)
is computed by summing the bitvector. If \(k > l\) we flip the
mask
. After the checks we filter the list according to the mask. To
keep the presentation simple, we have not optimized it. Namely, the
algorithm does not operate in-place and traverses the sequence more
often than is strictly needed.
int quickselect(int[[1]] list, uint k) {
assert(k < size(list));
int pivot;
while (true) {
pivot = list[0]; // Random selection is better
list = list[1:];
bool[[1]] mask = list < pivot;
uint l = sum(mask);
if (k == l) break;
if (k > l) mask = ~mask;
if (k > l) k = k - l - 1;
list = filter(list, mask);
}
return pivot;
}
Given an \(n\)-element input list, in the worst case our implementation may take \(n\) iterations leading to an \(O(n^2)\) algorithm. Consider the case where the list is already sorted and we are looking for the \((n-1)\)-th smallest element. In such case, during every iteration, the pivot is chosen to be the first element of the running sequence. Because it is the smallest one, no elements other than the pivot are eliminated from consideration. We can improve the algorithm by selecting the pivot randomly but this does not improve the worst case as every time the smallest element could be selected. A perfectly secure version has to account for the worst-case possibility. Thus, if we do not want to leak any information we have to perform all \(n\) iterations of the algorithm. But such an implementation is unacceptably slow and better ones clearly exist. For example, we could simply sort the input sequence using some oblivious sorting network and pick the \(k\)-th element of the result. That yields a much better \(O(n \log^2 n)\)-time algorithm.
If we are willing to accept leaking the number of iterations that the algorithm performs, we can do much better. Unfortunately, this may actually give away a lot about the original data. For example, when we select the first element as the pivot during each iteration and the algorithm performs \(n\) iterations then we learn that the input list is sorted. Randomized pivot selection makes such a leak unlikely but does not eliminate it.
template <domain D>
D int quickselectPrivate(D int[[1]] list, uint k) {
assert(k < size(list));
D int pivot;
while (true) {
pivot = list[0];
list = list[1:];
D bool[[1]] mask = list < pivot;
D uint l = sum(mask);
if (declassify(k == l)) break;
mask = choose(k > l, ~mask, mask);
k = choose(k > l, k - l - 1, k);
list = partition(list, mask);
}
return pivot;
}
The secure implementation follows quite straightforwardly from the
public one. To stop iterating, we need to declassify the results of
comparison \(k = l\). We have to update variables mask
and
k
obliviously. Updating the working list, however, is tricky. Namely,
we cannot simply obliviously remove unnecessary elements as this will
give away more information than we would like. One possible solution is
to use an additional bit vector indicating if an element in the list is
actually there or not. This would complicate the implementation. A
simpler solution is to replace elements that are not needed with some
reserved values that are guaranteed to be larger than all others, and
move them to the end of the sequence.
Secure partitioning is performed in the following way. First we replace the elements of the sequence corresponding to false values in the mask by maximum integer values, and then we sort the sequence by the Boolean mask moving true values to the front. This can be done similarly to the Boolean counting sort step of the radix sort in [2]. Sorting Boolean vectors can be done in effectively linear time and; hence, the partitioning is also linear-time and takes a constant number of communication rounds. Given that, we can say that on average the algorithm takes \(O(n \log n)\) time and \(O(\log n)\) communication rounds. This is already as good as sorting-based solutions.
To go even further, if we are willing to leak on every iteration the number of elements smaller than the pivot, we can achieve linear-time solution. When declassifying \(l\) we can, instead of partitioning, remove the unneeded elements; thus, reducing the size of the working list. To improve security we can pick the pivot securely. This can be done by securely shuffling the working list at the start of every iteration and continuing as before.
Remove redundant computations
Any arithmetic operation on private values (except integer addition) takes a significant amount of time. You may be used to being able to copy and paste subexpressions in standard programming languages because the amount of computation time spent on the expression is insignificant. This is often not the case in SecreC. Store the result of the expression in a variable if you wish to reuse it.
For example, a value is often squared like this:
pd_shared3p float64 a = ...;
pd_shared3p float64 b = ...;
pd_shared3p float64 c = (a + b) * (a + b);
Instead you should create a temporary variable for the intermediate value:
pd_shared3p float64 a = ...;
pd_shared3p float64 b = ...;
pd_shared3p float64 sum = a + b;
pd_shared3p float64 c = sum * sum;
If you use a loop in your program then move every computation that does not depend on the loop outside of the loop. Evaluating an expression 100 times which can be evaluated once is very time consuming.
Avoid floating point computations
Let us take a look at the
performance
table of Sharemind MPC operations. We can notice that if we multiply
10000-element int64
vectors we can do 11700 operations per second. If
we use float64
datatype we can only do 121 operations per second. This
is a difference of two orders of magnitude. If the inputs to your
program are integers keep them as integers for as long as possible.
Use datatypes with smaller bit width
Let us take a look at the
operations
performance table again. Multiplying 10000-element uint64
vectors is
over two times slower than multiplying uint8
vectors. Prefer smaller
types when possible but keep in mind that arithmetic with small types
can lead to overflow.
Use xor-shared integers for comparison operations
Let us take a look at the
operations
performance table again. Comparing 10000-element uint64
vectors is
almost three times slower than comparing xor_uint64
vectors. If you
have an algorithm which does a lot of comparisons it might be beneficial
to convert to a xor-shared type and then back to an additively shared
type after comparisons. This strategy is used by sorting procedures in
the SecreC standard library.
Reduce the number of syscalls
SecreC programs use procedures in C modules. Calling a C procedure is called a syscall or system call in Sharemind parlance. Any operation on a private value causes a system call. For example, these operations cause syscalls:
-
initialising a private vector;
-
reading an element from a private vector;
-
writing an element to a private vector;
-
any arithmetic operation on private vectors.
The syscall interface is not very efficient so it can be beneficial to avoid performing syscalls. For example, if you copy elements from one vector to another in a loop you might perform a lot of syscalls.
To optimise this particular scenario we have gather and scatter syscalls which take a private vector and a public index vector and perform multiple reads/stores in parallel. Constructing the public index vector does not require syscalls.
The following is an example of how to use the gather syscall.
__syscall("shared3p::gather_uint8_vec",
__domainid(pd_shared3p), source, target, __cref idx);
In this case we read from a uint8
vector. The first argument specifies
the protection domain, source
is the vector we are reading from,
target
is the vector we are storing the read values into and idx
contains the (uint64
) indices of the positions we want to read.
The following is an example of how to use the scatter syscall.
__syscall("shared3p::scatter_uint8_vec",
__domainid(pd_shared3p), source, target, __cref idx);
Do not use slice syntax with arithmetic assignment on private values
The following statement combines slice syntax with arithmetic assignment:
a[0:10] *= 3;
Semantically it multiplies the first ten values of a
with three. With
gather and scatter syscalls it is possible to evaluate this statement
using O(1) syscalls and one multiplication but the compiler generates
code which loops over the slice indices and thus causes O(n) syscalls
and ten multiplications.
The following snippet does only one multiplication:
a[0:10] = a[0:10] * 3;
It still does O(n) syscalls. Let us assume that a
has type int64
. We
can optimise it further like this:
uint[[1]] idx(10);
for (uint i = 0; i < 10; ++i) {
idx[i] = i;
}
pd_shared3p int64[[1]] tmp(10);
__syscall("shared3p::gather_int64_vec",
__domainid(pd_shared3p), a, tmp, __cref idx);
tmp = tmp * 3;
__syscall("shared3p::scatter_int64_vec",
__domainid(pd_shared3p), tmp, a, __cref idx);
This kind of statement is not used very often but we point it out because it can be difficult to understand why your program is slow if you have written such a statement.
Implementation and integration with Sharemind MPC
In this section we give a short overview of the SecreC compiler and its integration with the Sharemind framework. We will also briefly discuss the bytecode to which the language compiles.
Custom bytecode
Sharemind MPC is able to execute custom bytecode. We will not go over the details of the bytecode but we will elaborate on why a custom bytecode language is used. The bytecode specification is available online.
We use custom bytecode for a few reasons. The most important one is security. By sacrificing some performance, we can have a more secure implementation where invalid operations, like division by zero, can gracefully stop the execution of the code instead of crashing the entire platform. Many bytecode languages have access to foreign functions and raw memory but this is something we wish to avoid for the sake of security. Custom bytecode allows for fine-grained control of memory. A program can be stopped by the VM when it goes over some memory allocation limit. Because we avoid garbage collection, the programmer can better predict memory usage.
The secondary reason for having custom bytecode is portability. Namely, different architectures implement floating-point operations in a different manner. For example, machine instructions for computing sine might give one result on one machine but a slightly different result on another. In such case the control flow of a program executing on these machines could dffer. Many existing bytecode languages rely on floating-point operations and therefore not suitable for distributed execution necessary for MPC. Sharemind’s bytecode relies on a software implementation of floating-point operations. Currently, we use the SoftFloat library (Release 2c).
Integration with Sharemind MPC
The SecreC language is not tightly coupled to Sharemind MPC. The compiler is just a standalone tool that takes a SecreC program text and produces a bytecode executable. The bytecode, in turn, can be executed using Sharemind. Conceptually, the bytecode can be generated from some other language or can even be manually written.
The SecreC compiler does not have to be deployed together with Sharemind but it usually is. This is because Sharemind is still vulnerable to executing untrusted code and, therefore, before executing any code it must be reviewed. Bytecode is much more difficult to audit than high-level SecreC programs. Hence, system administrators usually compile the SecreC file themselves after auditing the code, and then manually deploy the compiled bytecode file. The SecreC compiler is open-sourced and free to be verified by anyone.
A standard Sharemind deployment scheme is given in the following figure.
The figure depicts how a SecreC script app.sc
is compiled and run on a
Sharemind Application Server instance. For each of the three parties,
the script is compiled to a bytecode file app.sb
. When instructed, the
Sharemind instance can then execute the deployed script. Before
execution, Sharemind Application Server checks that the bytecode files
are identical.
The bytecode contains the list of protection domains that it expects to use, and also the list of system calls that it may perform. Before execution, Sharemind verifies if such protection domains have been configured and checks that all system calls are supported. After verification, Sharemind instantiates the required protection domains. This can include setting up various facilities such as random number generators, local overlay networks, database engines, and logging back-ends. If everything goes well, the program is executed after the setup phase.
The individual bytecode instructions do not include network communication, random number generation or any other security primitives. All of these tasks are performed via system calls to Sharemind. When the bytecode invokes a system call, control is given to the protection domain that implements it. Generally, a system call first verifies that its parameters (e.g., number of arguments, input lengths, input types) are correct, and then executes the actual logic. For example, the system call could perform the additive three-party multiplication protocol. To execute that protocol, protection domains need to generate random numbers and perform network communication between each other.
Using the table database interface
SecreC supports a basic file-based table database system intended for storing and organizing large amounts of data to be privately processed in SecreC.
Before creating a table a connection must be opened to the data source
with the function tdbOpenConnection
. The name of the data source is
set in Sharemind’s configuration. Before opening a new connection to a
different data source make sure the old connection has been closed with
tdbCloseConnection
. A simple table with a uniform data type can be
created with the function tdbTableCreate
.
Creating an empty table:
// creating an empty table with 3 columns of type int32 and protection domain shared3p
import shared3p;
import shared3p_table_database;
import stdlib;
import table_database;
domain pd_shared3p shared3p;
void main () {
string datasource = "DS1";
string table_name = "SimpleTable";
tdbOpenConnection (datasource);
pd_shared3p int32 data_type;
tdbTableCreate (datasource, table_name, data_type);
}
Data is added to the table one row at a time with the function
tdbInsertRow
. Each row is a vector with every element corresponding to
a column in the table. Data is read from the table one column at a time
with the function tdbReadColumn
. The identifier of the column can be
the column’s index or the name of the column.
Vector map
A vector map (also referred to as a value map or a vmap) is a data structure for making more complicated tables with column identifiers and multiple data types. A vmap can contain four different types of information: types, strings, values and indexes. A vmap is similar to a Python dictionary as the vmap contains parameters and data associated to those parameters.
Values added to the vmap can be either fixed lenght or variable length.
Variable length means that the size of values must not be uniform in one
column e.g. strings or vectors with variable length. Values are stored
in the vmap as batches. Each batch covers all columns but one row. Data
can be added one entry at a time with
tdbVmapAdd{Type/String/Value/Index}
for fixed length data or with
tdbVmapAddVlen{Type/Value}
for variable length data.
Creating a table from a vector map
When creating a table instead of specifing a data type and the number of
columns a vmap can be used. The vmap must contain a type and string for
every column with the parameters "types" and "names" respectively.
Data can be inserted to the table with tdbInsertRow
but instead of a
vector a vmap can be used. The vmap must have values with the parameter
"values" that are the same type as their respective column in the
table. Every batch in the vmap corresponds to a single row in the table.
Creating a table with a vector map:
import shared3p;
import shared3p_table_database;
import stdlib;
import table_database;
domain pd_shared3p shared3p;
void main() {
string ds = "DS1"; // Data source name
string tbl = "table"; // Table name
// Open database before running operations on it
tdbOpenConnection(ds);
// Check if a table exists
if (tdbTableExists(ds, tbl)) {
// Delete existing table
tdbTableDelete(ds, tbl);
}
// We want to create a simple table with three columns:
//
// -----------------------------------------------------------
// Name: | "index" | "measurement" | "have_measurement" |
// Type: | public uint64 | pd_shared3p uint64 | pd_shared3p bool |
// -----------------------------------------------------------
// | 0 | 0 | true |
// | 1 | 10 | true |
// | 2 | 20 | true |
// | 3 | 30 | true |
// | 4 | 40 | true |
// -----------------------------------------------------------
// Create a new "vector map/value map" for storing arguments to the table
// creation call.
uint params = tdbVmapNew();
// Column 0, name "index", type public uint64
{
uint64 vtype;
tdbVmapAddType(params, "types", vtype);
tdbVmapAddString(params, "names", "index");
}
// Column 1, name "measurement", type pd_shared3p uint64
{
pd_shared3p uint64 vtype;
tdbVmapAddType(params, "types", vtype);
tdbVmapAddString(params, "names", "measurement");
}
// Column 2, name "have_measurement", type pd_shared3p bool
{
pd_shared3p bool vtype;
tdbVmapAddType(params, "types", vtype);
tdbVmapAddString(params, "names", "have_measurement");
}
// Create the table
tdbTableCreate(ds, tbl, params);
// Free the parameter map
tdbVmapDelete(params);
// Insert some data
uint nrows = 5;
params = tdbVmapNew();
for (uint i = 0; i < nrows; ++i) {
uint64 index = i;
pd_shared3p uint64 measurement = i * 10;
pd_shared3p bool have_measurement = true;
if (i != 0) {
// This has to be called in-between rows
tdbVmapAddBatch(params);
}
tdbVmapAddValue(params, "values", index);
tdbVmapAddValue(params, "values", measurement);
tdbVmapAddValue(params, "values", have_measurement);
}
tdbInsertRow(ds, tbl, params);
}
Erasing tables
Before creating a new table it is recommended to check if a table with
the same name already exists and delete it if it does. A table can be
deleted with tdbTableDelete
. After the program finishes SecreC
automatically deletes all existing vector maps but for safety concerns
it is advised to do this manually.