# Workshop on Encryption for Secure Search and Other Algorithms

^{30}/Jun 2015

I just got back from the Workshop on Encryption for Secure Search and other Algorithms (ESSA) which was held in Bertinoro, Italy, and was organized by Sasha Boldyreva and Bogdan Warinschi. It was a great event and I'd like to thank the organizers for putting this together and doing such a great job. It was really nice to see all the excitement and enthusiasm behind this topic; both from the research community and from industry.

Since a few people have already asked me for details about the event I figured I would just write brief summaries of the talks. I think the slides will be posted soon so if you are interested you should be able to get more details on the workshop page ESSA.

The first talk was by Christopher Bosch who gave a survey of encrypted search. The talk was based on a paper Christopher published last year. This a really extensive and thorough survey and a great contribution to the field. The authors go over a large number of papers and try to organize and categorize them; drawing conclusions and research directions from the broad perspective they gained from writing the survey. It is a great reference for anyone interested in this field.

*Kaoru Kurosawa* gave a talk on two of his papers. In the first paper, the
authors describe a universally composable (UC) variant of adaptive semantic
security (i.e., CKA2-security) for SSE. The main difference with the standard
definition is that the UC variant requires correctness and the simulation is
strengthened by requiring it to be black-box (i.e., there exists a simulator
for all adversaries). Kaoru then described a construction that achieves this
strong notion of security. In the second part of the talk Kaoru discussed a
more recent paper of his that describes an SSE scheme that handles very
expressive queries but without revealing the expression (not just the keywords
in the query but the form/structure of the query). This is accomplished using a
new variant of garbled circuits which is very interesting in its own right.

*Emily Shen* talked about her work on substring matching on encrypted data. This
is done using an encrypted suffix tree, i.e., using a (interactive) structured
encryption scheme for suffix trees. In this work, however, she was concerned
with a stronger model of security where the server can be malicious. This last
constraint required her to strengthen the standard definition of security for
structured encryption. The construction was very nice but a bit too involved to
describe here so I recommend reading the paper for more details.

*Nathan Chenette* gave a nice overview of the state of the art of fuzzy
searching on encrypted data in the property-preserving model. After describing
the different approaches he gave stronger definitions for this primitive.
Unfortunately, to achieve this notion there is what seems to be an inherent
ciphertext expansion so he described a weaker notion that allows for more
space-efficient constructions.

*Kevin Lewi* talked about his work on order revealing encryption (ORE). ORE is
similar to order-preserving encryption (OPE) in that it allows for comparisons
over ciphertexts. But there is one important distinction: unlike OPE, ORE does
not require the comparison operation over ciphertexts to be "less than". In
other words, to compare two OPE ciphertexts one can simply execute a
"greater/lesser than" operation whereas with ORE one might have to execute some
other arbitrary operation. This is an important relaxation and allows the
authors to overcome impossibility results for OPE which say that OPE schemes
have to leak more than just the order. The construction Kevin presented is
based on obfuscation techniques but does not require the full power of
obfuscation. In particular it avoids the use of Barrington's theorem (though it
makes use of multi-linear maps) which as Kevin said makes the scheme at least
"implementable" but not practical.

*Murat Kantarcioglu* described a few of his works including a paper that
initiated research on concrete inference attacks; that is, attacks on encrypted
search solutions that use statistical and optimization techniques to exploit
leakage. He described the attack he and his co-authors use to try to recover
the queries a user makes by exploiting the access pattern leakage of SSE. This
attack is known as the IKK attack and is currently the best inference attack we
have against access pattern leakage. The second part of the talk covered ways
to mitigate these kinds of attacks and Murat described a clever way of using
differential privacy for this.

*David Cash* also discussed inference attacks. In addition to standard inference
attacks, however, he also described attacks where the adversary exploits more
than leakage and, in particular, knows or chooses some of documents. The
findings were very interesting. One thing that came out of this study was that
the IKK attack-while very interesting in theory-is not really practical. There
are several technical reasons for this but I' ll leave you to read David's paper
when it appears if you are interested. This study also looked at a new class of
schemes (not SSE/structured schemes) that have appeared in the literature
recently and showed that they were vulnerable to adversaries who know and/or
can choose documents (though to be fair they were not designed with that
adversarial model in mind).

Unfortunately, *Hugo Krawczyk* couldn' t make it to the workshop at the last
second so *Stas Jarecki* gave his talk. This was a nice overview of the work on
SSE done by the IBM/Rutgers/UC Irvine team (from now on referred to as IRI) for
the IARPA SPAR project. It covered a series of papers including their paper
from CRYPTO' 13 that shows how to achieve conjunctive queries in sub-linear
time. The talk then continued with more recent papers that focused on schemes
with good I/O complexity and even more expressive queries. The talk had a nice
blend of theory and systems, in particular illustrating how systems constraints
like I/O complexity can sometime force you to find new and interesting
solutions.

*Vlad Kolesnikov* talked about the system that Columbia and Bell Labs designed
for the IARPA competition. This system-called Blind Seer-even had a cool logo
which we learned was designed by Vlad himself! At a high level, this system
makes use of garbled circuits and bloom filters and is designed to work in a
3/4-party model that includes a data owner, a policy server and an index
server. Vlad described several bottlenecks they encountered and all the clever
optimizations they had to design to make the system perform. There was some
discussion about how Blind Seer compared to the IRI system. In the end, it
seemed that the two were incomparable and achieved different tradeoffs between
leakage and efficiency.

*Adam O'Neill* presented his recent work on modular OPE (MOPE). MOPE is a variant
of OPE where a random shift and modular operation is applied to the plaintext
before an OPE encryption is done. Turns out this can improve the security of
OPE but not when OPE is used to do range queries. Adam described a few
techniques to address this that didn' t seem to affect the efficiency of the
schemes. He also showed experimental results to back this up.

*Radu Sion* talked about the new cloud security startup he' s doing. He couldn' t
say much about the technical aspects of what they are doing but he went over
some of the services they are providing and showed demoes, some of which
included searching on encrypted data. Since this was a "sensitive" talk and
Radu himself had to be careful not to reveal too much I' ll stop here at the
risk of revealing things he may not want made public on a larger scale.

*Paul Grubbs* gave a talk that went over what he' s been working on at SkyHigh
networks. He talked about ongoing projects SkyHigh was doing with OPE,
deterministic encryption and format-preserving encryption. In addition he
discussed future projects the company was planning on doing with SSE. This talk
was nice in that it provided a different perspective on crypto than what you
typically get in academic settings. In particular, Paul described how the
solutions they considered and worked on had to fit various business and
legal/regulatory constraints. This is something I' ve been exposed to at MSR and
I definitely think that seeing how technology gets (or doesn' t get) deployed in
the real world is very useful in developing and sharpening your intuition about
what research areas are more or less promising in terms of impact.

*Mayank Varia* gave a great talk on the testing framework Lincoln Labs built to
evaluate the encrypted search systems for the IARPA competition. I have to say
this was one of my favorite talks. The scale of what they built was truly
impressive. The system is composed of various frameworks. One part of the
system is just for generating realistic data and queries and they do this using
machine learning techniques on real data. The query generation is very
flexible however, and you can use it to generate data and queries with specific
characteristics for your tests. The second component is a measurement
framework. The third component was an automated system for generating graphs
and visualizations of the experimental results in LaTex! Overall what they
built sounded very impressive and I think that we should try to adopt it as a
standard way of testing/evaluating encrypted search solutions. I think the
encrypted search community is lucky to have such a framework so we should take
advantage of it. Mayank said that they are working on getting the code up on
GitHub so I' ll update this post as soon as it' s up.

*David Wu* talked about a new protocol for privacy-preserving location
services. Suppose you want to find out how to get from point A to point B but
don' t want to disclose your location to the server that stores the maps and the
server doesn' t want to reveal its own data. Without privacy, one can solve this
problem by representing the map as a graph and computing the shortest path so
the problem David was interested in was can you design a practical two-party
protocol for shortest paths. David showed how to do this by first proposing a
very nice way to compress the representation of the graph in a way that doesn' t
affect the shortest paths and then computing the shortest paths on the new
representation via oblivious transfer. David then presented benchmarks of their
protocol for the city of Los Angeles.

*Florian Bourse* presented new constructions of functional encryption schemes for
inner products. Unlike previous general-purpose FE schemes the goal of this
work was to provide simple and efficient constructions. Florian discussed two
constructions, one based on DDH and another based on LWE. Note that the
functionality considered by Florian is slightly different than "inner product
encryption" of Katz, Sahai, Waters and Shen, Shi and Waters. In the latter
works, the decryption returns one bit of information: whether the inner product
is equal to 0 or not. Here, the decryption returns the actual inner product.

*Tarik Moataz* talked about ORAM with constant bandwidth. What is meant in the
literature by constant-bandwidth ORAM is a bit technical but, roughly speaking,
one can think of it as the requirement that the metadata exchanged with the
server is smaller than the data blocks. Previous work on constant-bandwidth
ORAM had two limitations. The first is that they achieved only amortized
constant-bandwidth. The second is that they only work with very large blocks
and as such only make sense for limited kinds of applications (using standard
parameters, they would have 4MB blocks). Tarik showed to get around these two
limitations, giving a worst-case constant-bandwidth ORAM with much smaller
block size. In addition, the scheme also improves the computational cost at the
server.

*Stas Jarecki* talked about RAM-based MPC (i.e., MPC protocols that work in the
RAM model as opposed to over circuits). The standard way to do this is to use
two-party computation (2PC) to securely compute the client algorithm of an ORAM
scheme. Roughly speaking, this requires the ORAM client algorithm to be
MPC-friendly so that the resulting solution is efficient. While most schemes
consider only the two-party setting, Stas argued that it is interesting to look
at three parties as well since better efficiency could be achieved in that
setting. In fact, Stas described a protocol for this setting which was a lot
more efficient than protocols for the two-party setting.

*Leo Reyzin* gave a survey of entropy notions in cryptography. Leo went over
Shannon entropy, min-entropy and average conditional min-entropy in each case
giving a very nice and intuitive explanation of why and when these notions
should be applied. He also discussed computational variants of entropy
including HILL entropy and what is known and not known about it. Entropy
notions in crypto are a bit subtle and can be hard to work with and
unfortunately there isn't much material to learn from so Leo' s survey was
extremely useful.