Research
A list of my academic research publications.
My academic research interests broadly include cryptography, data structures, randomized algorithms, and probabilistic computing. I imagine these interests will soon shift as I gain more mileage at my appointment as a quantitative researcher at Proof Trading.
My PhD work focused on the intersection of cryptography and data structures. I have explored how probabilistic data structures behave in adversarial environments. Likewise, I have proposed new and robust versions of these structures that are provably secure. Lately, I have been interested in privacy-preserving verifiable data structures and their applications, namely, key transparency.
A list of my academic publications can be found below. Various author ordering conventions used. PDF versions of each paper can be obtained by clicking on the title of a paper. I also keep my Google Scholar up to date. Additionally, here is my ORCID.
Peer-reviewed Publications
Probabilistic Data Structures in the Wild: A Security Analysis of Redis
Mia Filić, Jonas Hofmann, Sam A. Markelon, Kenneth G Paterson, and Anupama Unnikrishnan
CODASPY '25 (Best Paper Award)
Abstract
Redis (Remote Dictionary Server) is a general purpose, in-memory database that supports a rich array of functionality, including various Probabilistic Data Structures (PDS), such as Bloom filters, Cuckoo filters, as well as cardinality and frequency estimators. These PDS typically perform well in the average case. However, given that Redis is intended to be used across a diverse array of applications, it is crucial to evaluate how these PDS perform under worst-case scenarios, i.e., when faced with adversarial inputs. We offer a comprehensive analysis to address this question. We begin by carefully documenting the different PDS implementations in Redis, explaining how they deviate from those PDS as described in the literature. Then we show that these deviations enable a total of 10 novel attacks that are more severe than the corresponding attacks for generic versions of the PDS. We highlight the critical role of Redis' decision to use non-cryptographic hash functions in the severity of these attacks. We conclude by discussing countermeasures to the attacks, or explaining why, in some cases, countermeasures are not possible.
Citation
@inproceedings{fhmpu24,
title={Probabilistic data structures in the wild: A security analysis of redis},
author={Filić, Mia and Hofmann, Jonas and Markelon, Sam A and Paterson, Kenneth G and Unnikrishnan, Anupama},
booktitle={Proceedings of the Fifteenth ACM Conference on Data and Application Security and Privacy},
pages={167--178},
year={2024}
}
Leveraging Generative Models for Covert Messaging: Challenges and Tradeoffs for "Dead-Drop" Deployments
Luke A Bauer, James K. Howes, Sam A. Markelon, Vincent Bindschaedler, and Thomas Shrimpton
CODASPY '24
Abstract
State of the art generative models of human-produced content are the focus of many recent papers that explore their use for steganographic communication. In particular, generative models of natural language text. Loosely, these works (invertibly) encode message-carrying bits into a sequence of samples from the model, ultimately yielding a plausible natural language covertext. By focusing on this narrow steganographic piece, prior work has largely ignored the significant algorithmic challenges, and performance-security tradeoffs, that arise when one actually tries to build a messaging pipeline around it. We make these challenges concrete, by considering the natural application of such a pipeline: namely, "dead-drop" covert messaging over large, public internet platforms (e.g. social media sites). We explicate the challenges and describe approaches to overcome them, surfacing in the process important performance and security tradeoffs that must be carefully tuned. We implement a system around this model-based format-transforming encryption pipeline, and give an empirical analysis of its performance and (heuristic) security.
Citation
@inproceedings{bhmbs24,
title={Leveraging Generative Models for Covert Messaging: Challenges and Tradeoffs for" Dead-Drop" Deployments},
author={Bauer, Luke A and Howes, James K and Markelon, Sam A and Bindschaedler, Vincent and Shrimpton, Thomas},
booktitle={Proceedings of the Fourteenth ACM Conference on Data and Application Security and Privacy},
pages={67--78},
year={2024}
}
Compact Frequency Estimators in Adversarial Environments
Sam A. Markelon, Mia Filić, and Thomas Shrimpton
CCS '23
Abstract
Count-Min Sketch (CMS) and HeavyKeeper (HK) are two realizations of a compact frequency estimator (CFE). These are a class of probabilistic data structures that maintain a compact summary of (typically) high-volume streaming data, and provides approximately correct estimates of the number of times any particular element has appeared. CFEs are often the base structure in systems looking for the highest-frequency elements (i.e., top-K elements, heavy hitters, elephant flows). Traditionally, probabilistic guarantees on the accuracy of frequency estimates are proved under the implicit assumption that stream elements do not depend upon the internal randomness of the structure. Said another way, they are proved in the presence of data streams that are created by non-adaptive adversaries. Yet in many practical use-cases, this assumption is not well-matched with reality; especially, in applications where malicious actors are incentivized to manipulate the data stream. We show that the CMS and HK structures can be forced to make significant estimation errors, by concrete attacks that exploit adaptivity. We analyze these attacks analytically and experimentally, with tight agreement between the two. Sadly, these negative results seem unavoidable for (at least) sketch-based CFEs with parameters that are reasonable in practice. On the positive side, we give a new CFE (Count-Keeper) that can be seen as a composition of the CMS and HK structures. Count-Keeper estimates are typically more accurate (by at least a factor of two) than CMS for "honest" streams; our attacks against CMS and HK are less effective (and more resource intensive) when used against Count-Keeper; and Count-Keeper has a native ability to flag estimates that are suspicious, which neither CMS or HK (or any other CFE, to our knowledge) admits.
Citation
@inproceedings{mfs23,
title={Compact Frequency Estimators in Adversarial Environments},
author={Markelon, Sam A and Filic, Mia and Shrimpton, Thomas},
booktitle={Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security},
pages={3254--3268},
year={2023}
}
The DecCert PKI:A Solution to Decentralized Identity Attestation and Zooko’s Triangle
Sam A. Markelon and John True
DAPPS '22 (Best Paper Award.)
Abstract
We propose DecCert, a decentralized public key infrastructure designed as a smart contract that solves the problem of identity attestation on public blockchains. Our system allows an individual to bind an identity to a public blockchain address. Once a claim of identity is made by an individual, other users can choose to verify the attested identity based on the evidence presented by an identity claim maker by staking cryptocurrency in the DecCert smart contract. Increasing levels of trust are naturally built based upon the amount staked and the duration the collateral is staked for. This mechanism replaces the usual utilization of digital signatures in a traditional hierarchical certificate authority model or the web of trust model to form a publicly verifiable decentralized stake of trust model. We also present a novel solution to the certificate revocation problem and implement our solution on the Ethereum blockchain. Further, we show that our design solves Zooko’s triangle as defined for public key infrastructure deployments
Citation
@inproceedings{mt22,
title={The DecCert PKI: A Solution to Decentralized Identity Attestation and Zooko’s Triangle},
author={Markelon, Sam A and True, John},
booktitle={2022 IEEE International Conference on Decentralized Applications and Infrastructures (DAPPS)},
pages={74--82},
year={2022},
organization={IEEE}
}
A semi-quantum extended B92 protocol and its analysis
Walter O. Krawec and Sam A. Markelon
Quantum Information Science, Sensing, and Computation XII
Abstract
Unlike purely classical communication, unconditionally secure key distribution is possible if Alice and Bob are both equipped with quantum hardware. The degree to which a protocol needs to be quantum is not only an interesting theoretical question, but also important for practical implementations. Indeed, one may wish to construct cheaper devices, or compensate for device malfunction. In this sense, studying limited resource QKD protocols is an important problem. One direction to studying this is the semi-quantum model introduced by Boyer et al. in 2007 (PRL 99 140501). Several provably secure semi-quantum protocols were put forth. However, most of these protocols were proven secure in the perfect qubit scenario and not necessarily against practical attacks. Only recently, starting with seminal work of Boyer, Katz, Liss, and Mor in (PRA 96 062335) has research in the field of semi-quantum cryptography considered practical devices and imperfections, such as multi photon sources and imperfect detectors. In this work, we present a new SQKD protocol based on an Extended B92 protocol which is able to counter certain practical attacks. Furthermore, the techniques we use may see broad application to other limited-resource (S)QKD protocols.
Citation
@inproceedings{km20,
title={A semi-quantum extended B92 protocol and its analysis},
author={Krawec, Walter O and Markelon, Sam A},
booktitle={Quantum Information Science, Sensing, and Computation XII},
volume={11391},
pages={47--55},
year={2020},
organization={SPIE}
}
Genetic algorithm to study practical quantum adversaries
Walter O. Krawec and Sam A. Markelon
GECCO '18
Abstract
In this paper we show how genetic algorithms can be effectively applied to study the security of arbitrary quantum key distribution (QKD) protocols when faced with adversaries limited to current-day technology. We compare two approaches, both of which take into account practical limitations on the quantum power of an adversary (which can be specified by the user). Our system can be used to determine upper-bounds on noise tolerances of novel QKD protocols in this scenario, thus making it a useful tool for researchers. We compare our algorithm's results with current known numerical results, and also evaluate it on newer, more complex, protocols where no results are currently known.
Citation
@inproceedings{km18,
title={Genetic algorithm to study practical quantum adversaries},
author={Krawec, Walter O and Markelon, Sam A},
booktitle={Proceedings of the Genetic and Evolutionary Computation Conference},
pages={1270--1277},
year={2018}
}
Pre-prints
SoK: On the Security Goals of Key Transparency Systems
Nicholas Brandt, Mia Filić, and Sam A. Markelon
Abstract
Key Transparency (KT) systems have emerged as a critical technology for adding verifiability to the distribution of public keys used in end-to-end encrypted messaging services. Despite substantial academic interest, increased industry adoption, and IETF standardization efforts, KT systems lack a holistic and formalized security model, limiting their resilience to practical threats and constraining future development. In this paper, we survey the existing KT literature and present the first cryptographically sound formalization of KT as an ideal functionality. Our work clarifies the underlying assumptions, defines core security properties, and highlights potential vulnerabilities in deployed KT systems. We prove in the Universal Composability framework that our concrete protocol achieves KT security as defined by our formalism. Our KT protocol builds on the latest trends in KT design, guided by the formalization.
Citation
@misc{bfm24,
author = {Nicholas Brandt and Mia Filić and Sam A. Markelon},
title = {{SoK}: On the Security Goals of Key Transparency Systems},
howpublished = {Cryptology {ePrint} Archive, Paper 2024/1938},
year = {2024},
url = {https://eprint.iacr.org/2024/1938}
}