Select Language

An Oblivious Password Cracking Server: Combining Hellman Tables and PIR Protocols

Analysis of a privacy-preserving password cracking server using Hellman tables and Private Information Retrieval (PIR) protocols to protect query confidentiality.
strongpassword.org | PDF Size: 0.2 MB
Rating: 4.5/5
Your Rating
You have already rated this document
PDF Document Cover - An Oblivious Password Cracking Server: Combining Hellman Tables and PIR Protocols

Table of Contents

1. Introduction

This paper addresses a critical privacy challenge in offensive security operations: performing password cracking via a third-party server without revealing the target hashes. The scenario involves a pentester with limited local resources (e.g., a smartphone) who needs to query large precomputed hash tables (like Rainbow or Hellman tables) hosted remotely. The core problem is constructing an oblivious password cracking server where the server cannot learn which password-hash pairs the client is attempting to crack, thus preserving operational secrecy.

2. Preliminaries

2.1 Hash Reversing Tables

Password systems often store cryptographic hashes. Attackers use precomputed tables to reverse these hashes. Key methods include:

  • Hellman Tables (1980): A time-memory trade-off using chains of hashes and passwords, storing only chain start and end points.
  • Rivest's Distinguished Points (1982): An optimization using special "distinguished" hash values as chain endpoints to reduce lookups.
  • Rainbow Tables (Oeschlin, 2003): Uses a different reduction function at each chain step, generally faster but less suitable for the proposed PIR-based query model.

The paper argues that Hellman tables (or variants with distinguished points) are more compatible with PIR protocols than Rainbow tables for this specific application.

2.2 Private Information Retrieval

Private Information Retrieval (PIR) allows a client to retrieve an item from a database without the server learning which item was accessed. For a single database holding an n-bit string, a PIR scheme involves:

  1. Query Generation (Client)
  2. Query Transmission
  3. Query Processing (Server)
  4. Response Transmission
  5. Response Decoding (Client)

Complexity is measured as $O_C$ (client processing), $O_S$ (server processing), and $O_T$ (transfer). A fundamental lower bound is that $O_S$ must be at least $O(n)$ to ensure privacy, meaning the server must perform work proportional to the database size.

3. Solution Overview

The proposed solution ingeniously combines Hellman tables (or distinguished point tables) with a single-database PIR protocol. The server stores the precomputed cracking tables. When a client wants to crack a hash, it uses a PIR query to privately retrieve the necessary information from the specific locations in the Hellman tables that correspond to potential chain matches, without revealing the lookup indices.

4. Technical Details & Algorithm Construction

The construction focuses on adapting Hellman tables for PIR. A Hellman table is defined by a cryptographic hash function $H$ and a reduction function $R$. A chain starts with a plaintext $SP_i$, and iteratively computes: $X_0 = SP_i$, $X_{j+1} = H(R(X_j))$, storing only $(SP_i, EP_i)$ where $EP_i$ is the final value after $t$ steps. To check a hash $h$, the client computes a chain of length $t$ starting from $h$, checking at each step if the intermediate value matches a stored endpoint. The PIR protocol is used to privately fetch these endpoint comparisons from the server's table.

5. PIR Protocols & Complexity Analysis

The paper analyzes the computational and communication overhead. Using a standard computational PIR protocol (e.g., based on the quadratic residuosity assumption), the server-side processing cost $O_S$ scales with the table size. The client cost $O_C$ and communication $O_T$ are significantly lower but non-trivial. The analysis shows that while PIR introduces overhead compared to a plaintext query, it is the necessary price for strong query privacy. The choice of Hellman over Rainbow tables is justified here because Rainbow tables require checking multiple columns with different reduction functions, leading to more PIR queries and higher overall cost.

6. Experimental Results

A prototype was implemented in Python. The experiments demonstrated the feasibility of the approach. Key metrics included:

  • Query Time: The end-to-end time for an oblivious crack attempt, factoring in PIR computation and communication latency.
  • Server Load: The computational load on the server per query, confirming the $O(n)$ theoretical bound.
  • Success Rate: The probability of successfully cracking a hash given the table coverage, which aligns with standard Hellman table success probabilities.

The results validated that the system works but highlighted the performance trade-off: privacy comes at the cost of increased server computation per query compared to a non-private service.

7. Conclusion & Future Work

The paper successfully demonstrates a novel architecture for an oblivious password cracking server. Future work directions include:

  • Exploring more efficient PIR protocols to reduce $O_S$ and $O_T$.
  • Investigating the use of Trusted Execution Environments (TEEs) like Intel SGX as an alternative or complement to cryptographic PIR.
  • Extending the model to distributed or multi-server PIR settings to potentially improve performance.

8. Original Analysis & Expert Commentary

Core Insight: This paper isn't about cracking passwords faster; it's about cracking them quieter. The authors have identified a glaring operational gap in offensive security: the digital footprint. When a red teamer queries a cloud-based cracking service, that query metadata is a liability. This work proposes to encrypt the intent itself using PIR, making the server "oblivious." It's a classic example of applying advanced cryptographic theory (PIR) to a gritty, real-world infosec problem. The significance is akin to the privacy considerations in model inversion attacks or membership inference attacks against machine learning APIs, where the query itself can leak sensitive information.

Logical Flow: The argument is logically sound. 1) Define the threat model: an untrusted third-party server. 2) Select the appropriate cryptographic primitive: single-database computational PIR, which is the only viable option for this non-colluding, single-server scenario. 3) Adapt the cracking primitive: Choose Hellman tables over Rainbow tables because their structure requires fewer, more deterministic PIR queries per crack attempt. This is a crucial engineering decision that shows deep domain knowledge. The flow from problem to cryptographic tool to systems integration is coherent.

Strengths & Flaws: The major strength is novelty and direct applicability. The prototype proves concept viability. However, the flaws are practical. The performance overhead of PIR is massive. As the authors note, server work is $O(n)$. For large tables (terabytes), this is prohibitive for a commercial service. It's a solution that prioritizes perfect privacy over any semblance of efficiency. Furthermore, it only protects the query. The server still knows the client is performing a cracking operation, which may be enough to raise alarms in some jurisdictions. Compared to fully homomorphic encryption (FHE) approaches that are emerging, this PIR-based method is simpler but far less flexible.

Actionable Insights: For security practitioners, this is a blueprint for high-assurance, privacy-preserving offensive tooling. For researchers, it opens a vein of work on efficient, usable PIR. The immediate next step is to benchmark this against a TEE-based approach (e.g., running the cracking logic inside an SGX enclave). The TEE would handle the computation privately with potentially less overhead than cryptographic PIR, though it introduces trust in the hardware vendor. The long-term vision should be a hybrid model: use PIR for the initial, most sensitive index lookup, and perhaps trusted hardware for subsequent steps, balancing trust assumptions and performance. This work, like the foundational CycleGAN paper which creatively combined networks for unpaired image translation, shows how combining two mature technologies (Hellman tables and PIR) can create a novel solution for a niche but important problem.

9. Technical Details & Mathematical Formulation

The core of a Hellman chain for a hash function $H$ and reduction function $R$ is defined iteratively. Given a starting plaintext $P_0$: $$X_0 = P_0$$ $$X_{k+1} = H(R(X_k)) \quad \text{for } k = 0, 1, ..., t-1$$ The chain endpoint is $X_t$. The table stores pairs $(P_0, X_t)$. To invert a hash $h$, one computes a trial chain from $h$: $Y_0 = h$, $Y_1 = H(R(Y_0))$, ..., $Y_t = H(R(Y_{t-1}))$. At each step $j$, one checks if $R(Y_j)$ is a distinguished point or if $Y_j$ matches a stored endpoint $X_t$, triggering a chain regeneration to find the preimage. In the PIR-adapted version, the check against the server's stored endpoint list is performed via a PIR query on the index corresponding to $R(Y_j)$ or $Y_j$.

10. Analysis Framework & Case Study

Case Study: Secure Penetration Testing as a Service (PTaaS)
Imagine a cloud-based PTaaS platform offering password auditing. A client company uploads a list of password hashes from their own system for a security audit. Using a standard service, the cloud provider learns which specific hashes the company's passwords correspond to, a potential breach. Using the oblivious server framework:

  1. The client's auditing tool pre-processes each target hash $h$.
  2. For each $h$, it locally computes the necessary indices $i_1, i_2, ... i_k$ into the provider's Hellman tables.
  3. It uses a PIR protocol to generate encrypted queries for these indices and sends them to the PTaaS server.
  4. The server processes all queries (performing work on its entire database) and returns encrypted blocks of data.
  5. The client decrypts the responses and processes them locally to see if any chain matches were found, recovering plaintext passwords.

Throughout this process, the PTaaS provider sees only encrypted, random-looking queries and learns nothing about which hashes the client was testing, thus preserving the confidentiality of the client's internal password set.

11. Future Applications & Directions

The principles of this research extend beyond password cracking:

  • Privacy-Preserving Threat Intelligence Lookups: Querying indicators of compromise (IOCs) from a shared database without revealing your own assets.
  • Confidential DNA Sequence Matching: A hospital could query a genomic database for disease markers without exposing the patient's full genome.
  • Private Filtering in Log Analysis: Searching through shared security logs for attack patterns without revealing the specific patterns your organization is vulnerable to.
  • The convergence with Fully Homomorphic Encryption (FHE) is a key direction. While PIR hides the access pattern, FHE could allow the server to perform the entire cracking computation on encrypted data, returning only an encrypted result. Projects like Microsoft's SEAL and OpenFHE are making this more practical.
  • Integration with differential privacy could add a layer of statistical privacy, ensuring that even the success or failure of a query doesn't leak too much information.

12. References

  1. Calvo, A., Futoransky, A., & Sarraute, C. (2013). An Oblivious Password Cracking Server. arXiv preprint arXiv:1307.8186.
  2. Hellman, M. (1980). A cryptanalytic time-memory trade-off. IEEE Transactions on Information Theory, 26(4), 401-406.
  3. Rivest, R. L. (1982). How to reuse a "write-once" memory. (MIT Laboratory for Computer Science Technical Report).
  4. Oechslin, P. (2003). Making a faster cryptanalytic time-memory trade-off. Advances in Cryptology - CRYPTO 2003 (pp. 617-630). Springer.
  5. Chor, B., Goldreich, O., Kushilevitz, E., & Sudan, M. (1995). Private information retrieval. Proceedings of IEEE 36th Annual Symposium on Foundations of Computer Science (pp. 41-50). IEEE.
  6. Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE international conference on computer vision (pp. 2223-2232). (Cited as an analogous example of creative technology combination).
  7. Microsoft Research. (n.d.). Microsoft SEAL (Simple Encrypted Arithmetic Library). Retrieved from https://www.microsoft.com/en-us/research/project/microsoft-seal/