FF1 format-preserving encryption for ID obfuscation
Replaced sequential integer IDs across a multi-tenant platform with FF1-encrypted equivalents to prevent enumeration attacks — without breaking schemas or queries.
Impact
Platform-wide ID obfuscation
Context
Digiicampus is a multi-tenant platform serving educational institutions. Every resource — students, attendance records, courses, fee receipts — is identified by an auto-increment integer primary key. These IDs show up in URLs, API responses, and exported reports.
Sequential IDs leak information. A tenant admin seeing /students/4821
knows roughly how many students exist. A malicious actor with one valid
ID can enumerate nearby ones. Worse, the growth rate of IDs over time
reveals business metrics no customer should be able to infer.
Problem
The goal: replace every outward-facing integer ID with an obfuscated form that is:
- Non-enumerable.
4821and4822should look unrelated. - Referentially stable. The same input must always map to the same output.
- Schema-compatible. No migration of primary key columns, no foreign key rewrites.
- Query-compatible. The encrypted form must round-trip cleanly at the service boundary.
- Tenant-scoped. Two tenants with the same internal ID must see different external IDs.
UUIDs were ruled out — they would require touching every table, every foreign key, and every integration. Base64 of an HMAC was ruled out — not reversible. Simple XOR with a static key was ruled out — trivially breakable given a few known plaintext/ciphertext pairs.
Approach
I chose FF1 (Format-Preserving Encryption) from NIST SP 800-38G. FF1 is an AES-based cipher that encrypts values while preserving their domain — a 10-digit number encrypts to another 10-digit number, a hex-string to another hex-string of equal length. This meant:
- No schema changes. IDs stay
BIGINTin the database. - No migration. Encryption happens at the controller boundary.
- Deterministic. Same input + same key = same output, so caching and idempotency work unchanged.
Key architecture:
- Master key stored in AWS SSM Parameter Store (SecureString).
- Loaded once at application startup, cached in memory.
- Tenant-specific tweaks derived from tenant ID — the same numeric ID encrypts differently for each tenant. This is FF1’s native mechanism for domain separation; no key-per-tenant required.
Integration points:
- A centralized obfuscation utility, adopted org-wide.
- Jackson serializers/deserializers applied at the controller layer, so request/response DTOs automatically encrypt on the way out and decrypt on the way in.
- Internal service-to-service calls use raw IDs; encryption is a boundary concern, not a persistence concern.
Implementation
The trickiest parts weren’t the crypto — they were the edges:
- Legacy endpoints that accepted raw integer IDs had to keep working during the rollout. I supported both encrypted and raw forms at ingress for a grace period, with a flag to enforce encrypted-only later.
- Bulk export flows (CSV reports, Athena queries) needed care — some reports are consumed by internal systems that expect raw IDs, others by end users who should see encrypted forms. The serializer lives at the API boundary, so internal paths are unaffected.
- Foreign key joins in reporting queries stayed untouched. The database never sees an encrypted ID.
- URL encoding. FF1 output is numeric for numeric inputs, so URLs stayed clean — no base64 padding, no special characters.
The rollout was incremental, one module at a time. Each module’s controllers got the serializers, the API contract version bumped, and the frontend clients updated in the same release.
Impact
- Every outward-facing ID on the platform is now non-enumerable.
- Zero database schema changes, zero primary key migrations.
- The utility became a reusable building block — new modules get obfuscation “for free” by using the standard DTO serializers.
- Auditable: the master key lives in SSM with IAM-controlled access, so rotation is a parameter update, not a code change.
What I’d do differently: version the tweak derivation from day one. I derived tweaks from tenant ID directly, which means if the derivation ever needs to change, I’d need a migration path. A version byte prefix would have made future rotation cheaper.