Alexandria internals refactor
As part of the NLnet grant to work on Irdest I allocated a milestone to get Alexandria to the
1.0 finish line. It's an ambitious target, seeing as the database is not yet 5 years old (
irdest-core/store module obsolete!
Following is an outline of work that needs to happen. This issue will track development of these features.
As part of qaul.net Alexandria needed to handle many types of payloads. This is no longer the case! As such
blob support needs to be removed. Furthermore, we will need to support large records, and possibly records where a key-value store approach is not ideal. Following is an outline of the new alexandria scope
- Encrypted data & metadata
- Identity concealing. Hide user IDs. Is this zero knowledge?
- Records based. Path based. Two types of records
- Key-value stores
- Table records with shared schemas
- Search tags. Encrypted tag cache. Per-user.
- Encrypted schemas.
- Insert data easily without manual diff creation
- Use diffs for transactions internally.
- Generational (epoch based) garbage collection
- Sync data dynamically.
- Break large records up into chunks that can be streamed from disk.
Things that are stored
- Records! Chunk based.
- User sessions, with hidden user IDs. How?
- Per-user tag-cache
- Per-user private key store
- Per-user public key store (not encrypted - only user ID is unknown)
The threat model
Hiding information from an adversarial user is impossible. Alexandria can not hide its own memory from the root user. Data MUST exist in un-encrypted form at some point or another. Watching alexandria in memory WILL reveal information about data present in the database.
Protect data synced to disk. Confiscated devices, broken disk encryption, stolen files. Don't reveal user identity or information payloads to on-disk attacks. If the cops steal your phone, can they indict you?
Following is a graph of data available via APIs, internal components and their on-disk counterparts.
Following is an implementation outline
Investigate whether the
Encryptedabstraction introduced in alexandria
v0.1is still relevant
Take a closer look at every component that exists in the code currently and map its inputs and outputs
- Where are modules being used?
- Are there unused modules in the library?
Build a solid user session management module
- Allow user registration, login, logout, destruction
- Sync user sessions & keys to disk
Tag & schema cache
- Encrypt caches with root user key
- Provide an API for other modules to query the caches
- Load information from disk that is not present in the cache (how?)
- Investigate whether to split the store implementation between KV and tables or keep them in the same stare
- Create an API for manipulating tables & diffing KV records
- Create transactional deltas for the store
- Apply deltas atomically
Add and remove items from the above outline as needed.