Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-Replica Encryption Keys #371

Open
ryneeverett opened this issue Oct 6, 2023 · 12 comments
Open

Per-Replica Encryption Keys #371

ryneeverett opened this issue Oct 6, 2023 · 12 comments

Comments

@ryneeverett
Copy link
Collaborator

Spinning off discussion from GothenburgBitFactory/taskchampion-sync-server#3.

@djmitche:

Also note that encryption keys are per-client, not per-replica. That is, my laptop, desktop, and phone all share the same encryption key. We should have a process to handle a compromised key, but I think it would involve migrating all replicas to a new client_id (and new encryption key).

@ryneeverett:

This doesn't strike me as a great security story. Maybe it's daunting but would it be worth considering per-replica keys? I would argue that most people don't roll keys but they do roll devices.

I recall at least one person reporting that their team uses bugwarrior to aggregate their issues into a single synchronized taskwarrior database. I believe the entire team would be one "client" under the new model, so when somebody leaves the company everybody on the team would need to roll the key.

@djmitche:

Let's split off the issue of key rotation. I don't know of a simple mechanism for each replica to have a different key but still be able to exchange information with other replicas.

@ryneeverett
Copy link
Collaborator Author

I think we'd basically need PKI.

  • Each replica would need an asymmetric key-pair.
  • The task database would need to track each replica's public key. (It seems like replica-tracking will be required anyway for intelligently dropping old versions in [taskchampion] Drop old versions on the sync server taskchampion-sync-server#3.)
  • New versions/snapshots would need to be encrypted with a random symmetric key. That key would need to be encrypted with each of the public keys of the replicas and tracked in the task database somewhere. (I would think these keys might be automatically attached/detached from the encrypted blob by the tooling.)
  • New replicas are introduced by an existing replica adding the public key and making a snapshot so that the new replica can immediately "catch up".

@djmitche
Copy link
Collaborator

djmitche commented Oct 6, 2023

For context for those reading along, the current design is just a single, shared symmetric key for all replicas of the same task database. This is a substantial improvement over the Taskwarrior-2.x sync model, which stores all data in cleartext -- a server operator can easily read all users' tasks.

The use-case I was targeting was a service like freecinc.com or inthe.am that provides free or low-cost hosting for Taskwarrior tasks for individuals who are managing their day-to-day tasks across multiple devices. For such a service, the hosting provider should not be able to read users' tasks -- that's the adversarial model. Since it's the same individual using all of those devices, there is no need to distinguish them or prevent one device from reading tasks from another device.

I think this issue is addressing the following cases, but please let me know where I've gotten it wrong:

  • Multi-user task databases, e.g, team members, where people are added and removed, and where removed people must not be able to read subsequent updates to the task database.
  • Key rotation, such as where a key is disclosed and must be invalidated so that it cannot be used to read subsequent updates.

My concern with the proposal in the previous comment is that a malicious server operator could add a new public key. It's not clear how the "existing replica" would be validated. Would that replica sign the new public key with its private key? That would create a tree of keys which would need to be validated back to the root (where that root would presumably be a secret shared by all replicas).

Another alternative is to add keys to the sync'd data (so the "task database" contains both a set of tasks and a set of keys). This would need some care to ensure that other replicas could not push versions not encrypted for that new key before downloading and applying the version that adds the new key. This sort of thing -- changing the set of participants -- is really hard to get right in other distributed systems like Raft or Paxos.

Overall, I want to consider the complexity of what we implement:

  • From a user perspective, this needs to be as simple as possible. As a counter-example, the difficulty of setting up client certificates has historically been a challenge for people deploying their own taskd.
  • "Don't roll your own crypto" is good advice -- as this gets more complex, the likelihood of missing some critical vulnerability grows
  • Intended use-cases. If we can support multi-user task databases, great -- but if that carries a cost to people trying to track their homework assignments across their laptop and phone, then maybe not.

@dathanb
Copy link
Contributor

dathanb commented Oct 6, 2023

It feels like leaning on GPG for tracking keys and handling encryption + decryption might be a good path forward. Especially if Taskwarrior including tooling to a) share public keys via the sync server, and b) easily "rotate" by re-encrypting existing tasks with a new set of recipients.

@ryneeverett
Copy link
Collaborator Author

I think this issue is addressing the following cases, but please let me know where I've gotten it wrong:

  • Multi-user task databases, e.g, team members, where people are added and removed, and where removed people must not be able to read subsequent updates to the task database.
  • Key rotation, such as where a key is disclosed and must be invalidated so that it cannot be used to read subsequent updates.

Yes, though I would clarify that key rotation is good even in (the more common case) where you don't know a key is compromised. With my proposal, it would be fairly easy to automatically roll keys on a frequent basis. The easiest way would be for a replica to roll it's own key when making a snapshot. However, because the lowest resource devices are probably the most likely to be compromised, it may make more sense for a multi-step process in which a replica adds it's new key to the database and marks it's old key as needing to be deleted next time a higher-resourced replica makes a snapshot.

My concern with the proposal in the previous comment is that a malicious server operator could add a new public key. It's not clear how the "existing replica" would be validated. Would that replica sign the new public key with its private key?

Yes, I should have specified that the public keys would need to be signed with the private key of the "sponsoring replica".

That would create a tree of keys which would need to be validated back to the root (where that root would presumably be a secret shared by all replicas).

Yes, that's not wrong. But the validation wouldn't have to go back to the root every time, it would only have to go back as far as the last validated key. A new replica is valid if its key is signed by an already-trusted replica.

  • From a user perspective, this needs to be as simple as possible. As a counter-example, the difficulty of setting up client certificates has historically been a challenge for people deploying their own taskd.

I agree completely. I think all this can be implemented under the hood.

"Don't roll your own crypto" is good advice -- as this gets more complex, the likelihood of missing some critical vulnerability grows

It's a valid concern that we might not have the skill set to implement this securely. However, I don't think admonitions against rolling your own crypto are applicable. I've always seen such advice in the context of implementing -- or worse inventing -- your own cryptography algorithms. I've never seen it used as an argument against implementing a cryptosystem using cryptography libraries and primitives constructed by experts.

In fact, I'm not even proposing a novel cryptosystem. I didn't look at this page too closely so I could be mistaken, but I believe this is called a hybrid cryptosystem. Regardless, I'm pretty sure this pattern has been around a long time.

And to @dathanb's point, we could potentially use gpg or some other tool or library that already implements this pattern. I described the low-level details of the implementation because I was responding to the suggestion that such a system might not be feasible and I wanted to describe the mechanics rather than say "find some library that implements the cryptography..."

@djmitche
Copy link
Collaborator

djmitche commented Oct 7, 2023

Having filed GothenburgBitFactory/taskwarrior#3185, I remembered that the server backends are pluggable and we can in fact support lots of options, so we don't have to trade off ease-of-use for more sophisticated security models. We could decide, for example, that the cloud-provider-backed implementation just uses a shared key (and shared cloud credentials), while the sync-server model implements a hybrid cryptosystem.

That actually provides a nice solution to one of the places Taskwarrior is currently "stuck": the taskchampion-sync-server is at a "proof of concept" level of quality right now, and I don't have the resources or experience to take it beyond that. If we can ship Taskwarrior 3.0 with just a cloud backend for sync, then probably most users can simply migrate to that, and the remaining few may provide more focused feedback on what they need from a sync server.

So, let's keep working on this idea. Maybe a mock-up is a good next step?

@ryneeverett
Copy link
Collaborator Author

So, let's keep working on this idea. Maybe a mock-up is a good next step?

That's one possible route, but I was thinking a good next step would be to explore how tooling could help implement this. If there happened to be the perfect tooling out there, a real implementation might be easier than a fake. On the other hand, if there's utterly no tooling support available I might be inclined to close this as impractical.

@ryneeverett
Copy link
Collaborator Author

The use-case I was targeting was a service like freecinc.com or inthe.am that provides free or low-cost hosting for Taskwarrior tasks for individuals who are managing their day-to-day tasks across multiple devices. For such a service, the hosting provider should not be able to read users' tasks -- that's the adversarial model. Since it's the same individual using all of those devices, there is no need to distinguish them or prevent one device from reading tasks from another device.

Having looked at GothenburgBitFactory/taskwarrior#3185, I wonder how important the "hosted taskwarrior" use case is. Is there still a need for such a service if there are cloud options? Why would users choose such a service over generic cloud storage? If we dropped or de-prioritized this use case for the sync server, that would change the threat model and make the current shared key a more questionable approach. (Yeah, defense in depth is good, but promoting the sync server as "encrypted" while not offering a security model that meets the standard of modern applications might be bordering on security theater.)

@djmitche
Copy link
Collaborator

That you have a different use case in mind does not make the existing solution "security theater". Hyperbole is not persuasive.

I suspect that a hosted solution will always be easier than setting up a cloud account, so the hosted case remains useful for the less-technical end of the user base.

One of the ways I expect a hosted solution would support itself is to offer a web interface for tasks, which would mean that the hosting provider needs the encryption key. I suspect users would very much want to be able to leave that hosting provider when it inevitably goes public and changes its TOS to sell their data. Per-replica keys might be useful there as defense-in-depth (the first layer being, stop sending your task data to the company).

At any rate, carry on looking at tooling!

@ryneeverett
Copy link
Collaborator Author

That you have a different use case in mind does not make the existing solution "security theater". Hyperbole is not persuasive.

...

One of the ways I expect a hosted solution would support itself is to offer a web interface for tasks, which would mean that the hosting provider needs the encryption key.

I'd hate to escalate the tone of this discussion but I'm really at a loss for how to respond to this. I'm questioning whether there's really a use case in which the current encryption scheme provides any real security value. If not, it's textbook security theater. You proceed to give an example in which the server owner has custody of the encryption key. What value does the current encryption scheme provide in this case?

@djmitche
Copy link
Collaborator

Lots (hundreds? thousands? I don't actually know) of people used inthe.am and freecinc.com. Both of those services could read everyone's tasks in cleartext (like, with grep). The model implemented by TC today is strictly better than that, with no theater at all.

It could be even better, sure. And that's what this issue is about. Please, go on an implement it!

@ryneeverett
Copy link
Collaborator Author

Lots (hundreds? thousands? I don't actually know) of people used inthe.am and freecinc.com. Both of those services could read everyone's tasks in cleartext (like, with grep). The model implemented by TC today is strictly better than that, with no theater at all.

The threat model of the TC encryption scheme is to defend against services such as those. Those services shut down essentially because they weren't economically viable and, as you suggested, in an effort to add value future services are likely to undermine the TC encryption scheme by requiring the key. It's hard to understand how you think it's "strictly better" if you don't expect the threat model to be applicable. Or is your point that the TC encryption scheme helps server operators appropriately implement encryption at rest? I could accept that.

@ryneeverett
Copy link
Collaborator Author

Preliminary Tooling Research

Messaging Layer Security (MLS)

  • "a protocol based on tree structures that enables asynchronous group keying with forward secrecy and post-compromise security"
  • "The core functionality of MLS is continuous group authenticated key exchange (AKE)."
  • Designed for end-to-end security against a malicious server operator.1

Tools

Terminology Mapping

  • taskchampion -> MLS
  • Client -> Group
  • Replica -> Member/Client
  • Sync Server -> Delivery Service (DS)
  • Version/Snapshot -> PrivateMessage

Pros

  • Same high-level goal and threat model: a mutable group of devices asynchronously communicating with separate rotatable keys without exposing data to the server operator.
  • Optimized for efficiently propagating small messages by reusing shared keys over an "epoch". Other solutions are probably a lot more wasteful of resources because we're generating, storing, and propagating a new key for every message (version). I wouldn't be surprised if, in a typical case, the key generation used more compute than taskwarrior and the key used more storage space than the operations.
  • High level of abstraction, which should make it more difficult to implement incorrectly in a security-compromising way. Unlike other options, MLS takes care of more security-sensitive operaations under the hood:
    • Key management: adding and removing replica keys (and making sure these operations are client-side validated!), propagating the key store, etc.
    • Cryptography: high-level encryption/decryption and signatures/validation

Cons

  • The stability, production-readiness, and long-term health of OpenMLS is unknown.
  • Seems to be designed solely with messaging apps in mind.
    • There could be breaking changes on a shorter timescale than we're comfortable with since modern messaging app development moves a lot faster than taskwarrior development.
    • OpenMLS limits message size to 4.3gb, which might not be acceptable for snapshots.
  • Specific "services" are intentionally left outside the RFC scope, which we'd have to implement:
    • "Delivery Service (DS)": We'd have to provide a way for replicas to store and retrieve KeyPackages and to send each other messages in order to propagate keys through the "Ratchet Tree". This would likely mean adding one table to store keys and another to queue messages bound for each replica when they sync.
    • "Authentication Service (AS)": MLS provides no guidance on verifying the identity of a replica when a new key is presented (for new replicas or key rotation). This could be as simple as trust-on-first-use or as complex as oauth.

GPG

  • Needs no introduction.

Tools

Implementation Notes

  • While gpg provides a keyserver implementation (in Rust!) I'm not sure it provides any benefit to us over implementing our own storage table and endpoints.
    • There is no sharing of replica keys between clients and every replica of a given client needs the entire set of keys.
    • Revocation/rotation is as simple as deleting/replacing the key from the table rather than inserting a revoked key to the keyserver.
  • Use gpg's --recipient or --group argument to encrypt for all devices.

Pros

  • Designed with the open-ended flexibility to support a wide variety of applications such as ours.

Cons

  • Historically, security experts have accused gpg of being easy to use incorrectly, bad defaults, inferior cryptography, high severity CVE's, etc.

Age

  • "The GPG killer." [citation needed]

Tools

Pros

  • The api does seem a bit simpler and more stupid-proof than gpg.

Cons

  • No signatures/validation support.
    • If we really need signatures we'd need a separate tool for that such as signatory or minisign.
    • Or avoid signatures altogether by pushing key management to the client side rather than the server and relying on the fact that age is already authenticated. My (shallow) reading of the article:
      • As long as we keep public keys secret (i.e., don't expose them to the server) age is effectively authenticated.
      • The next section of the article goes into an issue with authentication especially affecting multiple recipients but it's only a problem if recipients are untrusted.
  • No key management support whatsoever, but that's arguably not much of a loss as gpg's support is not a complete solution for our use case.

@djmitche djmitche transferred this issue from GothenburgBitFactory/taskwarrior Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

3 participants