Easy private certificate management for VMs on AWS, GCP, and Azure

87 points by alanctkc 6 years ago · 19 comments

Reader

The idea and functionality looks good. Some quick friendly feedback:

For production I would want to run this in Docker in some sort of a portable fashion.

Looking at the documentation it seems that you have to manually enter the password when you start up step-ca. That's not really going to work for automated setups. You need to be able to inject secrets from environment variables, or these days, Kubernetes secrets.

There's also the issue of backing up your CA secrets, e.g. if your step-ca process dies and you want to restart it somewhere else. That may be out of scope for step-ca though and handled through some other process, which is fine.

Might be good to add some documentation on how to set this up in a high availability fashion so it is not a single point of failure.

I do like the relative simplicity of this compared to all the other CA solutions out there. Good luck and thanks for the work.

mmalone 6 years ago

> For production I would want to run this in Docker in some sort of a portable fashion.
Do https://hub.helm.sh/charts/smallstep/step-certificates & https://hub.docker.com/r/smallstep/step-ca tick this box for you?
> Looking at the documentation it seems that you have to manually enter the password when you start up step-ca... environment... kubernetes secrets
There are a couple options here. You're only prompted for a password if your intermediate signing key (at `$(step path)/secrets/intermediate_ca_key` by default) is encrypted with a password. You can create your own signing key (e.g., with `step certificate create`) or remove the password (e.g., with `step crypto change-pass --no-password`) and you won't be prompted for one.
The `step-ca` binary also takes a `--password-file` flag. We chose to take this from file vs. environment or directly as a flag because files are the easiest thing to secure in general... and we're all about misuse resistance. If you really want to use a string or an environment variable you can use bash process substitution (e.g., `--password-file <(echo "pass")` or `--password-file <(echo $PASS)`).
We're also working on PKCS#11 HSM support and will be supporting cloud HSM/KMSs on all the popular clouds, which is probably the best and most secure option for most people.
> There's also the issue of backing up your CA secrets, e.g. if your step-ca process dies and you want to restart it somewhere else.
Yea this is super tricky. I think HSMs are probably the right answer. As it stands, the password protection on the signing keys mean it's somewhat ok (if not best practice) to backup the keys. Really though, the whole system is designed to make intermediate and even root rotation fairly easy so there's that option, too.
This is an area where we're still improving and definitely appreciate feedback from anyone who has thoughts.
> Documentation on how to set this up in a high availability fashion
Yea good call. We need that.
For anyone who's reading this you should be aware that the CA just needs a root and intermediate cert in `$(step path)/certs` and an intermediate key in `$(step path)/secrets`. You don't have to create these keys with `step ca init`. You can use the lower level `step certificate create` command group to create a second intermediate that hangs off an existing root, for example.
> I do like the relative simplicity of this compared to all the other CA solutions out there. Good luck and thanks for the work.
Thanks so much! <3.
Edit:
If you're doing k8s stuff also check out:
- https://github.com/smallstep/autocert
- https://github.com/smallstep/step-issuer
If you're using Envoy/Istio we're also iterating on this:
- https://github.com/smallstep/step-sds
- programd 6 years ago
  
  Excellent response, thank you. Answers all my questions.

zokier 6 years ago

I feel like the lack of "audience" field (or equivalent) in AWS IID makes them bit less attractive for authentication than GCP/Azure ones. For example here step-ca could impersonate (if compromised) the client instance to any other services that were to use IID for auth (or vise versa).

mmalone 6 years ago

Yep. AWS's instance identity implementation is crap. I want to write a follow-up blog post about this. They also don't rotate their keys and their tokens don't expire. To top it off, their implementation is buggy and terribly documented. Honestly it's pretty shameful. They have the resources to fix it, and they should fix it. GCP's implementation is the best. It's JWT-based and heavily inspired by OAuth OIDC identity tokens and uses a lot of the same infrastructure. Azure's is a close second. None are perfect.
That said, even AWS's crappy implementation is super useful, and really the only good way to do this (that I know of?). We've tried to mitigate this risk somewhat by making tokens single use. I'd like to also add a way to send a token to `step-ca` to say "this server doesn't need a certificate" that basically marks the instance as "used" without issuing anything. If everything that uses IIDs did this, and you ran through all of your services at startup and either authenticated or said "I never need to use this service", it would provide some protection.
Still, to your point, AWS should fix their shit.
- zokier 6 years ago
  
  Btw can step create client certs? That would reduce the need to use IID for anything else, even if it doesn't really resolve the underlying issues with IID
  - mmalone 6 years ago
    
    Yep. That's my preferred solution, obviously ;)
    Certs have "TLS Client Authentication" key use set by default.

jively 6 years ago

This is very similar to [CFSSL](https://cfssl.org/), any specific reasons to use this over Cloudflare's PKI?

alanctkcOP 6 years ago

I'm a smallstep developer.
From a mile high view, the thing that makes step somewhat different is the heavy emphasis on usability and reducing overall complexity of managing your own PKI holistically.
So, even if smallstep is more complete in a feature-for-feature comparison to alternatives, the primary focus on ergonomics and filling the "humanized" tooling gap is why you might pick it over another tool... depending on your needs.
In a sense step is to cfssl/openssl as httpie is to curl. You can accomplish a lot of the same things, but they're at different levels as far as mental tax and overall approachability.
mmalone 6 years ago

Hey, author here. There are some pros and cons, but here's my (obviously biased) position. I'm not a CFSSL expert so I invite people to correct me if I'm wrong anywhere here.
- CFSSL by default runs open -- it's an unauthenticated API that'll sign anything. Figuring out how to securely authenticate to the CA is one of the hardest parts of automated enrollment. `step` & `step-ca` solve this for you.
- Building on the last point, there are multiple ways to authenticate to `step-ca` for various scenarios: one-time tokens, OAuth OIDC (SSO)[1], and now instance identity documents for VMs (more coming soon)
- The `step` command-line tool integrates with `step-ca` and makes the entire enrollment process super easy -- we focused a lot on usability (and misuse resistance)
- `step` also helps with other certificate management workflows[2] like root certificate distribution, root federation, root certificate (un)installation[3], certificate renewal, and (passive) revocation
- `step` is also useful as a generic swiss army knife for security tech like JWTs, JWKs, NaCl, OAuth, and more... not necessary relevant for this comparison, but useful[4]
This might be unfair... but I think philosophically the projects serve different purposes. CFSSL was created to serve CloudFlare's specific internal PKI needs, and it does that well, and it was awesome of them to open source it. `step` & `step-ca` was created because we believe everyone deserves great internal public key infrastructure and there's a tooling gap. So I think we're more interested in addressing a broader variety of use cases than CFSSL is.
There are some definite advantages of CFSSL though. Someone can probably extend this list and I'd love to discuss, but some obvious ones from my perspective are:
- They have a bigger community (at least for now :)
- They've been around longer
- There's more documentation out there about how to do things with CFSSL (see point #1)
Functionally, I think the only thing that CFSSL has that we don't at the moment is "active" certificate revocation -- CRLs & OCSP. We think short-lived certificates[5] are a better approach, and design for that primarily, but we're planning to fill this gap soon so at that point I think we'll be parity+ with CFSSL.
[1] https://smallstep.com/blog/easily-curl-services-secured-by-h...
[2] https://smallstep.com/docs/cli/ca/#commands
[3] https://smallstep.com/docs/cli/certificate/install/
[4] https://github.com/smallstep/cli/blob/master/README.md#Featu...
[5] https://smallstep.com/blog/passive-revocation/

heleninboodler 6 years ago

This is very neat stuff and I'd actually be interested in talking to you more about where you see it going in the future, because it's extremely closely related to some stuff I've worked on. One clarification:

Am I understanding it correctly that step-ca can be configured to either 1) hand out certs for any CN or 2) only hand out certs for the machine's FQDN according to the instance metadata? In essence, the "any CN" mode is only useful for knowing that this instance is one of your own (but exactly which one is totally on the honor system), and the "FQDN only" mode is useful if you use your cloud provider's FQDNs for your hosts. Do I have that correct?

mmalone 6 years ago

Yes, your understanding is correct. I think it's slightly better than you suggest... since instance identity authentication only works once per instance (by default) you'd probably have some other monitoring stuff in your stack that would notice if a VM went rogue. If a `foo` instance got a cert for `bar` your CD stack would presumably still consider it a `foo` instance and, for example, add it to DNS as `foo`. Then connections to that instance would fail (since it can't authenticate as `foo`) and, hopefully, you'd notice.
Still, this is hand-wavey and complicated and not ideal from a security perspective. It's a lot better than not having certificates at all, but it'd be even better if this gap were closed.
To close this gap we need some sort of enrollment process. The reason we didn't add this for MVP is it's kind of complicated. I think we'd need some policy at the CA that maps VM identities to the workload identities the VM is authorized to run. We need to figure out what would run this enrollment step to add the mappings (probably different for different stacks) and how that thing would authenticate to the CA.
We've also been a bit reluctant to add ad-hoc policy stuff to the CA because we have a generic policy solution that we've been working on. Once that's released it'll give us a much better foundation for this sort of stuff.
Finally, there are other ways to build a stronger enrollment mechanism today. We have a JWT-based one-time-token authentication mechanism[1] that you can use, where a "provisioner" (e.g., something in your CD pipeline like Puppet or Kubernetes) issues a one-time token for a workload to get a certificate from `step-ca`. In this flow the JWT contains the workload's identity, so whatever issues the JWT controls certificate enrollment. This flow has pretty much the same characteristics as an IID+enrollment flow.
Finally, we have ACME support coming soon (next week, actually). So that'll be another option if you want a stronger binding to an instance's FQDN.
Hope this makes sense. Happy to answer any additional questions!
[1] https://smallstep.com/docs/design-document/#jwk-provisioner

urda 6 years ago

I've used XCA [1] before for managing my personal CA and PKI certs for things. I simply then share my root CA out to my necessary end points and handle things from there.

[1] https://hohnstaedt.de/xca/

mmalone 6 years ago

I've never used XCA but I've heard of it. Does it have an actual "online CA" with an API for signing certificates or is it more of a desktop app that works with local signing certificates - like a graphical version of OpenSSL?
If you ever have a reason to check out the `step` / `step-ca` toolchain I'd love to chat about the differences you see. Message me here or shoot me an email (mike at smallstep).
- urda 6 years ago
  
  Since it's my own CA, I have a few personal scripts that handle it. Everything else (like the root cert) is handled offline with a different physical device. It's nothing more than some glorified bash stuff and pulling public CA's from my own sites.
  XCA is a gui for dealing with making certs. For me even as a technical user, i prefer it more than CLI.
  - mmalone 6 years ago
    
    Cool that's good feedback. We've been working on a web interface that we could maybe turn into an electron app for this sort of stuff.
    I'm probably pressing my luck promoting here but if you do a bunch of cert related stuff check out our `step certificate` command group at https://smallstep.com/docs/cli/certificate/#commands -- it does a bunch of cool stuff like dumping x509 as JSON and extracting public keys and linting certs and it's way easier to use than openssl. Might be useful in your scripts.
    
    urda 6 years ago
    
    I will be taking a look at it for sure! Like I said i'm pretty small time, but I love the power of having my own CA.

ilaksh 6 years ago

What's the advantage of this over some scripts like https://github.com/tomberek/easy-ca?

mmalone 6 years ago

See the comparison to cfssl below. Step has lots more features, is easier to use, and harder to misuse. Relative to easy-rsa, the most important difference is that step-ca is a service that can issue certs via an API. Combined with step it can also help with lots of cert management tasks like root distribution and automated renewal.

Settings

Easy private certificate management for VMs on AWS, GCP, and Azure

Keyboard Shortcuts