What's new in Docker 1.13: prune, secrets, checkpoints and more

82 points by assaflavie 9 years ago · 61 comments

Reader

I'm really looking forward to seeing the scientific community adopt docker as a way to distribute reproducible research and coursework.

MIT 6.S094 has a Dockerfile[^1] that contains all the software required for taking part in the class. This is a huge boon for getting stuck into the class and its coursework.

[^1]: http://selfdrivingcars.mit.edu/files/Dockerfile

mbreese 9 years ago

Most of the excitement that I've seen in the HPC scientific world has been around Singularity [1] containers. In particular, the main advantage seems to be keeping processes running as non-privileged users. This lets these containers get integrated with existing HPC clusters much easier.
[1] http://singularity.lbl.gov/
sigjuice 9 years ago

How is publishing a Dockerfile even remotely reproducible? Almost every Dockerfile is a series of apt-get install, or yum install or pip install commands. How do I know what versions of packages I am downloading or whether they will even be available to download if I build from this Dockerfile, say two months from now?
IMHO, every Dockerfile has left-pad written all over it.
- yeukhon 9 years ago
  
  Good question.
  Reproduciblity is all about the starting point. Computers are electronic, so if your computation requires high entropy from some random source and supposed next run there is not enough entropy your experiment may fail. But that's really really really a corner case. Docker image keeps the state of the starting point (kernel, packages, history of bashrc etc) are kept version controlled. It is as if someone gave you a copy of the virtualbox image.
  So how do we lock down?
  1) When you start with a Dockerfile, specify the version of the packages you are installing
  2) When you want to reproduce, you can rebuild an image with that Dockerdile.
  3) But most people are just going to use your image which is always the same now or next year. Building image != launching a container using an image.
yeukhon 9 years ago

Currently a lot of research is computed using Condor to schedule jobs and yeah, they span across multiple machines, like how Jenkins master/slaves work. It's been a go-to for many HPC research.
There's been some effort individually to integrate Docker with Condor (after all, both are just processes running on some host machine).

lacampbell 9 years ago

I'm just a guy that wants to deploy web apps. Is docker overkill for me? Basically, I want to be able to test something on my local machine under the same conditions it will be running on my server. Containerisation seems like the only way to do this that doesn't involve keeping packages and system configurations in sync in two or more systems.

scanr 9 years ago
Docker may be overkill to start but it's relatively low cost to implement and it will definitely pay dividends over time:
```
    * You can be sure that what you're running locally 
      is exactly what you'll be running on the server
    * Your deployment experience will be the same 
      regardless of which tech stack you're using for the 
      web application
    * There are many places you can deploy docker 
      containers (Google GCE, Amazon ECS, Amazon EB, etc.)
    * A web application is often composed of several 
      services (e.g. the web app, a database, redis etc.) 
      and docker compose makes it easy to fire all of 
      those up in development e.g. if a new 
      developer joins, they only need to install 
      docker rather than web app framework + 
      database + redis
    * Docker sets you up quite well to grow into a 
      more complex deployment (e.g. using Kubernetes)
```
- MrBuddyCasino 9 years ago
  
  > it's relatively low cost to implement
  Running Docker in production takes a huge amount of effort to get right and is not easily done.
  - tra3 9 years ago
    
    I dont believe that's an accurate assessment. If the grandparent wants to run a one off container with reproducible results, something like docker-compose is perfect. If he wants to run a multi-node microservices architecture then the story gets more complicated.
    
    LeanderK 9 years ago
    
    i run a lot of small projects with docker-compose on a single host and it makes deploying my changes very easy. Maybe there is a low cost setting it up, but i think eve with a small project it pays it divides pretty fast.
  - scanr 9 years ago
    
    I could have been clearer. I meant that setting up docker for his use case i.e. a single 'standard' web application, is relatively easy. Especially if you're using something like Amazon Elastic Beanstalk. At least, that's been my experience.
    You're right that docker can become very complex e.g. dockerizing and orchestrating mariadb with galera for high availability was not pleasant.
    
    MrBuddyCasino 9 years ago
    
    I agree that its actually ok for that use case. But then you don't have a big initial pain to solve, anyway - people using Docker in production usually have few other choices due to the scale they are operating at, and Nomad+Ansible doesn't cut it because there are complex dependencies.
scarlac 9 years ago

Docker is very well suited for local development and testing, particularly since the launch of Docker for Mac and Windows. It makes utilities like MAMP less necessary.
But apart from local development, I'd say that depends on your needs. If you want more ease-of-use, and you run a single-server hosting environment with multiple projects, it may be easier to keep doing that without adding Docker. But if you want increased security and better isolation between your projects, Docker is likely a better solution.
In any case, I would strongly recommend that you familiarize yourself with Docker, at least locally. After a while, you can decide if you want to take the leap and use it on your server as well.
- nefasti 9 years ago
  
  Docker for local development for us been a pain in the butt. - We've hit performance problems with the filesystem, - Problem with caching things like yarn and npm install - The need to constantly rebuild the images for changes to be picked up. - Dificulty dealing with single docker file for prod and testing, making us want to montain 2 docker files.
  Probably some bad setup of our part, but we've been using on production with kubernetes and none of those problems.
  We're still using the compose to bootstrap database, caching, etc.
  - tonyhb 9 years ago
    
    I don't understand how kubernetes solved your base image issue. That's a clustering system, so by default it can't help.
    It sounds as though your setup doesn't work with the immutable filesystems introduced by docker. That's not an issue with docker at all - just something to learn.
    I can't imagine dev or deployment without docker any more - all of my tests, yarn installs, dev workflow and prod runs through it.
StavrosK 9 years ago

My experience has been that it's great for local develoment, if your app is reasonably complex (ie Docker doesn't make sense if you only have an app worker and SQLite database), but I don't love it for production. In order for Docker to work well on production, you need something like Kubernetes, and that's a huge hassle for a small app.
- majewsky 9 years ago
  
  I don't think that Kubernetes is the most important thing on prod. Some colleagues from another team at $WORK use plain Docker and "orchestrate" their containers with simple systemd units that run `docker stop|start`. If the app is only a single container, that should do it. (Actually, in that case, I think that `rkt run` would be better since the process runs below the same cgroup, and systemd can detect crashes and restart the container.)
  Anyway, Kubernetes is not so important for small deployments, but what I've found really helpful is CoreOS: an auto-updating base OS that gets out of the way and (more importantly) ships a combination of Linux kernel + Docker that usually works really well.
  - bradmwalker 9 years ago
    
    Recent versions of systemd-nspawn can directly download a docker image and run it in a service unit.
  - akavel 9 years ago
    
    What about docker-compose? We've recently started using it, and we don't see any problems; did your colleagues evaluate it?
    
    kingrolo 9 years ago
    
    docker-compose is really straightforward to get running, even moreso with docker-machine, and it gives you dev/prod parity, but the downside is that there's not a built in way to do zero downtime deploys.
    
    RabbitmqGuy 9 years ago
    
    Actually with the new docker-compose version 3 you can do rolling updates[1]. 1. https://docs.docker.com/compose/compose-file/#/deploy
    
    bpicolo 9 years ago
    
    That doesn't suggest zero downtime though, no? Still needs an LB to know to stop routing to that host for a moment.
  - StavrosK 9 years ago
    
    That's how I do deployments, but they take a while to start/stop. Whereas, with uwsgi, for example, deployments are zero-downtime, since uwsgi loads a new interpreter and uses that for new connections from that point on, without interrupting any old connections.
anoctopus 9 years ago

For your use case, containers aren't overkill, but a full orchestration system probably is. Just letting some simple outside process just handle starting them up is fine, and Swarm seems to have improved enough that you can use that for single computer "keep my app running with x instances" stuff with no overhead.
johnbrodie 9 years ago

I'll go against the grain here, and say that Vagrant + Ansible (or your favorite config management tool) will be easier to handle. It's well understood, simple, and you can try out any config changes in your local Vagrant environment before running the same changes on production.
At least in my mind, it's much more simple to say "OK, I installed these packages, let me add that to Ansible" than it is to get a production-ready Docker setup going.
eicnix 9 years ago

You can try rkt[1] instead. Its from a container framework from coreos which makes a lot of things easier than docker.
Running it in a simple production setup is simply writing a systemd/initd job which starts the container. No container management daemon or orchestration framework involved.
[1] https://github.com/coreos/rkt
omginternets 9 years ago

>Containerisation seems like the only way to do this that doesn't involve keeping packages and system configurations in sync in two or more systems.
In a nutshell, this is why I'm now hooked on docker. I can reproducibly build things on my macbook without tearing up the system packages, and I can deploy them to my small datacenter without thinking twice.
I'd suggest you at least try it out.
bradmwalker 9 years ago

For single-node applications, I develop on a LXC setup with a base template of the distribution that will run a production VM. This combination provides maximum dev/prod parity, the benefits of lightweight virtualization for development, and a boring, battle-tested production environment. The setup and deployment is written once for the choice distribution.
paulddraper 9 years ago

> Containerisation seems like the only way to do this that doesn't involve keeping packages and system configurations in sync in two or more systems.
Virtual machines will also work.
Docker serves as a lightweight virtualization that will provide the same experience, assuming you are willing to keep to the kernel and Docker version "in sync" between prod and local.
- XorNot 9 years ago
  
  Lately I've been sending a bunch of patches upstream to the runv project (https://github.com/hyperhq/runv). Turns out that wrapping the docker interface with full VM isolation is a model I very much like.
patrec 9 years ago

If you only care about controlling the software configuration and versions, nix (nixos.org/nix/) will do this far more elegantly than docker.
pricechild 9 years ago

Does your web app have a database?

wenbert 9 years ago

Does this mean the qcow2 disk space usage in Mac is fixed?

ewang1 9 years ago

Yep, just tested it. The qcow2 disk space gets reclaimed on Docker restart.
- wenbert 9 years ago
  
  Sweet. Now time to play. I followed that bug until i got tired of it. Took a couple of months! Good on them for fixing it. Thanks!

tachion 9 years ago

As much as I welcome the CLI cleanup, I can't stop thinking that the 'docker ps -> docker container ls' change makes no sense to anyone who has any experience with bsd/unix/linux systems. Seriously, why?

rockostrich 9 years ago

I agree. It looks like `docker ps` still works so it's nothing to really be concerned about just yet.
- cpuguy83 9 years ago
  
  `docker ps` will never be removed.

EtienneK 9 years ago

Secrets is a big one! Will really help speed up enterprise adoption.

joekrill 9 years ago

Looks like there's a mistake about image pruning:

"Add -f to get rid of all unused images (ones with no containers running them)."

But the option is actually `-a` -- `-f` just simply skips the prompt.

assaflavieOP 9 years ago

Oops. Thanks for bringing this to my attention. Fixing...
willemmali 9 years ago

Like this?
docker rmi -af
I'm a bit confused by the backticks as I use them all the time scripting, but also in Markdown.
- realPubkey 9 years ago
  
  I have a gist for it: https://gist.github.com/pubkey/73dcb894cf5f7d262863
  #stop and delete all containers
  docker rm -f $(docker ps -a -q)
  #delete all images
  docker rmi -f $(docker images -q)
  - e40 9 years ago
    
    This is NOT equivalent. The OP was talking about removing unused images. Your commands remove all images.
    
    lloeki 9 years ago
    
    Maybe this?
    docker rm $(docker ps -qa --no-trunc --filter "status=exited") docker rmi $(docker images --filter "dangling=true" -q --no-trunc)
  - assaflavieOP 9 years ago
    
    fwiw, there's a new syntax for this, that is a bit more verbose, but probably worth adopting:
    docker container rm $(docker container ls -qa)
    docker image rm $(docker image ls -q)

andmarios 9 years ago

Prune seems not that well thought to me. Don't get me wrong, I do find it useful but many people use containers as environments. Think about how many people are going to run prune only to find their work go missing.

If you are gonna add a nuclear button, do it with a big red alert and give the option to whitelist some containers.

joekrill 9 years ago

But that's really what `docker rm` is for, isn't it? I mean, if you want to only delete specific containers, use that. Prune has a specific purpose, which I think is very clear. If you're running the command, you (presumably) know what it should be doing.
I suppose you could argue it might be nice to be able to do something like `docker container prune startsWith*` or something similar. But on the other hand, that functionality is already available -- just use `docker rm` with xargs or something.
- andmarios 9 years ago
  
  But the thing people complain most isn't because they want to delete everything but because they need docker rm, xargs and complex bash foo to delete the containers and images they don't need.
  For example I want to delete all old and all untagged versions of an image. I want to delete all stopped containers that use a specific image, or that were created more than two weeks ago. I want to delete all images starting with test.
  Nuke everything? Not so much and to be honest this would be the easiest even with xargs and docker rm.
  - cpuguy83 9 years ago
    
    fyi, you do not need xargs.
    `docker rm $(docker ps -q --filter blash)`
    But agree, `prune` is currently sledge hammer and needs some refinement. It's not about not being well thought out, it's about getting something out there that can be built on top of.
    
    andmarios 9 years ago
    
    Thanks, it was too long since I used filter and it wasn't that interesting. Seems much better now!
the_duke 9 years ago

That's kind of like saying the 'rm' command is not well thought out, because many people wouldn't want to delete the whole file system when running 'rm -r /'.
- andmarios 9 years ago
  
  Actually the rm command won't let you remove the root filesystem.
  Also, when ran in bash but also supported by other shells, it supports regular expressions and extended pattern matching letting you for example specify not which files you want deleted, but which files you want not.

lojack 9 years ago

Curious what methods others use for handling secrets at build time (using docker-compose). I'm currently installing (private) dependencies at runtime by mounting my secrets as a volume. I couldn't find a method that didn't seem to have some risk of inadvertently exposing them.

vkjv 9 years ago

There are only methods that I'm aware of:
- Exposing the secrets on a (http) server that the Dockerfile can use to fetch
- What we use: Create a one time use secret that is destroyed after the image is built and before it is pushed.
- djstein 9 years ago
  
  >What we use: Create a one time use secret that is destroyed after the image is built and before it is pushed.
  This approach has sparked my interest, could you post an example of any open source docker-compose file and/or associated scripts that would do this?
  - lojack 9 years ago
    
    I did actually encounter this solution while researching the problem, didn't love it, but you can check out the solution at: https://github.com/docker/docker/issues/13490#issuecomment-1...
    As long as you add the file and remove it in the same command it doesn't get committed as an extra layer, so the container won't have any history of the secrets. You'll run into problems if you do multiple RUN's or an ADD and then RUN.
cpuguy83 9 years ago

Stay tuned for `docker build` support for secrets, and more secret backends in later versions.

the_duke 9 years ago

Why not one 'prune' command with 'containers', 'images', ... as an argument / subcommand?

Would have seemed more intuitive to me.

danpalmer 9 years ago

All of the other commands have been namespaced by what they deal with, so I think it makes more sense in the come t of everything else.

Settings

What's new in Docker 1.13: prune, secrets, checkpoints and more

Keyboard Shortcuts