Cloud Naming Convention (2019)
stepan.wtfResource types in the name is a pet peeve of mine. They should not be part of the name. An RDS instance can never be anything other than a database, no need for 'database' or 'db' or 'rds' in the name. Likewise a Kubernetes cluster can never be anything other than a k8s cluster. It doesn't need 'cluster' suffix.
Terraform makes this redundancy especially obvious because you already include the type in any references to resources eg.
google_container_cluster.payments_cluster.something
I swear some companies do this out of some perverse habit: one place I worked at used to prefix repos, internal modules, infrastructure names, etc with the name of the company.
Which was so frustrating because we all work here, we know what it’s called, it doesn’t add anything, there’s no sister-company we share resources with, all it does is make it harder to distinguish between 2 similarly named, but entirely too long internal applications.
I worked with a company that did this as well - they even went as far as prefixing their _slack_ channels with $(companyName)…
I cannot agree on this more. Also things you should never put in the name of a resource: its provider. I mean, why do you want to put in every Google Cloud resource the GGL prefix? And last, if we are talking cloud let use tags to describe most of the useful metadata and make them 1St class citizens, although the name will always stand out more.
I had the impression, the types get in the name exactly because they can't change.
How do you feel about Hungarian notation?
Possibly useful in dynamically typed languages. Definitely not useful at all in statically typed languages. And cloud resources are equivalent to statically typed - you cant turn an EC2 instance into an Elastic Load Balancer, for instance -- it's only ever one or the other.
I’d argue for having environment-if you do specify it in your resource name- should be at the start, and ideally capitalised differently: in the awful event you somehow find yourself in some kind of environment where dev and prod services are inexplicably listed together, you probably want it to be as clear as possible which one you’re accessing/reading/etc.
Personally I think applications should be blind to environment and have the relevant configs passed to them and said environments should be as well-separated as possible. Ideally in different accounts. With different URL’s. That are co-inaccessible.
this falls when azure allows a-z0-9 hyphens and underscores on most resources, a-z and underscores (no hyphens or digits) on some resources, and on others, only a-z and like 20 characters.
that's not a fault with the article, though.
Also, with externally addressable Azure resources, the resource name is used as part of the FQDN name, and you cannot change the hostname later. Which might impact on how you choose to name resources.
The "environment" absolutely should _not_ be part of the name of the resource.
Coupling the notion of "environment" with your workload (be it in their name or their configuration files) is an anti-pattern that I wish people stopped following.
If you have 3 environments, dev staging and prod, you want resources in these environments to be named _exactly the same_ in each environment.
Whatever your workloads are, wherever they run, they themselves should never _be aware of the environment they run in_. From a workload's point of view the "environment" and its label (dev, staging or prod) do not matter. What makes a workload a "dev" or a "production" workload are their respective configuration and the way they differ, _not_ the name of the "environment" they run in.
What makes a workload a "dev" workload is dictated by its configuration (which database host it talks to for example).
When the environment is being coupled in the configuration of your workloads, inevitably a developer will end up writing code like:
if env == 'dev' then use_database("dev.example.com")
This won't work at all at scale as people start adding new environments (imagine, "qa","test", "dev_1", "alphonso's_test", "etc") as developer will start adding more and more conditions: if env == 'dev' then use_database("dev.example.com")
if env == 'qa' then use_database("qa.example.com")
if env == 'dev_1' or env == 'alphonso' then use_database("dev_00.example.com")
// ... add more and more and more conditions
Instead, if your "dev" environment must talk to a "dev.example.com" database, create a variable called "DATABASE_HOST".And for each environment, set the "DATABASE_HOST" to the value of the database this specific environment needs to talk to.
For example, for your "dev" environment DATABASE_HOST = "dev.example.com", and in your prod environment DATABASE_HOST = "prod.example.com". Here we clearly have a "dev" and a "prod", yet "dev" and "prod" are merely a "labels" for us humans to differentiate them, but the _configuration_ of these environments is really what defines them.
The code above then simply becomes:
use_database(DATABASE_HOST)
and _this_ ^ will scale with an infinite amount of environments._Configuration_ defines the "environment" _not_ the name of the environment.
edit: I realize the article is talking about your cloud provider resources and people might be running multiple "environment" resources in a single account. The above applies to "workloads" talking to these "cloud provider resources", not to the resources themselves, since, of course, you can't have 2 DBs named the same under one single account (obviously the names would collide).
> If you have 3 environments, dev staging and prod, you want resources in these environments to be named _exactly the same_ in each environment.
So your production database server has the same resource name as your dev database server?
Good luck running that on Azure.
Re your edit: When people have strong views ("absolutely not") and rants about it and yet does not seem to grasp what the article about, I think the opinion of those people should be ignored. Consider that.
Eh. Sure, I slightly mis-read the article, it happens.
Just preface my block of text with a "tangentially, when it comes to 'workloads' ... [the rest of the block of text]", and now you have a generic comment, not about the article, but about something related.
When people skim through something in a sloppy way and then focus on writing a rant about it I just don't take their view seriously. If people can't be bothered to carefully digest information, I just assume that they don't know what they are talking about. You may be right or wrong, but I would just choose to listen to people who did their homework instead.
Seeing as these are the only two comments you've made on this thread, it seems like you're not ignoring what you claim should be ignored and taking all this a bit too seriously.
Whether it makes sense to add the stage name to a resource name is a decision that is informed by a wider context that includes hosting environment, deployment pattern and configuration approach. It can make sense in some situations and can be a bad idea in other situations.
I think it is helpful to include environment name - they show up in UIs, consoles, DNS names, etc, and provide the operator a sanity check what environment a resource is in at a glance.
I disagree. Sure it might be an anti pattern, but it’s also for safety. Everything which is production should be called production. I have seen this in the places I’ve worked for the last 10 years or so, and never seen code like: “ if env == 'dev' then ”.
There is a story of a bank who sent out cancellations in production of all their trades because of a mistake due to something like that. That was a costly mistake.
Production and test environments should not even be on the same network. And, ideally, in my opinion, whoever has acces to a production server should not have access to a test server, and the other way around.
Bottom line is that the systems tend no to give a GIVE what the names are, so long as they resolve to information.
The naming conventions are for the humans to reason about the system, and help the new hire not trigger an outage.
Getting the "proper" amount of information in there is the acme of skill.
I tend to agree with this pattern as long as the commands you're running let you explicitly state which environment you're running these commands against so there's never any confusion by the human running the commands or the human reviewing CI logs.
With Kubernetes / Helm you can have all of your resources named the same in each environment, each with their own set of same named env vars that follow what you've described but whenever you do anything that interacts with your cluster you can add a `-n prod` namespace flag so that each environment runs in its own isolated namespace.
Also for good measure it's not a bad idea IMO to add _dev, _test, _prod to the name of your database just as a double identifier. It still meshes well with the strategy of using a DATABASE_URL. I like using the full URL instead of just using the DATABASE_HOST since the password will be different across environments and I'd rather only have to set 1 env var instead of 2+.
In a sibling comment someone mentioned this pattern doesn't work for monitoring in different environments but it does work. You can set an APP_ENV env var and then filter based on that. The same thing applies for logging. You can tag / filter your logs on the APP_ENV too.
From the article:
> As usual, there’s no silver bullet and the actual naming convention should always be tailored to your environment. The main point is having one! And I hope this post gives you a head start.
Globally namespaced resources (like s3) would disagree with you.
This always bites me. I wish it could be opt out. I'm fine namespacing my s3 buckets with my region and account and I rarely ever use s3 domains
Internally at Amazon people are expected to spin up entire separate AWS “conduit” accounts for different environments. So this only applies if you’re running everything out of one account, which is bad form.
To their point, S3 buckets must be uniquely named globally, across all of AWS
Yeah, can’t escape things with global namespaces.
This is such a terrible idea.
When looking at log files you need to know which node is having a problem. It's also helpful to know what environment was responsible for a security alert at a glance. It's also helpful to know if the instance that is running is the same one that had the incident, or whether it's a brand new node and the old one is gone.
Naming servers the same name loses a ton of valuable information and provides almost no benefit. It is just inviting people to make mistakes, and creating a nightmare for your noc/soc and siem response teams.
I agree with the sibling comments that you should have the environment name in the resource name - that way you can directly know whether the host you're connected to/the monitoring alert/etc. are prod or not.
I was disappointed that this article was not about naming actual clouds.
Ha! I was similarly disappointed. But you might find this a good starting point, if you've not come across it before:
No more reliance on GPS for me!