Show HN: Bender – Let's Standardize Serverless ETL
engblog.nextdoor.comETL stands for Extract, Transform and Load. So this is a data pipeline framework. Nextdoor please put that in your blog post as I do not believe it is a very common acronym.
It's not that uncommon, especially in data science/analytics/engineering. I've definitely heard "ETL" more often that data pipeline or analytics pipeline.
This is a nice list[0] of ETL software.
Not uncommon in the ERP/Enterprise/Business computing either. (I say that while I'm working on just such a project for a food distributor).
It is an extremely common acronym.
It's an extremely common acronym in a small niche. So while you're right, sort of, it's still worth clarifying when posted to a more general site.
It’s not a small niche. Search any job board and ETL will show up in a large presentage of job posts and resumes related to software QA and data validations.
HTML does have the <ABBR> tag. How hard is it to do?
Even once?<abbr title="Extract, Transform, Load">ETL</abbr>I was not disagreeing with the recommendation. It’s almost always a good idea to spell out an acronym/initialization on first use.
It is the kind of acronym that is known by anyone who needs it, modulo those two random people somewhere in the world who just discovered today that they need it but haven't googled the problem yet.
Admittedly, it is also the sort of acronym that, when you don't know it, is really annoying in a headline.
I'll leave you with a link to Wikipedia [1]. ETL goes back a long, long way, and is still commonly used all through the industry, except maybe in startups composed of all young, inexperienced people.
Thank you for pointing that out.
I would have myself made the mistakes of assuming the terms is ubiquitous and everyone already knows it. I have heard it used liberally at every job I've done in the last 15 years.
I would have termed it a "very very common acronym." Right up there with API.
But your post reminds me that not everyone's experiences are the same.
ETL as a term has been around for decades and is very well used and understood in pretty much all IT, data, software and enterprise scenarios.
Its nice but AWS released GLUE for similar use case a couple of weeks back
I have so far been unimpressed with GLUE.
Gobblin (https://gobblin.apache.org/), which looks like it does something similar, can be packaged up as a single .jar, and run on AWS Lambda.
Anyone had working experience with either?
Great work. I'm in the process of writing an ETL and while I don't think this will suite our needs (so unfortunately I need to keep writing) this article provides a lot of great detail that helped me see the process more clearly.
It would be great to know what doesn't fit your needs - can you describe your project in more detail?
Interested in chatting about your ETL project more? We may be able to help.
Link to the source in the document doesn't work.
Can you point me to where -- I scanned through the Blog and havn't found a bad link yet?
The last link, to github, was to some internal repo. It has since been fixed.