Show HN: Bender – Let's Standardize Serverless ETL

engblog.nextdoor.com

52 points by stlava 9 years ago · 21 comments

Reader

ETL stands for Extract, Transform and Load. So this is a data pipeline framework. Nextdoor please put that in your blog post as I do not believe it is a very common acronym.

techwizrd 9 years ago

It's not that uncommon, especially in data science/analytics/engineering. I've definitely heard "ETL" more often that data pipeline or analytics pipeline.
This is a nice list[0] of ETL software.
0: https://github.com/pawl/awesome-etl
- sbuttgereit 9 years ago
  
  Not uncommon in the ERP/Enterprise/Business computing either. (I say that while I'm working on just such a project for a food distributor).
gipp 9 years ago

It is an extremely common acronym.
- vidarh 9 years ago
  
  It's an extremely common acronym in a small niche. So while you're right, sort of, it's still worth clarifying when posted to a more general site.
  - lloydde 9 years ago
    
    It’s not a small niche. Search any job board and ETL will show up in a large presentage of job posts and resumes related to software QA and data validations.
    
    spc476 9 years ago
    
    HTML does have the <ABBR> tag. How hard is it to do?
    <abbr title="Extract, Transform, Load">ETL</abbr>
    Even once?
    
    lloydde 9 years ago
    
    I was not disagreeing with the recommendation. It’s almost always a good idea to spell out an acronym/initialization on first use.
  - _jal 9 years ago
    
    It is the kind of acronym that is known by anyone who needs it, modulo those two random people somewhere in the world who just discovered today that they need it but haven't googled the problem yet.
    Admittedly, it is also the sort of acronym that, when you don't know it, is really annoying in a headline.
taude 9 years ago

I'll leave you with a link to Wikipedia [1]. ETL goes back a long, long way, and is still commonly used all through the industry, except maybe in startups composed of all young, inexperienced people.
[1] https://en.wikipedia.org/wiki/Extract,_transform,_load
throwaway2016a 9 years ago

Thank you for pointing that out.
I would have myself made the mistakes of assuming the terms is ubiquitous and everyone already knows it. I have heard it used liberally at every job I've done in the last 15 years.
I would have termed it a "very very common acronym." Right up there with API.
But your post reminds me that not everyone's experiences are the same.
manigandham 9 years ago

ETL as a term has been around for decades and is very well used and understood in pretty much all IT, data, software and enterprise scenarios.

john_teller02 9 years ago

Its nice but AWS released GLUE for similar use case a couple of weeks back

snowpalmer 9 years ago

I have so far been unimpressed with GLUE.

slagfart 9 years ago

Gobblin (https://gobblin.apache.org/), which looks like it does something similar, can be packaged up as a single .jar, and run on AWS Lambda.

Anyone had working experience with either?

throwaway2016a 9 years ago

Great work. I'm in the process of writing an ETL and while I don't think this will suite our needs (so unfortunately I need to keep writing) this article provides a lot of great detail that helped me see the process more clearly.

diranged 9 years ago

It would be great to know what doesn't fit your needs - can you describe your project in more detail?
rbradk 9 years ago

Interested in chatting about your ETL project more? We may be able to help.

throway_foo_bar 9 years ago

Link to the source in the document doesn't work.

diranged 9 years ago

Can you point me to where -- I scanned through the Blog and havn't found a bad link yet?
- throway_foo_bar 9 years ago
  
  The last link, to github, was to some internal repo. It has since been fixed.

Settings

Show HN: Bender – Let's Standardize Serverless ETL

Keyboard Shortcuts