Settings

Theme

The Data Engineering Handbook

github.com

185 points by matthewhefferon a year ago · 21 comments

Reader

teleforce a year ago

I highly recommend Fundamental of Data Engineering book [1].

Hopefully the authors can update the book soon to reflect the latest information and expand with another entire chapter for data management as they did to data architecture.

[1] Fundamentals of Data Engineering:

https://www.oreilly.com/library/view/fundamentals-of-data/97...

  • chrisvalleybay a year ago

    I want to second this. As a CTO I was able to use the themes and concepts to better be able help make decisions within a domain I didn’t know before reading this book.

gigatexal a year ago

This isn’t a handbook wtf. It’s more akin to an Awesome [topic] list like Awesome Python on GitHub showing all sorts of projects and things related to the language.

moandcompany a year ago

"This repo has all the resources you need to become an amazing data engineer!"

That's a bold claim. Is this a marketing post for selling courses?

  • wswope a year ago

    It’s a plug for the author’s discord server (first entry under the “must-join” communities list).

    Pro tip for any aspiring DEs: you can ignore 98% of the linked junk on this repo. Learn python, learn SQL, read DDIA, and you’ll do fine.

  • matthewhefferonOP a year ago

    He does sell a course, but the repo is open-source with contributions from others. It’s got solid resources, and I found it pretty useful.

benrutter a year ago

Nice list! Although as somebody who works on open source tools for data engineering, it kills me a little to see "companies" as the the list header rather than, say, "projects".

(also, shameless plug for my.latest project Wimsey which is non-company affiliated but does let you test data in a nice, lightweight way: https://github.com/benrutter/wimsey)

chris_wot a year ago

Kimball's Data Warehousing book is excellent.

  • emmanueloga_ a year ago

    "You can save more souls with roller skates and Easy-Bake Ovens than with this 2,000-page sleeping pill" The Simpsons Season 13, Episode 6 :-)

    I first learned about star schemas from one of Kimball's books years ago. The content was good, but the writing style wasn’t particularly engaging.

    I think the books remain relevant for foundational concepts like dimensional modeling, but Kimball's focus reflected the dominant dbs of the time like Oracle and SQL Server. Columnar databases such as MonetDB were niche and not widely adopted... If I remember right, I don't think Kimball books cover those more than a passing mention.

    Are there any more modern books about warehousing out there you would recommend? (other than DDIA, which is brought up all the time these days).

    • chris_wot a year ago

      lol! I haven't pursued this area for some time, so I genuinely can't say.

mettamage a year ago

Is there a data analyst handbook? Starting a role in that soon (switching from SWE - tired of full-time programming)

  • blub a year ago

    Is this still a career path worth investing in? It seemed super hot a couple of years ago and then people stopped talking about it. Saw many comments complaining about lack of jobs.

    • fifilura a year ago

      I think it should be worth it. For the reason that you are closer to the product you are building, you help defining what to build.

      As opposed to programming which is more like plumbing work.

      • gregw2 a year ago

        Believe me, there's a lot of plumbing moving stuff from point A to B and dealing with poop ("dirty data" is the industry euphemism) in the data engineering and data analyst space.

        In my more analytic moments I try to convince myself that data engineering and analysis is like chemical refining, creating useful byproducts out of raw liquids, but in my cynical moments, the plumbing metaphors for it are just so much more evocative.

        • fifilura a year ago

          Still, somehow, I think it is where it all started in the 1950s. "I have all these numbers, I need a machine to help me do something useful".

          And then after this came a huge industry with programmers.

          For me it is more like back-to-basics.

    • jhrmnn a year ago

      I don’t think people stopped analyzing data, but the job titles probably changed? Data scientists and data engineers are probably now doing what data analysts used to do?

    • mettamage a year ago

      Despite having 4 years of experience as a SWE (excluding bachelor and CS master), it’s the only job I can get. So there’s that

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection