ProPublica Illinois Uses GNU Make to Load 1.4GB of Data Every Day – ProPublica
propublica.orgI once replaced a Hadoop ETL process that took several hours with a GNU Make ETL process that took several tens of minutes and loaded well more than 1.4GB of data, at a company that shall not be named. Its Java developers working on Windows machines blinked uncomprehendingly at it and muttered, "This is not enterprise." I quit within the month.
GNU Make's documentation is actually excellent. It's dense, but that is only because there is a lot to cover.
While the autotools system is very powerful, I think people’s problem with it (`make`) is its syntax. `configure` is usually a bash script, but `make` isn’t. It’s pretty unintuitive to a newbie. It looks like a script, but it’s kindof not. The lack of arrays is a weird choice (AFAIK everything is a string) that requires one to use quotes in a lot of places so “array-like” “functions” work correctly.
When I was learning C++ a bit ago, I tried to write basic `Makefile` files, but switched to CMake. There is the disadvantage of practically every Unix-like OS having `make`, but not always CMake, but I found the trade off worth it for one-off projects
I am partial to Rust over C/C++ here, but I do like Rust’s method of a simple `cargo build`, but with the ability to write a program (`build.rs`) for non-Rust parts.
Just a question:
Is i really a good idea to download voting data over a unencrypted ftp?
RTFA. It's campaign finance data. Contributions and spending, which is publicly available information. ProPublica is a watchdog group.