Settings

Theme

Range joins in DuckDB

duckdb.org

111 points by hfmuehleisen 4 years ago · 26 comments (25 loaded)

Reader

orthoxerox 4 years ago

DuckDB has become my preferred tool for hardcore data wrangling. Excel is fine for like 80% of data processing tasks, but the remaining twenty percent are a pain, especially when you're CPU bound on a remote desktop. Smuggling the DuckDB JDBC driver onto said remote machine was the most productive infosec violation I've ever committed.

  • minaguib 4 years ago

    I don't know why I waited so long to try it.

    I wrangle a ton of raw and aggregate data locally every day. I've had a 10-year habit of massaging via unix CLI tools and pipes then moving to excel. I guess I didn't wanna write code. Funny thing is I love SQL.

    But with `duckdb_cli` it's a game-changer. I'm truly truly impressed.

  • hawkfish-rmgw 4 years ago

    Our lips are sealed.

eatonphil 4 years ago

I've been beating my head trying to get duckdb to statically link into a Go program (I'm neither an expert with cgo nor ld). If anyone else has been able to do this I'd love to see your build steps.

https://github.com/marcboeker/go-duckdb produces a non-static binary by default.

  • knome 4 years ago

    I'm not familiar with the project. Does it use any net-related code? That won't be static because it will want to load C-libs for using /etc/nsswitch.conf to handle DNS/name stuff.

    https://stackoverflow.com/questions/33228809/why-is-my-go-ap...

    • eatonphil 4 years ago

      I don't have the source code in a good state to publish yet but here's where I'm at. At some point before this CGO_LDFLAGS does work and the header is found (omit the -ldflags args). But when it goes to statically link it can no longer find the header.

        CGO_LDFLAGS="-L$(pwd)/duckdb/src/include" CGO_CFLAGS="-I$(pwd)/duckdb/src/include" go build -ldflags '-extldflags " -lstdc++ -lm -lduckdb -static"'
        # github.com/marcboeker/go-duckdb
        ../../go/pkg/mod/github.com/marcboeker/go-duckdb@v0.0.0-20220427142532-cd9f33e64d9a/connection.go:4:10: fatal error: duckdb.h: No such file or directory
          4 | #include <duckdb.h>
            |          ^~~~~~~~~~
        compilation terminated.
      
      Edit, nevermind about not being in a good state! Here's my code: https://github.com/multiprocessio/duckdb-tests.
      • irq-1 4 years ago

        Put the file in quotes. Angle brackets are for built-in files. #include "duckdb.h"

        • eatonphil 4 years ago

          That's not my code.

          • eatonphil 4 years ago

            But also, just to double check, I modified the vendored code and no difference:

              CGO_LDFLAGS="-L$(pwd)/duckdb/src/include" CGO_CFLAGS="-I$(pwd)/duckdb/src/include" go build -ldflags '-extldflags " -lstdc++ -lm -lduckdb -static"'
              # github.com/marcboeker/go-duckdb
              vendor/github.com/marcboeker/go-duckdb/connection.go:4:10: fatal error: duckdb.h: No such file or directory
                4 | #include "duckdb.h"
                  |          ^~~~~~~~~~
              compilation terminated.
  • DemocracyFTW2 4 years ago

    I tried to replace SQLite with DuckDB for a customized install of better-sqlite3[1] and failed.

    [1] https://github.com/JoshuaWise/better-sqlite3

  • Cwizard 4 years ago

    I tried the same thing, also failed… I am also not an expert however. But I am very interested in this. Anyone reading this that could point me to some resources that might help?

  • folago 4 years ago
    • ignoramous 4 years ago

      Per this post [0] by Andrew Kelley, Zig's lead developer, projects with "large dependency trees" are better off using other tools than rely on Zig's cross-compile magic.

      DuckDB needs Python3 to build as well, so not sure how easy it might be to get it cross-compile with Zig CC.

      [0] https://archive.is/7SuAf

      • eatonphil 4 years ago

        Also, the issue isn't cross compiling it's just static linking.

        • ignoramous 4 years ago

          Gotcha. Targeting musl instead of glibc with Zig CC should get you a statically-linked binary, though, unsure if duckdb and its deps play nice with musl.

          Personally, a duckdb golang binary interests me. But: I haven't yet mustered enough patience to sit through a time-consuming duckdb build.

lnsp 4 years ago

Great to see new features being implemented. I'm using DuckDB for a thesis project and integrating it into my own Python CLI/web tool has been super easy -- I especially love the direct integration with DataFrames, it makes things really seamless.

crimsoneer 4 years ago

I've been using DuckDb a fair bit recently and really enjoy it... When it has slightly better ide support (eg, I can use it in pycharm) and can take in geospatial data, I'll be ecstatic.

chewbacha 4 years ago

Well I feel silly, based on a slight mis-reading of the title, I totally thought that Range was some company that was acquired by DuckDB.

  • hawkfish-rmgw 4 years ago

    Oh dear I can see that - sorry for the confusion! I'll see if we can come up with something a bit longer. It was a bit nerdy...

  • zxv 4 years ago

    I also thought that DuckDB had acquired a company named Range. Interesting article regardless!

    • qorrect 4 years ago

      I was thinking there was some 10x programmer known only as Range that had joined.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection