Settings

Theme

DTable a new distributed table implementation in Julia using Dagger.jl

julialang.org

19 points by vchuravy 4 years ago · 5 comments

Reader

snicker7 4 years ago

It's a shame that JuliaDB was basically abandoned. I work in the financial industry, and I could see Julia competing with KDB+. Unfortunately, the Julia data engineering stack is far behind the data science stack.

  • jpsamaroo 4 years ago

    It is certainly a shame, but I'm confident that Dagger and its new DTable should be able to cover all of the ground that JuliaDB covers, while being far easier to maintain. I think JuliaDB had some great ideas, but it didn't go far enough with composability, instead opting to use a limited set of table types (no internal DataFrames.jl support), fully focusing on loading from CSV (which is a horrible data format, albeit very common), and supporting only one CSV reader/writer (CSVFiles.jl). Of course, all of this could get fixed; but with JuliaComputing no longer funding its direct development, and no one dedicating the large portions of time necessary to fix all the outstanding issues and begin developing and merging features, JuliaDB isn't moving anywhere fast.

    Thankfully, Dagger is under active maintenance, and has financial support through the JuliaLab (by employing me). Krystian Guliński, the DTable's author and maintainer, is also interested in developing and maintaining the DTable further (having created it as part of his schooling), and will hopefully stay on the Dagger team for the foreseeable future.

1egg0myegg0 4 years ago

Have you thought about interfacing with DuckDB for out of core processing?

https://github.com/kimmolinna/DuckDB.jl

  • jpsamaroo 4 years ago

    Firstly, I'll say that we already have work started to implement out-of-core directly in Dagger: https://github.com/JuliaParallel/Dagger.jl/pull/289.

    With that PR in place, it should be possible to define a "storage device" which is backed by a database. I haven't had a chance to actually try this, since the PR still needs quite some work and testing, but it's definitely something on my radar!

adgjlsfhk1 4 years ago

20x faster than Dask is pretty good! I hope this becomes production ready.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection