Settings

Theme

We put 1M files into DVC, Git-LFS, and Oxen.ai

docs.oxen.ai

7 points by sthoward a year ago · 5 comments

Reader

mathi0750 a year ago

Oxen is awesome. Been having a lot of fun with your model inference tool. any time line on new models + which models will you guys be adding next?

sthowardOP a year ago

Hey all, If you haven't seen the Oxen project yet, we have been building an open source unstructured data version control tool.

We were inspired by the idea of making large machine learning datasets living & breathing assets that people can collaborate on, rather than the static ones of the past. Lately we have been working hard on optimizing the underlying Merkle Trees and data structures with in Oxen.ai and just released v0.19.4 which provides a bunch of performance upgrades and stability to the internal APIs.

To put it all to the test, we decided to benchmark the tool on the 1 million+ images in the classic ImageNet dataset.

The TLDR is Oxen.ai is faster than raw uploads to S3, 13x faster than git-lfs, and 5x faster than DVC. The full breakdown can be found here.

https://docs.oxen.ai/features/performance

If you are in the ML/AI community, or rust aficionados, would love to get your feedback on both the tool and the codebase. We would love some community contribution when it comes to different storage backends and integrations into other data tools.

  • neelm a year ago

    Have you measured the difference in speed in moving data to the GPU? For enterprise AI workflows this is a bottleneck to utilization, so improved speed can help reduce compute costs.

    • sthowardOP a year ago

      Great thought. Right now we are optimizing for moving data from machine A to machine B, but getting data to the GPU is interesting. We're on it.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection