An introduction to libuv
nikhilm.github.ioI've written a couple of C tools directly on top of libuv[1] to play with it. I've got to say, its a great little library. Its basically nodejs's entire cross-platform standard library & event loop exposed to C, without V8 and without npm. It performs well, the code is pretty high quality and the internal documentation is excellent. The external API docs aren't very good though - I found myself reading through nodejs a few times to figure out how I was expected to use some of the functions.
I'm quite curious to see how much performance you lose through libuv compared to using the lower level IO primitives directly (epoll, select and friends). I know redis doesn't use any high level event libraries, but I haven't seen any benchmarks.
[1] https://github.com/josephg/sharedb (Caveat: This was an experiment and libuv has probably changed in incompatible ways since I wrote this code)
Redis doesn't use libuv, and in fact rejected a patch from MS to add libuv support. The reason is dependency management, not performance. Redis takes very very few dependencies - basically just a C compiler and POSIX. This makes it easier to deploy Redis, and Redis doesn't have to worry about failures in its non-existent dependencies.
i am always a bit of jerk about these things because i constantly work with genuinely performance critical code, but the very first thing puts me off:
uv_loop_t* loop = uv_loop_new();
does the compiler know where this exists, is it allocated on demand, is there a lock involved? i hope to get the good answers to these questions but the naming of the function alone makes me skeptical. this skepticism turns out to be justified.
digging in:
loop = (uv_loop_t* )malloc(sizeof(uv_loop_t));
so the answers to all of my questions are the wrong ones for me. i might override the allocator to be less rubbish in my context. but simple things like this tell me that this library was not architected for the kinds of performance considerations that i need to make.
at this high level its not so important, but the more digging i do the 'worse' it gets...
generally this does look helpful, but it gives me nothing over my existing solutions (in my context) which, for example, require zero run-time memory allocations - outside of OS level API calls that I have zero control over - and lean heavily towards lockless implementations, avoiding the massively (but understandably) heavyweight OS provided threading primitives...
thread safety of malloc and other standard library (i.e. libc) type stuff is, in reality, up to the implementor. even when things have no requirement to be thread safe implementors (Microsoft) will often insert what i call 'sledgehammer thread safety' to protect bad programmers from themselves. i can understand why, but it prevents me from being able to use these libraries.
when i can do a better job than your standard library, you have failed imo. but it is just my opinion...
Libuv author here. Libuv doesn't try to be all things to all people - its main users are Node.js and Rust - but if you have suggestions on how to improve the API or the implementation, please file issues[1] or join us in #libuv on irc.freenode.org. We welcome outside input.
As a bit of history, the reason why uv_loop_new() mallocs memory for the struct (and it's something of an anomaly in that respect, most other API functions don't) is that the thing that came before libuv, libev, worked like that. It's something we can change if there is demand for it.
And Julia! :)
thanks. great reply. :)
tbh, i was angling for 'my criticism isn't great because i am a specialist in a specialist field'.
(also, it does look like a genuinely useful library for most use cases - im just lazy and want everyone else to do my job for me :P)
In general, you can always do a better job than any given library for your specific use case. That might mean 10x more work though, it's all about choices/compromises.
He probably wants an alternative like:
uv_loop_t loop; uv_loop_init(&loop);That would be my preference. And contrary to what others thought that one would want a lot of these loops, when you want only a few, especially when those few are fixed for the entire program life you can do better by avoiding malloc altogether.
Could be my preference for working on systems that never malloc and only use specific pools and rely on knowing their exact memory requirements from the start to the end. I can acknowledge that this may be more anal than normal but then if the library didn't allocate in my name I could choose to allocate on heap or use it in a static fashion as I please.
As an undergrad I was thought about ADT (Abstract Data Type) and then the "proper" way was to provide a _new method to allocate the data. This also allowed to completely hide the type as the C header didn't need to show the content of the struct being allocated.
If you go with not malloc'ing internally then you need to expose the entire struct and I tend to go about it by using an extra header X_internal.h that is explicitly showing the internals but expects you not to abuse this knowledge.
It's a tradeoff and I currently tend towards the second option more often than not.
Various parts of libuv (e.g. the recv path) have pluggable allocators. I'm more interested in why you'd be wanting to create and destroy event loops at a high rate? That seems to imply you're perhaps creating and destroying threads at a high rate, in which case, you have bigger problems than malloc.
My theory: The library was designed for Node.js, not for ultimate performance. Design for Node.js means "easy to create bindings for", which means opaque pointers instead of structs in the API.
Just wondering if you could recommend some books or open source code that you'd consider to be a good role model of the kind of code you write. I use libuv in one of my projects and I was using it to better my C skills. I don't get to do C very much in my day job. But I'd like to be exposed to different styles of C so that I might understand why one style is used over another in a given context. Thanks.
this is part of the problem. most of it is locked away in proprietary source codes - something i am hoping to fix in the coming months by putting something into the public domain myself. :)
Genuine question: why would you want to create a significant numbers of loop contexts? (Your way does sometimes save one pointer dereference, but that's not what you appear to be talking about.)
Is there some introductory text available explaining what is the purpose of libuv, it's intended usage scenarios etc.?
It's a C library to handle asynchronous IO. The library it replaced, libev, is essentially a wrapper around select which is a unix system call that looks for file descriptors that are ready for reading or writing (for more info you can use the command 'man select' in bash). My understanding is that select can be nondeterministic so there were predictability and performance improvements to be had by replacing it with a better model. The guide also links to this talk by one of the libuv authors which is a great help in understanding why they wrote libuv: https://www.youtube.com/watch?v=nGn60vDSxQ4
This is not quite correct. libev is a wrapper around the best available of select/epoll/kqueue (the same syscalls libuv uses), and it provides nice timers, thread wake (eventfd/pipe), etc.
What it doesn't provide that libuv does is high-level support for asynchronous filesystem I/O, a built-in asynchronous DNS resolver, process management abstractions and more high-level cross platform goodies for writing asynchronous apps. libev also doesn't have very good support on windows.
So, the main improvements provided by libuv are a more extensive high-level API and good windows support. I doubt speed (or deterministic latency or scalability.. etc) was a goal, as libev is very, very fast. Just lower-level.
For the record, Node's I/O performance is better now that it's based on libuv (after libuv removed its own libev dependency) than when it was based on libev. This doesn't necessarily mean that libuv is always faster than libev, but at least for Node's use case it was.
Thanks for the clarification. My experience with system level IO calls is limited. I formed my answer from my experience and the Bert Belder talk I linked to above. He actually describes libev as a wrapper around select, and says that select is slow but he does clarify in his slides that libev also uses epoll/kqueue, which I didn't notice until returning to the talk after your comment.
libuv is like libev and libevent, but with an async API instead of a level-triggered readiness notification API. It was specifically written for Node.js because libev didn't work for them anymore at some point (e.g. limited usefulness on Windows).
high performance I/O library, with Windows support. I'm happy to note it has bindings for a lot of languages, including Python:
https://github.com/joyent/libuv/wiki/Projects-that-use-libuv
I'm very familiar with Twisted, Perl AnyEvent, and Gevent/Greenlet -- libuv seems to be like that.
http://redis.io/topics/internals-rediseventlib ae is good too.
Isn't this example wrong: (I'm not much familiar with libuv, but reading from the comments it might be):
http://nikhilm.github.io/uvbook/filesystem.html
void on_read(uv_fs_t *req) {
uv_fs_req_cleanup(req); // <-- bug? freeing here, later using req ptr?
if (req->result < 0) {
fprintf(stderr, "Read error: %s\n", uv_strerror(uv_last_error(uv_default_loop())));
}
else if (req->result == 0) {
uv_fs_t close_req;
// synchronous
uv_fs_close(uv_default_loop(), &close_req, open_req.result, NULL);
}
else {
uv_fs_write(uv_default_loop(), &write_req, 1, buffer, req->result, -1, on_write);
}
}It's correct. uv_fs_req_cleanup() deletes some private data associated with the uv_fs_t. `result` is part of the public interface and is unaffected.
It looks like the BeBook.
Author here.
It's because it uses the Haiku standard theme shipped with the sphinx documentation generator, which does come from the Haiku project :)