Why Rails 4 Live Streaming is a Big Deal

blog.phusion.nl

38 points by ninh 14 years ago · 59 comments

Reader

amix 14 years ago

Rails is turning into a framework that includes everything, including the kitchen sink. Personally, I prefer to use the best tool for the job and node.js seems to be a much saner choice when doing realtime communication, since everything in node.js is non-blocking. There are so many ways to shoot yourself in the foot if you develop large realtime systems in Ruby (or any other language that includes a lot of blocking libraries).

sjtgraham 14 years ago

Actually not everything in Node is non-blocking. IO is largely non-blocking by but there is also blocking IO in Node too (synchronous file system functions). Not to mention you definitely will block by doing something computationally intensive in a single tick of the reactor loop.
Have you ever written a non-trivial "real time" app in Ruby? I have (https://github.com/stevegraham/slanger). I think Ruby is actually very well suited to event driven apps. Eventmachine is a very mature library providing asynchronous I/O based on the same pattern as Node. Ruby also has fibers as a native language feature, allowing you to write asynchronous code that looks synchronous, i.e. no nested callback hell, and consequently this makes it a lot easier to write tests for.
Comparing Node to Rails is also absolute nonsense. Rails is a web framework and Node is much lower level than that. Rails is essentially a suite of DSLs for building web applications. Of course there are costs associated with that amount of abstraction.
- sunkencity 14 years ago
  
  One thing that keeps me from considering Eventmachine mature is that the built in http clients are very crude and undocumented. Example: to get working error handling one needs to use an external http client library like igrigoriks em-http-request instead of the defaults. In this regard Node comes out ahead with better core utilities to boot. Stuff like that is very confusing for new users and puts the whole framework into question (shipping with a http client that is not suitable for production use).
- amix 14 years ago
  
  Having blocking computation is much better than having blocking IO - - especially, since most web applications spend most of their time waiting for the database/caching layer. And blocking computation can be solved in node using the clusters feature.
masklinn 14 years ago

> Rails is turning into a framework that includes everything
It's really not turning into anything, Rails had response streaming back in the days, the whole stack has response streaming (as long as the underlying server handles it and there's no broken middleware anywhere) and most microframeworks have response streaming.
There's nothing special to response streaming.
> since everything in node.js is non-blocking.
Not in your wildest dreams, IO is non-blocking and that's about it. Try a bit computation in a node response flow and see your concurrency disappear. If you're looking for "everything is non-blocking", Node most definitely isn't going to cut it.
> There are so many ways to shoot yourself in the foot if you develop large realtime systems in Ruby
Just as there are in Node. And Ruby has eventmachine which works rather well.
malandrew 14 years ago
This is exactly what I thought when I saw this on the front page. I'm wouldn't be surprised if they eventually add a kitchen sink library.
Choosing a monolithic framework doesn't encourage separation of concerns and clean interfaces. It also makes scaling horizontally more difficult.
So many ruby gems are built around rails instead of being built as separate discrete libraries. This means that once Rails get really long in the tooth (it has already started to) that it's going to take a bunch of libraries that could have been timeless along with it had they not been so deeply integrated with Rails' architectural features.
The fact that a document like this exists... https://github.com/radar/guides/blob/master/sprockets/sprock...
```
    This is a detailed guide to the internal workings of Sprockets. Hopefully 
    with this information, somebody else besides Josh Peek, Sam Stephenson, 
    Yehuda Katz and (partially) myself can begin to understand how Sprockets works.
```
... that connects so many parts of rails isn't a good sign.
node.js, or more specifically npm, got things really right by making the inclusion of libraries really trivial and easy to separate concerns.
I can't begin to imagine the complications with trying to do proper async around a framework that wasn't designed from the ground up to be async. I'd expect that you'll often have to dig into the internals of Rails to discover where something is functioning synchronously where you didn't expect it to. Don't get me wrong, I love the opportunity to dig into other people's code that I rely on, but that is only fun with libraries, not frameworks. When digging around in frameworks you often spend tons of time having to investigate glue code between parts to even understand the part you need to understand.
jherdman 14 years ago

I think this is a bit of an overreaction given that Rails has always been a "batteries included" framework. The fact that Rails allows for socket-like behaviour now is not a tipping point, but simply a nice little feature that may prove useful to some.
When Rails starts including a messaging stack, job server, and more, then I'd be right behind you in your complaint.
- adriand 14 years ago
  
  > When Rails starts including a messaging stack, job server, and more, then I'd be right behind you in your complaint.
  Actually, Rails 4 does include a job queue:
  https://github.com/rails/rails/commit/adff4a706a5d7ad18ef053...
  However, I'm not complaining. ;)
  - FooBarWidget 14 years ago
    
    Well to be more accurate: it includes a job queue API with the actual implementation being pluggable. It does not do job queuing by itself. It's just like the cache API, actual implementation depends on the cache store.
    
    adriand 14 years ago
    
    True, thanks for the correction. The person I responded to said "job server" and I'm not 100% clear on what that would cover - perhaps job queue API would fit that description, perhaps it wouldn't.
ZoFreX 14 years ago

I'm actually quite excited about this, because I think it makes a lot of sense for this to be in Rails, for a couple of reasons.
First of all, there aren't really any decent solutions for the problem in Ruby. There's a few projects with great potential, but nothing really mature - when I needed to do this, I went with cramp.in, which hasn't been updated in over 6 months now sadly. Node.js would unequivocally have been the better tool for the job, but I didn't want to learn Javascript or Node for this one-off.
Secondly, even if there were mature Ruby equivalents to Node.js, or no impediment to me using Node, I can still see a lot of benefit to doing this in Rails. Personally I think it makes sense to be able to send events down to the clients and hook into the same models and business code you already have in place in the Rails app. (For certain use-cases, for example a project that's in Rails, is mostly a "regular" web app, but you want to give live updates for certain model changes, such as messages, ticket changes, new blog posts, whatever - obviously if the app is totally oriented around live functionality then node would probably be a smarter choice)
(I apologise for using the term "Node.js" as if its another web framework, I know that's not entirely accurate but I don't know enough about it to write more accurately!)
- randomdata 14 years ago
  
  > First of all, there aren't really any decent solutions for the problem in Ruby.
  Even ignoring other frameworks, Rails has supported streaming through various APIs since at least the 2.x days.
  In Rails 2, you would pass a Proc to the render method giving direct access to the response.
  Rails 3 changed the API to any Enumerable assigned to self.response_body, as described in the article.
  Rails 4 gets yet another API. It may be arguably cleaner, but the end functionality remains the same as it has always been.
- masklinn 14 years ago
  
  > First of all, there aren't really any decent solutions for the problem in Ruby.
  Uh? Rack has supported streaming since the beginning, and Sinatra has had special support on top of that since 1.3.
  > Secondly, even if there were mature Ruby equivalents to Node.js
  http://rubyeventmachine.com/
  > I can still see a lot of benefit to doing this in Rails.
  Don't dismiss the drawbacks. Such as self-DOS-ing. Your Rails application can only have as many clients total as you have workers if they all keep a permanent connection.
  - saurik 14 years ago
    
    "Why your web framework should not adopt Rack API" by José Valim
    http://blog.plataformatec.com.br/2012/06/why-your-web-framew...
    "This blog post is an attempt to detail the limitations in the Rack/CGI-based APIs that the Rails Core Team has found while working with the streaming feature that shipped with Rails 3.1 and why we need better abstractions in the long term."
    
    masklinn 14 years ago
    
    Broken middlewares won't be any less broken now that before.
    
    blutonium 14 years ago
    
    They are less broken now. The Rails team put a lot of effort into fixing them in the past three releases.
  - ZoFreX 14 years ago
    
    I was more meaning top to bottom, although I confess I never looked into the feasibility of just rolling my own thing with rack. I should have clarified what I was looking for, which was an event-based micro web framework.
    Event machine is what's underpinning cramp, which is what I went for, and it does the job ok. The problems I had were to do with less than full support from the top to the bottom of the stack (basically the only server I could use was Thin, and there was no way to get anything working with Torquebox that I could find)
    > Your Rails application can only have as many clients total as you have workers if they all keep a permanent connection.
    So even though it's event-based, if I have 1000 clients long-polling (or SSE'ing, or whatever) that would tie up all my workers? I may have misunderstood what we were getting in Rails 4, then.
    
    masklinn 14 years ago
    
    > So even though it's event-based
    Rails is not event-based. An evented systems can have an n:m relation between workers and clients.
    > I may have misunderstood what we were getting in Rails 4, then.
    Yeah, or I did. As far as i understood, Rails 4 just adds streaming responses: the client starts getting bytes as soon as you start generating them (by calling `response.write` or whatever). That's not evented, that's just writing to the response stream, you can do that in CGI if you want.
  - bascule 14 years ago
    
    Rack 1.x doesn't have end-to-end streaming, and certainly not with a socket-style API. The Rack specification mandates buffering and rewinding input, and while it's possible to write a rewindable input stream (I've actually seen it done) nobody has ever written a stable one.
    EventMachine is poorly maintained and doesn't have the same level of community support as Node.js, not to mention an ugly API. Of course, Node.js doesn't have the same level of maturity as Twisted Python but Twisted doesn't get any hype whatsoever.
cageface 14 years ago

I think the problem is that Rails was designed to solve the web problems of 2005 and a lot has changed since then.
The shift to single-page, JS-driven applications and large numbers of dynamic updates requires a different set of design priorities and the performance characteristics of Ruby itself are more problematic in this environment.
In many cases it makes more sense to use something like Go or Node or Erlang that was designed to handle these kinds of loads from the ground up.
- bascule 14 years ago
  
  None of this is true. Single-page apps need great JSON APIs. Rails is awesome at making great JSON APIs. With Ember Data on the front end and ActiveModel::Serializers on the backend, you can completely free yourself from writing JSON serialization code.
  For websockets-based applications, check out Cramp. And note that as both Rails and Cramp are Rack-based, you can combine them.
  "Go or Node or Erlang"
  One of these things is not like the others...
  - malandrew 14 years ago
    
    Rails works just fine for REST, but you've forgotten the other side of this and that is the part of the framework that handles the javascript assets, i.e. Sprockets.
    The difference between handling the single page app in Rails and node.js is night and day. Sprockets is a nightmare to deal with.
    
    PetrolMan 14 years ago
    
    Please explain.
    I've actually had a pretty good time dealing with Sprockets. There are the occasional deployment issues but the benefits tend to more than make up for any frustrations.
    
    malandrew 14 years ago
    
    Problems:
    (1) Slow
    (2) Sprockets' C-style require directives because the people working with ruby don't want to get their hands dirty with crawling the javascript AST and parsing out commonjs or even Harmony-ES module system requires.
    (3) Incompatibilities between various JS-focused gems because of sprockets issues. Getting Require.js to work took a long time and then it took a lot longer to get the require.js gem to work with jasmine.js, and even then it ended up being a jerryrigged approach.
    In general, sprockets is a mess with respect to Javascript handling versus what you can do in node.js.
    
    bascule 14 years ago
    
    It's pretty slow, among other things
    
    bascule 14 years ago
    
    Nobody's forcing you to use Sprockets. You can use e.g. rake-pipeline instead
  - cageface 14 years ago
    
    Knocking up a REST API in any decent web framework is easy. Rails makes it slightly easier.
    It's good that things like cramp exist but async is really something you want to design into your language and framework from day one and not bolt on via libraries later.
- jcoder 14 years ago
  
  "Shift"? I'd go for "emergence." The classic use cases are still alive and well.
bascule 14 years ago

"There are so many ways to shoot yourself in the foot if you develop large realtime systems in Ruby (or any other language that includes a lot of blocking libraries)."
As opposed to doing I/O flow control in an asynchronous, callback-driven system? Have you ever heard of the "slow consumer problem"?
Not only can you build realtime systems with threads, by using synchronous I/O you'll be taking advantage of all the flow control TCP has to offer, instead of unboundedly filling up write buffers.
- malandrew 14 years ago
  
  If you have the slow consumer problem then you start forking node.js into separate worker processes (via binary or counting semaphores) and do IPC via something like dnode (locally or even over something like seaport).
chaz 14 years ago

Agree that best tool for the job is important, and that node.js may be the right tool if your app is very realtime-focused. But no need to go node.js if all you need is a progress indicator.
I think the modern web stack does need an easy way to do realtime communication, and as a casual Rails engineer since pre 1.0, I'm happy to see this get support out of the box.
batista 14 years ago

>There are so many ways to shoot yourself in the foot if you develop large realtime systems in Ruby (or any other language that includes a lot of blocking libraries).
How is developing "large realtime systems" in Node any better?
It's a thrown together library on top of V8, in a language that doesn't even have concurrency primitives.
Doing non-blocking stuff in Node is like powering your car by pedalling. It works and takes you places, but it misses the point of needing a car in the first place.
- malandrew 14 years ago
  
  If you approach it with a Rails/framework mindset, it's worse in node.js, but if you approach designing a large realtime system as many separate loosely coupled node.js processes, it's easier.
  Whenever I see someone trying to cobble together their own framework by prepackaging a bunch of node.js libraries I see someone who is doing it wrong.
  You shoot yourself in the foot in node.js when you treat it like rails. You shoot yourself in the foot with rails when you try to do anything outside the scope of problems for which it was designed.

alexyoung 14 years ago

"Can Rails compete with Node.js?"

For the perplexed: Node isn't a web framework.

FooBarWidget 14 years ago

It isn't, but in the context of the article I was talking about competing on the ability to support certain I/O use cases, not comparing features.
mikeryan 14 years ago

It took about 2 weeks of working with Node and you quickly realize where its place is your stack, and its very much not the same thing I use Rails for. It's also a pretty distinct line that is pretty easy to determine.
I also find any Node/Rails comparison's pretty silly honestly, they're very different tools, and though you can do the same thing with each, I wouldn't.
batista 14 years ago

>For the perplexed: Node isn't a web framework.
For the perplexed: it doesn't matter.
Take it to mean "Node+whatever" vs Rails. Or even "raw Node + totally custom js framework on top" vs Rails.

bascule 14 years ago

"Cons: If a thread crashes, the entire process goes down."

I wrote this thing called Celluloid and I can assure you this isn't true. Ruby has "abort_on_exception" for threads, but the default is most assuredly false.

"Good luck debugging concurrency bugs."

Good luck debugging concurrency bugs in a callback-driven system!

FooBarWidget 14 years ago

> I wrote this thing called Celluloid and I can assure you this isn't true. Ruby has "abort_on_exception" for threads, but the default is most assuredly false.
I'm talking about CPU instruction level crashes, not language level crashes. Things like writing to an invalid memory address or heap corruption.
> Good luck debugging concurrency bugs in a callback-driven system!
Actually I already mentioned concurrency bugs in evented systems in the article.
- bascule 14 years ago
  
  "Things like writing to an invalid memory address or heap corruption."
  So what you're trying to say is if the entire virtual machine crashes, you lose all running threads.

edwinnathaniel 14 years ago

It's becoming more like... GASP JavaEE GASP

bascule 14 years ago

Rails: reinventing Java one feature at a time (and that's not necessarily a bad thing)
batista 14 years ago

Yes, I can see how Rails evolving to match the evolving needs of its users can give that impression... /s

aoe 14 years ago

So these changes won't be available in the free version of Phusion Passenger 4?

parfe 14 years ago

>Several days ago Rails introduced Live Streaming: the ability to send partial responses to the client immediately.

Would this be analogous to what PHP does if you being writing a response without output buffering?

FooBarWidget 14 years ago

Yes you can do the same in PHP by disabling output buffering. You're limited by the web server's concurrency model however. Apache's mod_php only works on the prefork MPM so your concurrency is limited by the number of Apache processes you can spawn (which can be quite bloated because you run the PHP interpreter inside Apache). Another less commonly used but still notable setup is PHP via FastCGI (e.g. when using PHP through Nginx). Here you are limited by the number of PHP-FastCGI processes you spawn.

why-el 14 years ago

Pardon the ignorance, but can't this be achieved by simple Ajax requests provided by any of the js frameworks? How is this better?

jherdman 14 years ago

Ajax requires long polling, this PUSHES the response to the server, thus obviating the need for long polling.
- masklinn 14 years ago
  
  > this PUSHES the response to the server
  Erm... it pushes the response to the client, not the server, and only pushes after a normal HTTP request.
  And of course, it also ties up a huge amount of server resources (total number of clients = total number of workers, since each client permanently ties up a connection forever). Phusion Passenger's docs recommend 8 workers/GB RAM, hope you expect users.
  - jeltz 14 years ago
    
    Streaming works fine with thin + nginx. A single thin instance can handle hundreds (or thousands depending on how much data you stream) of connected users without any problems. The only risk I see here is that long requests can lock up the steaming. Our solution is to have nginx route to one server for streaming and to another for normal requests.
  - mapgrep 14 years ago
    
    1. Long polling can tie up server resources too. There is a process on the other end of that long AJAX request. The mechanism for delivering the streaming connection to the client is orthogonal to the mechanism for handling that connection on the server.
    2. You say a streaming connection "ties up a huge amount of server resources," but the whole point of the linked article is that this does not have to be the case; Node.js can (when used correctly) handle loads of connections in a single process, and Phusion Passenger is clearly trying to evolve their model to achieve similar if not fully comparable results.
    
    masklinn 14 years ago
    
    > Node.js can (when used correctly) handle loads of connections in a single process
    Node uses an evented IO layer, that is completely orthogonal (and thus irrelevant) to streaming responses. You can stream responses with blocking IO, and you can buffer responses with evented IO.
    > Phusion Passenger is clearly trying to evolve their model to achieve similar if not fully comparable results.
    If you think that can happen, you're deluding yourself. Ruby+Rails's model means you need one worker (OS-level, be it a process or a thread does not matter) per connection. With "infinite" streaming responses this means each client ties up a worker forever. OS threads may be cheaper than os processes (when you need to load Ruby + Rails in your process) but that doesn't mean they're actually cheap when you need a thousand or two.
  - xentronium 14 years ago
    
    > total number of clients = total number of workers, since each client permanently ties up a connection forever
    Totally up to your application server. Actual formula is number of clients = total number of threads. It is up to app server to handle parallel requests.
    
    masklinn 14 years ago
    
    > Actual formula is number of clients = total number of threads.
    That's what I said/meant by "worker", whether it's a thread or process does not matter, it pretty much has to be an OS-level concurrency primitive.
  - jherdman 14 years ago
    
    Yup! You're right. My bad.

sergiotapia 14 years ago

Is this any different than what SignalR provides for ASP.Net Web Applications?

masklinn 14 years ago

It's got essentially no relation with SignalR. It's equivalent to using `response.OutputStream.Write` in your HttpHandler.
If you're looking for a SignalR equivalent in Ruby, you need EventMachine.

Settings

Why Rails 4 Live Streaming is a Big Deal

Keyboard Shortcuts