Creating a Proxy Server with Go
codingcookies.comI know it's irrational, but it drives me a little nuts how the proxy idiom in go is "two goroutines implementing socket-to-socket copy". The inner handler loop of a proxy is a place where, to me, select/poll might actually make the code easier to follow; also, the idiom doubles the number of goroutines required to handle a given connection load, and while goroutines are cheap, they aren't free.
I know it's possible to pull select() into Golang programs (I ended up having to, to write a fast port scanner), but Golang people look at you weirdly when you tell them you did that.
Go is strongly opinionated and very normative (which is a good thing for its main target audience). It's not surprising it attracts people who would look down to you for departing from the norm, even if your choices are technically justified.
Do you have any examples or data to support that there is any benefit by employing a explicit asynchronous IO pattern rather than relying on the Go IO libraries perform the select() (epoll or whatever...) syscall and then rely on the goroutine scheduler to switch control to who is registered to do something when the data is ready?
Is the default minimal stack segment size too big? Or the scheduler too heavyweight?
In the case of a proxy, the select(2) call implements the simple predicate "should I read from the client or the server". It turns a pair of coroutines into a single normal routine. The single routine is clearer, perhaps marginally less performant for a single connection, but probably marginally more performant across a large number of connections.
The "asynchronous pattern" thing is a religious canard. Golang devotees are (rightfully) happy to abandon event-structured programs in favor of programs that look like socket tutorial code but perform like event servers. But we're not talking about asynchronous control; we're talking about a synchronous loop. I've noticed this when talking to Golang people: they hear "select" or "poll" and automatically a switch goes off in their head that lights the "BAD!" lamp. I'm not sure that's valid.
Another example, which I gave upthread, was a high-speed port scanner; when I left it to the Golang scheduler to handle the sockets without Golang's timeout idiom, I quickly starved the program of sockets, because of the interaction between timeouts and the scheduler. I pulled select(2) into the program (for that one use only! Just to avoid using Golang timeouts for a simple connect(2) timeout) and the program not only ran quickly but properly handled the socket descriptors.
This isn't a critique of Golang, which I like working with. Rather, I'm criticizing a specific Golang idiom.
I'm sorry, I'm not sure I follow you. Is the problem with your high speed port scanner lie in the implementation of DialTimeout or you for some reason you had to use syscall.Connect in a new goroutine and implemented the timeouts with select{} ?
errrm, why not just `func Copy(dst Writer, src Reader) (written int64, err error)` ? avoid having to select/poll and just let the io.Reader and io.Writer interfaces do the work for you... and if they implemenet WriteTo, all the better.
I've written a lot of code using select and poll, and I don't think it's ever come out "simpler and easier to follow" than code which just used blocking I/O.
We're not talking about the same thing. I'm not suggesting that Golang would be better off with fully evented I/O. In fact, the comment I wrote that you replied to agrees with what you just said.
I'm not sure I agree with how channels are used here. What's the point of this spaces chan? Why couldn't a simple atomic counter solve this (see sync/atomic)? Why allocate a thousand bools?
// The booleans representing the free active connection spaces.
spaces := make(chan bool, *maxConnections)
// Initialize the spaces
for i := 0; i < *maxConnections; i++ {
spaces <- true
}
}Is this really how people use go???
This is generally the Go way of limiting concurrency and usually the recommended approach. Some use empty structs in place of bools, but otherwise there doesn't seem to be a problem with the approach.
yes, using a channel's buffer for resource management like this is a common pattern. I can't say that I necessarily like this pattern, but... yes, it is common.
In this case, this particular part of the application is worth questioning, because the error condition isn't reasonable. Right now, if the proxy is full, it accepts a TCP connection from a client, and then it just ... stalls. It doesn't disconnect the client, it doesn't read from the client, it just hangs. So if a client were to actually connect to this proxy when it's full, they'd just open up a connection and wait.
Using sync/atomic package isn't common. Yes, you could do it, but it would be more common to just have a goroutine with a counter in it, and a select statement to serialize increment and decrement messages.
And then there are the actual handlers, which throw away the error information if there's an error actually writing to the connection.
Looking at this some more... Why wouldn't you just make the waiting chan buffered and eliminate any tracking of connections.
Hey, poster here.
You're right that sync/atomic could've taken care of this, I wasn't aware of that package and figured channels were the way to go in Go.
As for making the waiting chan buffered, the reason I wanted to keep track of pending connections and active connections is because I'd like to proxy from a high-power server to a low-power server such as a Raspberry Pi. I agree with you that it could have done without though.
Thanks for the tips! :-)
So perhaps I was a bit harsh. Do check out this set of slides though: http://talks.golang.org/2012/concurrency.slide
One of the last slides: http://talks.golang.org/2012/concurrency.slide#54
Thanks, that's an awesome resource!
I think in this case using atmoic.Add* is much clearer, and probably better.
sync/atomic would indeed be the way to solve this problem.
The inbuilt reverseproxy is also handy for small tasks:
package main
import (
"net/http"
"net/http/httputil"
"net/url"
)
func main() {
target, _ := url.Parse("http://127.0.0.1:8000")
http.ListenAndServe(":80", httputil.NewSingleHostReverseProxy(target))
}
This will http proxy :80 to :8000.