-
Just thought I’d share this.
31K to 49K transactions per second of a single server, my message size varies but its 1-4 kb, compression obviously brings this figure down, snappy from kafka to the consumer, and gzip to disk.
The machine used does have 12 cores 24 cpus hyper-threaded, on a 10gigabit network, must of the work is io network and disk.
My use case is:
-> .read json (snappy compressed) data from Kafka,
-> inserting into files locally that are rolled over and sent to hadoop.
The files are partitioned by queue, year, month,day and hour. Also the files are compressed (configurable) using the hadoop gzip codec.
My implementation uses core.async alts!! to return the kafka consumer threads as a single list. A single thread reads from this list and pushes the results onto a chronicle queue (for persistence while in memory) (taken from the reactor framework).
There are 20 threads (in a cached thread pool) that read from this queue and send the results to agents. Each agent represents an open file that writes, check for roll conditions and closes the file when required.
I think 30-40K tps is pretty good enough to deem clojure ok for server backend software.
References:
https://github.com/gerritjvv/pseidon
https://github.com/gerritjvv/pseidon/blob/master/pseidon-kafka/src/pseidon/kafka/consumer.clj
(see the messages function)
https://github.com/gerritjvv/pseidon/blob/master/pseidon/src/pseidon/core/fileresource.clj
https://github.com/gerritjvv/pseidon/blob/master/pseidon/src/pseidon/core/queue.clj
(see the BlockingChannelImpl defrecord and consume-messages)
https://github.com/OpenHFT/Java-Chronicle
https://github.com/codahale/metrics/ (metrics)
Stats screenshot:
(reduced to be visible on this page).
At this precise moment the meanRate was at 45K tps, and the oneMinuteRate at 31K tps. I did get higher speeds up to 49K tps, but my kafka queue was getting emptied faster that I could keep up producing.

More performance stats
