Using Apache Kafka for Consumer Metrics
product.hubspot.comKafka is really wonderful in its simplicity. You put messages into it and one or more consumers read those messages in partition sequence. My app only does around 1M messages/minute, but LinkedIn does 13M per second. Granted, LinkedIn's usage is across all of their services, but Kafka's log structure offers great performance and the replication offers durability.
If you're looking to process data streams in real time, Kafka is definitely worth a look and the team at Confluent is awesome.
It's indeed a good idea to measure delta and lag along the workflow. It helps both to be alerted of an eventual issue and to identify the possible spots and causes : a trafic spike, a stage without enough cpu/io resources, a late consumer among a group ...