Settings

Theme

Ask HN: What is the best way to calculate percentile of streaming data?

1 points by rishiloyola 6 years ago · 2 comments · 1 min read


Hello,

I need to code python function which will iterate through incoming requests and calculate percentile of size of body dynamically. Which lib or algo do you guy recommend?

Example: Requests are coming in batches. Let's say first batch has 50 requests, next one has 80 etc. I need to calculate percentile of size of body that each request has.

sigmaprimus 6 years ago

I think you need to provide a bit more info, are you using Apache Kafka? Something else?

The function would be individual batch requests divided by total requests multiplied by 100, but I dont think thats what your looking for.

Edit: actually, for your question it would be the inverse of batch size multiplied by 100, eg. First batch has 50 request so that would be 1/50×100 or 2%

  • rishiloyolaOP 6 years ago

    No I am not using Kafka. It is just basic python server. I want to calculate what is the nth percentile of size of my incoming request object over past one hour.

    I don't want to store size of each request in memory. It will eat so much of my RAM.

    Incoming traffic:

    - 1st batch

    --> 60 requests

    ---> size of 1st request is 10kb

    ---> size of 2nd request is 2kb

    ...

    ...

    - 2nd batch

    --> 10 requests

    ---> size of 1st request is 5kb

    ---> size of 2nd request is 8kb

    ...

    - 100th batch

    I am talking about percentile(10th, 50th, 95th) size of request.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection