A Tiny Startup’s Plot to Beat Google at Big Data
wrd.cmThe thing that's missing from this article (which is relevant to this crowd) is the incredible focus on simplicity of Keen's API. Keen's API for reporting metrics is the right one. Any Keen SDK has essentially one API call for reporting metrics:
keen.add_event("collection_name", {
"arbitrary": "dictionary properties",
"with_any_type": -11.23
})
For developers accustomed to instrumenting their applications with various metrics libraries, this is akin to finding the holy grail. Most APIs for metrics collection require you to decide up front what metrics you want, whether it is a time series or a count or a gauge or a ratio. And even after you've figured _that_ out, there is a combinatorial explosion of different metric collections you have to create for every combination of filters that interest you.For the developer, Keen's API is so powerful because it lets you defer almost all of your "question-asking" until later (which is when you want to think about it anyways because you can never predict up front all of the questions you want to ask about your data).
When I began to evaluate options for monitoring ngrok's usage and performance, Keen struck me both for getting the abstraction right, and because I have watched company after company dump countless amounts of money and developer time into homegrown analytics systems that materialize either too late or far over budget.
Disclaimers:
- I am Keen customer for ngrok.com (https://ngrok.com/status)
- Compelled by the power of their product, and the competence of their team, I now work for Keen.
Having written an analytics reporting library [1] I wholly agree with this. The library should not have preconceptions of what you want to report, it should just get out of the way and let you report it. I call it the "Report them all, The User will know His own" approach.
[1] The client complained that all available ones were NIH.
I see that they price in terms of events/month.
Is there a limit as to how big an "event" can be? Is there a limit to how long they hold onto it?
Hi! I'm on the Keen IO team. We have a limit around 1000 properties per event.
For small accounts, we don't currently enforce data archiving, so your data sticks around as long as you want it to.
For large volume customers, where storage & querying large data sets becomes a significant cost, we discuss retention requirements when doing the pricing negotiation.
not saying one's better than the other, but this is exactly the way Mixpanel does it too
As inaccurate titles go, this is a exemplary one. Other than that, a nice if fluffy overview of Keen. The "data" space is getting incredibly complicated and it will be interesting to see who finds a profitable niche.
The title was definitely a surprise! We don't see ourselves as David to Google's Goliath, though that title & dramatic shadowy photo might have you believe otherwise. :)
You should revisit the David to Goliath story :)
http://www.ted.com/talks/malcolm_gladwell_the_unheard_story_...
It is worth watching, if not for the accurateness of the facts, but by the amazing point of view.
I'm a big fan of MG and this was so great! It really does paint the story in a new way. Thanks for sharing.
Still, David was intent on destroying Goliath. I want our company to be the best at what we do, but that's not the same as wanting to beat and/or destroy Google. I'd like for Google to stick around for a long time, inventing new cars and better internet and whatnot :)
Why not a more automatic approach like http://heapanalytics.com? Their js snippet starts automatically collecting all click events, then I group and analyze events on their website.
How is this "big data" when I have to make backend code changes to manually add each event using keen's api library? There's no way I can get as many events adding them one-by-one.
We emphasize "event data" which goes beyond blanket monitoring. This is "big data" because what you're tracking could still happen at high volume and velocity. We leave the third V, variety, up to you - ultimately you will know your business best and can create and extend the data model you need. Making a backend code change to add a single event collection could immediately lead to millions of very rich data points!
My colleague wrote an excellent blog post on about the event data approach and I think you might enjoy the read! https://keen.io/blog/53958349217/analytics-for-hackers-how-t...
(Note: I work at Keen IO)
Heap is cool! Clicks in JavaScript are just one type of event data. You might also have events coming from your backend servers, mobile apps, smart devices, etc.
A tool like heapanalytics is a great example of something that could be built on top of the Keen IO platform.
The main differentiator is that you analyze heap data using their web interface. Keen IO has a web interface, but we are fundamentally a query API.
You can use Keen IO to build your own custom data views and frontends, to white label analytics for your customers, or to use your query results programmatically.
Keen is one of those companies like GitHub, Heroku, or Mailchimp that makes it easier to focus on what I'm actually trying to do. Best of luck to them.
Keen has a great team, and some great ideas + technology. They were on it back at the Techstars Cloud program, and glad to see them getting some press these days. Kyle and Ryan are both wonderful to talk to about big data problems. Keep up the good work guys!
Beat Google at Big Data? OK, great sneaky way to put "Beat Google at X" in a headline.. too bad Google's business isn't primarily about providing tools in X (and if it is, it's probably < 0.01% of their focus/revenue).
You don't understand: whoever gets Big Data, beats Google. Period. Google's main business is web text search -- but the best text search doesn't stand a chance against good data search. All those pages will happily become data as soon as someone makes that breakthrough to make it feasible and compelling.
Text search is "horse-and-carriage" and Google is certainly its king. But data is the coming "automobile", and all bets in that space are off.
But who has that data? Google. And who else??? Big data is not just the tools to analyze that data. It's also HAVING Big Data.
Keen seems awesome, except I have a high volume startup with no funding and no revenue right now. We'd blow past the 50k event/month within the first few days for sure.
I really wish there was a more affordable way to do this. Though, I guess you get what you pay for.
It actually reminds me of Mixpanel. Both are too expensive for my liking :(
We offer discounts for startups! Email me! michelle@keen.io
.. tomorrow around 9am .. Google buys KeenIO for $900 million.
The thing I like most about Keen IO is their team. They are very open and would help anyone who finds their way to their Hipchat room.
You are not the first to note this about the team. Uniquely nice ppl.
The Next Big Thing You Missed: A Tiny Startup’s Plot to Be Purchased by Google