Welcome
Spark, Parquet and S3 – It’s complicated.
(A version of this post was originally posted in AppsFlyer’s blog. Also special thanks to Morri Feldman and Michael Spector from AppsFlyer data team that did most of the work solving the problems discussed in this article)
TL;DR; The combination of Spark, Parquet and S3 (& Mesos) is a powerful, flexible and cost effective analytics platform (and, incidentally, an alternative to Hadoop). However making all these technologies gel and play nicely together is not a simple task. This post describes the challenges we (AppsFlyer) faced when building our analytics platform on these technologies, and the steps we took to mitigate them and make it all work.
Spark is shaping up as the leading alternative to Map/Reduce for several reasons including the wide adoption by the different Hadoop distributions, combining both batch and streaming on a single platform and a growing library of … Read More »
Data’s hierarchy of needs
This post originally published in the AppsFlyer blog.
A couple of weeks ago Nir Rubinshtein and I presented AppsFlyer’s data architecture in a meetup ofBig Data & Data Science Israel. One of the concepts that I presented there, which is worth expanding upon is “Data’s Hierarchy of Needs:”
Data should Exist
Data should be Accessible
Data should be Usable
Data should be Distilled
Data should be Presented
How can we make data “achieve its pinnacle of existence” and be acted upon. In other words, what are the areas that should be addressed when designing a data architecture if you want it to be complete and enable creating insights and value from the data you generate and collect.
If done properly, your users might just act upon the data you provide. This list might seem a little simplistic but it is not a prescription of what to do but … Read More »
Suprastructure – how come “Microservices” are getting small?
Now, I don’t want to get off on a rant here*, but, It seems like “Microservices” are all the rage these days – at least judging from my twitter, feedly and Prismatic feeds. I already wrote that that in my opinion “Microservices” is just new name to SOA . I thought I’d give a couple of examples for what I mean.
I worked on systems that today would pass for Microservices years ago (as early as 2004/5). For instance in 2007, I worked at a startup called xsights. We developed something like google goggles for brands (or barcodeless barcode) so users could snap a picture of a ad/brochure etc. and get relevant content or perks in response (e.g. we had campaigns in Germany with a book publisher where MMSing shots of newspaper ads or outdoor signage resulted in getting information and discounts on the advertized … Read More »








