Welcome


Spark, Parquet and S3 – It’s complicated.

Posted on August 10th, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts. No Comments

(A version of this post was originally posted in AppsFlyer’s blog. Also special thanks to Morri Feldman and Michael Spector from AppsFlyer data team that did most of the work solving the problems discussed in this article)

TL;DR; The combination of Spark, Parquet and S3 (& Mesos) is a powerful, flexible and cost effective analytics platform (and, incidentally, an alternative to Hadoop). However making all these technologies gel and play nicely together is not a simple task. This post describes the challenges we (AppsFlyer) faced when building our analytics platform on these technologies, and the steps we  took to mitigate them and make it all work.

Spark is shaping up as the leading alternative to Map/Reduce for several reasons including the wide adoption by the different Hadoop distributions, combining both batch and streaming on a single platform and a growing library of … Read More »

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Buffer this pageShare on RedditShare on StumbleUponEmail this to someone


Data’s hierarchy of needs

Posted on June 1st, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts. No Comments

This post originally published in the AppsFlyer blog.

A couple of weeks ago Nir Rubinshtein and I presented AppsFlyer’s data architecture in a meetup ofBig Data & Data Science Israel. One of the concepts that I presented there, which is worth expanding upon is “Data’s Hierarchy of Needs:”

Data should Exist
Data should be Accessible
Data should be Usable
Data should be Distilled
Data should be Presented

How can we make data “achieve its pinnacle of existence” and be acted upon. In other words, what are the areas that should be addressed when designing a data architecture if you want it to be complete and enable creating insights and value from the data you generate and collect.

If done properly, your users might just act upon the data you provide. This list might seem a little simplistic but it is not a prescription of what to do but … Read More »

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Buffer this pageShare on RedditShare on StumbleUponEmail this to someone


Suprastructure – how come “Microservices” are getting small?

Posted on April 26th, by Arnon Rotem-Gal-Oz in Blog, Featured Posts, SOA Patterns. 1 Comment

Now, I don’t want to get off on a rant here*,  but, It seems like “Microservices” are all the rage these days – at least judging from my twitter, feedly and Prismatic feeds. I already wrote that that in my opinion “Microservices” is just new name to SOA . I thought I’d give a couple of examples for what I mean.

I worked on systems that today would  pass for Microservices years ago (as early as 2004/5).  For instance in  2007,  I worked at a startup called xsights. We developed something like  google goggles for brands (or barcodeless barcode) so users could snap a picture of a ad/brochure etc. and get relevant content or perks in response (e.g. we had campaigns in Germany with a book publisher where MMSing shots of newspaper ads  or outdoor signage resulted in getting information and discounts on the advertized … Read More »

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Buffer this pageShare on RedditShare on StumbleUponEmail this to someone