I presented big data to Amdocs’ product group last week. One of the sessions I did was recorded so I might be able to add here later. Meanwhile you can check out the slides.
Note that trying to keep the slide visual I put some of the information is in the slide notes and not on the slides themselves.
Big data Overview from Arnon Rotem-Gal-Oz
Google’s Jeffrey Dean and Sanjay Ghemawat filed the patent request and published the map/reduce paper 10 year ago (2004). According to WikiPedia Doug Cutting and Mike Cafarella created Hadoop, with its own implementation of Map/Reduce, one year later at Yahoo – both these implementations were done for the same purpose – batch indexing of the web.
Back than, the web began its “web 2.0″ transition, pages became more dynamic , people began to create more content – so an efficient way to reprocess and build the web index was needed and map/reduce was it. Web Indexing was a great fit for map/reduce since the initial processing of each source (web page) is completely independent from any other – i.e. a very convenient map phase and you need to combine the results to build the reverse index. That said, even the core google algorithm – the famous pagerank is … Read More »