The title says it all – These are slides from a session I was working on to explain the basics of software architecture based on…
The nexus of technology, business & people
The title says it all – These are slides from a session I was working on to explain the basics of software architecture based on…
Using pypark’s pandas integration via apply_batch and transform_batch is very powerful but lacking documentation can cause hard to trace bugs – hopefully my experience (below)…
I gave a general overview of Apache Spark to our R&D teams. You can find the slides below
I watched (COVID19-era version of “attended”) the latest spark Summit and in one of the keynotes Reynold Xin from Databricks, presented the following two images…
Back in ancient history (2004) Google’s Jeff Dean & Sanjay Ghemawat presented their innovative idea for dealing with huge data sets – a novel idea…