Featured Posts


Services, Microservices, Nanoservices – oh my!

Posted on March 25th, by Arnon Rotem-Gal-Oz in Blog, SOA Patterns. 2 comments

Apparently there’s this new distributed architecture thing called microservices out and about – so last week I went ahead and read Martin Fowler’s & James Lewis’s extensive article on the subject . and my reaction to this was basically:

I guess it is easier to use a new name (Microservices) rather than say that this is what SOA actually meant – re http://t.co/gvhxDfDWLG

— Arnon Rotem-Gal-Oz (@arnonrgo) March 16, 2014

Similar arguments (nothing new here) were also expressed after Martin’s tweet of his article e.g. Clemens Vasters’ comment:

@martinfowler @boicy but these are the very principles of SOA before vendors does pushed the hub in the middle, i.e. ESB — Clemens Vasters (@clemensv) March 16, 2014

Or Steve Jones’ post “Microservices is SOA, for those who know what SOA is.”

Autonomy, smart endpoints, events etc. that the article talks about are all SOA concepts – If … Read More »



and with YARN the game changes

Posted on October 17th, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts. 2 comments

I’ve been working with Hadoop for a few years now and the platform and ecosystems has been advancing at an amazing pace with new features and additional capabilities appearing almost on a daily basis. Some changes are small like better scheduling in Oozie; some are still progressing like support for NFS some are cool like full support for CPython in Pig but, in my opinion, the most important change is the introduction of YARN in Hadoop 2.0.

Hadoop was created with HDFS, a distributed file system, and Map/Reduce framework – a distributed processing platform. With YARN hadoop moves from being a distributed processing framework into a distributed operating system.
“operating system”, that sounded a little exaggerated when I wrote it, so just for fun, I picked up a copy of Tanenbaum’s “Modern Operating Systems”*, I have lying around from my days as … Read More »



ReSQL?

Posted on July 23rd, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts, Uncategorized. No Comments

The NoSQL moniker that was coined circa 2009 marked a move from the “traditional” relational model. There were quite a few non-relational databases around prior to 2009, but in the last few years we’ve seen an explosion of new offerings (you can see,for example, the “NoSQL landscape” in a previous post I made). Generally speaking, and everything here is a wild generalization, since not all solutions are created equal and there are many types of solutions – NoSQL solutions mostly means some relaxation of ACID constraints, and, as the name implies, the removal of the “Structured Query Language” (SQL) both as a data definition language, and more importantly, as a data manipulation language, in particular SQL’s query capabilities.

ACID and SQL are a lot to lose and NoSQL solutions offer a few benefits to augment them mainly:

Scalability – either as relative scalability, … Read More »



SOA patterns added to Intel’s recommended reading list

Posted on July 23rd, by Arnon Rotem-Gal-Oz in Blog, SOA Patterns. No Comments

Last month I received a nice letter from Intel saying that my SOA patterns book was added to their list of recommended reading they curate. below are the relevant quotes from the letter:

“We are pleased to announce that a book published by Manning, SOA Patterns, by Arnon Rotem-Gal-Oz, has been selected for Intel Corporation’s Recommended Reading List for 2H’13. Congratulations!

Our Recommended Reading Program partners with publishers worldwide to provide technical professionals a simple and handy reference list of what to read to stay abreast of new technologies. Dozens of industry technologists, corporate fellows, and engineers have helped by suggesting books and reviewing the list. This is the most comprehensive reading list available for professional computer developers and IT professionals.”



Fallacies of massively distributed computing

Posted on April 29th, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts. 5 comments

In the last few years, we see the advent of highly distributed systems. Systems that have clusters with lots of servers are no longer the sole realm of the googles’ and facebooks’ of the world and we begin to see multi-node and big data systems in enterprises. e.g. I don’t think a company such as Nice (the company I work for) would release an hadoop based analytics platform and solutions, something we did just last week, 5-6 years ago.

So now that large(r) clusters are more prevalent, I thought it would be a good time to reflect on the fallacies of distributed computing and how/if they are relevant; should they be changed.
If you don’t know about the fallacies you can see the list and read the article I wrote about them at the link mentioned above. In a few words … Read More »


SOA Patterns is “deal of the day” on Manning’s site (Apr. 14th)

Posted on April 13th, by Arnon Rotem-Gal-Oz in Blog, Featured Posts, SOA Patterns. No Comments

I just got a notice from Manning that my book SOA patterns will be featured as “deal of the day” on Apr 14th – that means that it will be available for 50% off starting Midnight US ET of April 14th (and considering it’s a world-wide offer it would actually last for more than 24 hours).

To get the 50% discount use code dotd0414au at www.manning.com/rotem

If you’re not familiar with my book (which I guess is unlikely if you’re reading my blog, but anyway), you might want to check out the SOA Patterns page on my site, read one or more of the pattern draft or check out the book reviews.

Reviews of SOA patterns

Cameron McKenzie @ TheServerSide.com
Tad Anderson @ Java Developers Journal
Roberto Casadei @ robertocasadei.it
Colin Jack @ losTechies (half a book review)
Jan Van Ryswyck @ ElegantCode.com (half a book review)
Karsten Strøbæk @ … Read More »


Herding Apache Pig – using pig with perl and python

Posted on March 4th, by Arnon Rotem-Gal-Oz in Big Data, Blog, Featured Posts. No Comments

the past week or so we got some new data that we had to process quickly . There are quite a few technologies out there to quickly churn map/reduce jobs on Hadoop (Cascading,  Hive,  Crunch, Jaql to name a few of many) , my personal favorite is Apache Pig.  I find that the imperative nature of pig makes it relatively easy to understand what’s going on and where the data is going and that it produces efficient enough map/reduces. On the down side pig lacks control structures so working with pig also mean you need to extend it with user defined functions (UDFs) or Hadoop streaming. Usually I use Java or Scala for writing UDFs but it is always nice to try something new so we decided to checkout some other technologies – namely perl and python. This post highlights some of … Read More »



The Saga pattern and that architecture vs. design thing

Posted on January 24th, by Arnon Rotem-Gal-Oz in Blog, Featured Posts, SOA Patterns. No Comments

It has been few months since SOA Patterns was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author – so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I’d be grateful if you could also rate it on Amazon so that others would know about it too)

I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to “the general public” but some are. One interesting question I got is about the so called “Canonical schema pattern“. I have a post in the making (for too long now,sorry about that Bill) that explains why I don’t consider it … Read More »



Killing the HBase zombie table

Posted on January 15th, by Arnon Rotem-Gal-Oz in Big Data, Blog. 3 comments

One of our team leaders approached me in the hall today and asked if I could land a hand in troubleshooting something. He and our QA lead were configuring one of our test Hadoop clusters after an upgrade and they had a problem with one table they were trying to set up:

When they tried to create the table in HBase shell they got an error that the table exists
When they tried to delete the table they got an error that the table does not exist
HBase ships with a health-check and fix util called hbck (use: hbase hbck to run. see here for details) – they’ve run hbase reports everything is fine and dandy

Hmm, The first thing I tied to do is to look at the .META. table. This is where HBase keeps the tables and the regions they use. I … Read More »



The NoSQL landscape in diagrams

Posted on November 3rd, by Arnon Rotem-Gal-Oz in Big Data, Blog. 1 Comment

Here’s the NoSQL landscape in 3 slides (and hey, at least mine looks different :) )

451 research published their view of the NoSql/NewSql world in a unified diagram.

Infochimps published a similar diagram

And here’s mine from SOA Patterns chapter 10 (discussing “SOA & big data”)