In the last few years, we see the advent of highly distributed systems. Systems that have clusters with lots of servers are no longer the sole realm of the googles’ and facebooks’ of the world and we begin to see multi-node and big data systems in enterprises. e.g. I don’t think a company such as Nice (the company I work for) would release an hadoop based analytics platform and solutions, something we did just last week, 5-6 years ago.
So now that large(r) clusters are more prevalent, I thought it would be a good time to reflect on the fallacies of distributed computing and how/if they are relevant; should they be changed.
If you don’t know about the fallacies you can see the list and read the article I wrote about them at the link mentioned above. In a few words … Read More »
I just got a notice from Manning that my book SOA patterns will be featured as “deal of the day” on Apr 14th – that means that it will be available for 50% off starting Midnight US ET of April 14th (and considering it’s a world-wide offer it would actually last for more than 24 hours).
To get the 50% discount use code dotd0414au at www.manning.com/rotem
If you’re not familiar with my book (which I guess is unlikely if you’re reading my blog, but anyway), you might want to check out the SOA Patterns page on my site, read one or more of the pattern draft or check out the book reviews.
Reviews of SOA patterns
Cameron McKenzie @ TheServerSide.com
Tad Anderson @ Java Developers Journal
Roberto Casadei @ robertocasadei.it
Colin Jack @ losTechies (half a book review)
Jan Van Ryswyck @ ElegantCode.com (half a book review)
Karsten Strøbæk @ … Read More »
Even though I mostly sit at work trying to look busy, every so often someone does stumbles into my office with a question or a problem so I’ve got to do something.
Interestingly enough, a lot of problems can be handled by some pretty basic stuff like like reminding people that a .jar/war file is a zip file and you can take a look inside for what’s there or what’s missing; or sending people to read the log files (turns out these buggers actually contain useful information) etc. – so now for today’s lesson: “It’s open source, so the source, you know, is open…”
We use a lot of open source projects at Nice (we’ve also, slowly, starting to give something back to the community but that’s another story). One of these is HBase, one of our devs was working on enabling … Read More »
the past week or so we got some new data that we had to process quickly . There are quite a few technologies out there to quickly churn map/reduce jobs on Hadoop (Cascading, Hive, Crunch, Jaql to name a few of many) , my personal favorite is Apache Pig. I find that the imperative nature of pig makes it relatively easy to understand what’s going on and where the data is going and that it produces efficient enough map/reduces. On the down side pig lacks control structures so working with pig also mean you need to extend it with user defined functions (UDFs) or Hadoop streaming. Usually I use Java or Scala for writing UDFs but it is always nice to try something new so we decided to checkout some other technologies – namely perl and python. This post highlights some of … Read More »
It has been few months since SOA Patterns was published and so far the book sold somewhere between 2K-3K copies which I guess is not bad for an unknown author – so first off, thanks to all of you who bought a copy (by the way, if you found the book useful I’d be grateful if you could also rate it on Amazon so that others would know about it too)
I know at least a few of you actually read the book as from time to time I get questions about it :). Not all the questions are interesting to “the general public” but some are. One interesting question I got is about the so called “Canonical schema pattern“. I have a post in the making (for too long now,sorry about that Bill) that explains why I don’t consider it … Read More »
One of our team leaders approached me in the hall today and asked if I could land a hand in troubleshooting something. He and our QA lead were configuring one of our test Hadoop clusters after an upgrade and they had a problem with one table they were trying to set up:
When they tried to create the table in HBase shell they got an error that the table exists
When they tried to delete the table they got an error that the table does not exist
HBase ships with a health-check and fix util called hbck (use: hbase hbck to run. see here for details) – they’ve run hbase reports everything is fine and dandy
Hmm, The first thing I tied to do is to look at the .META. table. This is where HBase keeps the tables and the regions they use. I … Read More »
In the last year and half or so (since I joined Nice Systems ) we’ve been hard at work building our big data platform based on a lot of open source technologies including Hadoop and HBase and quite a few others. Building on open source brings a lot of benefits and helps cut development time by building on the knowledge and effort of other.
I personally think that this has to be two-way street and as a company benefits for open source it should also give something back. This is why I am very happy to introduce Nice’s first (hopefully first of many) contribution back to the open source community. A UI dev tool for working with HBase called h-rider. H-rider offers a convenient user interface to poke around data stored in HBase which our developers find very useful both for development and debugging
h-rider … Read More »
I was poking around my old blog (rgoarchitects.com) and I found this post from 2007 which I think is worth re-iterating:
In a post called “Ignorance vs. Negligence“, Ayende blows some steam off on some of the so called “professionals” that he met along the way. You know …those with a fancy title that don’t know jack and design some of the nightmares we see from time to time. I’ve seen this phenomena in a lot of projects I consulted/reviewed:
The senior security expert who recommended something which isn’t supported by the platform
The senior architect who throw the system down to hell by basing all the system on a clunky asynchronous solutions that should only be used by a tiny portion of the application.
The geniuses that built this wonderful code generator that generated code with so many dependencies and singletons that made … Read More »
Here’s the NoSQL landscape in 3 slides (and hey, at least mine looks different :) )
451 research published their view of the NoSql/NewSql world in a unified diagram.
Infochimps published a similar diagram
And here’s mine from SOA Patterns chapter 10 (discussing “SOA & big data”)
I gave a presentation of SOA and big data in IGTCloud forum