Sunday, May 22, 2016

The five stages of coming to terms with Cassandra

From Wikimedia Commons, the free media repository
The five stages of coming to terms with JavaScript are:
  1. Denial: “I won’t need this language.”
  2. Anger: “Why does the web have to be so popular?”
  3. Bargaining: “OK, at least let me compile a reasonable language to JavaScript.”
  4. Depression: “Programming is not for me, I’ll pursue a career in masonry, like I always wanted.”
  5. Acceptance: “I can’t fight it, I may as well prepare for it.”

The same is with Cassandra - however, IMO in the opposite order:
  1. Acceptance: “I will use Cassandra. It's... AMAZING! Let me just quote Apache Cassandra landing page:"
       The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
  2. Depression: “Damn, it's so well designed, but a complex piece of software and it doesn't work as expected.”
  3. Bargaining: "OK, at least let me try to tune it or report some bugs.”
  4. Anger: “Why is it so popular? Why it has so good PR?”
  5. Denial: “I won’t use it or recommend it ever again.”

The context

I've done the research, checked multiple websites - read about performance, architecture, hosting, maintenance, TCO, libraries, popularity... and Cassandra seemed to be a good database for time-series logs storage, with 95% writes (with SLA) and only 5% reads (without SLA). I've chosen prepared Cassandra Datastax virtual disk image on Amazon with bootstrap scripts, made a proof-of-concept solution and read a book or two about Cassandra. All seemed good. However, it's not post about the good. So ...fast forward...

The bad

Some stories which I remember:
  • Cassandra cluster is on production (along with pararell, old solution for this purpose). Phone rings at 2AM. C* cluster is down. Quick look at logs - OutOfMemoryException in random place in JVM. Outage time: 1h - let me just remind you "proven fault-tolerance". Cluster restart, it works again.
  • Next day at work, random hour, the same thing. Related bug: OutOfMemoryError after few hours from node restart 
  • After few days... firing repair - the standard C* operation, which you have to run at least every gc_grace_seconds, by default 10 days. Usually it worked, but then, unexpectedly the server died and later again and again, related issue: "Unknown type 0" Stream failure on Repair. Outage time: stopped counting.
  • Because of the failing servers in the cluster I decided to scale it out a little. Unfortunately, the issue above also made the scaling impossible.
  • After a while, I've encountered a second (thrid?) problem with the repair. Related bug: Repair session exception Validation failed

Fail

Let's get back to the landing page:
The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
Now, let's see at critical JIRA issue dates:
This means that for around one month at least few people could scale or repair their Cassandra clusters. I fully understand - it's free and Open-Sourced-Software. However, even if something it's free you expect it to work - that's the harsh reality. If it doesn't work just you look for something else. No offence Cassandra Datastax/Apache teams, you are doing truly amazing work, however in resilient software, stability is a TOP 1 requirement. 

Maybe it's me? Maybe I'm the only one having problems?

Fortunately (for me) not:
  1. Here is a presentation how guys at Adform switched from Cassandra to Aerospike: Big Data Strategy Minsk 2014 - Tadas Pivorius - Married to Cassandra
  2. My friend working at a different company also told me, that they used Cassandra and they abandoned it.
  3. Just looked at linked issues and the number of watchers.
In all cases the problems were similar to mine.

Thursday, May 5, 2016

Specifying requirements for live notification mechanism for systems integration purposes

Recently I've designed a mechanism to notify external systems (with which we cooperate) about changes in our system. This, obviously, can be done in multiple ways. Let's look at some considerations on a high level, some questions and how that affects our requirements.

Assumptions

  • we want to notify other, external systems, owned by someone else
  • allowed delay, between the change in our system and making the notification is around one minute
  • the change can carry multiple information and varies on the type of change
  • we expose an API which is currently used by those external systems - they fetch the changes periodically
  • the number of changes per second in our system is spiky in nature (assume 50-5000 notifications/second for now)
  • external systems will subscribe themselves for notifications
Those are real-life business assumptions, which are delivered to the designer/programmer/you/me.

Questions?

  1. How to notify external systems?
  2. What information should we pass? When is the notification delivered?
  3. How long should we wait for the response?
  4. When should we retry? 
Let's try to answer those questions.

Answers

  1. There are multiple external systems, made in multiple different technologies. The most popular and basic method of integration is just making HTTP(S) calls. Should it be GET, POST or X? Let's consider two most popular - the GETs and POSTs.
  2. We have to pass multiple values, depending on the notification type. For example, normal amount of information is: string (300 chars), 5 dates, 5 integers - therefore both GET (allowing ~2k chars on nearly all browsers and servers) and POST methods are viable. However, GET is very straightforward and simple. No issues with encoding, accepting compression or even reading the stream. What is more, GET put less pressure on your's servers as you do not have to send the body stream. Unfortunately GET query string is also visible for (nearly) everyone, therefore only-non sensitive information can be passed. What about concurrent notifications? How could one make "exactly-once" delivery model? Here is where we can use nicely one of our assumptions. Because of our API we can force external systems to fetch information through our API, after we will notify them. Such notification can be delivered in "at-least-once" model and we can provide non-sensitive, idempotent information about the change, which then can be used to get, full sensitive data from our API. One can even imagine an optimization - keep notifications to send in a buffer and delete duplicates in a small time bucket. 
  3. The obvious thing is that the longer we wait for responses the more resources are used. However, there is one more important thing. By specifying the request timeout, we can control how the architecture of the external system will look like. By saying "you have 30 seconds to process the notification" is like saying ~"you have a lot of time to get our notification, process it and synchronously ask our API then send us HTTP 200 status code". Compare it with "you have 3 seconds to store the notification for processing later or process it asynchronously". The implications are clear, short time = less required resources + better integration.
  4. We want to be sure that the notification reaches the external system and thanks to the design specified in second point we can use "at-least-once" delivery model. I see two options now: a) hit specified URL 3 times (for example), don't wait for the answer and don't send this notification ever again, b) hit specified URL, retry in X minutes if HTTP status code was different than 200. First option is very simple in implementation, however it assumes that external systems will develop a mechanism to avoid processing the same notifications multiple times - which will likely end in hitting our API three times for every single notification.

Conclusion

There we have it, answers which potentially should lead to a simple, sleek design which is relatively easy for implementation, completely fulfills the needs and requires a good design from external systems.