While many enterprises across a wide range of verticals—finance, healthcare, technology—build applications on the big data stack, some are not fully aware of the performance management and operational challenges that often arise. Over the course of a two-part blog series, we’ll address the requirements at both the individual application level, as well as holistic clusters and workloads, and explore what type of architecture can provide automated solutions for these complex environments.
Composed of multiple distributed systems, the big data stack in an enterprise typically goes through the following evolution:
Industry analysts estimate that there are more than 10,000 enterprises worldwide running applications in production on a big data stack, comprised of three or more distributed systems.
Naturally, performance challenges are inherent related to failure, speed, SLA or behavior. Typical questions that are nontrivial to answer include:
Many operational performance requirements are needed at the “macro” level compared to the level of individual applications. These include:
Addressing performance challenges requires a sophisticated architecture that includes the following components:
There are several technical challenges involved in this process. Examples include:
Now that we’ve tackled the modern big data stack and reviewed operational challenges and requirements for a performance management solution to meet those needs, we look forward to discussing the features that a solution should offer in our next post.