Performance & Data Eng

How to resolve performance issues of big data applications

I didn’t grow up celebrating Christmas, but every time I watched Chevy Chase, as Clark Griswold in National Lampoon’s Christmas Vacation, trying to unravel his Christmas lights I could feel his pain. For those who had […]

  • 1 min read

I didn’t grow up celebrating Christmas, but every time I watched Chevy Chase, as Clark Griswold in National Lampoon’s Christmas Vacation, trying to unravel his Christmas lights I could feel his pain. For those who had to put up lights, you might remember how time-consuming it was to unwrap those lights. But that wasn’t the biggest problem. No, the biggest issue was troubleshooting why one or more lights were out. It could be an issue with the socket, the wiring, or just one light causing a section of good lights to be out. Figuring out the root cause of the problem was a trial-and-error process that was very frustrating.

Today’s approach to diagnose and resolve performance issues of big data systems is just like dealing with those pesky Christmas lights. Current performance monitoring and management tools don’t pin point the root cause of the problem or how they affect other systems or components running across a big data platform. As a result, troubleshooting and resolving performance issues, like rogue users and jobs impacting cluster performance, missed SLAs, stuck jobs, failed queries, or not understanding cluster usage and application performance, is very time consuming and cannot scale to support big data applications in a production deployment.

There’s a better way to resolve Big Data performance issues than spending hours sifting through monitoring graphs and logs

Managing big data operations in a multi-tenant cluster is complex and it’s hard to diagnose some of the problems listed above. It’s also hard to track who is doing what, understand cluster usage and application performance, justify resource demands, and forecast capacity needs.

Gaining full visibility across the big data stack is difficult because there is no single pane that gives operations and users insight into what’s going on. Even with monitoring tools like Cloudera Manager, Ambari, and MapR Control System, people have to use logs, monitoring graphs, configuration files, and so on to try to resolve application performance issues.

The Unravel platform gives you this important insight across multiple systems.