Spark Performance Management 2018-05-27T15:16:12+00:00

Application Performance Management
for Apache Spark

Optimize, troubleshoot and analyze Apache Spark performance

Spark performance is the lifeblood of many big data applications

Many big data applications are built in Spark now. From data transformation and SQL applications to real-time streaming applications and data pipelines powered by AI and machine learning, Spark has made it easier than ever to create big data applications. However, moving these applications into production and running them in production continuously and reliably is challenging. Unravel powerfully helps you manage Spark performance.

Running Spark apps in production is hard

Spark applications and pipelines can suffer from out-of memory and configuration issues and lead to unpredictable performance, “stealing” resources in multi-tenant clusters, and slow jobs.

Image set: Spark performance tuning prevents missed SLAs, failed jobs, and other issues.

Many factors affect performance and utilization of Spark apps and pipelines

The reason the traditional approach cannot scale is because there are just too many potential problems, across too many different systems, for DevOps to troubleshoot issues through trial-and-error. Let’s look at an example: Many things can go wrong (at multiple levels) including the app, containers, resource management, network performance, and data storage. Additional factors contribute to the complexity of moving Spark from pilot to production. There are many patterns of Spark applications, submission methods, as well as infrastructure choices. All require a more simple, end-to-end way to manage performance and utilization.

Running Spark in production needs a full stack, intelligent, and automated approach to operations and performance management

Unravel was built specifically to manage performance and utilization of big data applications and platforms. Unravel not only collects performance data across the full-stack, but it automatically correlates all the data together, as well as automatically provides specific recommendations for solving performance issues or improving utilization of resources. With Unravel, DevOps can optimize performance and utilization, troubleshoot issues quickly, and analyze usage to do chargeback reporting as well as plan future scaling

Optimize applications and pipelines

  • Monitor, detect and fix inefficient and failed Apache Spark applications
  • Troubleshoot multi-system pipelines from a single location
  • Ensure compliance on reliability, throughput, and response-time SLAs

Understand data usage and access throughout the Apache Spark stack

  • Ensure optimal use of in-memory data caching
  • Optimize HDFS, NoSQL, and Kafka usage for Spark
  • Detect and fix poor partitioning

Optimize Apache Spark performance tuning

  • Optimize container sizes for Spark on Mesos and YARN
  • Get instructions for tuning JVM for Spark drivers and executors
  • Minimize data shuffles

Monitor, manage, tune the performance of all your applications on the Apache Spark stack


  • SparkSQL

  • Streaming

  • MLlib

  • GraphX


  • Scala

  • Python

  • Java

  • R


  • Standalone

  • Mesos

  • Yarn

Data Stores

  • HDFS

  • Cassandra

  • Hbase

  • Kafka

  • Elasticsearch