Table of Contents
Also see our blog post with stories from Untold speakers, “My Fitbit sleep score is 88…”
Unravel Data recently held its first-ever customer conference, Untold 2020. We promoted Untold 2020 as a convocation of #datalovers. And these #datalovers generated some valuable data – including the interesting fact that more than 60% of surveyed customers have SLAs for either “more than 50% of their pipelines” (42%) or “all of their pipelines” (21%).
All of this ties together. Unravel makes it much easier for customers to set attainable SLAs for their pipelines, and to meet those SLAs once they’re set. Let’s look at some more data-driven findings from the conference.
And, if you’re an Unravel Data customer, reach out to access much more information, including webinar-type recordings of all five customer and industry expert presentations, with polling and results presentations interspersed throughout. If you are not yet a customer, but want to know more, you can create a free account or contact us.
Unravel Data CEO Kunal Agarwal kicking off Untold.
Note: “The plural of anecdotes is not data,” the saying goes – so, nostra culpa. The findings discussed herein are polling results from our customers who attended Untold, and they fall short of statistical significance. But they do represent the opinions of some of the most intense #datalovers in leading corporations worldwide – Unravel Data’s customer base. (“The great ones skate to where the puck’s going to be,” after Wayne Gretzky…)
More Than 60% of Customer Pipelines Have SLAs
Using Unravel, more than 60% of the pipelines managed by our customers have SLAs:
- More than 20% have SLAs for all their pipelines.
- More than 40% have SLAs for more than half of their pipelines.
- Fewer than 40% have SLAs for fewer than half their pipelines (29%) or fewer than a quarter of them (8%).
Pipelines were, understandably, a huge topic of conversation at Untold. Complex pipelines, and the need for better tools to manage them, are very common amongst our customers. And Unravel makes it possible for our customers to set, and meet, SLAs for their pipelines.
What percentage of your data pipelines have SLAs | |
<25% | 8.3% |
>25-50% | 29.2% |
>50% | 41.7% |
All of them | 20.8% |
Bad Applications Cause Cost Overruns
We asked our attendees the biggest reason for cost overruns:
- Roughly three-quarters replied, “Bad applications taking too many resources.” Finding out which applications all the resources are being consumed by is a key feature of Unravel software.
- Nearly a quarter replied, “Oversized containers.” Now, not everyone is using containers yet, so we are left to wonder just how many container users are unnecessarily supersizing their containers. Most of them?
- And the remaining answer was, “small files.” One of the strongest presentations at the Untold conference was about the tendency of bad apps to generate a myriad of small files that consume a disproportionate share of resources; you can use Unravel to generate a small files report and track these problems down.
What is usually the biggest reason for cost overruns? | |
Bad applications taking too many resources | 75.0% |
Oversized containers | 20.0% |
Small files | 5.0% |
Two-Thirds Know Their Most Expensive App/User
Amongst Unravel customers, almost two-thirds can identify their most expensive app(s) and user(s). Identifying costly apps and users is one of the strengths of Unravel Data:
- On-premises, expensive apps and users consume far more than their share of system resources, slowing other jobs and contributing to instability and crashes.
- In moving to cloud, knowing who’s costing you in your on-premises estate – and whether the business results are worth the expense – is crucial to informed decision-making.
- In the cloud, “pay as you go” means that, as one of our presenters described it, “When you go, you pay.” It’s very easy for 20% of your apps and users to generate 80% of your cloud bill, and it’s very unlikely for that inflated bill to also represent 80% of your IT value.
Do you know who is the most expensive user / app on your system? | |
Yes | 65.0% |
No | 25.0% |
No, but would be great to know | 10.0% |
An Ounce of Prevention is Worth a Pound of Cure
Knowing whether you have rogue users and/or apps on your system is very valuable:
- A plurality (43%) of Unravel customers have rogue users/apps on their cluster “all the time.”
- A minority (19%) see this about once a day.
- A near-plurality (38%) only see this behavior once a week or so.
We would bet good money that non-Unravel customers see a lot more rogue behavior than our customers do. With Unravel, you can know exactly who and what is “going rogue” – and you can help stakeholders get the same results with far less use of system resources and cost. This cuts down on rogue behavior, freeing up room for existing jobs, and for new applications to run with hitherto unattainable performance.
How often do you have rogue users/apps on your cluster? | |
All the time! | 42.9% |
Once a day | 19.0% |
Once a week | 38.1% |
Unravel Customers Are Fending Off Bad Apps
Once you find and improve the bad apps that are currently in production, the logical next step is to prevent bad apps from even reaching production in the first place. Unravel customers are meeting this challenge:
- More than 90% of attendees find automation helpful in preventing bad quality apps from being promoted into production.
- More than 80% have a quality gate when promoting apps from Dev to QA, and from QA to production.
- More than half have a well-defined DataOps/SDLC (software development life cycle) process, and nearly a third have a partially-defined process. Only about one-eighth have neither.
- About one-quarter have operations people/sysadmins troubleshooting their data pipelines; another quarter put the responsibility onto the developers or data engineers who create the apps. Nearly half make a game-time decision, depending on the type of issue, or use an “all hands on deck” approach with everyone helping.
The Rest of the Story
- More than two-thirds are finding their data costs to be running over budget.
- More than half are in the process of migrating to cloud – though only a quarter have gotten to the point of actually moving apps and data, then optimizing and scaling the result.
- Half find that automating problem identification, root cause analysis, and resolution saves them 1-5 hours per issue; the other half save from 6-10 hours or more.
- Somewhat more than half find their clusters to be on the over-provisioned side.
- Nearly half have ten or more technologies in representative data pipelines.
Finding Out More
Unravel Data customers – #datalovers all – look to be well ahead of the industry in managing issues that relate to big data, streaming data, and moving to the cloud. If you’re interested in joining this select group, you can create a free account or contact us. (There may still be some Untold conference swag left for new customers!)