As discussed in our recent Series C announcement, we are seeing unprecedented and accelerating demand for solutions to complex, business-critical challenges in dealing with data. Modern data systems are becoming impossibly complex. The burgeoning amount of data being processed in organizations today is staggering. Just a year ago, Forbes reported that 90% of the world’s data was created in the previous two years.
“New Stack” of Software Emerging
Organizations of all sizes and in all industries are transforming to deal with this change and a ‘New Stack’ of software is emerging to enable the building, operation and monitoring of these modern applications and the systems that support them. According to Morgan Stanley, ‘New Stack’ revenue is set to hit $50 Billion by 2022. In addition, according to their January 2019 CIO Survey, 50% of application workloads are expected to reside in public cloud environments by 2022, up from ~22% today.
With that backdrop, are the recent headlines about Cloudera and MapR surprising or anticipated? Is interest in data waning? Not even close. So why did Hortonworks get swallowed, Cloudera stumble and is MapR disappearing? Consensus is clear – execution gaps and an expected, but much faster than anticipated adoption of public cloud services. The other side of the equation is evidenced by Microsoft’s impressive latest earnings announcement driven in large part by its $40 Billion Azure business which grew at 73% last quarter – hard to post those kind of growth figures on a number that big <disclaimer: Microsoft is an investor in Unravel>
Rise of Big Data in the Cloud
I’ve written before about the rise of big data in the cloud, but Unravel has been doing a lot more than just talking about this shift. We’ve taken significant steps to support the transition of data workloads to the cloud. Unravel has forged partnerships with Azure and AWS, and our solution for migration and management of data workloads is available on both platforms. We have a particularly deep relationship with Azure and M12 (Microsoft’s investment arm) who participated in both Unravel’s Series B and Series C funding rounds. Unravel caught Azure’s eye precisely because of the need to solve these large scale operational data issues wherever the workloads are being executed.
With Unravel you get complete visibility into every aspect of your data pipelines:
- Is application code executing optimally (or failing)?
- Are resources being used, abused or constrained?
- How do I lower my cloud instance costs?
- Which workloads should I migrate to cloud first and what’s the performance/cost tradeoff?
- What are specific application and workload costs for Chargeback?
- Where are all my workloads being executed?
- How are all my services being utilized?
- How are users behaving and who are the bad actors?
These issues apply as much to systems located in the cloud as they do to systems on-premises. This is true for the breadth of public cloud deployment types:
- IaaS (Infrastructure as a Service): Cloudera, Hortonworks or MapR data platforms deployed on cloud VMs where your modern data applications are running
- PaaS (Platform as a Service): Managed Hadoop/Spark Platforms like AWS Elastic MapReduce (EMR), Azure HDInsight, Google Cloud Dataproc, etc.
- Cloud-Native: Products like Amazon Redshift, Azure Databricks, AWS Databricks, etc.
- Serverless: Ready-to-use services that require no setup like Amazon Athena, Google BigQuery, Google Cloud DataFlow, etc.
For those interested in learning more about specific services offered by the cloud platform providers we recently posted a blog on “Understanding Cloud Data Services.”
We introduced a portfolio of capabilities that help customers plan, migrate, and manage modern data applications running on AWS, Microsoft Azure and the Google Cloud Platform and we have talk frequently about what it takes to “Migrate and scale data pipelines on the AWS Platform” and about “Getting the most from data applications in the cloud.”
Test-drive Unravel for cloud environments
Building Expertise in Data Operations
No matter if you are running an on-premises system such as Cloudera or a fully managed PaaS environment such as Azure HDInsight or AWS EMR, or some hybrid combination, companies need to build expertise in data operations. DataOps is the discipline that ensures you are future proofing your architectural, operational and commercial decisions as you transform your business and migrate data workloads to the cloud, go straight to the cloud for new workloads, maintain an application on-premises or any hybrid combination.
Planning for Cloud-based Data Services
So, wherever you are on your cloud adoption and workload migration journey, now is the time to start or accelerate your strategic thinking and execution planning for cloud-based data services. Very recent history shows us that we need to be proactive, not reactive, and to expect this pace of change to continue to accelerate.
However, as migration goes from planning to reality, ensure that you invest in the critical skills, technology and process changes to establish a data operations center of excellence and future proof your critical data applications and systems.