On November 12, 2020, Databricks, a big data processing and distribution software company, announced that they are set to release SQL Analytics.
Databricks has built a platform that helps its customers unify their analytics across business, data science, and data engineering on top of an open-source framework.
Their current product suite consists of: |
- Delta Lake, an open-source data lake product
- MLflow, an open-source project that helps data teams operationalize machine learning
- Koalas, which creates a single machine framework for Spark and Pandas
- Spark, the open-source analytics engine
|
Databricks announces SQL analytics
Databricks seems to have found the sweet spot in terms of providing data operation. They are now adding SQL Analytics to their cadre of products, which will allow users to perform BI and SQL workloads directly on the data lake. According to the company’s announcement, this will “let analysts query data lakes with the BI tools they already use.”
Source: Databricks
This solution helps companies solve the tricky problem of data management and data storage, as it has been historically difficult to perform advanced analytics and machine learning on data lakes. Databricks argue that with the right tools, like SQL Analytics, "it makes sense for customers to think about positioning their data lake as the center of data infrastructure."
SQL Analytics will bring to the table: |
- A SQL-native workspace
- Built-in connectors to existing BI tools and broad partner support
- Fast query performance
- Governance and administration
|
Zooming out
Data does not live in a vacuum. It must be properly managed, maintained, and mined in order to derive insights. As mentioned previously, sellers are looking to bring together the best-in-class tools, to provide a solution which can be a comprehensive data cloud to serve all of a business’ data needs. The ultimate aim of this confluence is to provide a solution that adequately allows one to manage their data (including data preparation) and analyze it (both in terms of analytics, as well as data science).
Databricks is demonstrating how this is being built from the bottom up, starting with data management, while other sellers have built their comprehensive solution from the top down, starting with strong analytics capabilities. Either way it is built, the trend we are seeing is the confluence of data management and analytics into one comprehensive solution.