This is a new strategic shift for the American publisher. It is no longer just a data platform, but also an application-oriented platform as a service environment.
On the occasion of the Snowflake Summit, which is taking place this week in Las Vegas, the American publisher of the same name announces a new phase in its strategy. By combining flexibility, big data volume, and data warehousing performance, it is now possible to drive applications in a lake in the cloud (Data Cloud). So far the publisher market is limited to sharing data from its platform, and also keeps a place for them in order to share and monetize it.
“Depending on our environment, these data applications will be able to take advantage of transaction data or analytics data, but also include machine learning models if needed,” explains Benoît Dageville, Snowflake co-founder and chief product officer. The hardware hub, the Native Application Framework, is currently in private beta, managing its entire lifecycle, from development to sale through deployment and expansion. It is clear that applications will be able to take advantage of the functions specific to Snowflake: stored procedures, UDFs, UDTFs…
Downstream, applications developed on Snowflake and associated data will remain in the customer’s (or tenant) space while they are maneuvered. Thus, they will remain in control of their security and rule, regardless of their use. Of Snowflake’s 425 partner publishers, only a few actually had access to the new framework. This is the case for Informatica or ServiceNow as part of the development of the new Snowflake connectors. But also from Google for integrating Google Analytics Indicators into the platform.
Machine learning is at the heart of the show
Complements Snowpark’s native application framework. A library designed for massively parallel processing of data in Snowflake (Spark-style) while taking advantage of a sandbox to secure it. Purpose? Enable to build data pipelines and machine learning processes. During the global event, Snowflake also announced the availability of the public Snowpark beta for Python. The language has been added to Java and Scala, already supported languages. Snowpark for Python integrates with the Python development environment resulting from the Streamlit takeover. In parallel, several Python libraries are supported: Numpy and Pandas on the data analytics side, Scikit-Learn and Tensorflow on the machine learning side.
To complete Snowpark for Python, Snowflake is working on several developments. First, Snowflake worksheets for Python which aim to integrate Streamlit into the core of the Snowsight GUI. Next, machine learning SQL that will be designed to build predictive machine learning models based on time series (or time series). These two bricks are currently in private beta. Finally, large memory repositories, currently in development, will perform memory-intensive operations, such as feature engineering or machine learning processing applied to large data sets.
Transaction and data analytics
Besides submitting apps, Snowflake Summit has seen other big announcements. Chief among them is the introduction of UniStore. Released in private beta, this brick powers Snowflake in managing transaction processing. Stated goal: To run data services with a latency of a few milliseconds through state control and simultaneous access. “And it’s very useful in machine learning,” says Benoit Dagville. For the occasion, Snowflake creates mixed tables to manage both transaction processing (OLTP) and data analytics. “The goal is to avoid sharing current data with historical data with the need for an ETL between the two,” Snowflake co-founder continues.
Another announcement, Snowflake improves real-time data ingestion. An evolution that includes the launch of Snowpipe Streaming (in private beta). Technology that allows data to flow in a serverless mode. “Physical Tables” will be added to this brick soon, a feature in development intended to simplify declarative data conversion. For interacting with third-party databases, Snowflake also works on two new types of tables. In development, the first table, Iceberg Tables, will open a portal to the Apache Iceberg table format. The second, external tables for on-premises storage, will provide access from Snowflake to internally deployed storage systems, such as Dell Technologies and Pure Storage.
When it went public in September 2020, Snowflake raised $3.4 billion. The company has more than 3,000 employees for more than 6,300 customers around the world.