Note: This is an English translation of this Japanese article.
Hello, I'm Sagara, a Modern Data Stack consultant. The Modern Data Stack ecosystem sees new information published daily. In this article, I'll summarize the interesting Modern Data Stack-related information I've come across in the past two weeks.
Disclaimer: This article doesn't cover all the latest information about the mentioned products. It only includes information that **I personally found interesting.
Data Extract/Load
Airbyte
Airbyte's annual conference Move(data) was held on March 20th
Airbyte's annual conference Move(data) was held on March 20th.
Related to this, they also published a blog post summarizing the new features announced during the winter period. (The following is an AI summary of the article.)
-
Data Access
- Enterprise Connector Bundle including premium connectors for Oracle, SAP HANA, NetSuite, etc.
- Enhanced support for GraphQL and OAuth 2.0 for secure authentication and efficient data migration
- Extended file transfer support for unstructured data from Google Drive, SharePoint, OneDrive, etc.
-
Data Control & Governance
- AWS PrivateLink support for secure cloud-to-cloud data transfer
- Detailed audit log functionality for compliance monitoring and governance enhancement
- Data privacy and governance mapping features compliant with GDPR, HIPAA, and SOC 2
-
Data Portability & AI Workloads
- Apache Iceberg destination support for highly scalable AI and analytics workloads
- Enhanced schema evolution support and unstructured data migration
-
Platform & Performance Improvements
- Performance improvements for key connectors like Amazon Ads and Google Sheets
- Efficient custom connector development with Python-Based CDK
- Addition of connection tags and resource management features
- Enhanced pipeline monitoring with OpenTelemetry (OTEL) Metrics
https://airbyte.com/blog/airbyte-platform-winter-2025-release
Example of building an MCP Server for Airbyte
Airbyte's official blog published an article describing an example of building an MCP Server for Airbyte.
I personally wondered, "How do you use an MCP Server for Airbyte...?" This article includes examples such as checking connector status.
https://airbyte.com/blog/build-an-airbyte-mcp-server-for-claude-desktop
Data Warehouse/Data Lakehouse
Snowflake
Ability to execute specified conditions and generate alerts when new records are added to target tables
Snowflake introduced a new feature that allows you to execute specified conditions and generate alerts when new records are added to target tables.
https://docs.snowflake.com/en/release-notes/2025/other/2025-03-19-alerts-on-new-data
This is a very exciting feature, so I wrote two blog posts about it:
- Outputting task and Dynamic Table errors to an event table as logs and sending alert notifications when new logs appear
https://dev.classmethod.jp/articles/snowflake-alert-with-new-data-for-error-notification/
- Performing data quality checks when new data enters the target table and sending alert notifications if anomalies are detected
https://dev.classmethod.jp/articles/snowflake-alert-with-data-quality-check/
terraform-provider-snowflake moving from Snowflake-Labs to snowflakedb for GA
The ROADMAP.md for terraform-provider-snowflake has been updated, indicating that terraform-provider-snowflake will be moved from Snowflake-Labs to snowflakedb for GA.
I wondered, "What's the difference between GA and v1.0?" It turns out that GA will add official Snowflake support:
having official Snowflake support (ability to submit official Support Cases for the Provider);
migrating the project to the snowflakedb GitHub organization (we are still in Snowflake-Labs, reserved for unofficial/experimental projects).
https://github.com/snowflakedb/terraform-provider-snowflake/blob/main/ROADMAP.md
Data Transform
General
Comparison of column-level lineage in dbt and SQLMesh
Recce's blog published an article summarizing the differences in column-level lineage between VS Code extensions, dbt Explorer, Recce, and SQLMesh.
https://datarecce.io/blog/column-level-lineage-options-for-dbt/
dbt
dbt Developer Day held on March 20th, announcing latest features including a new engine integrating SDF functionality
dbt Developer Day was held on March 20th, with many announcements of the latest features.
https://www.getdbt.com/blog/dbt-developer-day-2025
I was particularly interested in the following:
- Announcement of a new dbt engine with SDF functionality and an official VS Code extension to use the new engine (currently requires application for use)
- dbt Copilot reaching GA
- dbt Core 1.10 beta release, with new features including
--sample
for sampling during build - Plans to support Python models with BigQuery DataFrames
dbt Copilot now generally available
"dbt Copilot," which allows you to generate SQL and YAML automatically by making requests in natural language in the dbt Cloud IDE, is now generally available.
https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes#march-2025
I actually tried it myself and wrote a blog post about it, so please check it out as well.
https://dev.classmethod.jp/articles/dbt-cloud-dbt-copilot-ga/
SELECT's summary of practices for slim dbt development and building
SELECT published three articles summarizing practices for slim dbt development and deployment.
They are quite specific and informative!
- Methods for slim local builds, including the
--defer
flag,--empty
flag, and ref macro modifications - Methods for slim CI/CD, including
state:modified
,--defer
flag, and the use of clone and swap - Methods for slim scheduled builds using
source_status:fresher+
and tagging
Coalesce
Coalesce announces acquisition of CastorDoc
Coalesce, which offers GUI-based data modeling and transformation pipeline building, announced the acquisition of the data catalog service CastorDoc.
Since I've covered CastorDoc (formerly Castor) several times in my blog, this was shocking news to me personally.
https://coalesce.io/company-news/why-coalesce-acquired-castordoc/
SQLMesh (Tobiko Cloud)
Tobiko Cloud now generally available
Tobiko Cloud, the SaaS version of SQLMesh, is now generally available.
The following is an AI-generated summary of the features of Tobiko Cloud mentioned in the article:
-
Granular, actionable observability and insights
- Features three observability capabilities: runtime alerts, integrated debugger, and warehouse cost tracking
- Alerts are easy to configure and detect failed runs or threshold exceedances, notifying via Slack, PagerDuty, or email
- Debugger consolidates critical context for error resolution in one place
- Cost tracker analyzes spending on engines like BigQuery and Snowflake, identifying the most expensive models
-
Intuitive, efficient built-in orchestration
- Features a native scheduler based on SQLMesh's state-aware architecture
- Minimizes pipeline bottlenecks with concurrency and model execution pause capabilities
- Can run multiple models in parallel, preventing blockage by long-running jobs
- Model execution pause feature allows temporary suspension of production runs during maintenance or issue investigation
-
Best-in-class security and data governance
- Provides isolated Python environments and hybrid deployment options
- Isolated Python environments install only the dependencies needed for each execution
- Hybrid deployment allows users to keep all data and warehouse operations within their own infrastructure
-
Advanced impact and change analysis
- Advanced change categorization evaluates changed columns used downstream, reducing unnecessary backfills
- Supports cross-database diff detection, comparing dataset versions across multiple warehouses
- Uses hash algorithms to avoid high-cost full joins, streamlining vendor migrations and data drift detection
https://www.tobikodata.com/blog/introducing-enterprise-ready-tobiko-cloud
Business Intelligence
Tableau
Tableau 2025.1 has been released
The latest version of Tableau, 2025.1, has been released.
Key features include VizQL Data Service API, PrivateLink connection to Tableau Cloud, and multilingual support for Tableau Agent.
https://www.tableau.com/products/new-features
It can be downloaded from the following URL:
https://www.tableau.com/support/releases
Omni
New features released including source table support via CSV/XLSX uploads and manual data entry
Omni's ChangeLog has been updated, announcing new features including source table support via CSV/XLSX uploads and manual data entry.
https://omni.co/blog/introducing-data-input
Another interesting new feature is the ability to check if dbt Models are working correctly from the IDE, as shown in the image below. (Image quoted from the ChangeLog)
Data Catalog
Atlan
Announced an agent for sending metadata to Atlan from private networks
Atlan announced a new feature: an agent for sending metadata to Atlan from private networks. This eliminates the need for inbound connections from SaaS-based Atlan to databases in private networks, requiring only outbound connections from the private network to Atlan.
Currently, it supports Oracle, MS SQL Server, PostgreSQL, and Salesforce, with plans to support more connectors in the future.
https://shipped.atlan.com/securely-extract-metadata-from-within-your-org-1ui772
Data Activation (Reverse ETL)
Census
Released "AI Sheets" with free AI Column functionality
Census released "AI Sheets," a separate tool from Census's main features that allows you to add columns based on natural language requests.
https://www.getcensus.com/blog/introducing-free-ai-sheets-transform-your-data-simply-and-easily
You can use it at the following URL. It allows you to upload a CSV with up to 1000 rows and add processed columns based on natural language requests.