Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
May 10, 2025 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
An orchestration platform for the development, production, and observation of data assets.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.
Build data pipelines, the easy way 🛠️
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide, DuckDB, Pandas, and Plotly, Matplotlib, etc. Build dashboards, reports, and notebooks that run offline, load fast, and share like a document.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
A system for agentic LLM-powered data processing and ETL
The best place to learn data engineering. Built and maintained by the data engineering community.
MLeap: Deploy ML Pipelines to Production
Concurrent Python made simple
The Feldera Incremental Computation Engine
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Visual Data Preparation and Transformation. Low-Code Python-based ETL.
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."