Project |
Aliases |
Description |
Sponsor (Champion) |
Mentors |
Start Date |
Toree |
|
Toree provides applications with a mechanism to interactively and remotely access Apache Spark. |
Incubator
(Sam Ruby)
|
Luciano Resende, Ryan Blue, Weiwei Yang
|
2015-12-02 |
Pony Mail |
|
Pony Mail is a mail-archiving, archive viewing, and interaction service, that can be integrated with many email platforms. |
Incubator
(Suneel Marthi)
|
John D. Ament
|
2016-05-27 |
Annotator |
|
Annotator provides annotation enabling code for browsers, servers, and humans. |
Incubator
(Daniel Gruno)
|
Nick Kew, Tommaso Teofili, Benjamin Young
|
2016-08-30 |
Livy |
|
Livy is web service that exposes a REST interface for managing long running Apache Spark contexts in your cluster. With Livy, new applications can be built on top of Apache Spark that require fine grained interaction with many Spark contexts. |
Incubator
(Sean Busbey)
|
Bikas Saha, Luciano Resende, Jean-Baptiste Onofré, Madhawa Kasun Gunasekara
|
2017-06-05 |
Nemo |
|
Nemo is a data processing system to flexibly control the runtime behaviors of a job to adapt to varying deployment characteristics. |
Incubator
(Byung-Gon Chun)
|
Hyunsik Choi, Byung-Gon Chun, Jean-Baptiste Onofré, Markus Weimer
|
2018-02-04 |
Training |
|
The Training project aims to develop resources which can be used for training purposes in various media formats, languages and for various Apache and non-Apache target projects. |
Incubator
(Lars Francke)
|
Craig Russell, Christofer Dutz, Justin Mclean, Lars Francke
|
2019-02-21 |
Teaclave |
mesatee |
Teaclave is a universal secure computing platform. |
Incubator
(Zhijie Shen)
|
Felix Cheung, Furkan Kamaci, Jianyong Dai, Matt Sicker, Zhijie Shen, Gordon King
|
2019-08-20 |
NLPCraft |
|
A Java API for NLU applications |
Incubator
(Konstantin Boudnik)
|
Furkan Kamaci, Evans Ye, Paul King, Konstantin I Boudnik
|
2020-02-13 |
Pegasus |
|
Pegasus is a distributed key-value storage system which is designed to be simple, horizontally scalable, strongly consistent and high-performance. |
Incubator
(Von Gosling)
|
Duo zhang, Liang Chen, Von Gosling, Liu Xun
|
2020-06-28 |
Wayang |
|
Wayang is a cross-platform data processing system that aims at decoupling the business logic of data analytics applications from concrete data processing platforms, such as Apache Flink or Apache Spark. Hence, it tames the complexity that arises from the "Cambrian explosion" of novel data processing platforms that we currently witness. |
Incubator
(Christofer Dutz)
|
Christofer Dutz, Lars George, Bernd Fondermann, Jean-Baptiste Onofré
|
2020-12-16 |
HugeGraph |
|
A large-scale and easy-to-use graph database |
Incubator
(Willem Ning Jiang)
|
Lidong Dai, Trista Pan, Xiangdong Huang, Yu Li, Willem Ning Jiang
|
2022-01-23 |
DevLake |
|
DevLake is a development data platform, providing the data infrastructure for developer teams to analyze and improve their engineering productivity. |
Incubator
(Willem Ning Jiang)
|
Felix Cheung, Liang Zhang, Lidong Dai, Sijie Guo, Jean-Baptiste Onofré, Willem Ning Jiang
|
2022-04-29 |
Baremaps |
|
Apache Baremaps is a toolkit and a set of infrastructure components for creating, publishing, and operating online maps. |
Incubator
(Bertrand Delacretaz)
|
Bertrand Delacretaz, Martin Desruisseaux, Julian Hyde, Calvin Kirs, George Percivall, Martin Desruisseaux
|
2022-10-10 |
KIE |
|
KIE (Knowledge is Everything) is a community of solutions and supporting tooling for knowledge engineering and process automation, focusing on events, rules, and workflows. |
Incubator
(Brian Proffitt)
|
Brian Proffitt, Claus Ibsen, Andrea Cosentino
|
2023-01-13 |
ResilientDB |
|
ResilientDB is a distributed blockchain framework that is open-source, lightweight, modular, and highly performant. |
Incubator
(Atri Sharma)
|
Junping Du, Calvin Kirs, Kevin Ratnasekera
|
2023-10-21 |
Seata |
|
Seata(Simple Extensible Autonomous Transaction Architecture)is an easy-to-use and high-performance distributed transaction solution, used to solve the data consistency problem. |
Incubator
(Sheng Wu)
|
Sheng Wu, Justin Mclean, Huxing Zhang, Heng Du, Xin Wang
|
2023-10-29 |
HoraeDB |
|
HoraeDB is a high-performance, distributed, cloud native time-series database. |
Incubator
(tison)
|
tison, Shaofeng Shi, Gang Li, Von Gosling
|
2023-12-11 |
Fury |
|
A blazing fast multi-language serialization framework powered by jit and zero-copy |
Incubator
(tison)
|
tison, PJ Fanning, Yu Li, Xin Wang, Enrico Olivelli, Hao Ding
|
2023-12-15 |
Gluten |
|
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines. |
Incubator
(Shaofeng Shi)
|
Yu Li, Wenli Zhang, Kent Yao, Shaofeng Shi, Felix Cheung
|
2024-01-11 |
XTable |
|
XTable is an omni-directional converter for table formats that facilitates interoperability across data processing systems and query engines. |
Incubator
(Jesús Camacho Rodríguez)
|
Jesús Camacho Rodríguez, Stamatis Zampetakis, Jean-Baptiste Onofré
|
2024-02-11 |
Amoro |
|
Amoro is a Lakehouse management system built on open data lake formats like Apache Iceberg and Apache Paimon. |
Incubator
|
Justn Mclean, Zhongyi Tan, Yu Li, Xinyu Zhou, Kent Yao
|
2024-03-11 |
StormCrawler |
|
StormCrawler is a collection of resources for building low-latency, customisable and scalable web crawlers on Apache Storm. |
Incubator
(PJ Fanning)
|
Dave Fisher, Lewis John McGibbney, Ayush Saxena, PJ Fanning
|
2024-03-19 |
GraphAr |
|
GraphAr is an open-source and language-independent data file format designed for efficient graph data storage and retrieval. |
Incubator
(Yu Li)
|
Calvin Kirs, tison, Xiaoqiao He, Yu Li
|
2024-03-25 |
HertzBeat |
|
HertzBeat is an easy-to-use, open source, real-time monitoring system. It features an agentless architecture, high-performance clustering, Prometheus compatibility, and powerful custom monitoring and status page building capabilities. |
Incubator
(Yonglun Zhang)
|
Yonglun Zhang, Yu Xiao, Justn Mclean, Francis Chuang
|
2024-04-05 |
Gravitino |
|
Gravitino is a high-performance, geo-distributed, and federated metadata like designed to manage metadata seamlessly across diverse data sources, vendors, and regions. Its primary goal is to provide users with unified metadata access for both data and AI assets. |
Incubator
(JB Onofré)
|
Daniel Dai, Junping Du, Justin McLean, Shaofeng Shi, Larry McCay, Jean-Baptiste Onofré
|
2024-06-04 |
OpenServerless |
|
OpenServerless is an open source, cloud-agnostic, serverless platform. It offers a complete environment for serverless applications development, based on Kubernetes. With Apache OpenWhisk as its FaaS engine, it provides an unified developer experience with a plethora of services (SQL or noSQL databases, key-value stores, object storage, LLMs services, function schedulers) managed by the platform's core: the operator, along with tooling (the CLI) to simplify (and interact with) deployments, integrated ide and starter application and optimized runtimes integrated with the staters. |
Incubator
(JB Onofré)
|
Bertrand Delacretaz, Enrico Olivelli, François Papon, JB Onofré, PJ Fanning
|
2024-06-17 |
OzHera |
|
OzHera is an application observation platform (APM) in the era of cloud native, with the application as its core, integrating capabilities such as metric monitoring, trace tracking, logging, and alerting |
Incubator
(Duo Zhang)
|
Yu Xiao, Yu Li, Kevin Ratnasekera, Duo Zhang
|
2024-07-11 |
Polaris |
|
Polaris is a catalog for data lakes. It provides new levels of choice, flexibility and control over data, with full enterprise security and Apache Iceberg interoperability across a multitude of engines and infrastructure. |
Incubator
(JB Onofre)
|
Bertrand Delacretaz, Holden Karau, Kent Yao, Ryan Blue, JB Onofre
|
2024-08-09 |
Cloudberry |
|
Cloudberry Database, built on the latest PostgreSQL kernel, is one of the most advanced and mature open-source MPP (Massively Parallel Processing) databases available.
|
Incubator
(Roman Shaposhnik)
|
Roman Shaposhnik, Willem Ning Jiang, Kent Yao
|
2024-10-11 |
Otava |
hunter |
Otava, a command-line tool, written in Python, that detects statistically significant changes in time-series data stored either in databases or CSV files. Otava entered Incubation as Hunter
|
Incubator
(Mick Semb Wever)
|
Dave Fisher, Enrico Olivelli, Lari Hotari, Mick Semb Wever
|
2024-11-27 |
Grails |
|
A powerful Groovy-based web application framework for the JVM built on top of Spring Boot |
Groovy
(Paul King)
|
Paul King, Soeren Glasius
|
2025-01-25 |
Iggy |
|
Iggy is a high-performance, ultra-low latency and large-scale persistent message streaming platform written in Rust. |
Incubator
(Yonik Seeley)
|
Hao Ding, Yonik Seeley, Zili Chen, Hulk Lin
|
2025-02-04 |
Hamilton |
|
Hamilton is a lightweight in-process framework to define, execute, and observe
directed acyclic graphs (DAGs) that express data transformations. In Hamilton
one can express complex DAGs of transformations, e.g. from dataframe transformations
(using pandas, polars, PySpark), machine learning pipelines, through to regular
software engineering API request and LLM API based workflows. Observability hooks
are built into the framework. The Hamilton UI is a self-hostable service to capture
observability output from workflow runs. Apache Software Foundation incubation will
establish Hamilton as a community-driven standard.
|
Incubator
(PJ Fanning)
|
Kevin Ratnasekera, Ayush Saxena, PJ Fanning
|
2025-04-12 |
Texera |
|
Texera is an open-source system to support collaborative data science, AI, and ML
using GUI-based workflows. Our vision is to develop a system to support cloud
platforms on which users can easily analyze data and use AI/ML techniques provided
as operators. Users with various backgrounds, irrespective of whether they know
coding or not, can collaborate on the same project to construct a pipeline.
Experienced users can use programming languages such as Python, R, Java, and Scala
to implement customized computation logic. The platform allows users to pause the
execution of a workflow to investigate the operator states, and resume the execution
at a later time. The platform can be used by a research community to publish valuable
resources such as data sets, workflows, and ML models to share their domain-specific
knowledge and support reproducibility of scientific research. The platform also allows
users to elastically request computing resources from public clouds for
computationally-intensive tasks.
|
Incubator
(PJ Fanning)
|
Cezar Andrei, Gordon King, PJ Fanning
|
2025-04-12 |
PouchDB |
|
PouchDB is an open-source JavaScript database inspired by Apache CouchDB that is designed to run well within the browser. |
Incubator
(Jan Lehnardt)
|
PJ Fanning, Jean-Baptiste Onofré
|
2025-04-15 |