Count: 44 PPMCs (history)

Mean age: 328 days

Median age: 459 days

Currently in incubation, sorted by age

Project Description Sponsor (Champion) Mentors Start Date
Wave A wave is a hosted, live, concurrent data structure for rich communication. It can be used like email, chat, or a document. Incubator Christian Grobmeier, Upayavira 2010-12-04
ODF Toolkit Java modules that allow programmatic creation, scanning and manipulation of OpenDocument Format (ISO/IEC 26300 == ODF) documents Incubator Sam Ruby, Nick Burch, Yegor Kozlov 2011-08-01
Blur Blur is a search platform capable of searching massive amounts of data in a cloud computing environment. Incubator(Patrick Hunt) Doug Cutting, Patrick Hunt, Tim Williams 2012-07-24
Ripple Ripple is a browser based mobile phone emulator designed to aid in the development of HTML5 based mobile applications. Ripple is a cross platform and cross runtime testing/debugging tool. It currently supports such runtimes as Cordova, WebWorks aand the Mobile Web. Incubator(Ross Gardler) Jukka Zitting, Christian Grobmeier, Andrew Savory 2012-10-16
Streams Apache Streams is a lightweight server for ActivityStreams. Incubator(Matt Franklin) Matt Franklin, Ate Douma, Craig McClanahan 2012-11-20
MRQL MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. Incubator(Edward J. Yoon) Alan Cabrera, Edward J. Yoon, Mohammad Nour El-Din 2013-03-13
Sentry Sentry is a highly modular system for providing fine grained role based authorization to both data and metadata stored on an Apache Hadoop cluster. Incubator Arvind Prabhakar, Joe Brockmeier, David Nalley, Olivier Lamy, Patrick Hunt, Thomas White 2013-08-08
BatchEE BatchEE projects aims to provide a JBatch implementation (aka JSR352) and a set of useful extensions for this specification. Incubator(FIXME) Jean-Baptiste Onofré, Olivier Lamy, Mark Struberg 2013-10-03
Sirona Monitoring Solution. Incubator(Olivier Lamy) Olivier Lamy, Henri Gomez, Jean-Baptiste Onofre, Tammo van Lessen, Mark Struberg 2013-10-15
Twill Twill is an abstraction over Apache Hadoop YARN that reduces the complexity of developing distributed applications, allowing developers to focus more on their business logic Incubator(Vinod K) Arun C Murthy, Tom White, Patrick Hunt, Andrei Savu 2013-11-14
log4cxx2 Logging for C++ Logging Services(Christian Grobmeier) Christian Grobmeier, Scott Deboy 2013-12-09
DataFu DataFu provides a collection of Hadoop MapReduce jobs and functions in higher level languages based on it to perform data analysis. It provides functions for common statistics tasks (e.g. quantiles, sampling), PageRank, stream sessionization, and set and bag operations. DataFu also provides Hadoop jobs for incremental data processing in MapReduce. Incubator(Jakob Homan) Ashutosh Chauhan, Roman Shaposhnik, Ted Dunning 2014-01-05
Slider Slider is a collection of tools and technologies to package, deploy, and manage long running applications on Apache Hadoop YARN clusters. Incubator(Vinod K) Arun C Murthy, Devaraj Das, Jean-Baptiste Onofré, Mahadev Konar 2014-04-29
Johnzon Implementation of JSR-353 JavaTM API for JSON Processing (Renamed from Fleece) Incubator(Mark Struberg) Justin Mclean, Daniel Kulp 2014-06-09
Ranger The Ranger project is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. Incubator(Owen O'Malley) Alan Gates, Daniel Gruno, Devaraj Das, Jakob Homan, Owen O'Malley 2014-07-24
REEF REEF (Retainable Evaluator Execution Framework) is a scale-out computing fabric that eases the development of Big Data applications on top of resource managers such as Apache YARN and Mesos. Incubator Chris Douglas, Chris Mattmann, Ross Gardler, Owen O'Malley 2014-08-12
Taverna Taverna is a domain-independent suite of tools used to design and execute data-driven workflows. Incubator(Andy Seaborne) Andy Seaborne, Chris Mattmann, Suresh Srinivas, Suresh Marru, Marlon Pierce 2014-10-20
HTrace HTrace is a tracing framework intended for use with distributed systems written in java. Incubator(Roman Shaposhnik) Jake Farrell, Todd Lipcon, Lewis John Mcgibbney, Andrew Purtell, Billie Rinaldi, Michael Stack 2014-11
Tamaya Tamaya is a highly flexible configuration solution based on an modular, extensible and injectable key/value based design, which should provide a minimal but extendible modern and functional API leveraging SE, ME and EE environments. Incubator(David Blevins) John D. Ament, Mark Struberg, Gerhard Petracek, David Blevins 2014-11-14
Kylin Kylin is a distributed and scalable OLAP engine built on Hadoop to support extremely large datasets. Incubator(Owen O’Malley) Owen O'Malley, Ted Dunning, Henry Saputra, Julian Hyde, P. Taylor Goetz 2014-11-25
Corinthia Corinthia is a toolkit/application for converting between and editing common office file formats, with an initial focus on word processing. It is designed to cater for multiple classes of platforms - desktop, web, and mobile - and relies heavily on web technologies such as HTML, CSS, and JavaScript for representing and manipulating documents. The toolkit is small, portable, and flexible, with minimal dependencies. The target audience is developers wishing to include office viewing, conversion, and editing functionality into their applications. Incubator(Jan Iversen) Daniel Gruno, Jan Iversen, Dave Fischer 2014-12-08
SAMOA SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms that run on top of distributed stream processing engines (DSPEs). It features a pluggable architecture that allows it to run on several DSPEs such as Apache Storm, Apache S4, and Apache Samza. Incubator(Daniel Dai) Alan Gates, Ashutosh Chauhan, Enis Soztutar, Ted Dunning 2014-12-15
Zeppelin A collaborative data analytics and visualization tool for distributed, general-purpose data processing systems such as Apache Spark, Apache Flink, etc. Incubator(Roman Shaposhnik) Konstantin Boudnik, Henry Saputra, Roman Shaposhnik, Ted Dunning, Hyunsik Choi 2014-12-23
TinkerPop TinkerPop is a graph computing framework written in Java Incubator(David Nalley) Rich Bowen, Daniel Gruno, Hadrian Zbarcea, Matt Franklin, David Nalley 2015-01-16
OpenAz Tools and libraries for developing Attribute-based Access Control (ABAC) Systems in a variety of languages. Incubator(Paul Fremantle) Emmanuel Lecharny, Colm O Heigeartaigh, Hadrian Zbarcea 2015-01-20
AsterixDB Apache AsterixDB is a scalable big data management system (BDMS) that provides storage, management, and query capabilities for large collections of semi-structured data. Incubator(Chris Mattmann) Ate Douma, Chris Mattmann, Henry Saputra, Jochen Wiedmann, Ted Dunning 2015-02-28
Myriad Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure. Incubator(Benjamin Hindman) Benjamin Hindman, Danese Cooper, Ted Dunning, Luciano Resende 2015-03-01
CommonsRDF Commons RDF is a set of interfaces and classes for RDF 1.1 concepts and behaviours. The commons-rdf-api module defines interfaces and testing harness. The commons-rdf-simple module provides a basic reference implementation to exercise the test harness and clarify API contracts. Incubator(Lewis John McGibbney) Rob Vesse, John D Ament, Gary Gregory 2015-03-06
Groovy Groovy is an object-oriented programming language for the Java platform. It is a language with features similar to those of Python, Ruby, Java, Perl, and Smalltalk. Incubator(Roman Shaposhnik) Andrew Bayer, Konstantin Boudnik, Bertrand Delacretaz, Jim Jagielski, Emmanuel Lecharny, Roman Shaposhnik 2015-03-17
Singa Singa is a distributed deep learning platform. Incubator(Thejas Nair) Daniel Dai, Alan Gates, Ted Dunning, Thejas Nair 2015-03-17
Geode Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures. Incubator(Roman Shaposhnik) Konstantin Boudnik, Chip Childers, Justin Erenkrantz, Jan Iversen, Chris Mattmann, William A. Rowe Jr., Roman Shaposhnik 2015-04-27
Atlas Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem Incubator(Jitendra Nath Pandey) Arun Murthy, Chris Douglas, Jakob Homan, Vinod Kumar Vavilapalli 2015-05-05
Climate Model Diagnostic Analyzer CMDA provides web services for multi-aspect physics-based and phenomenon-oriented climate model performance evaluation and diagnosis through the comprehensive and synergistic use of multiple observational data, reanalysis data, and model outputs. Incubator(Chris Mattmann) James W. Carman, Chris Mattmann, Michael James Joyce, Kim Whitehall, Gregory D. Reddin 2015-05-08
Trafodion Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Hadoop. Incubator(Michael Stack) Andrew Purtell, Devaraj Das, Enis Söztutar, Lars Hofhansl, Michael Stack 2015-05-24
Cotton Cotton is an Apache Mesos framework for running MySQL instances. Incubator(Jake Farrell) Dave Lester, Benjamin Hindman, Henry Saputra 2015-06
FreeMarker FreeMarker is a template engine, i.e. a generic tool to generate text output based on templates. FreeMarker is implemented in Java as a class library for programmers. Incubator(Jacopo Cappellato) Jacopo Cappellato, Jean-Frederic Clere, David E. Jones, Ralph Goers, Sergio Fernández 2015-07-01
Apex Apex is an enterprise grade native YARN big data-in-motion platform that unifies stream processing as well as batch processing. Incubator(Ted Dunning) Chris Nauroth, Alan Gates, Hitesh Shah, Justin Mclean, P. Taylor Goetz, Ted Dunning 2015-08-17
HAWQ HAWQ is an advanced enterprise SQL on Hadoop analytic engine built around a robust and high-performance massively-parallel processing (MPP) SQL framework evolved from Pivotal Greenplum Database. Incubator(Roman Shaposhnik) Alan Gates, Konstantin Boudnik, Justin Erenkrantz, Thejas Nair, Roman Shaposhnik 2015-09-04
HORN HORN is a neuron-centric programming APIs and execution framework for large-scale deep learning, built on top of Apache Hama. Incubator(Edward J. Yoon) Luciano Resende, Robin Anil, Edward J. Yoon, Rich Bowen 2015-09-04
MADlib Big Data Machine Learning in SQL for Data Scientists. Incubator(Roman Shaposhnik) Konstantin Boudnik, Ted Dunning, Roman Shaposhnik 2015-09-15
Rya Rya (pronounced "ree-uh" /rēə/) is a cloud-based RDF triple store that supports SPARQL queries. Rya is a scalable RDF data management system built on top of Accumulo. Rya uses novel storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes. Rya provides fast and easy access to the data through SPARQL, a conventional query mechanism for RDF data. Incubator(Adam Fuchs) Josh Elser, Edward J. Yoon, Sean Busbey, Venkatesh Seetharam 2015-09-18
Unomi Unomi is a reference implementation of the OASIS Context Server specification currently being worked on by the OASIS Context Server Technical Committee. It provides a high-performance user profile and event tracking server. Incubator(Jean-Baptiste Onofre) Bertrand Delacretaz, Roman Shaposhnik, Chris Mattmann 2015-10-05
Concerted Apache Concerted is a Do-It-Yourself toolkit for building in-memory data engines. Incubator(Roman Shaposhnik) Chris Nauroth, Daniel Dai, Jake Farrell, Julian Hyde, Lars Hofhansl 2015-10-14
Mynewt Mynewt is a real-time operating system for constrained embedded systems like wearables, lightbulbs, locks and doorbells. It works on a variety of 32-bit MCUs (microcontrollers), including ARM Cortex-M and MIPS architectures. Incubator(Marvin Humphrey) Sterling Hughes, Jim Jagielski, Justin Mclean, Greg Stein, P. Taylor Goetz 2015-10-20
Eagle Eagle is a Monitoring solution for Hadoop to instantly identify access to sensitive data, recognize attacks, malicious activities and take actions in real time. Incubator(Henry Saputra) Owen O'Malley, Henry Saputra, Julian Hyde, P. Taylor Goetz, Amareshwari Sriramdasu 2015-10-26
SystemML SystemML provides declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms and automatic generation of hybrid runtime plans ranging from single node, in-memory computations, to distributed computations such as Apache Hadoop MapReduce and Apache Spark. Incubator(Luciano Resende) Luciano Resende, Patrick Wendell, Reynold Xin, Rich Bowen 2015-11-02