Skip to Main Content
Apache Events The Apache Software Foundation
Apache 20th Anniversary Logo

This was extracted (@ 2024-05-15 21:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

WARNING: these pages may omit some original contents of the minutes.
This is due to changes in the layout of the source minutes over the years. Fixes are being worked on.

Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).

MADlib

17 Apr 2024 [Ed Espino / Rich]

## Description:
- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.
## Project Status:
- Two community developers (Nikhil Kak & Ekta Khanna) participated in
 addressing MADLIB-1517. As part of that work, several project updates
 (NOTICE year, Wiki), were performed as well.
- Community will mainly continue to work on any bug fixes for existing
 features based on severity and no new features planned for upcoming
 releases.
- With low bugs reported there are no plans for a new release anytime soon.
- The project is still actively used by the Greenplum Database project (mainly
 sponsored by Broadcom).
- The Greenplum Database project continues to strive to improve adoption of
 Apache MADlib by ensuring that plans generated by Greenplum are more
 performant. This work however is on the Greenplum side rather than MADlib.
## Project Activity:
- Release 2.1.0 occurred on September 8, 2023 which was
the 13th release as an Apache TLP project.
- Community plans to work on the following JIRAs for the next release:
* Fix empty string handling of grouping columns in regression model training
## Community Health:
- The community is small with 2 active committers since last report.
- Community focus is mainly on adoption of existing features and fixing bugs
as reported.
- There are no future releases planned.
## Membership Data:
- Currently stands at 12 PMC members, no new members added since last report
- Last addition was Chris Hajas on 2023-03-22.
## Committer base changes:
- Currently 23 committers, no new committers since last report.
- Last addition was David Kimura on 2023-03-23.
## Releases:
- v2.1.0 released on 2023-09-08
- v2.0.0 was released on 2023-06-23
- v1.21.0 was released on 2023-03-01
## Mailing list activity:
No activity on mailing lists since release announcement.

@Christofer: pursue a roll call vote

17 Jan 2024 [Ed Espino / Sharan]

## Description:
Apache MADlib is an open-source library for scalable in-database analytics.
It provides data-parallel implementations of mathematical, statistical,
graph and machine learning methods for structured and unstructured data.

## Project Status:
- The project has been relatively quiet the past five months.
- There are no issues requiring board attention at this time.

## Membership Data:
Apache MADlib was founded 2017-07-18 (6 years ago)

There are currently 23 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.

## Project Activity:
- No releases have occurred since last project report (Oct 2023).
- Two issues (improvement: MADlib-1516, bug: MADLIB-1517) may be
 worked on for the next release.

## Community Health:
We continue to have good voting participation from the newly formed PMC members.

@Rich: follow up on MADlib report prior to April

18 Oct 2023 [Ed Espino / Bertrand]

## Description:
Apache MADlib is an open-source library for scalable in-database analytics.
It provides data-parallel implementations of mathematical, statistical,
graph and machine learning methods for structured and unstructured data.

## Project Status:
- On the Apache MADlib v2 code base, the project completed its first
 minor (2.1.0) release.
- The project is maintaining a healthy Jira issue management level.
- There are no issues requiring board attention at this time.

## Membership Data:
Apache MADlib was founded 2017-07-18 (6 years ago)

There are currently 23 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.

## Project Activity:
Apache MADlib v2.1.0 was released on 2023-09-08

Improvements
- Build: Fix PG 15 support
- Assoc_rules: Fix SERIAL cache issue
- DL: Remove SERIAL from load_keras_model
- Build: Add ubuntu flag for PyXB installation
- Build: Add the actual path of $libdir to dynamic_library_path
- Build: Remove PyXB as a packaged dependency and replace it with external
 pyxb-x dependency.
- Build: Use PG15 in Jenkins CI
- CRF: Fix anyarray -> anycompatiblearray change for PG14

Release Manager
- Orhan Kislal

Vote Results
- The vote for releasing Apache MADlib 2.1.0 (RC2) passed with 4
 binding +1s and no 0 or -1 votes.

## Community Health:
We continue to have good voting participation from the newly formed PMC members.

@Sharan: follow up on committer and PMC membership changes

19 Jul 2023 [Ed Espino / Bertrand]

## Description
Apache MADlib is an open-source library for scalable in-database analytics.
It provides data-parallel implementations of mathematical, statistical,
graph and machine learning methods for structured and unstructured data.

## Project Status

- The project completed a major (2.0.0) release. This is an important
 milestone for the project.
- The project is maintaining a healthy Jira issue management level.
- There are no issues requiring board attention at this time.

## Membership Data
Apache MADlib was founded 2017-07-18 (6 years ago)

There are currently 23 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.

## Project Activity

* v2.0.0 was released on 2023-06-23

* New features
- Build: Add support for python3
- Build: Add support for GP7 Beta, GP6 python3 extension, Postgres 13/14/15

* Improvements
   - XGBoost: Add support for version 1.7.5
   - DL: Add support for tensorflow 2.10.1 and keras 2.10.0
   - DBScan: Add support for rtree 1.0.1

** Release Manager

Ed Espino served in the release manager capacity.

** Vote Results

The vote for releasing Apache MADlib 2.0.0 (RC1) passed with 4 binding
+1s and no 0 or -1 votes.

* Upgrading from MADlib 1.X to MADlib 2.X is not supported.

## Community Health

* We continue to have good voting participation from the newly formed
 PMC members.

* The dev community has started discussions on expanding the Apache
 MADlib adoption by discussing various projects (PostgreSQL, other
 PostgreSQL compliant backends - e.g. https://neon.tech/).

19 Apr 2023 [Ed Espino / Roman]

## Description:
Apache MADlib is an open-source library for scalable in-database analytics.
It provides data-parallel implementations of mathematical, statistical,
graph and machine learning methods for structured and unstructured data.

## Issues:
- There are no issues requiring board attention at this time.

## Membership Data:
At the ASF Board Meeting on March 22, 2023, Roman Shaposhnik (rvs) put
forth a proposal to "Reboot the Apache MADlib Project PMC". The
proposal passed unanimously. Here is the new PMC roster:

- Atri Sharma (2017-07-19)
- Chris Hajas (2023-03-23)
- David Kimura (2023-03-23)
- Ed Espino (2023-03-23)
- Ekta Khanna (2021-02-16)
- Greg Chase (2017-07-19)
- Hansome Yuan (2023-03-23)
- Jingyu Wang (2023-03-23)
- Nikhil Kak (2019-02-20)
- Orhan Kislal (2017-07-19)
- Roman Shaposhnik (2017-07-19)
- Venkatesh Raghavan (2023-03-23)

Ed Espino now serves as the Apache MADlib PMC chair. Ed has been added
to pmc-chairs (INFRA-24380). Ed has been added as a moderator to the
project's mailing lists (commits, dev, issues, private, user).

All team members have been subscribed to the private@madlib.apache.org
and dev@madlib.apache.org mailing lists.

Apache MADlib was founded 2017-07-18 (6 years ago)

There are currently 23 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.

## Project Activity:
* v1.21.0 was released on 2023-03-01

 New features include:
 - Graph: Add warm start for weakly connected components.
 - Graph: Add multicolumn identifier support for SSSP and APSP.
 - Build: Add support for Photon3 OS.

 Improvements:
 - XGBoost: Add support for bigint and varchar columns.
 - XGBoost: Enable eval_metrics parameter.

 Venkatesh (Venky) Raghavan served in the release manager capacity.

 The vote for releasing Apache MADlib 1.21.0 (RC2) passed with 4
 binding +1s, 1 non-binding +1, and no 0 or -1 votes.

 Official Vote thread:
   https://s.apache.org/5eghz

* The Apache MADlib project's web site was deficient in adhering to
 the Website Navigation Links Policy
 (https://www.apache.org/foundation/marks/pmcs#navigation).

 The following have been corrected (as reported in
 https://whimsy.apache.org/site/project/madlib)
 - Events
 - License
 - Thanks
 - Security
 - Sponsorship
 - Privacy
 - External resources

## Community Health:
* New PMC members have confirmed they are on the private and
 dev mailing lists.

* On March 2nd, Orhan Kislal put forth a proposal to support
 Python 3. This work will be targeted for the Apache MADlib v2.0
 release. The proposal passed unopposed.
 Proposal Thread:
 https://s.apache.org/pt3oa

* Thank you Roman Shaposhnik (rvs) for your dedication to ASF and in turn to
 the Apache MADlib project.

* Thank you ASF infrastructure for assisting with the project reboot.

* For the next project report, we hope to review and report on various
 facets of the project (project wiki & website, roadmap, jira,
 infrastructure, release processes, dev community participation).

15 Feb 2023 [Aaron Feng / Roy]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- Apache MADlib held a vote regarding moving it to the attic on 10/15/2022.
 The vote was not held in accordance to the Apache rules and subsequently got
 invalidated.


## Project Activity:

There hasn't been a new release since the last report.


## Community Health:

The community is small with a single active committer since the last report.
Two Apache members, Venkatesh Raghavan and Ed Espino, have shown interest in
joining as PMC members to shepherd the project.

## Membership Data:

- Currently stands at 16 PMC members, no new members added since the last report
- The most recent PMC members added were:
Ekta Khanna (Feb 2021)
Domino Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since the last report.

- The most recent committers added were:
Ekta Khanna (2019-07-27)
Himanshu Pandey (2019-07-27)
Domino Valdano (2019-07-27)


## Releases:

- Next release: Currently working on v1.20.0
- v1.19.0 released on 2022-03-08
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

The mailing list activity was 7 posts to dev@
and 0 posts to user@ for the last 3 months Nov 2022-Jan 2023.


## JIRA Statistics:

- 0 JIRA tickets were created in the 3 months
- 0 JIRA tickets were resolved in the 3 months

@Roman: follow up on pending interested PMC members to MADlib

18 Jan 2023 [Aaron Feng / Sharan]

No report was submitted.

19 Oct 2022

Terminate the Apache MADlib Project

 WHEREAS, the Project Management Committee of the Apache MADlib project
 has chosen by vote to recommend moving the project to the Attic; and

 WHEREAS, the Board of Directors deems it no longer in the best interest
 of the Foundation to continue the Apache MADlib project due to
 inactivity;

 NOW, THEREFORE, BE IT RESOLVED, that the Apache MADlib project is
 hereby terminated; and be it further

 RESOLVED, that the Attic PMC be and hereby is tasked with oversight
 over the software developed by the Apache MADlib Project; and be it
 further

 RESOLVED, that the office of "Vice President, Apache MADlib" is hereby
 terminated; and be it further

 RESOLVED, that the Apache MADlib PMC is hereby terminated.

 Special Order 7C, Terminate the Apache MADlib Project, was
 tabled.

19 Oct 2022 [Aaron Feng / Roy]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Project Activity:

- Release 1.20.0 occurred on Aug 5, 2022 which was
the 10th release as an Apache TLP project.

New features include:
XGBoost: Python based XGBoost with single and grid search executions
(MADLIB-1425, MADLIB-1490)
Graph: Add multicolumn support for WCC and Pagerank (MADLIB-1502, MADLIB-1503)

Improvements:
Utilities: Reuse update plan in GroupIterationController
Documentation: Update online examples for various modules

Bug fixes:
Elastic Net - GLM - SVM: Adjust ORCA to reduce planning time

## Community Health:

The community is relatively small but very engaged with robust mailing
list traffic, interest in doing frequent releases, and new
functionality being developed by contributors.

The number of developers actively contributing to the code/documentation
is approximately 2 in the 3nd quarter of the calendar year 2022.

We will constantly be on the lookout for new community members to be
invited either as committers or PMC.


## Membership Data:

- Currently stands at 16 PMC members, no new members added since the last report
- The most recent PMC members added were:
Ekta Khanna (Feb 2021)
Domino Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since the last report.

- The most recent committers added were:
Ekta Khanna (2019-07-27)
Himanshu Pandey (2019-07-27)
Domino Valdano (2019-07-27)


## Releases:

- Next release: Currently working on v1.21.0
- v1.20.0 released on 2022-08-03
- v1.19.0 released on 2022-03-08
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

The mailing list activity was 46 posts to dev@
and 6 posts to user@ for the last 3 months Jul-Oct 2022.


## JIRA Statistics:

- 6 JIRA tickets were created in the 3 months
- 2 JIRA tickets were resolved in the 3 months

20 Jul 2022 [Aaron Feng / Rich]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Project Activity:

- Release 1.19.0 occurred on Mar 8, 2022 which was the 9th release as an
 Apache TLP project.

New features include: DBSCAN: Fast parallel-optimized DBSCAN. MLP: Add rmsprop
and Adam optimization techniques.

Improvements: Graph: Improve WCC subtx count and catalog entry frequency. MLP:
Set lambda value for minibatch. GLM-multinom: Use non-temp tables in
GroupIterationController. Jenkins: Add new dockerfile for PG11. Build: Use
dynamic_library_path for module pathname.


## Community Health:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases, and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 3 in the 2nd quarter of the calendar year 2022.

We will constantly be on the lookout for new community members to be invited
either as committers or PMC.


## Membership Data:

- Currently stands at 16 PMC members, no new members added since the last
 report
- The most recent PMC members added were: Ekta Khanna (Feb 2021) Domino
 Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since the last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: Currently working on v1.20.0
- v1.19.0 released on 2022-03-08
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

The mailing list activity was 31 posts to dev@ and 5 posts to user@ for the
last 3 months Apr-Jul 2022.


## JIRA Statistics:

- 7 JIRA tickets were created in the 3 months
- 6 JIRA tickets were resolved in the 3 months

20 Apr 2022 [Aaron Feng / Rich]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Project Activity:

- Release 1.19.0 occurred on Mar 8, 2022 which was
the 9th release as an Apache TLP project.

New features include:
DBSCAN: Fast parallel-optimized DBSCAN.
MLP: Add rmsprop and Adam optimization techniques.

Improvements:
Graph: Improve WCC subtx count and catalog entry frequency.
MLP: Set lambda value for minibatch.
GLM-multinom: Use non-temp tables in GroupIterationController.
Jenkins: Add new dockerfile for PG11.
Build: Use dynamic_library_path for module pathname.


## Community Health:

The community is relatively small but very engaged with robust mailing
list traffic, interest in doing frequent releases and new
functionality being developed by contributors.

The number of developers actively contributing to the code/documentation
is approximately 3 in the 2nd quarter of calendar year 2022.

We will constantly be on a lookout for new community members to be
invited either as committers or PMC.


## Membership Data:

- Currently stands at 16 PMC members, no new members added since last report
- The most recent PMC members added were:
Ekta Khanna (Feb 2021)
Domino Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were:
Ekta Khanna (2019-07-27)
Himanshu Pandey (2019-07-27)
Domino Valdano (2019-07-27)


## Releases:

- Next release: Currently working on v1.20.0
- v1.19.0 released on 2022-03-08
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

Mailing list activity was 26 posts to dev@
and 14 posts to user@ for the last 3 months Jan-Mar 2022.


## JIRA Statistics:

- 4 JIRA tickets created in the 3 months
- 2 JIRA ticket resolved in the 3 months

19 Jan 2022 [Aaron Feng / Sam]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.18.0 occurred on Apr 5, 2021 which was
the 8th release as an Apache TLP project.

- Community is working on the 1.19.0 release including the following JIRAs:
* WCC: Optimize subtx count and catalog entry frequency
* next phase of DBSCAN clustering algorithm
* Deep learning minor fixes
* multilayer perceptron - added Adam and RMSprop optimizers
* Fix build failures for PMML and gppkg


## Health report:

The community is relatively small but very engaged with robust mailing
list traffic, interest in doing frequent releases and new
functionality being developed by contributors.

The number of developers actively contributing to the code/documentation
is approximately 3 in the 1st quarter of calendar year 2022.

We will constantly be on a lookout for new community members to be
invited either as committers or PMC.


## PMC changes:

- Currently stands at 16 PMC members, no new members added since last report
- The most recent PMC members added were:
Ekta Khanna (Feb 2021)
Domino Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were:
Ekta Khanna (2019-07-27)
Himanshu Pandey (2019-07-27)
Domino Valdano (2019-07-27)


## Releases:

- Next release: Currently working on v1.19.0
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

Mailing list activity was 4 posts to dev@
and 4 posts to user@ for the last 3 months Oct-Jan 2022.


## JIRA Statistics:

- 2 JIRA tickets created in the 3 months
- 3 JIRA ticket resolved in the 3 months

20 Oct 2021 [Aaron Feng / Craig]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.18.0 occurred on Apr 5, 2021 which was
the 8th release as an Apache TLP project.

- Community is working on the 1.19.0 release including the following JIRAs:
* next phase of DBSCAN clustering algorithm - merged in
* Deep learning minor fixes
* multilayer perceptron - added Adam and RMSprop optimizers
* Fix build failures for PMML and gppkg


## Health report:

The community is relatively small but very engaged with robust mailing
list traffic, interest in doing frequent releases and new
functionality being developed by contributors.

The number of developers actively contributing to the code/documentation
is approximately 3 in the 3rd quarter of calendar year 2021.

We will constantly be on a lookout for new community members to be
invited either as committers or PMC.


## PMC changes:

- Currently stands at 16 PMC members, no new members added since last report
- The most recent PMC members added were:
Ekta Khanna (Feb 2021)
Domino Valdano (Feb 2021)


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were:
Ekta Khanna (2019-07-27)
Himanshu Pandey (2019-07-27)
Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.19.0 planned for 2H 2021
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

Mailing list activity was 20 posts to dev@
and 5 posts to user@ for the last 3 months Apr-Jun 2021.


## JIRA Statistics:

- 3 JIRA tickets created in the 3 months
- 1 JIRA ticket resolved in the 3 months

21 Jul 2021 [Aaron Feng / Craig]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.18.0 occurred on Apr 5, 2021 which was the 8th release as an
 Apache TLP project.

- Community is working on the 1.19.0 release including the following JIRAs:
* multilayer perceptron - add Adam and RMSprop optimizers
* ARIMA - add GROUP BY feature
* weakly connected components and other graph methods - add incremental
 methods
* next phase of DBSCAN clustering algorithm

- Upcoming VLDB 2021 paper that incudes recent work on Apache MADlib:
 https://adalabucsd.github.io/papers/2021_Cerebro-DS.pdf Several MADlib
 committers are co-authors on the paper together with UC San Diego
 researchers.


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 5 in the 2nd quarter of calendar year 2021.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- Most recent PMC members added: Ekta Khanna (Feb 2021) Domino Valdano (Feb
 2021)
- Currently stands at 16 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.19.0 planned for 2H 2021
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

Mailing list activity was 20 posts to dev@ and 5 posts to user@ for the last 3
months Apr-Jun 2021.


## JIRA Statistics:

- 3 JIRA tickets created in the 3 months
- 8 JIRA tickets resolved in the 3 months

21 Apr 2021 [Aaron Feng / Sheng]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.18.0 occurred on Apr 5, 2021 which was the 8th release as an
 Apache TLP project.

- Community is working on the 1.19.0 release including the following JIRAs:
* multilayer perceptron - add Adam and RMSprop optimizers
* ARIMA - add GROUP BY feature
* weakly connected components and other graph methods - add incremental
 methods
* next phase of DBSCAN clustering algorithm

- Recent blog post on Apache MADlib regarding the autoML 1.18.0 release
 feature:
https://tanzu.vmware.com/content/blog-tag-thought-leadership/massively-parallel-automated-model-building-for-deep-learning


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 5 in the 1st quarter of calendar year 2021.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- Added 2 new PMC members since last report: Ekta Khanna (Feb 2021) Domino
 Valdano (Feb 2021)
- Currently stands at 16 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.19.0 planned for 1H 2021
- v1.18.0 released on 2021-04-05
- v1.17.0 released on 2020-04-09
- v1.16.0 released on 2019-07-08


## Mailing list activity:

Mailing list activity was 85 posts to dev@ and 4 posts to user@ for the last 3
months Jan-Mar 2021.


## JIRA Statistics:

- 23 JIRA tickets created in the 3 months
- 19 JIRA tickets resolved in the 3 months

20 Jan 2021 [Aaron Feng / Justin]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.17.0 occurred on Apr 9, 2020 which was the 7th release as an
 Apache TLP project.

- Community is working on the 1.18.0 release with JIRAs related to deep
 learning and other ML methods:
* deep learning - improve GPU efficiency
* deep learning - support custom loss functions and custom metrics
* deep learning - add autoML methods Hyperband and Hyperopt
* DBSCAN clustering algorithm

- Recent blog post mentioning Apache MADlib:
https://tanzu.vmware.com/content/blog/analytic-workloads-bi-ai-vmware-tanzu-greenplum

## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 5 in the 4th quarter of calendar year 2021.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.18.0 planned for 1H 2021

- v1.17.0 released on 2020-04-09

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15


## Mailing list activity:

Mailing list activity was 32 posts to dev@ and 2 posts to user@ for the last 3
months Oct-Dec 2020.


## JIRA Statistics:

- 8 JIRA tickets created in the 3 months

- 5 JIRA tickets resolved in the 3 months

21 Oct 2020 [Aaron Feng / Shane]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.17.0 occurred on Apr 9, 2020 which was the 7th release as an
 Apache TLP project.

- Community is working on the 1.18.0 release with JIRAs related to deep
 learning and other ML methods:
* deep learning - improve GPU efficiency
* deep learning - support custom loss functions and custom metrics
* deep learning - add autoML methods Hyperband and Hyperopt
* DBSCAN clustering algorithm

- Community members presented sessions mentioning Apache MADlib on Greenplum
 at the VMworld conference on Sept 2020, e.g.
https://www.vm  world.com/en/video-library/video-landing.html?sessionid=1586467547979001ehEa
https://www.vm  world.com/en/video-library/video-landing.html?sessionid=1589580297282001SUMh


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 5 in the 3rd quarter of calendar year 2020.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.18.0 planned for 2H 2020

- v1.17.0 released on 2020-04-09

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15


## Mailing list activity:

Mailing list activity was 40 posts to dev@ and 1 posts to user@ for the last 3
months Jul-Sep 2020.


## JIRA Statistics:

- 18 JIRA tickets created in the 3 months

- 18 JIRA tickets resolved in the 3 months

15 Jul 2020 [Aaron Feng / Sam]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Release 1.17.0 occurred on Apr 9, 2020 which was the 7th release as an
 Apache TLP project.

- Community is working on the 1.18.0 release with JIRAs related to deep
 learning and other ML methods:
* deep learning - improve GPU efficiency
* deep learning - support custom loss functions and custom metrics
* DBSCAN clustering algorithm
* add new solvers to multi-layer perceptron method

- Several new Jupyter notebook examples have been published to the community
 artifacts repo
 https://github.com/apache/madlib-site/tree/asf-site/community-artifacts


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 6 in the 2nd quarter of calendar year 2020.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.18.0 planned for 2H 2020

- v1.17.0 released on 2020-04-09

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15


## Mailing list activity:

Average monthly mailing list activity was 10 posts to dev@ and 4 posts to
user@ for the last 3 months Apr-Jun 2020.


## JIRA Statistics:

- 14 JIRA tickets created in the 3 months

- 2 JIRA tickets resolved in the 3 months

15 Apr 2020 [Aaron Feng / Craig]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
  for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Code complete and release in progress for 1.17
(as of time of this writing) which will be the 7th release as an Apache TLP
 project.

- Main 1.17 JIRAs include:
* feature improvements for deep learning including training multiple models in
  parallel for parameter selection (hyper-parameter tuning and model
  architecture search), inference on models trained outside of MADlib, and
  performance improvements to mini-batch preprocessor and DL training
* performance improvements to correlation/covariance, association rules, and
  weakly connected components graph algorithm
* stopping criteria on LDA using perplexity
* auto selection of number of centroids for K-mean clustering
* Postgres 12 support

- Next will be the 1.18 release with JIRAs related to deep learning and other
  ML methods

— Frank McQuillan (MADlib committer and PMC member) presented the latest deep
  learning work at FOSDEM'20 https://fosdem.org/2020/schedule/event/mppdb/ in
  a talk called: "Efficient Model Selection for Deep Neural Networks on
  Massively Parallel Processing Databases"

- Several new Jupyter notebook examples have been published to the community
  artifacts repo
  https://github.com/apache/madlib-site/tree/asf-site/community-artifacts


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 7 in the 1st quarter of calendar year 2020.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
  Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.18 planned for 2H 2020

- v1.17.0 released early April 2020

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15


## Mailing list activity:

Average monthly mailing list activity was 56 posts to dev@ and 5 posts to
user@ for the last 3 months Jan-Mar 2020.


## JIRA Statistics:

- 8 JIRA tickets created in the last month

- 15 JIRA tickets resolved in the last month

15 Jan 2020 [Aaron Feng / Danny]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Community is at work on the 1.17 release, which will be the 7th release as
 an Apache TLP project. Main JIRAs include:
* feature improvements for deep learning including training multiple models in
 parallel for parameter selection (hyper-parameter tuning and model
 architecture search), inference on models trained outside of MADlib, and
 performance improvements to mini-batch preprocessor
* performance improvements to correlation/covariance, association rules, and
 weakly connected components graph algorithm
* stopping criteria on LDA using perplexity
* auto selection of number of centroids for K-mean clustering
* Postgres 12 support

- After that will be the 2.0 release with JIRAs related to versioning models.

— Frank McQuillan (MADlib committer and PMC member) will present the latest
 deep learning work at FOSDEM'20
 https://fosdem.org/2020/schedule/event/mppdb/ in a talk called: "Efficient
 Model Selection for Deep Neural Networks on Massively Parallel Processing
 Databases"

## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 7 in the 4th quarter of calendar year 2019.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers, no new committers since last report.

- The most recent committers added were: Ekta Khanna (2019-07-27) Himanshu
 Pandey (2019-07-27) Domino Valdano (2019-07-27)


## Releases:

- Next release: v1.17 planned for Jan 2019

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15

- v1.15.0 released on 2018-08-10


## Mailing list activity:

Average monthly mailing list activity was 138 posts to dev@ and 11 posts to
user@ for the last 3 months Oct-Dec 2019.


## JIRA Statistics:

- 2 JIRA tickets created in the last month

- 10 JIRA tickets resolved in the last month

16 Oct 2019 [Aaron Feng / Ted]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Community is at work on the 1.17 release, which will be the 7th release as
 an Apache TLP project. Main JIRAs include:
* feature improvements for deep learning including training multiple models in
 parallel for parameter selection (hyper-parameter tuning and model
 architecture search), inference on models trained outside of MADlib, and
 performance improvements to mini-batch preprocessor
* performance improvements to correlation/covariance, association rules, and
 weakly connected components graph algorithm
* stopping criteria on LDA using perplexity
* auto selection of number of centroids for K-mean clustering

- After that will be the 2.0 release with JIRAs related to versioning models.

— Nikhil Kak and Nandish Jayaram (MADlib committers and PMC members) presented
 a community call on 2019-Aug-1 on the MADlib 1.16 release features:
 https://www.youtube.com/watch?v=uLW5By66Lf0

- Yuhao Zhang, a PhD candidate at University of California, San Diego
 completed his internship at Pivotal in Palo Alto on parameter selection in
 MADlib, which is an important area for deep learning practitioners.  Yuhao's
 advisor at UCSD is Arun Kumar in the Department of Computer Science and
 Engineering, whose research has contributed to MADlib in the past.  A
 presentation by Yuhao on his work on MADlib is at:
 https://www.youtube.com/watch?v=aZlKXqhyRKY

## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 7 in the 3rd quarter of calendar year 2019.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 17 committers.

- New committers added since last report: Ekta Khanna (2018-07-27) Himanshu
 Pandey (2018-07-27) Domino Valdano (2018-07-27)


## Releases:

- Next release: v1.17 planned for 4Q2019

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15

- v1.15.0 released on 2018-08-10


## Mailing list activity:

Average monthly mailing list activity was 503 posts to dev@ and 11 posts to
user@ for the last 3 months Jul-Sep 2019.


## JIRA Statistics:

- 3 JIRA tickets created in the last month

- 3 JIRA tickets resolved in the last month

17 Jul 2019 [Aaron Feng / Daniel]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Last release was 1.16 which was the 6th release as an Apache TLP project.
 This was a significant release that included initial support for distributed
 training of deep learning models with GPU acceleration, utilities to load
 model architectures and weights, preprocessing of images for mini-batch
 gradient descent, and support for Greenplum 6 and PostgreSQL 11.  Plus the
 usual bug fixes and minor improvements.

- Community is at work on the 1.17 release.  Scope is still being decided by
 the community, but JIRAs call for improvements to deep learning as a follow
 on to 1.16, and improvements to correlation/covariance, association rules
 and decision tree.

- After that will be the 2.0 release with JIRAs related to versioning models.

- Frank McQuillan (MADlib committer and PMC member) presented at Dell Tech
  World on 2019-Apr-30 on MADlib and Greenplum Database in a talk called "AI
  in a Box".

- Yuhao Zhang, a PhD candidate at University of California, San Diego is doing
 an internship at Pivotal in Palo Alto to work on parameter selection in
 MADlib, which is an important area for deep learning practitioners.  Yuhao's
 advisor at UCSD is Arun Kumar in the Department of Computer Science and
 Engineering, whose research has contributed to MADlib in the past.

## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 8 in the 2nd quarter of calendar year 2019.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in the last quarter.  Currently stands at 14 PMC members.


## Committer base changes:

- Currently 14 committers.

- Last committer additions were Jingyi Mei on 2018-06-14 and Nikhil Kak on
 2018-06-27.


## Releases:

- Next release: v1.17 planned for 3Q2019

- v1.16.0 released on 2019-07-08

- v1.15.1 released on 2018-10-15

- v1.15.0 released on 2018-08-10


## Mailing list activity:

Average monthly mailing list activity was 620 posts to dev@ and 11 posts to
user@ for the last 3 months Apr-Jun.


## JIRA Statistics:

- 12 JIRA tickets created in the last month

- 13 JIRA ckets resolved in the last month

17 Apr 2019 [Aaron Feng / Shane]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Last release was 1.15.1 which was the 5th release as an Apache TLP project.
 This was a minor release that included support for Ubuntu 16.04 as well as
 various feature improvements.

- Community is at work on the 1.16 release.  Key features are PostgreSQL 11
 support, a new method for k-NN nearest neighbors, and an early stage
 implementation of deep learning.

- After that will be the 2.0 release with JIRAs related to versioning models.

— Frank McQuillan (MADlib committer and PMC member) presented at FOSDEM’19 on
 2019-Feb-03 on deep learning on parallel databases, using MADlib and
 Greenplum Database as an example.

- Frank McQuillan also presented at PostgresConf 2019 March 18-22, New York on
 AI from model perspective


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 12 in the 1st quarter of calendar year 2019.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- Added Jingyi Mei (jingyimei@apache.org) and Nikhil Kak (nkak@apache.org) as
 new PMC members on 2019-Feb-20

- Jim Jagielski asked to be removed from the PMC

- Currently stands at 14 PMC members.


## Committer base changes:

- Currently 14 committers.

- Last committer additions were Jingyi Mei on 2018-06-14 and Nikhil Kak on
 2018-06-27.


## Releases:

- Next release: v1.16 planned for 1H2019

- v1.15.1 released on 2018-10-15

- v1.15.0 released on 2018-08-10

- v1.14.0 released on 2018-05-01



## Mailing list activity:

Average monthly mailing list activity was 243 posts to dev@ and 6 posts to
user@ for the last 3 months Jan-Mar.


## JIRA Statistics:

- 16 JIRA tickets created in the last month

- 3 JIRA tickets resolved in the last month

16 Jan 2019 [Aaron Feng / Rich]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
  for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Last release was 1.15.1 which was the 5th release as an Apache TLP project.
  This was a minor release that included support for Ubuntu 16.04 as well as
  various feature improvements.

- Community is at work on the 1.16 release which is anticipated in the next
  month or so.  Key features are PostgreSQL 11 support, and a new method for
  k-NN nearest neighbors.

- Community is also at work on the 2.0 release at the same time, with good
  progress on a JIRA related to versioning models.

- On 2018-Oct-30, Nandish Jayaram (MADlib committer) and Frank McQuillan
(MADlib committer and PMC member) visited Arun Kumar, Assistant Professor of
 Computer Science and Engineering at the University of California, San Diego
 regarding possible collaboration projects related to in-database machine
 learning. Discussions went well and we will report back to the community if
 this collaboration moves forward. Note that Professor Kumar is a former
 colleague of MADlib PMC Chair Aaron Feng when both were at the University of
 Wisconsin-Madison.

— Frank McQuillan (MADlib committer and PMC member) will be presenting at
  FOSDEM’19 on 2019-Feb-03 on deep learning on parallel databases, using
  MADlib and Greenplum Database as an example.


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 6 in the 4th quarter of calendar year 2018.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in PMC, currently 13 PMC members.


## Committer base changes:

- Currently 15 committers.

- Last committer additions were Jingyi Mei on 2018-06-14 and Nikhil Kak on
  2018-06-27.


## Releases:

- Next release: v1.16 planned for Feb 2019

- v1.15.1 released on 2018-10-15

- v1.15.0 released on 2018-08-10

- v1.14.0 released on 2018-05-01



## Mailing list activity:

Average monthly mailing list activity was 119 posts to dev@ and 16 posts to
user@ for the last 3 months Oct-Dec.


## JIRA Statistics:

- 5 JIRA tickets created in the last month

- 3 JIRA tickets resolved in the last month

17 Oct 2018 [Aaron Feng / Bertrand]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning
framework for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Aaron Feng (PMC Chair) visited Pivotal in Palo Alto in
August and held discussions with a few other committers on
the topics of new features and testing infrastructure.

- A MADlib community call on the topic of the last 1.15
release occurred on 2018-Aug-23.  The presenters were project
committers Jingyi Mei and Frank McQuillan who reviewed
the main 1.15 features and gave some demos using Jupyter notebooks.
Demos included: variable importance in decision trees,
column/vector operations, and momentum methods for neural
networks (multi-layer perceptron).  Here is the link to the
community call:   https://youtu.be/9JpPWuiqweU

- Community is currently working on the 1.15.1 release which
will be 5th release as an Apache TLP project.  We expect
voting on release artifacts in Oct.

- Ideas for the 2.0 release are being discussed in the JIRAs
and mailing list and may include model management and deep
learning, depending on community interest and contributions.

- Apache MADlib has been referred by two 2018 VLDB papers:

“In-RDBMS Hardware Acceleration of Advanced Analytics”
http://www.vldb.org/pvldb/vol11/p1317-mahajan.pdf
Proceedings of the VLDB Endowment, Vol. 11, No. 11

“A Comparative Evaluation of Systems for Scalable Linear
Algebra-based Analytics”
http://www.vldb.org/pvldb/vol11/p2168-thomas.pdf
Proceedings of the VLDB Endowment, Vol. 11, No. 13


## Health report:

The community is relatively small but very engaged with robust mailing
list traffic, interest in doing frequent releases and new
functionality being developed by contributors.

The number of developers actively contributing to the code/documentation
is approximately 9 in the 3rd quarter of calendar year 2018,
which is about the same as the last report.

We will constantly be on a lookout for new community members to be
invited either as committers or PMC.


## PMC changes:

- No changes in PMC, currently 13 PMC members.


## Committer base changes:

- Currently 15 committers.

- Last committer additions were Jingyi Mei on 2018-06-14 and Nikhil Kak on 2018-06-27.


## Releases:

- Next release: v1.15.1 planned for October 2018

- v1.15.0 released on 2018-08-10

- v1.14.0 released on 2018-05-01

- v1.13.0 released on 2017-12-22


## Mailing list activity:

Average monthly mailing list activity was 169 posts to dev@
and 22 posts to user@ for the last 3 months Jul-Sep.


## JIRA Statistics:

- 7 JIRA tickets created in the last month

- 8 JIRA tickets resolved in the last month

18 Jul 2018 [Aaron Feng / Shane]

## Description:
- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:
- There are no issues requiring board attention at this time.


## Activity:
- A MADlib community call on the topic of the last 1.14 release occurred on
 2018-May-10.  This included demos of new features and improvements to
 existing features.
- Community is currently working on the 1.15 release which will be 4th release
 as an Apache TLP project.  We expect voting on release artifacts in late
 July.
- Ideas are being generated for the 2.0 release which will come after 1.15.


## Health report:
The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors. The number of developers actively contributing to
the code/documentation is approximately 8 in the 2nd quarter of calendar year
2018, which is about the same as the last report. We will constantly be on a
lookout for new community members to be invited either as committers or PMC.


## PMC changes:
- No changes in PMC, currently 13 PMC members.


## Committer base changes:
- Currently 15 committers.
- Two new committers added in the last 3 months.
- Last committer additions were Jingyi Mei on 2018-06-14 and Nikhil Kak on
 2018-06-27.


## Releases:
- Next release: v1.15.0 planned for late July 2018
- v1.14.0 released on 2018-05-01
- v1.13.0 released on 2017-12-22
- v1.12.0 released on 2017-08-29


## Mailing list activity:
Average monthly mailing list activity was 83 posts to dev@ and 11 posts to
user@ for the last 3 months Apr-Jun.


## JIRA Statistics:
- 8 JIRA tickets created in the last month
- 8 JIRA tickets resolved in the last month

18 Apr 2018 [Aaron Feng / Isabel]

## Description:

- Apache MADlib is a scalable, big data, SQL-driven machine learning framework
 for data scientists.


## Issues:

- There are no issues requiring board attention at this time.


## Activity:

- Community is currently finalizing the 1.14 release which will be third
 release as an Apache TLP project.  We expect voting on release artifacts to
 commence during the week of 2018-April-16.

- A MADlib community call on the topic of the 1.14 release will be scheduled
 towards the end of April.

- There was a MADlib community call on the topic of the 1.13 release on
 2018-January-17.

- Community is working on defining the scope of the 1.15 release in JIRA and
 mailing lists.

- Community has been building and posting data science notebooks as a quick
 start guide to using MADlib.  There are currently more than 25 notebooks
 available at
 https://github.com/apache/madlib-site/tree/asf-site/community-artifacts


## Health report:

The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors.

The number of developers actively contributing to the code/documentation is
approximately 9 in the first quarter of the calendar year, which is about the
same as the last report.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.


## PMC changes:

- No changes in PMC, currently 13 PMC members.


## Committer base changes:

- Currently 13 committers.

- No new committers added in the last 3 months

- Last committer addition was Nandish Jayaram on 2016-09-08


## Releases:

- Next release: v1.14.0 planned for April 2018

- v1.13.0 released on 2017-12-22

- v1.12.0 released on 2017-08-29

- v1.11.0-incubating released on 2017-05-17


## Mailing list activity:

Mailing activity has remained relatively stable with 223 posts to dev@ and 7
posts to user@ during the month of 2018-March.


## JIRA Statistics:

- 13 JIRA tickets created in the last month

- 57 JIRA tickets resolved in the last month

17 Jan 2018 [Aaron Feng / Phil]

[REPORT] MADlib - January 2018

## Description:
- Apache MADlib is a scalable, Big Data, SQL-driven machine learning framework
 for Data Scientists.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
- MADlib 1.13 was released on 2017-December-22, and this is the second release
 as an Apache TLP project.
- There is a MADlib community call on the topic of the 1.13 release scheduled
 for 2018-January-17.
- As a final (we think) post-graduation task, we cleaned up
 dist/incubator/madlib
- Community is working on 1.14 JIRAs is currently.

## Health report:
The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and new functionality being
developed by contributors. The number of developers actively contributing to
the code/documentation has increased to around 9 per month, up from 6 at the
time of the last report. We will constantly be on a lookout for new community
members to be invited either as committers or PMC.

## PMC changes:
- No changes in PMC, currently 13 PMC members.

## Committer base changes:
- Currently 13 committers.
- No new committers added in the last 3 months
- Last committer addition was Nandish Jayaram on 2016-09-08

## Releases:
- Next release: v1.14.0 planned for Feb 2018
- v1.13.0 released on 2017-12-22
- v1.12.0 released on 2017-08-29
- v1.11.0-incubating released on 2017-05-17

## Mailing list activity:
Mailing activity has remained relatively stable with 148 posts to dev@ and 7
posts to user@ during the month of December.

## JIRA Statistics:
- 10 JIRA tickets created in the last month
- 13 JIRA tickets resolved in the last month

18 Oct 2017 [Aaron Feng / Bertrand]

## Description:
- Apache MADlib is a scalable, Big Data, SQL-driven machine learning framework
for Data Scientists.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
- Community activity related to post-graduation tasks has largely been
completed (website, wiki, ASF infrastructure, etc.)

- A MADlib community call on the topic of the 1.12 release happened on Sept 7,
2017.

- The community decided to take some of the proposed 2.0 release JIRAs and put
them into a 1.13 release to do first, targeted for November 2017. The reason
is that more time is needed to plan out the proposed interface changes for
2.0. Work on 1.13 JIRAs is in flight currently.


## Health report: The community is relatively small but very engaged with
robust mailing list traffic, interest in doing frequent releases and a bunch
of new functionality being developed by contributors.

The number of committers actively contributing to the code/documentation has
been steady and remains at a level of half a dozen active committers each
month.

We will constantly be on a lookout for new community members to be invited
either as committers or PMC.

## PMC changes:
- No changes in PMC, currently 13 PMC members.


## Committer base changes:

- Currently 13 committers.

- No new committers added in the last 3 months

- Last committer addition was Nandish Jayaram on 2016-09-08


## Releases:
- Next release: v1.13.0 planned for Nov 2017

- v1.12.0 released on 2017-08-29

- v1.11.0-incubating released on 2017-05-17

- v1.10.0-incubating released on 2017-03-10


## Mailing list activity: Mailing activity has remained relatively stable with
144 posts
to dev@ and 18 posts to user@ during the month of Sep.


## JIRA Statistics:
- 6 JIRA tickets created in the last month

- 4 JIRA tickets resolved in the last month

20 Sep 2017 [Aaron Feng / Brett]

## Description:
- Apache MADlib is a scalable, Big Data, SQL-driven machine learning framework
for Data Scientists.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
- Community activity focused on post-graduation tasks related to MADlib's
promotion to ASF TLP status. This included working on the code base,
website, wiki, and ASF infrastructure.
- Trademark transfer from Pivotal to ASF was completed in the month of
August, 2017.
- The first TLP release of MADlib 1.12 happened on Aug 29, 2017.
- A MADlib community call on the topic of the 1.12 release happened on Sept
7, 2017.
- A set of 2.0 JIRAs has been proposed to review by the community.

## Health report:
The community is relatively small but very engaged with robust mailing list
traffic, interest in doing frequent releases and a bunch of new functionality
being developed by contributors. Work has begun in earnest on the next
release 2.0 planned in the fall.
The number of committers actively contributing to the code/documentation has
been steady and remains at a level of half a dozen active committers each
month.
We will constantly be on a lookout for new community members to be invited
either as committers or PMC.

## PMC changes:
- No changes in PMC, currently 13 PMC members.

## Committer base changes:
- Currently 13 committers.
- No new committers added in the last 3 months
- Last committer addition was Nandish Jayaram on 2016-09-08

## Releases:
- v1.12.0 released on 2017-08-29
- v1.11.0-incubating released on 2017-05-17
- v1.10.0-incubating released on 2017-03-10

## Mailing list activity:
Mailing activity has remained relatively stable with 324 posts to dev@ and
19 posts to user@ during the month of Aug.

## JIRA Statistics:
- 9 JIRA tickets created in the last month
- 13 JIRA tickets resolved in the last month

16 Aug 2017 [Aaron Feng / Jim]

## Description:
- The Apache MADlib is a scalable, Big Data, SQL-driven machine learning framework for Data Scientists.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
- The bulk of the community activity is now focused on post-graduation tasks of promoting MADlib to the ASF's TLP status. This includes working on code base, website, wiki, and ASF infrastructure.
- We expect to finalize the trademark transfer from Pivotal to ASF within the
month of August, 2017.
- Ed Espino volunteered to drive the first TLP release of MADlib 1.12 which is expected to happen within the next couple of months.

## Health report:
The project has just graduated to the status of a TLP at ASF. The community is small but very engaged with robust mailing list traffic, interest in doing frequent releases and a bunch of new functionality being developed by contributors. The number of committers actively contributing to the code/documentation has been steady and remains at a level of half a dozen active committers each month. Since the project has just graduated we haven't had a chance to actively grow our PMC roster, but it must be noted that at this point all of our active committers are also PMC members. Of course, we will constantly be on a lookout for new community members to be invited either as committers or PMC.

## PMC changes:
- PMC has been just formalized as part of the graduation resolution
- Currently 13 PMC members.

## Committer base changes:
- Currently 13 committers.
- No new committers added in the last 3 months
- Last committer addition was Nandish Jayaram on 2016-09-08

## Releases:
- v1.11.0-incubating released on 2017-05-17
- v1.10.0-incubating released on 2017-03-10
- v1.9.1-incubating released on 2016-09-19

## Mailing list activity:
Mailing activity remains steady with 203 posts to dev@ and 23 posts to user@

## JIRA Statistics:
- 10 JIRA tickets created in the last month
- 3 JIRA tickets resolved in the last month

@Jim: follow up with trademark assignment issue

19 Jul 2017

Establish the Apache MADlib Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to the
 public, related to a scalable, Big Data, SQL-driven machine
 learning framework for Data Scientists.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache MADlib Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache MADlib Project be and hereby is
 responsible for the creation and maintenance of software
 related to a scalable, Big Data, SQL-driven machine
 learning framework for Data Scientists.

 RESOLVED, that the office of "Vice President, Apache MADlib" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache MADlib Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache MADlib Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache MADlib Project:
   Sarah Aerni <saerni@apache.org>
   Greg Chase <gregchase@apache.org>
   Aaron Feng <aaronfeng@apache.org>
   Rahul Iyer <riyer@apache.org>
   Jim Jagielski <jim@apache.org>
   Nandish Jayaram <njayaram@apache.org>
   Anirudh Kondaveeti <akondave@apache.org>
   Orhan Kislal <okislal@apache.org>
   Frank McQuillan <fmcquillan@apache.org>
   Srivatsan R <vatsan@apache.org>
   Rashmi Raghu <rashmiraghu@apache.org>
   Roman Shaposhnik <rvs@apache.org>
   Atri Sharma <atri@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Aaron Feng
 be appointed to the office of Vice President, Apache MADlib, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache MADlib PMC be and hereby is
 tasked with the creation of a set of bylaws intended to
 encourage open development and increased participation in the
 Apache MADlib Project; and be it further

 RESOLVED, that the Apache MADlib Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Incubator MADlib podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator MADlib podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 Special Order 7A, Establish the Apache MADlib Project, was
 approved by Unanimous Vote of the directors voting, with
 Shane Curcuru abstaining.

 As the MADlib trademark hasn't been transferred to the Foundation yet:

 1) MADlib is required to include a disclaimer on their homepage and in
 their releases, indicating that the mark doesn't belong to the ASF so
 far, until the trademark is transferred.

 2) The expectation is that the trademark handover will be completed
 before the end of 2017.

19 Jul 2017

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Finalize trademark transfer from Pivotal to ASF.
 2. Continue to produce regular Apache (incubating) releases.
 3. Continue to execute and manage the project according to governance model
    of the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of?

 1. The Apache MADlib Project is ready for graduation out of the incubator.
Discussion by Project:
https://lists.apache.org/thread.html/070c6764fcd0448b2db8975936b52f7a28bd0e231c0e690288a6968e@%3Cdev.madlib.apache.org%3E
Vote by IPMC and community:
https://lists.apache.org/thread.html/733920464e8f8170d9cc831b701f275d757ee9448a7bfd05a1bf8dfd@%3Cgeneral.incubator.apache.org%3E
Trademark transfer from Pivotal to ASF is being tracked in:
https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-125
2. The resolution for graduation was tabled by the board last month due to
trademark issue, and is now being re-submitted.

How has the community developed since the last report?

 1. Some related events in Q2 2017:
    * May 25, 2017 - MADlib community call.  Topic:  New Features in Apache
      MADlib 1.11 (Frank McQuillan)
    * Jun 21, 2017 - Greenplum meetup in San Francisco.  Topic:
      Apache Solr & MADlib (incubating): Enabling Massive Text Analytics
      In-Database (Bharath Sitaraman)
    * Jul 5-7, 2017 - PG Day Russia.  Topic: Various on “Greenplum Day”
      Jul 5 including in-database analyitics (Roman Shaposhnik and others)
    * Jul 25, 2017 (upcoming) - SF Bay ACM Chapter meetup.  Topic:  Advanced
      Analytics for Security: Lateral Movement Detection (Anirudh Kondaveti)
 2. See material technical conversations on user/dev mailing lists and in
    the appropriate JIRAs and pull requests.

How has the project developed since the last report?

 1. TLP readiness - maturity evaluation matrix https://cwiki.apache.org/confluence/display/MADLIB/ASF+Maturity+Evaluation
 2. TLP readiness - graduation resolution https://cwiki.apache.org/confluence/display/MADLIB/Graduation+Resolution
 3. TLP readiness - documented release process https://cwiki.apache.org/confluence/display/MADLIB/Release+Process
 4. Active work in progress for 6th ASF release MADlib v1.12 scheduled for
    Jul/Aug 2017.  Features include: more graph analytics (weakly connected
    components, breadth first search, all pairs shortest path, multiple
    graph measures), neural nets, stratified sampling, train-test split,
    improvements to decision tree & random forest, improvements to summary
    function
 5. Mailing list activity in Q2:  295 postings to dev, 77 postings to user.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [ ] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [X] Nearing graduation
 [ ] Other:

Date of last release:

 MADlib v1.11 on 5/16/17.

When were the last committers or PPMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.


Signed-off-by:

 [ ](madlib) Konstantin Boudnik
    Comments:
 [X](madlib) Ted Dunning
    Comments:
 [ ](madlib) Roman Shaposhnik
    Comments:

21 Jun 2017

Establish the Apache MADlib Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to the
 public, related to a scalable, Big Data, SQL-driven machine
 learning framework for Data Scientists.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache MADlib Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache MADlib Project be and hereby is
 responsible for the creation and maintenance of software
 related to a scalable, Big Data, SQL-driven machine
 learning framework for Data Scientists.

 RESOLVED, that the office of "Vice President, Apache MADlib" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache MADlib Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache MADlib Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache MADlib Project:
   Sarah Aerni <saerni@apache.org>
   Greg Chase <gregchase@apache.org>
   Aaron Feng <aaronfeng@apache.org>
   Rahul Iyer <riyer@apache.org>
   Jim Jagielski <jim@apache.org>
   Nandish Jayaram <njayaram@apache.org>
   Anirudh Kondaveeti <akondave@apache.org>
   Orhan Kislal <okislal@apache.org>
   Frank McQuillan <fmcquillan@apache.org>
   Srivatsan R <vatsan@apache.org>
   Rashmi Raghu <rashmiraghu@apache.org>
   Roman Shaposhnik <rvs@apache.org>
   Atri Sharma <atri@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Aaron Feng
 be appointed to the office of Vice President, Apache MADlib, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache MADlib PMC be and hereby is
 tasked with the creation of a set of bylaws intended to
 encourage open development and increased participation in the
 Apache MADlib Project; and be it further

 RESOLVED, that the Apache MADlib Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Incubator MADlib podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator MADlib podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 Special Order 7D, Establish the Apache MADlib Project,
 was tabled.

19 Apr 2017

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Continue to produce regular Apache (incubating) releases.
 2. Continue to execute and manage the project according to governance model
    of the "Apache Way".
 3. Continue to build community.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of?

 1. The next release v1.11 will be the 5th as an incubating project.  We
    believe this release will meet all requirements for a clean ASF release,
    based on listening to guidance from the IPMC over the previous releases.
    After that, the community would ideally like to move towards top level
    status.
 2. The licensing issues have been resolved.  Should anyone want to review,
    we have summarized the issue and resolution with relevant links on the
    MADlib wiki at
    https://cwiki.apache.org/confluence/display/MADLIB/ASF+Licensing+Guidance

How has the community developed since the last report?

 1. Some related events in Q1 2017:
    * Feb 4, 2017 - Presentation at FOSDEM’17 Graph devroom.  Topic:  Graph
      Analytics on Massively Parallel Processing Databases (Frank McQuillan)
    * Feb 2, 2017 - Greenplum meetup in SF.  Topic:  Machine Learning and
      Cyber Security with Greenplum and Apache MADlib (Anirudh Kondaveeti,
      Frank McQuillan)
    * Mar 23, 2017 - MADlib community call.  Topic:  New Features in Apache
      MADlib 1.10 (Frank McQuillan)
 2. See material technical conversations on user/dev mailing lists and in
    the appropriate JIRAs and pull requests.

How has the project developed since the last report?

 1. Build infra set up on Apache infra
    https://builds.apache.org/job/madlib-master-build/
 2. Docker image with necessary dependencies required to compile and test
    MADlib on PostgreSQL 9.6
    https://cwiki.apache.org/confluence/display/MADLIB/Quick+Start+Guide+for+Developers#QuickStartGuideforDevelopers-Dock
 3. Active work in progress for 5th ASF release MADlib v1.11 scheduled for
    Apr 2017.  Features include: PageRank, connected components, stratified
    sampling, improvements to decision tree & random forest, array & sparse
    vector output for pivot
 4. Mailing list activity in Q1 to date:  274 postings to dev, 111 postings
    to user.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [ ] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [X] Nearing graduation
 [ ] Other:

Date of last release:

 MADlib v1.10 on 3/10/17.

When were the last committers or PMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.

Signed-off-by:

 [X](madlib) Konstantin Boudnik
    Comments:
 [ ](madlib) Ted Dunning
    Comments:
 [x](madlib) Roman Shaposhnik
    Comments: we hope to submit a TLP resolution next month

18 Jan 2017

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Need guidance from Incubator PMC on how to resolve the BSD licensing
    switch over to Apache License.  What should be the content of the license
    headers for files that were previously BSD licensed and then granted to
    ASF?  Related legal-discuss threads:
    http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/%3CCALGG8z03zHhbFegXoi4fH+vXtF+9m7x6hak9RjKQjapuzi67gQ@mail.gmail.com%3E
    http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201603.mbox/%3C9D1AF43C-370B-4E58-B0EF-2E29D242F50B%40jaguNET.com%3E
 2. Continue to produce regular Apache (incubating) releases.
 3. Continue to execute and manage the project according to governance model
    of the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of?

 1. Yes-please see #1 above and provide guidance.
 2. The next release v1.10 will be the 4th as an incubating project.  After
    that, the community would ideally like to move towards top level status.

How has the community developed since the last report?

 1. Some related events in Q4 2016 and upcoming:
    * Feb 4, 2017 - Presentation accepted at FOSDEM’17 Graph devroom.
      Topic:  Graph Analytics on Massively Parallel Processing Databases
      (Frank McQuillan)
    * Dec 1, 2016 - MADLib community call.  Topic:  New features in R
      interface and MADlib user survey results (hosted by Greg Chase, Orhan
      Kislal, Frank McQuillan)
    * Nov 16, 2016 - Presentation at PGConf Silicon Valley.  Topic:
      Distributed In-Database Machine Learning with Apache MADlib
      (incubating) (Frank McQuillan)
    * Nov 14, 2016 - Presentation at Apache Big Data Europe.  Topic:
      Distributed In-Database Machine Learning with Apache MADlib
      (incubating) (Roman Shaposhnik)
 2. Material technical conversations on user/dev mailing lists and in the
    appropriate JIRAs and pull requests.
 3. New contributors to the project have been working on KNN module and
    Python interface.

How has the project developed since the last report?

 1. Active work in progress for 4th ASF release MADlib v10 scheduled for Jan
    2017.  Features include: single source shortest path graph algorithm,
    completely new module for encoding categorical variables, R interface
    update, grouping support in elastic net and PCA, cross validation in
    elastic net, verbose output option for decision tree visualization.
 2. Mailing list activity in Q4:  227 postings to dev, 66 postings to user.

Date of last release:

 MADlib v1.9.1 on 9/19/16.

When were the last committers or PMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

(rvs) I had a chat with ASF VP Legal and the proposal is to go ahead with the release like it is. If there will be concerns raised by IPMC during the review of this upcoming release Jim volunteered to be directly involved to work through these concerns.

19 Oct 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Need guidance from Incubator PMC on how to resolve the BSD licensing
   switch over to Apache License.  What should be the content of the license
   headers for files that were previously BSD licensed and then granted to
   ASF?  Related legal-discuss threads:
   http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/%3CC
   ALGG8z03zHhbFegXoi4fH+vXtF+9m7x6hak9RjKQjapuzi67gQ@mail.gmail.com%3E
   http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201603.mbox/%3C9
   D1AF43C-370B-4E58-B0EF-2E29D242F50B%40jaguNET.com%3E
 2. Continue to produce regular Apache (incubating) releases.
 3. Continue to execute and manage the project according to governance model
   required by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

Yes-please see #1 above and provide guidance.

How has the community developed since the last report?

 1. Two new committers added to the project:
       * Orhan Kislal (9/7/16)
       * Nandish Jayaram (9/7/16)
 2. MADlib related events in Q3 2016:
       * Jul 27 - MADLib community call.  Topic:  Open discussion on Apache
         MADlib project (hosted by Greg Chase, Frank McQuillan)
       * Aug 19 - Presentation to Hortonworks.  Topic:  Apache MADlib, Apache
         HAWQ (incubating) and Apache Zeppelin (Rahul Iyer, Frank McQuillan)
       * Sep 13 - MADLib community call.  Topic:  Deep dive on MADlib 1.9.1
         release (hosted by Greg Chase, presentation by Frank McQuillan)
       * Sep 21 - Meetup at Hortonworks San Francisco.  Topic:  Future of data
         - Apache MADlib and Apache HAWQ (Tushar Pednekar)
       * Sep 22 - Meetup at Hortonworks Santa Clara.  Topic:  Future of data -
         Apache MADlib and Apache HAWQ (Tushar Pednekar)
 3. Material technical conversations on dev mailing lists and in the
   appropriate JIRAs and pull requests.

How has the project developed since the last report?

 1. 3rd ASF release MADlib v1.9.1 released on Sep 19, 2016.  Features include:
   path functions (phase 2), 1-class support vector machines for novelty
   detection, prediction metrics, sessionization, pivoting.
 2. Community has started active development on the v1.10 release.
 3. 13 JIRAs created and 5 resolved in last 30 days.

Date of last release:

 MADlib v1.9.1 on 9/19/16.

When were the last committers or PMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

 tdunning:
   This project seems to be ticking along pretty reasonably. The only worry I
     have about it is that it seems to be strongly centered around a few (or
     even just one) very strong contributors. That is a worry relative to
     longevity and community building. Overall, I don't think that the project
     is getting much marginal value from incubation.

 johndament:
   Its unclear what guidance from the IPMC is required if the podling is
     already reaching out to legal, which would be the main thing I can think
     of to recommend to them right now.

 rvs:
   @johndament: I think we need to formalize whatever decision by legal. I'll
     create a formal LEGAL JIRA soon.

20 Jul 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Continue to produce regular Apache (incubating) releases.
 2. Expand the community, increase dev list activity and add new
    contributors.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. MADlib related events in Q2 2016:
       * April 19 - Joint community call MADlib - Greenplum Database.
         Topic:  MADlib v1.9 new features (Nandish Jayaram, Ivan Novick,
         Cesar Rojas, Frank McQuillan)
       * May 5 - MADLib community call.  Topic:  Detailed review of the
         MADlib v1.9 release (Xiaocheng Tang, Frank McQuillan)
       * June 21 - MADLib community call.  Topic:  Apache Zeppelin meets
         Apache MADlib (incubating) and Apache HAWQ (incubating) (Moon soo
         Lee, Rahul Iyer, Frank McQuillan)
       * June 21 - Data Engineers Guild meetup in Palo Alto.  Topic: The
         Analytics and Science Behind Connected Transportation (Srivatsan
         Ramanujam, Esther Vasiete, Ralph Rabbat, Frank McQuillan)
 2. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs and pull requests.
 3. We are seeing some PostgreSQL experts chipping on SQL coding and making
    good suggestions in the pull requests.

How has the project developed since the last report?

 1. 2nd ASF release MADlib v1.9 released on April 6, 2016.  The goal of
    this 2nd release was general availability of MADlib v1.9 for community
    use.
 2. 3rd ASF release MADlib v1.9.1 anticipated this summer depending on
    community input.  Features include:  path functions (phase 2), 1-class
    support vector machines for novelty detection, prediction metrics,
    sessionization, pivoting.
 3. 2 JIRAs created and 14 resolved in last 30 days.

Date of last release:

 MADlib v1.9 on 4/6/16.

When were the last committers or PMC members elected?

 Xiaocheng Tang on 1/14/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

20 Apr 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Continue to produce regular Apache (incubating) releases.
 2. Expand the community, increase dev list activity and add new
    contributors.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. Held three community calls in Q1 2016.  Each call featured a different
    member of the Apache MADlib community presenting on a topic of interest
    to them:
    * Jan 15th - Bayesian Analysis of Binomial Response Models on MPP
      Databases (Gautam Muralidhar)
    * Feb 16th - An Overview of GWR Analysis of Spatial Data (Chenliang
      Wang)
        * Mar 16th - MADlib on PostgreSQL and PGXN (AJ Welch)
 2. One new committer has been added to the project (Xiaocheng Tang)
 3. Presentation of Apache MADlib at FOSDEM’16 in Brussels (Frank
    McQuillan)
 4. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs (e.g., 111 emails on dev@ mailing list in Feb)
 5. Several Google Summer of Code (GSoC) candidates have expressed interest
    in working on MADlib projects via dev@ mailing list.

How has the project developed since the last report?

 1. 1st ASF release MADlib v1.9alpha on 3/11/16 which was intended to clear
    all potential IP issues in the code base and make it legally ready to
    be adopted by the community.
 2. 2nd ASF release MADlib v1.9 is currently in IPMC voting as of this
    writing on 4/6/16.  The goal of this 2nd release is general
    availability of MADlib v1.9 for community use.
 3. Some features in the latest release:  path functions, support vector
    machines, advanced matrix operations, covariance matrix, proportion of
    variance for PCA, support for Apache HAWQ (incubating) 2.0.
 4. 15 JIRAs created and 27 resolved in last 30 days.

Date of last release:

 Apache MADlib (incubating) v1.9alpha on 3/11/16.

When were the last committers or PMC members elected?

 Xiaocheng Tang on 1/14/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

20 Jan 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 None

How has the community developed since the last report?

 1. Second community call held 12/18/15.  On the call Roman suggested that
    the MADlib community do a 1.9 alpha release in the near term.  There
    was general agreement and this is planned for January.  A Release
    Manager has not yet been identified.
 2. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs.  E.g., 51 emails on dev in Dec.
 3. One new committer has been proposed and voting is under way on the
    private mailing list.
 4. Two comprehensive proposals were posted to the dev mailing list from
    community members.  One relates to the addition of geographically
    weighted regression (GWR) algorithms.  The second involves Bayesian
    analysis of binomial response models for MPP databases, which makes
    extensive use of MADlib’s new matrix operations.  Both proposals are
    under active discussion on the mailing list currently.
 5. We have been accepted to present a full talk at FOSDEM 2016/Brussels in
    the HPC, Big Data & Data Science Devroom on Jan 31.  The title of the
    talk is: "MADlib: Distributed In-Database Machine Learning for Fun and
    Profit"

How has the project developed since the last report?

 1. 5 JIRAs created and 4 resolved in last 30 days.
 2. A SQL API guide has been added to the the MADlib wiki
    https://cwiki.apache.org/confluence/display/MADLIB/SQL+API+Guide.

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 One new committer has been proposed and voting is under way on the private
 mailing list.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

16 Dec 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. First community call held 11/20/15.  There were approximately 10
    attendees, about half were from outside of the current group of MADlib
    contributors.  This will be a monthly call, possibly moving to 2x per
    month in the future.
 2. Meetup 12/3/15 @ Pivotal Labs, San Francisco:  “MADlib and HAWQ for
    Advanced SQL Machine Learning on Hadoop”.  One goal of this meetup is
    to invite new community participation in MADlib.
 3. Material technical conversations are now happening on the dev mailing
    lists and in the appropriate JIRAs.  E.g., 53 emails on dev in Nov
    compared with 7 in Oct.

How has the project developed since the last report?

 1. 31 JIRAs created and 7 resolved in last 30 days.
 2. Mailing list subscribers: user - 19, dev - 20
 3. Proposed scope for first Apache MADlib release has been described to
    the community for comment.  This release includes IP cleanliness and
    new features.
 4. The MADlib wiki <http://s.apache.org/0lQ> has been updated with new
    content, including a new contributors guideline, an FAQ and a page
    listing suggestions for first time contributors (these have also been
    labeled “starter” in the JIRAs).

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 No new members added on top of the initial committer list.

Signed-off-by:

 [X](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

18 Nov 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. Meetup 10/1/15 @ Pivotal Labs, New York, NY:  “MADlib and HAWQ for
    Advanced SQL Machine Learning on Hadoop” http://s.apache.org/VbG
 2. Meetup 10/29/15 @ Pivotal Palo Alto, CA:  “Data Science at Scale for
    IoT” http://www.meetup.com/Pivotal-Open-Source-Hub/events/225426787/

How has the project developed since the last report?

 1. All known issues related to IP cleanliness described in
    https://wiki.apache.org/incubator/MADlibProposal have been fixed and
    pushed to the Apache repo.
 2. All software activity tracking has migrated to Apache MADlib JIRA from
    previous tool.  18 JIRAs created and 2 resolved in last 30 days.
 3. All commits and code are now being done on the Apache Git repo.
 4. Three new quick start guides have been written: i) install, ii) user,
    and iii) developer.  The goal is to make it easier to onboard new
    community members.
 5. A new Greenplum DB sandbox VM with MADlib pre-installed has been
    created and made available publicly at
    https://github.com/greenplum-db/gpdb-sandbox-tutorials.  The goal is to
    make it easier to onboard new community members - they can download and
    start trying MADlib right away with no install/setup.
 6. A "catchup JIRA" was filed
    https://issues.apache.org/jira/browse/MADLIB-912 in order to catch up
    between the time of the code grand to Apache and bringing in dev work
    that was already in flight at the time.  We apologize for any
    inconvenience in clubbing together these multiple items; it was a
    one-time operation.

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 No new members added on top of the initial committer list.

Signed-off-by:

 [X](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

 Konstantin Boudnik (cos):

   I don't see much info on the community development. How many new
   contributors the project had gained? Were there any additions in the
   mailing lists? Please consider providing this information in the next
   report.

21 Oct 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache release.
 2. Finalize infrastructure migration and ICLAs from committers.
 3. Expand the community, increase dev list activity and add new
    committers/pmc members.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 Just started incubation, nothing specific to report at this time.

How has the community developed since the last report?

 We are approximately 2 weeks into the incubation process.

   1. So still at early stage.
   2. Most of the core contributors have completed their ICLA's and have
      established apache ids.
   3. Several presentations and meet-ups at ApacheCon EU and Strata NYC to
      discuss MADlib move to ASF governance.
   4. Formal announcements from Pivotal, press briefings and blogs related
      to the move of the project into Apache aimed at growing awareness and
      interest in the project.  Specialty press have picked up the story and
      reported widely.

How has the project developed since the last report?

 Early activity:

 1. Initial code drop provided to Apache
 2. Core infrastructure is in the process of being migrated from existing
    infrastructure: mailing lists, git, jira, wiki, website

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 Most of initial list of committers have been on-boarded, some still
 outstanding.  No new members added on top of the initial committer list.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [ ](madlib) Roman Shaposhnik

Shepherd/Mentor notes: