Apache Logo
The Apache Way Contribute ASF Sponsors

Formal board meeting minutes from 2010 through present. Please Note: The board typically approves minutes from one meeting during the next board meeting, so minutes will be published roughly one month later than the scheduled date. Other corporate records are published, as is an alternate categorized view of all board meeting minutes.

2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | Pre-organization meetings

MADlib

18 Jan 2017

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Need guidance from Incubator PMC on how to resolve the BSD licensing
    switch over to Apache License.  What should be the content of the license
    headers for files that were previously BSD licensed and then granted to
    ASF?  Related legal-discuss threads:
    http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/%3CCALGG8z03zHhbFegXoi4fH+vXtF+9m7x6hak9RjKQjapuzi67gQ@mail.gmail.com%3E
    http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201603.mbox/%3C9D1AF43C-370B-4E58-B0EF-2E29D242F50B%40jaguNET.com%3E
 2. Continue to produce regular Apache (incubating) releases.
 3. Continue to execute and manage the project according to governance model
    of the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of?

 1. Yes-please see #1 above and provide guidance.
 2. The next release v1.10 will be the 4th as an incubating project.  After
    that, the community would ideally like to move towards top level status.

How has the community developed since the last report?

 1. Some related events in Q4 2016 and upcoming:
    * Feb 4, 2017 - Presentation accepted at FOSDEM’17 Graph devroom.
      Topic:  Graph Analytics on Massively Parallel Processing Databases
      (Frank McQuillan)
    * Dec 1, 2016 - MADLib community call.  Topic:  New features in R
      interface and MADlib user survey results (hosted by Greg Chase, Orhan
      Kislal, Frank McQuillan)
    * Nov 16, 2016 - Presentation at PGConf Silicon Valley.  Topic:
      Distributed In-Database Machine Learning with Apache MADlib
      (incubating) (Frank McQuillan)
    * Nov 14, 2016 - Presentation at Apache Big Data Europe.  Topic:
      Distributed In-Database Machine Learning with Apache MADlib
      (incubating) (Roman Shaposhnik)
 2. Material technical conversations on user/dev mailing lists and in the
    appropriate JIRAs and pull requests.
 3. New contributors to the project have been working on KNN module and
    Python interface.

How has the project developed since the last report?

 1. Active work in progress for 4th ASF release MADlib v10 scheduled for Jan
    2017.  Features include: single source shortest path graph algorithm,
    completely new module for encoding categorical variables, R interface
    update, grouping support in elastic net and PCA, cross validation in
    elastic net, verbose output option for decision tree visualization.
 2. Mailing list activity in Q4:  227 postings to dev, 66 postings to user.

Date of last release:

 MADlib v1.9.1 on 9/19/16.

When were the last committers or PMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

(rvs) I had a chat with ASF VP Legal and the proposal is to go ahead with the release like it is. If there will be concerns raised by IPMC during the review of this upcoming release Jim volunteered to be directly involved to work through these concerns.

19 Oct 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Need guidance from Incubator PMC on how to resolve the BSD licensing
   switch over to Apache License.  What should be the content of the license
   headers for files that were previously BSD licensed and then granted to
   ASF?  Related legal-discuss threads:
   http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/%3CC
   ALGG8z03zHhbFegXoi4fH+vXtF+9m7x6hak9RjKQjapuzi67gQ@mail.gmail.com%3E
   http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201603.mbox/%3C9
   D1AF43C-370B-4E58-B0EF-2E29D242F50B%40jaguNET.com%3E
 2. Continue to produce regular Apache (incubating) releases.
 3. Continue to execute and manage the project according to governance model
   required by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

Yes-please see #1 above and provide guidance.

How has the community developed since the last report?

 1. Two new committers added to the project:
       * Orhan Kislal (9/7/16)
       * Nandish Jayaram (9/7/16)
 2. MADlib related events in Q3 2016:
       * Jul 27 - MADLib community call.  Topic:  Open discussion on Apache
         MADlib project (hosted by Greg Chase, Frank McQuillan)
       * Aug 19 - Presentation to Hortonworks.  Topic:  Apache MADlib, Apache
         HAWQ (incubating) and Apache Zeppelin (Rahul Iyer, Frank McQuillan)
       * Sep 13 - MADLib community call.  Topic:  Deep dive on MADlib 1.9.1
         release (hosted by Greg Chase, presentation by Frank McQuillan)
       * Sep 21 - Meetup at Hortonworks San Francisco.  Topic:  Future of data
         - Apache MADlib and Apache HAWQ (Tushar Pednekar)
       * Sep 22 - Meetup at Hortonworks Santa Clara.  Topic:  Future of data -
         Apache MADlib and Apache HAWQ (Tushar Pednekar)
 3. Material technical conversations on dev mailing lists and in the
   appropriate JIRAs and pull requests.

How has the project developed since the last report?

 1. 3rd ASF release MADlib v1.9.1 released on Sep 19, 2016.  Features include:
   path functions (phase 2), 1-class support vector machines for novelty
   detection, prediction metrics, sessionization, pivoting.
 2. Community has started active development on the v1.10 release.
 3. 13 JIRAs created and 5 resolved in last 30 days.

Date of last release:

 MADlib v1.9.1 on 9/19/16.

When were the last committers or PMC members elected:

 Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

 tdunning:
   This project seems to be ticking along pretty reasonably. The only worry I
     have about it is that it seems to be strongly centered around a few (or
     even just one) very strong contributors. That is a worry relative to
     longevity and community building. Overall, I don't think that the project
     is getting much marginal value from incubation.

 johndament:
   Its unclear what guidance from the IPMC is required if the podling is
     already reaching out to legal, which would be the main thing I can think
     of to recommend to them right now.

 rvs:
   @johndament: I think we need to formalize whatever decision by legal. I'll
     create a formal LEGAL JIRA soon.

20 Jul 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Continue to produce regular Apache (incubating) releases.
 2. Expand the community, increase dev list activity and add new
    contributors.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. MADlib related events in Q2 2016:
       * April 19 - Joint community call MADlib - Greenplum Database.
         Topic:  MADlib v1.9 new features (Nandish Jayaram, Ivan Novick,
         Cesar Rojas, Frank McQuillan)
       * May 5 - MADLib community call.  Topic:  Detailed review of the
         MADlib v1.9 release (Xiaocheng Tang, Frank McQuillan)
       * June 21 - MADLib community call.  Topic:  Apache Zeppelin meets
         Apache MADlib (incubating) and Apache HAWQ (incubating) (Moon soo
         Lee, Rahul Iyer, Frank McQuillan)
       * June 21 - Data Engineers Guild meetup in Palo Alto.  Topic: The
         Analytics and Science Behind Connected Transportation (Srivatsan
         Ramanujam, Esther Vasiete, Ralph Rabbat, Frank McQuillan)
 2. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs and pull requests.
 3. We are seeing some PostgreSQL experts chipping on SQL coding and making
    good suggestions in the pull requests.

How has the project developed since the last report?

 1. 2nd ASF release MADlib v1.9 released on April 6, 2016.  The goal of
    this 2nd release was general availability of MADlib v1.9 for community
    use.
 2. 3rd ASF release MADlib v1.9.1 anticipated this summer depending on
    community input.  Features include:  path functions (phase 2), 1-class
    support vector machines for novelty detection, prediction metrics,
    sessionization, pivoting.
 3. 2 JIRAs created and 14 resolved in last 30 days.

Date of last release:

 MADlib v1.9 on 4/6/16.

When were the last committers or PMC members elected?

 Xiaocheng Tang on 1/14/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

20 Apr 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Continue to produce regular Apache (incubating) releases.
 2. Expand the community, increase dev list activity and add new
    contributors.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. Held three community calls in Q1 2016.  Each call featured a different
    member of the Apache MADlib community presenting on a topic of interest
    to them:
    * Jan 15th - Bayesian Analysis of Binomial Response Models on MPP
      Databases (Gautam Muralidhar)
    * Feb 16th - An Overview of GWR Analysis of Spatial Data (Chenliang
      Wang)
        * Mar 16th - MADlib on PostgreSQL and PGXN (AJ Welch)
 2. One new committer has been added to the project (Xiaocheng Tang)
 3. Presentation of Apache MADlib at FOSDEM’16 in Brussels (Frank
    McQuillan)
 4. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs (e.g., 111 emails on dev@ mailing list in Feb)
 5. Several Google Summer of Code (GSoC) candidates have expressed interest
    in working on MADlib projects via dev@ mailing list.

How has the project developed since the last report?

 1. 1st ASF release MADlib v1.9alpha on 3/11/16 which was intended to clear
    all potential IP issues in the code base and make it legally ready to
    be adopted by the community.
 2. 2nd ASF release MADlib v1.9 is currently in IPMC voting as of this
    writing on 4/6/16.  The goal of this 2nd release is general
    availability of MADlib v1.9 for community use.
 3. Some features in the latest release:  path functions, support vector
    machines, advanced matrix operations, covariance matrix, proportion of
    variance for PCA, support for Apache HAWQ (incubating) 2.0.
 4. 15 JIRAs created and 27 resolved in last 30 days.

Date of last release:

 Apache MADlib (incubating) v1.9alpha on 3/11/16.

When were the last committers or PMC members elected?

 Xiaocheng Tang on 1/14/16.

Signed-off-by:

 [ ](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

20 Jan 2016

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 None

How has the community developed since the last report?

 1. Second community call held 12/18/15.  On the call Roman suggested that
    the MADlib community do a 1.9 alpha release in the near term.  There
    was general agreement and this is planned for January.  A Release
    Manager has not yet been identified.
 2. Material technical conversations on dev mailing lists and in the
    appropriate JIRAs.  E.g., 51 emails on dev in Dec.
 3. One new committer has been proposed and voting is under way on the
    private mailing list.
 4. Two comprehensive proposals were posted to the dev mailing list from
    community members.  One relates to the addition of geographically
    weighted regression (GWR) algorithms.  The second involves Bayesian
    analysis of binomial response models for MPP databases, which makes
    extensive use of MADlib’s new matrix operations.  Both proposals are
    under active discussion on the mailing list currently.
 5. We have been accepted to present a full talk at FOSDEM 2016/Brussels in
    the HPC, Big Data & Data Science Devroom on Jan 31.  The title of the
    talk is: "MADlib: Distributed In-Database Machine Learning for Fun and
    Profit"

How has the project developed since the last report?

 1. 5 JIRAs created and 4 resolved in last 30 days.
 2. A SQL API guide has been added to the the MADlib wiki
    https://cwiki.apache.org/confluence/display/MADLIB/SQL+API+Guide.

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 One new committer has been proposed and voting is under way on the private
 mailing list.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [x](madlib) Ted Dunning
 [x](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

16 Dec 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. First community call held 11/20/15.  There were approximately 10
    attendees, about half were from outside of the current group of MADlib
    contributors.  This will be a monthly call, possibly moving to 2x per
    month in the future.
 2. Meetup 12/3/15 @ Pivotal Labs, San Francisco:  “MADlib and HAWQ for
    Advanced SQL Machine Learning on Hadoop”.  One goal of this meetup is
    to invite new community participation in MADlib.
 3. Material technical conversations are now happening on the dev mailing
    lists and in the appropriate JIRAs.  E.g., 53 emails on dev in Nov
    compared with 7 in Oct.

How has the project developed since the last report?

 1. 31 JIRAs created and 7 resolved in last 30 days.
 2. Mailing list subscribers: user - 19, dev - 20
 3. Proposed scope for first Apache MADlib release has been described to
    the community for comment.  This release includes IP cleanliness and
    new features.
 4. The MADlib wiki <http://s.apache.org/0lQ> has been updated with new
    content, including a new contributors guideline, an FAQ and a page
    listing suggestions for first time contributors (these have also been
    labeled “starter” in the JIRAs).

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 No new members added on top of the initial committer list.

Signed-off-by:

 [X](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

18 Nov 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache (incubating) release.
 2. Expand the community, increase dev list activity and add new
    committers/pmc members.
 3. Execute and manage the project according to governance model required
    by the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 1. Meetup 10/1/15 @ Pivotal Labs, New York, NY:  “MADlib and HAWQ for
    Advanced SQL Machine Learning on Hadoop” http://s.apache.org/VbG
 2. Meetup 10/29/15 @ Pivotal Palo Alto, CA:  “Data Science at Scale for
    IoT” http://www.meetup.com/Pivotal-Open-Source-Hub/events/225426787/

How has the project developed since the last report?

 1. All known issues related to IP cleanliness described in
    https://wiki.apache.org/incubator/MADlibProposal have been fixed and
    pushed to the Apache repo.
 2. All software activity tracking has migrated to Apache MADlib JIRA from
    previous tool.  18 JIRAs created and 2 resolved in last 30 days.
 3. All commits and code are now being done on the Apache Git repo.
 4. Three new quick start guides have been written: i) install, ii) user,
    and iii) developer.  The goal is to make it easier to onboard new
    community members.
 5. A new Greenplum DB sandbox VM with MADlib pre-installed has been
    created and made available publicly at
    https://github.com/greenplum-db/gpdb-sandbox-tutorials.  The goal is to
    make it easier to onboard new community members - they can download and
    start trying MADlib right away with no install/setup.
 6. A "catchup JIRA" was filed
    https://issues.apache.org/jira/browse/MADLIB-912 in order to catch up
    between the time of the code grand to Apache and bringing in dev work
    that was already in flight at the time.  We apologize for any
    inconvenience in clubbing together these multiple items; it was a
    one-time operation.

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 No new members added on top of the initial committer list.

Signed-off-by:

 [X](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [X](madlib) Roman Shaposhnik

Shepherd/Mentor notes:

 Konstantin Boudnik (cos):

   I don't see much info on the community development. How many new
   contributors the project had gained? Were there any additions in the
   mailing lists? Please consider providing this information in the next
   report.

21 Oct 2015

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

 1. Produce a first Apache release.
 2. Finalize infrastructure migration and ICLAs from committers.
 3. Expand the community, increase dev list activity and add new
    committers/pmc members.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 Just started incubation, nothing specific to report at this time.

How has the community developed since the last report?

 We are approximately 2 weeks into the incubation process.

   1. So still at early stage.
   2. Most of the core contributors have completed their ICLA's and have
      established apache ids.
   3. Several presentations and meet-ups at ApacheCon EU and Strata NYC to
      discuss MADlib move to ASF governance.
   4. Formal announcements from Pivotal, press briefings and blogs related
      to the move of the project into Apache aimed at growing awareness and
      interest in the project.  Specialty press have picked up the story and
      reported widely.

How has the project developed since the last report?

 Early activity:

 1. Initial code drop provided to Apache
 2. Core infrastructure is in the process of being migrated from existing
    infrastructure: mailing lists, git, jira, wiki, website

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 Most of initial list of committers have been on-boarded, some still
 outstanding.  No new members added on top of the initial committer list.

Signed-off-by:

 [x](madlib) Konstantin Boudnik
 [X](madlib) Ted Dunning
 [ ](madlib) Roman Shaposhnik

Shepherd/Mentor notes: