Apache Logo
The Apache Way Contribute ASF Sponsors

This was extracted (@ 2018-11-21 21:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | Pre-organization meetings

DataFu

17 Oct 2018 [Matthew Hayes / Ted]

## Description:

 - DataFu provides a collection of Hadoop MapReduce jobs and Pig UDFs to
   perform data analysis. It provides functions for common statistics tasks
   (e.g. quantiles, sampling), PageRank, stream sessionization, and set and
   bag operations. DataFu also provides Hadoop jobs for incremental data
   processing in MapReduce.

## Issues:

 - There are no issues requiring board attention at this time.

## Activity:

 - Code for an initial Spark subproject has been contributed (from Eyal
   Allweil and new contributor Ohad Raviv).
 - Some work on adding a macro for deduping.
 - Javadoc updates.  Noted deprecated methods.

## PMC changes:

 - Currently 11 PMC members.
 - No new PMC members added in the last 3 months
 - No new PMC members added since podling graduation in February 2018.

## Committer base changes:

 - Currently 18 committers.
 - No new changes to the committer base since last report.
 - No new committers added since podling graduation in February 2018.

## Releases:

 - Last release was 1.4.0 on Wed Mar 21 2018

## JIRA activity:

 - 1 JIRA tickets created in the last 3 months
 - 2 JIRA tickets closed/resolved in the last 3 months

18 Jul 2018 [Matthew Hayes / Rich]

## Description:

 - DataFu provides a collection of Hadoop MapReduce jobs and Pig UDFs to
   perform data analysis. It provides functions
   for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
   sessionization, and set and bag operations. DataFu also provides Hadoop jobs
   for incremental data processing in MapReduce.

## Issues:

 - There are no issues requiring board attention at this time.

## Activity:

 - Updated to compile with Java 8.
 - Updated Ruby gems used for website generation.
 - Upgraded build system to Gradle v4.8.1.
 - Added new macro.

## PMC changes:

 - Currently 11 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Casey Stella on Tue Feb 20 2018

## Committer base changes:

 - Currently 18 committers.
 - No new changes to the committer base since last report.

## Releases:

 - Last release was 1.4.0 on Wed Mar 21 2018

## JIRA activity:

 - 2 JIRA tickets created in the last 3 months
 - 3 JIRA tickets closed/resolved in the last 3 months

16 May 2018 [Matthew Hayes / Shane]

## Description:

 - DataFu provides a collection of Hadoop MapReduce jobs and Pig UDFs to
   perform data analysis. It provides functions for common statistics tasks
   (e.g. quantiles, sampling), PageRank, stream sessionization, and set and
   bag operations. DataFu also provides Hadoop jobs for incremental data
   processing in MapReduce.

## Issues:

 - There are no issues requiring board attention at this time.

## Activity:

 - No activity to report in the past month.

## PMC changes:

 - Currently 11 PMC members.
 - No new PMC members added in the last 3 months

## Committer base changes:

 - Currently 18 committers.
 - No new changes to the committer base since last report.

## Releases:

 - 1.4.0 was released on Wed Mar 21 2018

## JIRA activity:

 - 9 JIRA tickets created in the last 3 months
 - 11 JIRA tickets closed/resolved in the last 3 months

18 Apr 2018 [Matthew Hayes / Bertrand]

## Description:

 - DataFu provides a collection of Hadoop MapReduce jobs and Pig UDFs to
   perform data analysis. It provides functions for common statistics tasks
   (e.g. quantiles, sampling), PageRank, stream sessionization, and set and
    bag operations. DataFu also provides Hadoop jobs for incremental data
    processing in MapReduce.

## Issues:

 - There are no issues requiring board attention at this time.

## Activity:

 - Post-graduation work has been completed.
 - Released 1.4.0, the first release since graduating from incubator.

## PMC changes:

 - Currently 11 PMC members.
 - No new PMC members added in the last 3 months

## Committer base changes:

 - Currently 18 committers.
 - No changes (the PMC was established in the last 3 months)

## Releases:

 - 1.4.0 was released on Wed Mar 22 2018

## JIRA activity:

 - 13 JIRA tickets created in the last 3 months
 - 16 JIRA tickets closed/resolved in the last 3 months

21 Mar 2018 [Matthew Hayes / Bertrand]

## Description:

 - DataFu provides a collection of Hadoop MapReduce jobs and Pig UDFs to
   perform data analysis. It provides functions for common statistics tasks
   (e.g. quantiles, sampling), PageRank, stream sessionization, and set and
   bag operations. DataFu also provides Hadoop jobs for incremental data
   processing in MapReduce.

## Issues:

 - There are no issues requiring board attention at this time.

## Activity:

 - Much of the recent activity has focused on tasks related to incubator
   graduation.
 - A Download page (http://datafu.apache.org/docs/download.html) was added
   with clearer instructions for getting the most recent source release and
   validating it.
 - Infra set up the new domain for Apache DataFu: http://datafu.apache.org/
 - There is some upcoming work on other post-graduation items, such as
   updating the website to reflect being a TLP now, building artifacts without
   "incubating" in the name, etc.

## Health report:

 - A JIRA was filed from a new user.  It was found to not be an issue.
 - The last release was in January 2018 while still in incubation.  Planning
   to do a new release soon now that the project has graduated to TLP.

## PMC changes:

 - Currently 11 PMC members.
 - No new PMC members added in the last 3 months

## Committer base changes:

 - Currently 18 committers.
 - No changes (the PMC was established in the last 3 months)

## Releases:

 - No releases so far since graduating to TLP.
 - Last release during incubation was 1.3.3 on January 26th, 2018.

## JIRA activity:

 - 10 JIRA tickets created in the last 3 months
 - 11 JIRA tickets closed/resolved in the last 3 months

21 Feb 2018

Establish the Apache DataFu Project

 WHEREAS, the Board of Directors deems it to be in the best interests
 of the Foundation and consistent with the Foundation's purpose to
 establish a Project Management Committee charged with the creation and
 maintenance of open-source software, for distribution at no charge to
 the public, consisting of well-tested libraries that help developers
 solve common data problems in Hadoop and similar distributed systems.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
 (PMC), to be known as the "Apache DataFu Project", be and hereby is
 established pursuant to Bylaws of the Foundation; and be it further

 RESOLVED, that the Apache DataFu Project be and hereby is responsible
 for the creation and maintenance of libraries that help solve common
 data problems and work with large-scale data in Hadoop and similar
 distributed systems; and be it further

 RESOLVED, that the office of
 Vice President, Apache DataFu be and hereby is created, the person
 holding such office to serve at the direction of the Board of
 Directors as the chair of the Apache DataFu Project, and to have
 primary responsibility for management of the projects within the scope
 of responsibility of the Apache DataFu Project; and be it further

 RESOLVED, that the persons listed immediately below be and hereby are
 appointed to serve as the initial members of the Apache DataFu
 Project:

   * Casey Stella <cestella@apache.org>
   * Evion Kim <evion@apache.org>
   * Eyal Allweil <eyal@apache.org>
   * Jarek Jarcec Cecho <jarcec@apache.org>
   * Josh Wills <jwills@apache.org>
   * Matthew Hayes <mhayes@apache.org>
   * Mitul Tiwari <mitultiwari@apache.org>
   * Roman Shaposhnik <rvs@apache.org>
   * Russell Jurney <rjurney@apache.org>
   * Sam Shah <samshah@apache.org>
   * William Vaughan <wvaughan@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Matthew Hayes be
 appointed to the office of Vice President, Apache DataFu, to serve in
 accordance with and subject to the direction of the Board of Directors
 and the Bylaws of the Foundation until death, resignation, retirement,
 removal or disqualification, or until a successor is appointed; and be
 it further

 RESOLVED, that the initial Apache DataFu PMC be and hereby is tasked
 with the creation of a set of bylaws intended to encourage open
 development and increased participation in the Apache DataFu Project;
 and be it further

 RESOLVED, that the Apache DataFu Project be and hereby is tasked with
 the migration and rationalization of the Apache Incubator DataFu
 podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache Incubator
 DataFu podling encumbered upon the Apache Incubator Project are
 hereafter discharged.

 Special Order 7E, Establish the Apache DataFu Project, was
 approved by Unanimous Vote of the directors present.

17 Jan 2018

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Address IPMC feedback raised during graduation discussion 2. Positive
 IPMC recommendation vote for graduation 3. Continue releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 None

How has the community developed since the last report?

 One new contributor (Yuval Allweil)

How has the project developed since the last report?

 Upgraded Guava and Gradle versions. Addressed many website issues raised in
 graduation discussion.  Whimsy report now mostly green. Rat task
 automatically run as part of build. New UDFs for diffing tuples and
 computing hashes.

How would you assess the podling's maturity? Please feel free to add your own
commentary.

 [ ] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [x] Nearing graduation
 [ ] Other:

Date of last release:

 2017-03-10

When were the last committers or PPMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan Comments:
 [X](datafu) Roman Shaposhnik Comments: In my view the podling is ready to
  graduate at this point.
 [ ](datafu) Ted Dunning Comments:

IPMC/Shepherd notes:

18 Oct 2017

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Positive IPMC recommendation vote for graduation
 2. Address any IPMC feedback regarding graduation
 3. Continue releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 No new developments

How has the project developed since the last report?

 Added support for Pig macros distributed in JAR.
 Added a couple counting macros and TFIDF.
 Started graduation discussion in general list.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [ ] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [x] Nearing graduation
 [ ] Other:

Date of last release:

 2017-03-10

When were the last committers or PPMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
    Comments:
 [ ](datafu) Roman Shaposhnik
    Comments:
 [X](datafu) Ted Dunning
    Comments:

IPMC/Shepherd notes:

 johndament: The podling is attempting to graduate, however there are some concerns raised within the discussion over how big the actual PMC is.  There have also been concerns raised over the removal of committer status.

16 Aug 2017

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Positive IPMC recommendation vote for graduation
 2. Address any IPMC feedback regarding graduation
 3. Continue releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

* Community voted positively for graduation from Incubator.
* New contributor opened and fixed DATAFU-124

How has the project developed since the last report?

* Completed maturity evaluation checklist
 (https://cwiki.apache.org/confluence/display/DATAFU/Maturity+Evaluation)
* Drafted a graduation resolution
 (https://cwiki.apache.org/confluence/display/DATAFU/Graduation+Resolution)

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [ ] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [x] Nearing graduation
 [ ] Other:

Date of last release:

 2017-03-10

When were the last committers or PPMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
    Comments:
 [x](datafu) Roman Shaposhnik
    Comments:
    best of luck to this podling's graduation. They are a small, but a viable community
 [x](datafu) Ted Dunning
    Comments: Good luck to the community

IPMC/Shepherd notes:
Project is ready to graduate. I noticed a problem with their answer to QU30 on the Maturity model.
It was in part a documentation issue. Raised it with the project and they are addressing the way that
they are asking for security issues. Raised the documentation issue with ComDev and we fixed it.
Dave Fisher

19 Apr 2017

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Complete maturity evaluation checklist
 2. Draft graduation resolution
 3. Continue releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 No changes

How has the project developed since the last report?

 Version 1.3.2 was released.  This addressed an issue with released
 convenience binaries not including LICENSE, NOTICE, and DISCLAIMER
 in META-INF of JARs.  This was considered an important item to
 tackle before graduation.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [x] Initial setup
 [x] Working towards first release (released 1.3.0, 1.3.1, and 1.3.2)
 [x] Community building (4 committers added since incubuation, 24
     contributors in total)
 [x] Nearing graduation (maturity evaluation is nearly complete)
 [ ] Other:

Date of last release:

 2017-03-10

When were the last committers or PPMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
    Comments:
 [x](datafu) Roman Shaposhnik
    Comments: I believe the podling is ready for graduation
 [ ](datafu) Ted Dunning
    Comments:

27 Feb 2017

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Resolve NOTICE and LICENSE issues for binary distributions
 2. Continued releases
 3. Increased committer activity

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 Received a patch from a new contributor.

How has the project developed since the last report?

 No updates

Date of last release:

 2016-08-10

When were the last committers or PPMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

 Roman Shaposhnik:

   I really think we need to do one final push and either graduate or retire.

16 Nov 2016

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Resolve NOTICE and LICENSE issues for binary distributions
 2. Continued releases
 3. Increased committer activity

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 * No updates

How has the project developed since the last report?

 * Released 1.3.1.  Now using ASF-associated signing key.  Feedback from
   previous release addressed.
 * Website updated alongside 1.3.1 release.
 * Cleaned up release instructions.

Date of last release:

 2016-08-10

When were the last committers or PMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

 Roman Shaposhnik:

   Pushing this community towards graduation is pretty high on my TODO list.
   I think they are as ready as they are ever going to be.

19 Oct 2016

 johndament:

   Discussions on this podling seem to have stopped completely.  There was a
     graduation discussion back in August, which seems to have dropped
     completely after some release content issues were identified.

20 Jul 2016

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Grow user and contributor base
 2. Increased committer activity
 3. Continued releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 * Eyal Allweil was voted in as the newest committer and member of the
   PPMC.

How has the project developed since the last report?

 * A new UDF provided by Eyal was committed and another was submitted for
   review.
 * ASF-associated signing key committed in prep for next release,
   addressing feedback from previous release.

Date of last release:

 2015-11-14

When were the last committers or PMC members elected?

 July 2016 (Eyal Allweil)

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [x](datafu) Roman Shaposhnik
 [x](datafu) Ted Dunning

20 Apr 2016

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Grow user and contributor base
 2. Increased committer activity
 3. Continued releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 * A new contributor opened several JIRAs regarding improvements and
   contributed patches.  Two have been committed so far.

How has the project developed since the last report?

 * Improved instructions on loading projects in Eclipse based on discussion
   in JIRA.
 * Added checks in build system to catch issues using wrong JDK version.
 * Some UDFs were improved to be more efficient.
 * A new UDF is pending review.

Date of last release:

 2015-11-14

When were the last committers or PMC members elected

 November 2014

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [X](datafu) Ted Dunning

20 Jan 2016

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Grow user and contributor base
 2. Increased committer activity
 3. Continued releases

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * None

How has the community developed since the last report?

 * No new activity in the community since the last report.

How has the project developed since the last report?

 * Apache DataFu 1.3.0 source release completed, which is the first release
   since entering the Incubator.  DataFu 1.3.0 was also released to Maven.
 * Website (http://datafu.incubator.apache.org/) has been updated with
   instructions on how to use the source release or artifacts from Maven.

Date of last release:

 * 2015-11-14

When were the last committers or PMC members elected?



Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [x](datafu) Ted Dunning

Shepherd/Mentor notes:

 Roman Shaposhnik (rvs):

   The community appears to be in the final stretch before graduation,
   hopefully there's enough critical mass for it to happen.

18 Nov 2015

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Do first release
 2. Grow user and contributor base
 3. Increased committer activity

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * Performing the initial release remains the most important milestone.

How has the community developed since the last report?

 * No new activity in the community since the last report.

How has the project developed since the last report?

 * The website documentation (http://datafu.incubator.apache.org/) has been
   updated and brought up to date with the current state of the project and
   build system, making it easier for newcomers to get started.  This was
   the last major task blocking release.
 * All the release tasks filed for our first release have now been
   completed.  A discussion has been opened in the dev mailing list on the
   topic of doing our first release.  A vote will likely be held in the
   next few days.

Date of last release:

 * Not yet released.  First release will likely happen within the coming
   weeks.

When were the last committers or PMC members elected?

 * November 2014

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [X](datafu) Ted Dunning

Shepherd/Mentor notes:

19 Aug 2015

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Do first release
 2. Grow user and contributor base
 3. Increased committer activity

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 * Performing the initial release remains the most important milestone.
   However, the majority of the known tasks for this have been completed.
   The remaining tasks are related to documentation.  We are able to
   generate a signed, versioned source release from the build system.

How has the community developed since the last report?

 * A couple new contributors have submitted patches.  One of these has been
   committed and the other is nearly ready to be committed.

How has the project developed since the last report?

 * The build system has been updated to provide tasks for generating
   signed, versioned source releases.  Documentation has been updated as
   well with instructions on how to do this.
 * Apache DataFu has been updated to run against Hadoop 2.  There was an
   issue running the Hourglass integration tests against Hadoop 2, which
   had been blocking this update.
 * A couple new patches from two new contributors for Pig UDFs have been
   submitted.  One of these is an improvement to the HyperLogLog UDF
   cardinality estimator that makes it much more efficient.  The other is a
   helper for getting a tuple out of a bag.
 * A patch has been submitted for a UDF to incrementally process
   date-partitioned data in Pig.  This provides similar functionality that
   is available in Hourglass.

Date of last release:

 * Not yet released

When were the last committers or PMC members elected?

 * November 2014

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [X](datafu) Ted Dunning

Shepherd/Mentor notes:

 Ted Dunning:

   The generation of signed releases on shared hardware has historically
   raised serious security questions from infra. I think that this process
   needs to be vetted very carefully.

21 Jan 2015

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Grow user and contributor base.
 2. Make first release.
 3. Increase activity for initial committers.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 1. Has not yet made a release, but is in process of preparing first.
 2. Need to dramatically grow the contributor base.

How has the community developed since the last report?

 New committer and PMC member.  Several JIRAs filed by new users.

How has the project developed since the last report?

 1. 16 issues created, several from new contributors.
 2. 8 issues closed.
 3. Reasonable amount of mailing list traffic.

Date of last release:

 None yet. Currently preparing release: DATAFU-53.

When were the last committers or PMC members elected?

 Nov 2014, Russell Jurney, both committer and PPMC.

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [X](datafu) Ted Dunning

15 Oct 2014

DataFu provides a collection of Hadoop MapReduce jobs and functions in higher
level languages based on it to perform data analysis. It provides functions
for common statistics tasks (e.g. quantiles, sampling), PageRank, stream
sessionization, and set and bag operations. DataFu also provides Hadoop jobs
for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Building an ASF-based community.
 2. Release.
 3. Adding support for Hadoop 2.x

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None.

How has the community developed since the last report?

 Three new users have contributed code since the last report.

How has the project developed since the last report?

 A couple more UDFs have been committed.  One bug fix was committed.  All
 JARs have been removed from the repo (a blocker for source release).  A
 build task has been added for creating a source release.  No open blockers
 for release left at this point.  Several more UDFs have been contributed but
 are still under review.

Date of last release:

 No release yet.

When were the last committers or PMC members elected?

 2014-02-22

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

(jmclean): Did not report on time. Low level mentor activity but no obvious
issues other than missing release. (Release mentioned in last report and
DATAFU-53 blocking release has been resolved).

16 Jul 2014

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

  1. Building an ASF-based community.
  2. Release.
  3. Decide on the future home of the project.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

  None.

How has the community developed since the last report?

  Will Vaughan gave a talk on DataFu at ApacheCon in April, and
  Casey Stella gave a talk on Pig and DataFu at the Hadoop Summit in
  June.

How has the project developed since the last report?

 Lots of JIRAs on bug fixes and new features, especially in April and May.
 Work slowed significantly in June, which probably means it's time for a
 release to mark our progress thus far.

Date of last release:

  None. Six month of incubation.

When were the last committers or PMC members elected?

  2014-02-22

Signed-off-by:

  [ ](datafu) Ashutosh Chauhan
  [X](datafu) Roman Shaposhnik
  [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

(jmclean) :  Mentor active, no obvious issues.

16 Apr 2014

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Building ASF community
 2. Release
 3. Remaining incubator paperwork

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None.

How has the community developed since the last report?

 A talk was given at an Apache Pig meetup held on March 14th.  A talk
 is scheduled to be given at ApacheCon in Denver on April 7th.  Jian
 Wang accepted the invitation to become a committer.

How has the project developed since the last report?

 Two new Jiras have been filed and received patches.

Date of last release:

 None. Third month of incubation.

When were the last committers or PMC members elected?

 2014-02-22

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

 Justin Mclean (jmclean):

   Relative new podling yet to make a release. One mentor is active on
   public mailing list no obvious issues that need attention.

19 Mar 2014

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Building ASF community
 2. Release
 3. Remaining incubator paperwork

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None.

How has the community developed since the last report?

 More contributions have been received from Jian Wang, who has also
 been voted in as the newest committer and PPMC member.  A talk
 is planned at the Apache Pig meetup to be held on March 14th.

How has the project developed since the last report?

 Three JIRAs have been opened, four have been closed.  The project has
 migrated from Ant to the Gradle build system, which will make it easier
 to add libraries for Hive, Crunch, etc.

Date of last release:

 None. Second month of incubation.

When were the last committers or PMC members elected?

 2014-02-22

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [x](datafu) Ted Dunning

19 Feb 2014

DataFu provides a collection of Hadoop MapReduce jobs and functions in
higher level languages based on it to perform data analysis. It provides
functions for common statistics tasks (e.g. quantiles, sampling), PageRank,
stream sessionization, and set and bag operations. DataFu also provides
Hadoop jobs for incremental data processing in MapReduce.

DataFu has been incubating since 2014-01-05.

Three most important issues to address in the move towards graduation:

 1. Building ASF community
 2. Release
 3. Remaining incubator paperwork

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None.

How has the community developed since the last report?

 Since initial incubation, have received contributions from two new
 contributors.

How has the project developed since the last report?

 First report.  Have obtained all the necessary infra (git/jira/wiki,etc).
 Thirty JIRAs have been opened, 14 have been closed.  Active discussion on
 mailing list as to community development, etc.

Date of last release:

 None. First month of incubation.

When were the last committers or PMC members elected?

 None. First month of incubation.

Signed-off-by:

 [ ](datafu) Ashutosh Chauhan
 [X](datafu) Roman Shaposhnik
 [ ](datafu) Ted Dunning

Shepherd/Mentor notes:

 Dave Fisher (wave):

   New community to the incubator just getting started. Good guidance from
   Mentors. Needs Apache trademark attribution on site. Should have links
   to Mailing lists on the site.