This was extracted (@ 2024-11-19 16:10) from a list of minutes
which have been approved by the Board.
Please Note
The Board typically approves the minutes of the previous meeting at the
beginning of every Board meeting; therefore, the list below does not
normally contain details from the minutes of the most recent Board meeting.
WARNING: these pages may omit some original contents of the minutes.
Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).
WHEREAS, the Project Management Committee of the Apache Crunch project has arrived at a consensus to recommend moving the project to the Attic; and WHEREAS, the Board of Directors deems it no longer in the best interest of the Foundation to continue the Apache Crunch project due to inactivity; NOW, THEREFORE, BE IT RESOLVED, that the Apache Crunch project is hereby terminated; and be it further RESOLVED, that the Attic PMC be and hereby is tasked with oversight over the software developed by the Apache Crunch Project; and be it further RESOLVED, that the office of "Vice President, Apache Crunch" is hereby terminated; and be it further RESOLVED, that the Apache Crunch PMC is hereby terminated. Special Order 7B, Terminate the Apache Crunch Project, was approved by Unanimous Vote of the directors present.
## Description: The mission of Crunch is the creation and maintenance of software related to Simple and Efficient MapReduce Pipelines ## Issues: We're discussing the future of the project on the PMC mailing list and could use some input from the board. ## Membership Data: Apache Crunch was founded 2013-02-19 (7 years ago) There are currently 15 committers and 12 PMC members in this project. The Committer-to-PMC ratio is 5:4. Community changes, past quarter: - No new PMC members. Last addition was Micah Whitacre on 2014-04-02. - No new committers. Last addition was Stephen Durfey on 2018-02-09. ## Project Activity: No activity since the last release in January; some COVID-19 related work and an incorrectly configured email server caused the chair to miss the last report deadline, apologies for that. ## Community Health: Things are quiet, it feels like the core of the work is mostly complete and we are talking about how best to wrap things up.
No report was submitted.
@Shane: pursue potential Attic resolution for Crunch
## Description: The mission of Crunch is the creation and maintenance of software related to building simple and efficient data pipelines on Hadoop and Spark. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Crunch was founded 2013-02-19 (7 years ago) There are currently 15 committers and 12 PMC members in this project. The Committer-to-PMC ratio is 5:4. Community changes, past quarter: - No new PMC members. Last addition was Micah Whitacre on 2014-04-02. - No new committers. Last addition was Stephen Durfey on 2018-02-09. ## Project Activity: We did our 1.0.0 release on 2019-10-24, and are currently working on a major dependency upgrade to keep Crunch compatible with our myriad upstream dependencies, likely followed quickly by yet another release so that users who need to be on Hadoop 2.8.2 and later versions can keep working: https://issues.apache.org/jira/browse/CRUNCH-692 ## Community Health: Quiet quarter, aside from the release vote and a bit of traffic related to the upgrade.
## Description: The mission of Apache Crunch is to make it easy to create and maintain large-scale data pipelines within the Apache Hadoop ecosystem. ## Issues: There are no issues requiring board attention at this time. ## Membership Data: Apache Crunch was founded 2013-02-19 (7 years ago) There are currently 15 committers and 12 PMC members in this project. The Committer-to-PMC ratio is 5:4. Community changes, past quarter: - No new PMC members. Last addition was Micah Whitacre on 2014-04-02. - No new committers. Last addition was Stephen Durfey on 2018-02-09. ## Project Activity: We are currently in the middle of the 1.0.0 release vote. Feels good to be reaching this milestone as a project. ## Community Health: It will be interesting to see how the release vote progresses and where the community wants to take the project going forward; we hope that this section will be more interesting in our next report three months from now.
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: Some good work the past couple of months resolving some long-standing issues with S3 compatibility/utility that put us into good position for completing a release and adding new committers. ## Health report: Same structural issues as our last report; the utility of the project is primarily for developers who are using MapReduce pipelines either in local Hadoop clusters and/or migrating them to the cloud, so there isn't much new work to do besides those efforts. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 15 committers. - No new committers added in the last 3 months - Last committer addition was Stephen Durfey at Fri Feb 09 2018 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 7 JIRA tickets created in the last 3 months - 7 JIRA tickets closed/resolved in the last 3 months
## Description: - Apache Crunch is a JVM library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - Most of the work this quarter focused on integrations with Hadoop's FileSystem extensions to better support reading/writing pipeline data from/to various object stores (mainly S3.) ## Health report: - The project's current work is focused on the needs of community members who are moving MapReduce-based pipelines that ran in legacy clusters to various cloud-based Hadoop clusters, and makes sense given Hadoop's general focus on this area as well. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 15 committers. - No new committers added in the last 3 months - Last committer addition was Stephen Durfey at Fri Feb 09 2018 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 10 JIRA tickets created in the last 3 months - 6 JIRA tickets closed/resolved in the last 3 months
No report was submitted.
## Description: - Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - Had a few issues files this quarter related to extensions/fixes to better integrate with object stores (like S3) at the end of MapReduce jobs with a patch promised from a developer who has made some small contributions to the project in the past and looks like an excellent candidate to become a committer in the near future. ## Health report: - The needs of the community are largely unchanged and are mainly focused on integrating legacy MapReduce jobs with cloud platforms that only use HDFS as a caching layer while persisting data for the long-term in object storage; there isn't much other work to do on the core of the Crunch system aside from these compatibility updates and the occasional bug fix. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 15 committers. - No new committers added in the last 3 months - Last committer addition was Stephen Durfey at Fri Feb 09 2018 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 2 JIRA tickets created in the last 3 months - 0 JIRA tickets closed/resolved in the last 3 months
## Description: - Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - After the Hive and HBase work last quarter, we had a relatively quiet block of work on S3-compatibility changes and some fixes for Avro support with the Apache Spark runtime engine, along with some preparation for changes to the public/private settings of APIs in the next release of Apache HBase. ## Health report: - It was a quiet quarter, just a couple of interesting bugs that needed to be worked through. The work for the next quarter should be focused on Java upgrades (ideally to JDK11) ahead of the next major version release. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 3 JIRA tickets created in the last 3 months - 2 JIRA tickets closed/resolved in the last 3 months
## Description: - Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - The only real activity this quarter was a few small changes to improve S3 compatibility/reliability and an interesting discussion/debugging exercise on the user mailing list about an incompatibility between Crunch's Apache Spark runtime and the code for reading/writing Avro serialized data. ## Health report: - I believe the overall state of the project reflects the overall state of open-source data engineering in general, i.e., a steady move away from MapReduce-based pipelines to running pipelines on top of modern engines like Spark or the generalized APIs of a system like Apache Beam. There simply isn't much interest (or much need) to extend Crunch's functionality as opposed to providing a smooth migration off of it and on to Spark or Beam. The only real exception to this is certain extremely large workloads that cannot move to Spark or Beam for whatever reason, and for those workloads, we should be making it easy (as the rest of the Hadoop community has) to run those jobs in either an on-premise cluster or on the cloud by not assuming that data will be stored in HDFS permanently. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 15 committers. - No new committers added in the last 3 months - Last committer addition was Stephen Durfey at Fri Feb 09 2018 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## Mailing list activity: - dev@crunch.apache.org: - 79 subscribers (down -2 in the last 3 months): - 14 emails sent to list (51 in previous quarter) - user@crunch.apache.org: - 152 subscribers (down -1 in the last 3 months): - 36 emails sent to list (6 in previous quarter) ## JIRA activity: - 2 JIRA tickets created in the last 3 months - 1 JIRA tickets closed/resolved in the last 3 months
## Description: - Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - This quarter was relatively quiet and was primarily focused on fixing small bugs in the HBase, Kafka, and Hive integrations that Crunch provides. A number of these fixes came from Clement Mathieu, who looks like a good candidate to be a new committer on the project along with Stephen Durfey, who became a committer in February. ## Health report: - Crunch continues to move at a steady pace of commits and bug fixes, but there has not been a major push from the community to add or update the existing functionality beyond the set of things that Crunch does well already. The biggest obvious improvement to the project is an upgrade of the APIs for the HBase dependency, but that isn't necessarily the kind of work that a developer would be interested in doing for fun (as opposed to as a result of a specific need for their work and a desire to contribute that work back to the community.) ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 15 committers. - Stephen Durfey was added as a committer on Fri Feb 09 2018 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 8 JIRA tickets created in the last 3 months - 7 JIRA tickets closed/resolved in the last 3 months
## Description: - Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: - There are no issues requiring board attention at this time. ## Activity: - The big push this quarter was to upgrade the Hive dependencies for Crunch in order to add a new HCatalog module for reading and writing data to the Hive metastore. This was one of the major upgrades to the project that we wanted to get done before the 1.0 release, and we're working on getting the HBase version upgrades and fixes into the mainline of the codebase now. ## Health report: - Acting on the feedback we received after our last report, the PMC recently voted to add the primary developer of the Hive/HCatalog functionality as a new committer on the project, and are reaching out to him now to kick off the process with the ASF. We're optimistic about adding another new committer for the HBase work in the next quarter. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 4 JIRA tickets created in the last 3 months - 5 JIRA tickets closed/resolved in the last 3 months
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: Aside from a few small bug fixes, the main focus of development/discussion in the last quarter was how to execute the series of upgrades on our dependencies (mainly Apache Hadoop, HBase, and Hive) that we need to complete the 1.0 release. We need to do some heavy lifting and compatibility-breaking changes to the HBase module in order to make the move to HBase 2.x, which is necessary to make the move to Hive 2.x. The JIRA issue here tracks this: https://issues.apache.org/jira/browse/CRUNCH-659 ## Health report: The fact that we're no longer really working on new features or functionality in favor of bug fixes and version upgrades is the most significant issue with the health of the project. The good news is that a number of new contributors have been driving the version upgrade effort, and several of them clearly have the potential to become committers and PMC members once this work is done. The biggest need (and where I have fallen short as PMC chair) is providing them with guidance, feedback, and support for their efforts so that they can complete this work and earn their committerships. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 8 JIRA tickets created in the last 3 months - 2 JIRA tickets closed/resolved in the last 3 months
No report was submitted.
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: Activity since the most recent release (February 2017) has been focused on upgrading the versions of major dependencies (especially HBase and Spark) in preparation for a 1.0 release, which will be synced with the latest and greatest from downstream projects and will allow us to clean up some deprecated parts of the API. ## Health report: ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - Last release was 0.15.0 on Sat Feb 25 2017 ## JIRA activity: - 8 JIRA tickets created in the last 3 months - 2 JIRA tickets closed/resolved in the last 3 months
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: Activity since the most recent release (February 2017) has been focused on upgrading the versions of major dependencies (especially HBase and Spark) in preparation for a 1.0 release, which will be synced with the latest and greatest from downstream projects and will allow us to clean up some deprecated parts of the API. ## Health report: Although the current focus of the project is good and useful, the question now is what to do after the 1.0 release is complete, which brings us back to broader questions about the future of Crunch and how it should relate to similar top-level projects like Apache Beam. We'll begin this conversation in earnest on the dev mailing list once the 1.0 release is finished. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - 0.15.0 was released on Sat Feb 25 2017 ## JIRA activity: - 10 JIRA tickets created in the last 3 months - 10 JIRA tickets closed/resolved in the last 3 months
Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: The JIRAs for the past few months have focused on core bug fixes and iteration/fixes on the Apache Kafka support that we started on after our last release in May. ## Health report: The Crunch code does what it does well, and it has for at least a few releases. Beyond bug fixes and supporting upgrades to new Hadoop/Spark releases, there isn't an obvious new direction to take the project in that would stay true to its original mission while remaining useful to developers. In terms of project goals, API design, and even a subset of committers, Crunch has a lot in common with the newly top-level Apache Beam project, which is focused on the next generation of data processing engines that unify batch and streaming use cases into a single API. Finding a way to join forces with Beam is one available way forward for the project, but figuring out what that move would look like would require some extensive discussions on the mailing lists, both about the future of data pipelines in general as well as the role that the Crunch community most wants to play. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - Last release was 0.14.0 on Wed May 04 2016 ## JIRA activity: - 9 JIRA tickets created in the last 3 months - 7 JIRA tickets closed/resolved in the last 3 months
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: We had a report this quarter that a file that was used in some of our tests contained content that had an unclear copyright status outside of the US (maugham.txt, containing a sampling of work from W. Somerset Maugham.) We removed this file from our source repo and updated the tests that made use of it, as discussed in this JIRA issue: https://issues.apache.org/jira/browse/CRUNCH-616 If the board feels that any additional action is required here, please let the PMC know and we will take it. ## Activity: We had a normal amount of activity this month, primarily focused on performance and debugging for very large MapReduce pipelines executed by Crunch and improvements to the new Kafka Streams-based pipeline executor. ## Health report: Work proceeds apace to improve what Crunch does well (executing large and complex MapReduce pipelines), but as MapReduce gradually declines in use and is replaced by Apache Spark as the execution engine of choice, we expect that patches will come in more slowly and be primarily focused on fixing bugs as opposed to adding new functionality. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - 0.14.0 was released on Fri May 06 2016 ## JIRA activity: - 10 JIRA tickets created in the last 3 months - 12 JIRA tickets closed/resolved in the last 3 months
## Description: Apache Crunch is a Java library for writing, testing, and running MapReduce and Apache Spark pipelines on Apache Hadoop. ## Issues: There are no issues requiring board attention at this time. ## Activity: There has been the normal bug fixing work after our most recent release in May, along with some new work to explore using Apache Kafka as a data source in Crunch pipelines that should make a good foundation for our next major release. ## Health report: Things are generally good: the core library does what it was designed to reasonably well, bugs are reported and addressed in a timely manner, and we have some new and interesting development avenues to explore in leveraging Crunch as a way of doing simplified stream processing without requiring the deployment of more heavyweight frameworks like Apache Storm or Apache Spark's streaming engine. ## PMC changes: - Currently 12 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Micah Whitacre on Wed Apr 02 2014 ## Committer base changes: - Currently 14 committers. - No new committers added in the last 3 months - Last committer addition was David Whiting at Mon Nov 30 2015 ## Releases: - 0.14.0 was released on Fri May 06 2016 ## JIRA activity: - 9 JIRA tickets created in the last 3 months - 4 JIRA tickets closed/resolved in the last 3 months
WHEREAS, the Board of Directors heretofore appointed Micah Whitacre (mkwhit) to the office of Vice President, Apache Crunch, and WHEREAS, the Board of Directors is in receipt of the resignation of Micah Whitacre from the office of Vice President, Apache Crunch, and WHEREAS, the Project Management Committee of the Apache Crunch project has chosen by vote to recommend Josh Wills (jwills) as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Micah Whitacre is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache Crunch, and BE IT FURTHER RESOLVED, that Josh Wills be and hereby is appointed to the office of Vice President, Apache Crunch, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7I, Change the Apache Crunch Project Chair, was approved by Unanimous Vote of the directors present.
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a slightly slower pace to the previous quarter, with13 new JIRAs being created since the previous board report with 11 (10 new + 1 old) issues being resolved in that time. The majority of the work on the project continues to focus on maintenance such as bug fixes but improvements to the Java 8 Lambda support and HBase efficiencies. There are no board-level issues at this time. Community --------- Community activity slowed but remained steady. The user mailing list activity has reduced to a question (1 every 3 days). Over the last reporting period the activity on the developer mailing list has remained steady (2 per day). David Whiting was added as a committer on Dec 2nd, 2015. Josh Wills was re-elected to the PMC Chair on April 10, 2016 and resolution sent to the board. Releases -------- * Apache Crunch 0.13.0 was released August 5, 2015
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a slightly slower pace to the previous quarter, with 19 new JIRAs being created since the previous board report with 18 (11 new + 7 old) issues being resolved in that time. The majority of the work on the project continues to focus on maintenance such as bug fixes but also improvements to Scala and Spark. A significant design and implementation effort has been underway to design Java 8 Lambda API support. There are no board-level issues at this time. Community --------- Community activity continues to be similar with the previous reporting period. The user mailing list has maintained a steady rate of questions and answer (1 per day). Over the last reporting period the activity on the developer mailing list has increased (2 per day). Micah Whitacre was added to the PMC on April 3rd, 2014. David Whiting was added as a committer on Dec 2nd, 2015. Releases -------- * Apache Crunch 0.13.0 was released August 5, 2015
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a similar pace to the previous quarter, with 34 new JIRA issues logged since the previous board report, with 24 (23 new + 1 old) issues being resolved in that time. The majority of the work on the project continues to focus on maintenance such as bug fixes with a heavier focus on rounding out HBase support, building out better Spark support, and support for Java 8 Lambdas. The project also successfully release version 0.13.0 this quarter which featured 27 issues. The release focussed on several bug fixes but the major effort was to upgrade to HBase 1.0 and remove support for Hadoop 1.0. There are no board-level issues at this time. Community --------- Community activity continues to be similar with the previous reporting period. The user mailing list has maintained a steady rate of questions and answer (1 per day). Over the last reporting period the activity on the developer mailing list has increased (2 per day). Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- * Apache Crunch 0.13.0 was released August 5, 2015
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a similar pace to the previous quarter, with 33 new JIRA issues logged since the previous board report, with 24 (22 new + 2 old) issues being resolved in that time. The majority of the work on the project continues to focus on maintenance such as bug fixes but also improvements to upgrade the technology stack such as HBase and Java. The project also successfully release version 0.12.0 this quarter which featured 35 issues. The release focussed on several bug fixes but also improvements to the projects Scala and Spark support. There are no board-level issues at this time. Community --------- Community activity continues to be similar with the previous reporting period. The user mailing list has maintained a steady rate of questions and answer (1 per day). Over the last reporting period the activity on the developer mailing list has increased (2 per day). Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- * Apache Crunch 0.12.0 was released May 8, 2015
WHEREAS, the Board of Directors heretofore appointed Gabriel Reid to the office of Vice President, Apache Crunch, and WHEREAS, the Board of Directors is in receipt of the resignation of Gabriel Reid from the office of Vice President, Apache Crunch, and WHEREAS, the Project Management Committee of the Apache Crunch project has chosen by vote to recommend Micah Whitacre as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Gabriel Reid is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache Crunch, and BE IT FURTHER RESOLVED, that Micah Whitacre be and hereby is appointed to the office of Vice President, Apache Crunch, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7C, Change the Apache Crunch Project Chair, was approved by Unanimous Vote of the directors present.
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a similar pace to the previous quarter, with 20 new JIRA issues logged since the previous board report, with 15 of them being closed in that time. The majority of the work on the project continues to focus on maintenance. The addition of a new committer was successfully voted on, but the person in question decided to make a major career change just at the same moment, which was an unfortunate loss for the Crunch community. This report also marks the resignation (due to the term of one year being up) of Gabriel Reid, the current PMC chair. There has been a successful vote to recommend Micah Whitacre as the new PMC chair. There are no board-level issues at this time. Community --------- Community activity continues to be similar with the previous reporting period. The user mailing list was more active, with an average of more than one message per day (nearly double of the previous reporting period), while the dev mailing list activity has dropped slightly in comparison with the previous reporting period. Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- There were no releases made in this quarter. The last releases were: * Apache Crunch 0.11.0, released Sept 10, 2014 * Apache Crunch 0.8.4, released Sept 13, 2014
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving at a slightly slower pace than in past quarters. Since the last board report there have been 15 new issues logged in Jira, with 8 of them being closed in that time. Similar to the previous couple of board reports, the majority of recent work has been focused on minor improvements and bug fixes. A particularly interesting recent jira ticket was the donation of a number of Crunch utilities from Spotify (CRUNCH-484). Spotify also posted an interesting blog post about how they currently use Crunch for analytics pipelines [1]. There are no board-level issues at this time. Community --------- Community activity has been similar, although slightly lower, in comparison with recent quarters, with an average of several mails on the developer list per day and an average of a message every two or three days on the user list. Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- There were no releases made in this quarter. 1. https://labs.spotify.com/2014/11/27/crunch/
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving along at a steady pace in the past quarter, at a similar velocity to the previous two quarters, although in the final weeks before this report things have quieted down quite a bit. Since the last board report there have been 32 new issues logged in Jira, with 27 of them being closed in that time. Similar to the previous board report, the majority of recent work has been focused on minor improvements and bug fixes. There was a minor hiccup in releasing version 0.11.0, when the released maven artifacts were correctly pushed to Nexus, but not synced to Maven Central. This was noticed by a user (reported on the user mailing list), and turned out to be an infrastructure issue resolved in INFRA-8333. There are no board-level issues at this time. Community --------- Community activity has continued to be in line with recent quarters, with an average of several mails on the developer list per day and an average of a message every day or two on the user list. Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- There were two releases made in this quarter: * 0.11.0 was released on September 2nd, 2014 * 0.8.4 was released on September 13th, 2014
Apache Crunch is a Java library for writing, testing, and running MapReduce and Spark pipelines on Apache Hadoop. Project Status -------------- The project has been moving along at a steady pace in the past quarter, at a similar velocity to the previous two quarters. Since the last board report there have been 53 new issues logged in Jira, with 44 of them being closed in that time. The majority of recent work has been focused on minor improvements and bug fixes, and there have also been quite a few tickets related to improvements in Scrunch (the Scala API for Crunch). There are no board-level issues at this time. Community --------- Community activity has continued to be in line with recent quarters, with an average of several mails on the developer list per day and an average of a message every day or two on the user list, and first-time contributions from new contributors every few weeks. Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013. Releases -------- The last two releases (0.10.0 and 0.8.3) were both made on June 9th 2014.
WHEREAS, the Board of Directors heretofore appointed Josh Wills to the office of Vice President, Apache Crunch, and WHEREAS, the Board of Directors is in receipt of the resignation of Josh Wills from the office of Vice President, Apache Crunch, and WHEREAS, the Project Management Committee of the Apache Crunch project has chosen by vote to recommend Gabriel Reid as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Josh Wills is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache Crunch, and BE IT FURTHER RESOLVED, that Gabriel Reid be and hereby is appointed to the office of Vice President, Apache Crunch, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7B, Change the Apache Crunch Project Chair, was approved by Unanimous Vote of the directors present.
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. General: The Crunch community has been steadily fixing issues for the last quarter, with 48 issues filed and fixed since the last release in December 2013, which means it's just about time for a new release. Since our last report we dramatically improved the quality and depth of the user guide [1] and getting started information [2] for the project on our website, and we have a proposal for the board to approve a new PMC chair for the project. We have also added one new PMC member since our last report. [1] http://crunch.apache.org/user-guide.html [2] http://crunch.apache.org/getting-started.html Releases: Last releases were 0.9.0 and 0.8.2, both made on December 17th, 2013. Community: Micah Whitacre was added to the PMC on April 3rd, 2014. Micah Whitacre was added as a committer on July 11th, 2013.
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. General: The Crunch community had a large number of releases in the last quarter, primarily focused on updating the libraries to work against the major releases of Apache Hadoop (2.2.0) and Apache HBase (0.96) that came out in the past quarter. 42 issues were created and 40 issues were resolved over this period, including bug fixes, new features, and support for a new Hadoop-based execution engine that is currently in the incubator, Apache Spark (incubating). Releases: The 0.9.0 release was made on December 17th, 2013. The 0.8.2 release was made on December 17th, 2013. The 0.8.1 release was made on November 20th, 2013. The 0.8.0 release was made on November 8th, 2013. Community: Chao Shi was added to the PMC on August 20th, 2013. Micah Whitacre was added as a committer on July 11th, 2013.
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. General: The Crunch community continues to develop features and fix bugs at a healthy pace: there are currently 28 JIRA issues that have been resolved since the 0.7.0 release, with input from 10 different contributors. Given our current rate of issue resolution, I believe that the community will vote to create a new release within the next few weeks. Releases: The 0.7.0 release was made on July 25, 2013. The 0.6.0 release was made on May 13th, 2013. Community: Chao Shi was added to the PMC on August 20th, 2013, our first new PMC member since leaving the Incubator. Micah Whitacre was added as a committer on July 11th, 2013.
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. Issues: There are no issues requiring board attention at this time. Releases: We made one release last quarter, version 0.6.0 on May 13th, 2013. Work is currently underway on version 0.7.0. Community: The Crunch PMC voted to add Micah Whitacre as a committer on the project and he has accepted. There have been no changes to the PMC since becoming a TLP in February 2013. Activity on the development list has been at a steady cadence for the last quarter, with new issues and patches being submitted by a diverse set of new and veteran contributors almost every day. Eli Collins gave a talk about using the Crunch libraries with Apache Avro to build applications on top of Apache Hadoop at QCon in June 2013. [1] [1] http://s.apache.org/Kr4 (PDF)
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. Issues: There are no issues requiring board attention at this time. Releases: We are currently holding a vote for our 0.6.0 release, our first release since leaving the incubator at the end of February. We have received three +1 votes for the current release candidate from PMC members and expect the vote to pass when voting closes in a couple of days. Community & Development: No new PMC members or committers have been added since our report last month, when we added two new committers. We had a tough month on the dev list, primarily due to a strange and somewhat random Java compiler error that caused Crunch builds to fail consistently in some environments but not others, which was frustrating to debug and caused lots of Jenkins failures. We believe that we have resolved these issues with the latest release candidate and are looking forward to our next release and getting back to working on new features and bug fixes for our next release.
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. ISSUES * There are no issues to raise with the board. COMMUNITY * We added two new committers to the Crunch project since our last board report. These are our first new committers since graduation. * Activity on the dev mailing list is stable and healthy. The user mailing list saw some questions from new users and roughly the same number of threads, but the overall volume of messages fell by about half. * We completed an overhaul of the Crunch website and added an About page. * Two Crunch PMC members are working on third-party projects that build on Crunch. Cloudera ML [1] integrates Apache Hive, Apache Mahout, and Crunch to perform data preparation and model evaluation tasks on Apache Hadoop. Also, work began to integrate Crunch with ElasticSearch's Hadoop libraries. [2] RELEASES * No new releases since our February 2013 release just before graduation. We expect to perform our first TLP release within the next few weeks. [1] http://github.com/cloudera/ml [2] http://github.com/tzolov/elasticsearch-hadoop#crunch
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines on Apache Hadoop. MILESTONES We completed our most recent release (0.5.0-incubating) on 2/19/13, just before we left the Incubator. ACTIVITY * Most of the Incubator transfer procedures have been completed. We still have our old release directory at the Incubator, but we plan to update that upon our next release. * The PMC is currently voting on a set of bylaws for the project modeled after the bylaws of the Apache Zookeeper project with some small tweaks based on the bylaws of the Apache Pig project. * Nine JIRAs have been resolved since our most recent release, primarily small bug fixes. One major feature was adding the ability to start and monitor a MapReduce pipeline asynchronously, which was contributed by a new developer on the project. COMMUNITY * 50 subscribers to the dev mailing list, 62 subscribers to the user mailing list. * There have been no changes to the PMC or committer composition since our recent graduation, although the PMC is currently holding a vote on adding two new committers. * The user mailing list has seen small but steady traffic in the form of questions and requests from Crunch users. INFRASTRUCTURE * The major TLP creation tasks are completed, but we have one outstanding issue from the move from the Incubator: our github mirror hasn't been updated to reflect the new repo name. This is tracked in INFRA-5933. * The website has been updated to reflect the project's new status as a TLP. LEGAL No known issues. BRANDING No known issues.
WHEREAS, the Board of Directors deems it to be in the best interests of the Foundation and consistent with the Foundation's purpose to establish a Project Management Committee charged with the creation and maintenance of open-source software, for distribution at no charge to the public, related to the development of Java libraries for writing, testing, and running MapReduce pipelines. NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee (PMC), to be known as the "Apache Crunch Project", be and hereby is established pursuant to Bylaws of the Foundation; and be it further RESOLVED, that the Apache Crunch Project be and hereby is responsible for the creation and maintenance of software related to development of Java libraries for writing, testing, and running MapReduce pipelines; and be it further RESOLVED, that the office of "Vice President, Apache Crunch" be and hereby is created, the person holding such office to serve at the direction of the Board of Directors as the chair of the Apache Crunch Project, and to have primary responsibility for management of the projects within the scope of responsibility of the Apache Crunch Project; and be it further RESOLVED, that the persons listed immediately below be and hereby are appointed to serve as the initial members of the Apache Crunch Project: * Brock Noland <brock@apache.org> * Christian Tzolov <tzolov@apache.org> * Gabriel Reid <greid@apache.org> * Josh Wills <jwills@apache.org> * Kiyan Ahmadizadeh <kiyan@apache.org> * Matthias Friedrich <mafr@apache.org> * Rahul Sharma <rsharma@apache.org> * Robert Chu <robertchu@apache.org> * Tom White <tomwhite@apache.org> * Vinod Kumar Vavilapalli <vinodkv@apache.org> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Josh Wills be appointed to the office of Vice President, Apache Crunch, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed; and be it further RESOLVED, that the initial Apache Crunch PMC be and hereby is tasked with the creation of a set of bylaws intended to encourage open development and increased participation in the Apache Crunch Project; and be it further RESOLVED, that the Apache Crunch Project be and hereby is tasked with the migration and rationalization of the Apache Incubator Crunch podling; and be it further RESOLVED, that all responsibilities pertaining to the Apache Incubator Crunch podling encumbered upon the Apache Incubator Project are hereafter discharged. Special Order 7B, Establish the Apache Crunch Project, was approved by Unanimous Vote of the directors present.
Crunch is a Java library for writing, testing, and running pipelines of MapReduce jobs on Apache Hadoop. Crunch has been incubating since 2012-05-26. Three most important issues to address in the move towards graduation: * None Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? * Nothing that currently requires IPMC attention. How has the community developed since the last report? The Apache Crunch development team has released version 0.4.0-incubating in November, the second release at the Apache Incubator. We have worked with the Apache BigTop project and our release is now part of Apache BigTop 0.5.0. For our next release, we have discussed and agreed on some large-scale API cleanup and implemented the necessary changes. We have performed the podling name search - the name Apache Crunch has been approved by the trademarks team. This has been our last blocker for graduation, we have already started a vote on a graduation resolution within the community and expect to start the vote on incubator-general in February. Development activity around Christmas has been a bit lower than usual but is now picking up again. There has been a significant increase in traffic on crunch-user; it is great to see that more and more users show up, file bug reports and contribute patches or test cases. How has the project developed since the last report? - 53 issues were created on the Crunch JIRA in November to January, 39 issues have been resolved - crunch-dev has seen 615 emails in the reporting period, while 126 emails were posted to crunch-user Signed-off-by: Arun Murthy: [ ](crunch) Patrick Hunt: [X](crunch) Tom White: [ ](crunch) Shepherd notes:
Crunch is a Java library for writing, testing, and running pipelines of MapReduce jobs on Apache Hadoop. Crunch entered incubation on 2012-05-29. The most important steps towards graduation: - Create another release or two - Perform the name search Nothing that currently requires IPMC attention. Community: The Crunch community has been very active and continues to grow. Two new committers have been voted in and one existing committer has joined the PPMC. As a result, Crunch now has 10 committers from 7 different organizations. We have created our first release in September and have published a website using the Apache CMS a few days later. Our second release will follow in November. Development: - 76 issues were created on the Crunch JIRA in August to October, 70 of those were resolved. - crunch-dev has been active: 922 emails in the reporting period - Apache CMS and ReviewBoard for Crunch are up and running - All ICLAs are in place, including those for the new committers Signed-off-by: tomwhite, jukka
Crunch is a Java library for writing, testing, and running pipelines of MapReduce jobs on Apache Hadoop. Crunch entered incubation on 2012-05-29. The most important steps towards graduation: - Infrastructure setup (CMS for the Crunch website) - Add new committers - Create a release Nothing that currently requires IPMC attention. Community: The Crunch developer community continue to grow. The project received code submissions from six new developers representing five distinct organizations in the month of July. One of the new developers made such substantial contributions to the design and testability of the Crunch code base that the PPMC voted to add him as a committer, increasing the number of distinct organizations on the committer list from four to five. We look forward to adding new committers from our pool contributors, and also added documentation to the wiki to explain to new contributors how to get started with the project. Development: - 29 issues were created on the Crunch JIRA in the month of July, 23 of those were resolved. - All ICLAs are in place, including the one for the committer the project just added. - Cloudera submitted the software grant documents to the Apache Secretary on 2012-07-11, and the Secretary registered the grant the same day. - crunch-dev has been active: 308 emails on the list in July. Signed-off-by: jukka
Crunch is a Java library for writing, testing, and running pipelines of MapReduce jobs on Apache Hadoop. Crunch entered incubation on May 27, 2012. The most important steps towards graduation: - Infrastructure setup (JIRA, Confluence, etc.) - CCLA licensing of the existing Crunch code - Adding new contributors - Creating a release Nothing that currently requires IPMC attention. Community: The developer mailing list has been very active with bug fixes, new features, and discussions of infrastructure setup and project policies, both from the existing committers and other developers with an interest in the project. The first patch from a non-committer is currently being prepared for submission: the code is written, but we were blocking on getting JIRA setup so that the copyright on the code could cleanly be assigned to the ASF. The JIRA issues were resolved earlier this week. All ICLAs are in place. Cloudera has gathered all of the copyright assignments for the existing Crunch code from non-Cloudera developers and is preparing the CCLA to assign the copyrights on the existing Crunch code to the ASF. Development: The 15 commits on the project this month were primarily for documentation and bug fixes, although we are evaluating two larger patches that bring additional functionality to the library: 1) adding map-side joins and 2) supporting interactive pipeline creation and execution via the Scala REPL. Signed off by mentor: phunt, tomwhite
Crunch is a Java library for writing, testing, and running pipelines of MapReduce jobs on Apache Hadoop. Crunch entered incubation on May 27, 2012. Community - Mailing lists have been created. - New committer accounts are being created, some pending ICLAs. - The Incubator status page has been created. Issues Before Graduation - Create Confluence instance. - Create JIRA issue tracker (CRUNCH) - Migrate code to Apache Git repository from Cloudera's GitHub repository. - Create Crunch website. - Make an incubating release. - Grow the size and diversity of the community. Licensing and other issues Work to obtain CCLA from Cloudera regarding license grant for existing Crunch GitHub repository is underway. Signed off by mentor: phunt