This was extracted (@ 2024-10-16 22:10) from a list of minutes
which have been approved by the Board.
Please Note
The Board typically approves the minutes of the previous meeting at the
beginning of every Board meeting; therefore, the list below does not
normally contain details from the minutes of the most recent Board meeting.
WARNING: these pages may omit some original contents of the minutes.
Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).
Report was filed, but display is awaiting the approval of the Board minutes.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (9 years ago) There are currently 49 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 7:2. - No new PMC members. Last addition was Pavan Lanka on 2023-03-30. - Shaoyun Chen was added as committer on 2024-05-14. - Yuanping Wu was added as committer on 2024-05-14. ## Project Activity: According to our release cadence, we released two maintenance releases in this quarter and helped other Apache communities use them. - 1.8.7 was released on 2024-04-14. - 2.0.1 was released on 2024-05-14. In addition, we are preparing the following milestones for the next quarter. - 1.9.4 (July) - 2.0.2 (August) - 1.7.11 (September) ## Community Health: In this quarter, the traffic of dev, issues, and user mailing lists have decreased by 29%, 44%, and 75% respectively. Activities have been slowing down due to the season. However, we are looking to return back to normal soon.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (9 years ago) There are currently 47 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. - No new PMC members. Last addition was Pavan Lanka on 2023-03-30. - Deshan Xiao was added as committer on 2024-01-11. ## Project Activity: In addition to our current distribution channels, we have begun to support two additional channels of ConanCenter and vcpkg. According to our release cadence, we released one major release and one maintenance release in this quarter and helped other Apache communities use them. - 1.9.3 was released on 2024-03-21. - 2.0.0 was released on 2024-03-08. In addition, we are preparing the following milestones for the next quarter. - 1.8.7 (April) - 2.0.1 (May) - 1.9.4 (June) ## Community Health: In this quarter, the traffic of dev, issues, and user mailing lists have increased by 36%, 86%, and 300% respectively. According to the commit mailing list, the number of commits has increased 59%, which is a good sign for community growth.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (9 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. - No new PMC members. Last addition was Pavan Lanka on 2023-03-30. - No new committers. Last addition was Xin Zhang on 2023-02-06. We are continuing to encourage community engagement, and we have voted on and invited a new committer candidate. ## Project Activity: We started the Apache ORC Format repository which includes ORC specifications and a ProtocolBuffer file. We released version 1.0.0 and successfully migrated the ORC repository. - https://github.com/apache/orc-format - ORC-1572: Use ORC Format 1.0.0 According to our release cadence, we released three maintenance releases in this quarter and helped other Apache communities use them. - 1.9.2 was released on 2023-11-10. - 1.8.6 was released on 2023-11-10. - 1.7.10 was released on 2023-11-10. In addition, we are preparing the following milestones in 2024. - 2.0.0 (January) - 1.9.3 (March) - 1.8.7 (April) - 1.7.11 (September) ## Community Health: In this quarter, the traffic of both dev and issues mailing lists have increased by 33% and 6% respectively. And the number of code contributors has increased 33%, which is a good sign for community growth.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (8 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. - No new PMC members. Last addition was Pavan Lanka on 2023-03-30. - No new committers. Last addition was Xin Zhang on 2023-02-06. ## Project Activity: According to our release cadence, we released two maintenance releases in this quarter and helped other Apache communities use them. - 1.9.1 was released on 2023-08-16. - 1.8.5 was released on 2023-09-05. In addition, we are preparing the following milestones. - 2.0.0 (January) - 1.9.2 (November) - 1.8.6 (December) - 1.7.10 (November) ## Community Health: In this quarter, activities have slowed down overall due to the summer season. We are preparing for the planned releases ahead and looking to return to normal soon.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (8 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. - No new PMC members. Last addition was Pavan Lanka on 2023-03-30. - No new committers. Last addition was Xin Zhang on 2023-02-06. ## Project Activity: According to our release cadence, we released one minor and two maintenance releases in this quarter and helped other Apache communities use them. - 1.9.0 was released on 2023-06-28. - 1.8.4 was released on 2023-06-14. - 1.7.9 was released on 2023-05-07. In addition, we are preparing the following milestones. - 2.0.0 (January) - 1.9.1 (August) - 1.8.5 (September) - 1.7.10 (November) ## Community Health: In this quarter, the number of closed GitHub and Jira issues increased by 47% and 160% respectively, and the number of code contributors has increased 13%.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (8 years ago) There are currently 47 committers and 15 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - Pavan Lanka was added to the PMC on 2023-03-30 - Xin Zhang was added as committer on 2023-02-06 In addition, there are two member status changes inside ORC PMC. - Alan Gate informed his resignation on 2023-04-07 - Dongjoon Hyun became an ASF Member on 2023-03 ## Project Activity: According to our release cadence, we released two maintenance releases in this quarter and helped Apache Spark and Iceberg to use it. - 1.8.3 was released on 2023-03-15. - 1.7.8 was released on 2023-02-14. - 1.8.2 was released on 2023-01-13. In addition, we are preparing the following milestones. Gang Wu volunteered as a new release manager. - 1.7.9 (May, Release Manager: Gang Wu) - 1.8.4 (June) - 1.9.0 (September) ## Community Health: There was the following comment in the last report. > rbowen: I would love to hear more about your initiative of "helping candidates." We have added one new committer and promoted one new PMC member. In this quarter, user/dev/issues mailing lists have 520%/51%/24% increase in traffic, respectively, and the number of code contributors has a 33% increase. It's a good sign to be a healthier community.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (8 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - No new PMC members. Last addition was Yiqun Zhang on 2022-05-06. - No new committers. Last addition was Pavan Lanka on 2022-05-23. ## Project Activity: According to our release cadence, we released one feature release and one maintenance release in this quarter. - 1.8.1 was released on 2022-12-02. - 1.7.7 was released on 2022-11-17. We have prepared an SBOM (Software Bill of Materials). - https://cwiki.apache.org/confluence/display/COMDEV/SBOM We are looking to release the following versions soon in order to get user feedback on the SBOM. We have adjusted the release date to this month. - 1.7.8 (January) - 1.8.2 (January) ## Community Health: In this quarter, the number of code contributors increased by 20%, and closed GitHub issues increased by 27%. Activities have slowed down due to the Holiday season, but we are looking to return back to normal soon.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (7 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - William Hyun became the new Chair of Apache ORC PMC on 2022-09-21. - No new PMC members. Last addition was Yiqun Zhang on 2022-05-06. - No new committers. Last addition was Pavan Lanka on 2022-05-23. ## Project Activity: According to our release cadence, we released one feature release and one maintenance release in this quarter. - 1.8.0 was released on 2022-09-03. - 1.7.6 was released on 2022-08-17. In addition, we are preparing the following milestones. - 1.7.7 (November) - 1.8.1 (December) ## Community Health: In this quarter, GitHub Issue activity increased by 20% and 22% respectively in issues opened and closed. In addition, the number of PR is slightly increased.
WHEREAS, the Board of Directors heretofore appointed Dongjoon Hyun (dongjoon) to the office of Vice President, Apache ORC, and WHEREAS, the Board of Directors is in receipt of the resignation of Dongjoon Hyun from the office of Vice President, Apache ORC, and WHEREAS, the Project Management Committee of the Apache ORC project has chosen by vote to recommend William Hyun (william) as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Dongjoon Hyun is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache ORC, and BE IT FURTHER RESOLVED, that William Hyun be and hereby is appointed to the office of Vice President, Apache ORC, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7B, Change the Apache ORC Project Chair, was approved by Unanimous Vote of the directors present.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. For the following comment about the last report, Apache ORC PMC updated GitHub notification settings about pull requests, issues, and commits explicitly in order to archive all activities to the ASF mailing list. > rbowen: Do you find that the decreased mailing list activity is offset by > discussion in the GitHub issues, or has that discussion simply gone away? ## Membership Data: Apache ORC was founded 2015-04-21 (7 years ago) There are currently 46 committers and 14 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - Yiqun Zhang was added to the PMC on 2022-05-06 - Pavan Lanka was added as committer on 2022-05-23 ## Project Activity: First, according to the new ASF data privacy policy, Apache ORC PMC added a link to ASF Data Privacy Policy at the Apache ORC website front page and also double-checked that Apache ORC website has no custom privacy policy and no Google Analytics. Second, according to our release cadence, we released three maintenance releases in this quarter and helped Apache Spark, Iceberg, and Arrow projects to use it. - 1.7.5 (2022-06-16) - 1.7.4 (2022-04-15) - 1.6.14 (2022-04-14) In addition, we are preparing the following milestones. - 1.7.6 (August) - 1.8.0 (September) - 1.6.15 (September) 1.6.15 is the End-Of-Support release after 3 year from 1.6.0 on September, 2019. ## Community Health: In this quarter, GitHub PR open/close activities increased by 32% and 47% respectively and Commit activity increased by 22%. JIRA issue open/close activities also increased slightly.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (7 years ago) There are currently 45 committers and 13 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - No new PMC members. Last addition was William Hyun on 2021-09-30. - Quanlong Huang was added as committer on 2022-03-04. - One invited candidate has been working on ICLA/CCLA. ## Project Activity: According to our release cadence, we released two maintenance releases in this quarter and helped Apache Spark, Iceberg, and Arrow projects to use it. - 1.6.13 (2022-01-20) - 1.7.3 (2022-02-09) In addition, we are actively preparing two more releases this month. - 1.7.4 (April) - 1.6.14 (April) Apache ORC community aims to become an user-friendly project more and more. We collaborated with the Arrow community and added the official 'USING IN PYTHON' pages for the Python users. https://orc.apache.org/docs/pyarrow.html https://orc.apache.org/docs/dask.html Also, William proposed to use the 'GitHub issue' feature to lower the hurdle to contribute and it's implemented via 'ORC-1094: Enable GitHub issues tab'. So far, it helps us a lot as a more user-friendly channel by skipping JIRA login. ## Community Health: Since we started to use 'GitHub issues', the mailing list activity has decreased. However, all the other activities are increased.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (7 years ago) There are currently 44 committers and 13 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - No new PMC members. Last addition was William Hyun on 2021-09-30. - Yiqun Zhang was added as committer on 2021-11-23 - One invited candidate has been working on ICLA/CCLA. ## Project Activity: In 2022, we started to build and share the release plan to improve the visibility and collaboration of the Apache ORC releases for all active branches. This is based on our release cadence discussion in 2021. - 1.6.13 (2022-01-15) - 1.7.3 (2022-02-15) - 1.8.0 (2022-09-15) We also released three maintenance releases in this quarter and helped the downstream Apache projects: Spark, Arrow, Iceberg. - 1.7.2 was released on 2021-12-20 - 1.7.1 was released on 2021-11-07 - 1.6.12 was released on 2021-11-07 In addition, we updated the Apache ORC adopters page for the users. https://orc.apache.org/docs/adopters.html Lastly, we revisited our test coverage and simplified our CI environment by migrating AppVeyor CI and Travis CI jobs to GitHub Action CI. We are expecting improved controllability and speedup in the review process. ## Community Health: Due to the seasonal reasons, all activity metrics decreased in this quarter. However, we have a good start in the year 2022 in the number of commits and PRs and new contributors.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (6 years ago) There are currently 43 committers and 13 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. The PMC has been focusing on the community's growth to be a healthier community by helping candidates. - William Hyun was added to the PMC on 2021-09-30. - The PMC voted a new committer and sent an invitation. The committer candidate is currently working on ICLA/CCLA. ## Project Activity: In the last quarter, we succeeded in launching a new 1.7.0 release in addition to three maintenance releases. - 1.7.0 was released on 2021-09-15 as the new latest version. - 1.6.10 was released on 2021-08-10. - 1.6.11 was released on 2021-09-15 and became the new stable version. - 1.5.13 was released on 2021-09-15. Since ORC 1.5.0 was released on 2018-05-14, `branch-1.5` had been maintained for 40 months. 1.5.13 is the last release and no more 1.5.x releases should be expected even for bug fixes. The community has been investing more and more in helping the downstream projects use new releases in order to help ORC users. - Apache ORC 1.5.13 will be used by Apache Spark 3.1.3. - Apache ORC 1.6.11 will be used by Apache Spark 3.2.0. - Apache ORC 1.7.0 will be used by * Apache Spark 3.3 * Apache Iceberg 0.13 * Apache Arrow 6.0 * Apache Druid 0.23 ## Community Health: Thanks to new releases, the ORC mailing lists, JIRA activity, commits, and PR activities increased greatly in this quarter.
WHEREAS, the Board of Directors heretofore appointed Owen O'Malley (omalley) to the office of Vice President, Apache ORC, and WHEREAS, the Board of Directors is in receipt of the resignation of Owen O'Malley from the office of Vice President, Apache ORC, and WHEREAS, the Project Management Committee of the Apache ORC project has chosen by vote to recommend Dongjoon Hyun (dongjoon) as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Owen O'Malley is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache ORC, and BE IT FURTHER RESOLVED, that Dongjoon Hyun be and hereby is appointed to the office of Vice President, Apache ORC, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7B, Change the Apache ORC Project Chair, was approved by Unanimous Vote of the directors present.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-22 (6 years ago) There are currently 43 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - No new PMC members. Last addition was Panagiotis Garefalakis on 2021-02-02. - William Hyun was added as committer on 2021-04-14 ## Project Activity: The PMC is asking that Dongjoon Hyun become the new ORC VP. I'll add the resolution to the board agenda. We've had two bug fix releases (1.6.8 and 1.6.9) in the last quarter. We're closing down on making a new 1.7 release. ## Community Health: The community is doing well. The email lists are down a bit, but the commits, jira traffic, and number of code contributors are all up.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Big Data workloads ## Issues: There are no issues that require board attention. Based on the feedback last quarter we had one PMC member join the private email list and another go emeritus. ## Membership Data: Apache ORC was founded 2015-04-22 (6 years ago) There are currently 42 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - Panagiotis Garefalakis was added to the PMC on 2021-02-02 - Jeongseok Hyun was added as a committer on 2021-04-12. ## Project Activity: The community is working a couple of features: * row level filtering that supports lazy loading of columns * automatic generation of row level filters based on Search Arguments It is probably time to make a new 1.7 release soon. ## Community Health: We've had a significant upswing in activity in the last quarter, especially since there wasn't a new release: - dev@orc.apache.org had a 94% increase in traffic in the past quarter (842 emails compared to 434) - issues@orc.apache.org had a 81% increase in traffic in the past quarter (355 emails compared to 196) - user@orc.apache.org had a 1900% increase in traffic in the past quarter (20 emails compared to 1) We've also had some contributions from several new members.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-22 (6 years ago) There are currently 42 committers and 12 PMC members in this project. The Committer-to-PMC ratio is 7:2. Community changes, past quarter: - No new PMC members. Last addition was Jesús Camacho Rodríguez on 2019-12-27. - Panagiotis Garefalakis was added as committer on 2020-11-16 ## Project Activity: Recent releases: - 1.6.6 was released on 2020-12-10. - 1.6.5 was released on 2020-10-01. - 1.5.12 was released on 2020-09-30. Most of 1.6.6 resolves backwards compatibility problems that blocked Spark from moving to 1.6. Blog post: - FastIngest: Low-latency Gobblin with Apache Iceberg and ORC format https://engineering.linkedin.com/blog/2021/fastingest-low-latency-gobblin? ## Community Health: Development has substantially picked up in the project with both increased email traffic and pull requests. - dev@orc.apache.org had a 40% increase in traffic in the past quarter (442 emails compared to 314) - issues@orc.apache.org had a 35% increase in traffic in the past quarter (196 emails compared to 145) - 61 PRs opened on GitHub, past quarter (90% increase) - 59 PRs closed on GitHub, past quarter (55% increase)
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues that require the board's attention. ## Membership Data: Apache ORC was founded 2015-04-21 (6 years ago) There are currently 41 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - No new PMC members. Last addition was Jesús Camacho Rodríguez on 2019-12-26. - No new committers. Last addition was Renat Valiullin on 2019-06-19. Apache ORC PMC recently starts to discuss new committer candidates. In 2020, Panagiotis Garefalakis is the most active contributor. We are planning to open a vote for him during November, 2020. ## Project Activity: Recent releases: - 1.6.5 was released on 2020-10-01. - 1.5.12 was released on 2020-09-30. - 1.5.11 was released on 2020-09-14. - 1.6.4 was released on 2020-09-14. New releases are used like the following: - ICEBERG-1546 Upgrade to 1.6.5 (https://github.com/apache/iceberg/pull/1546) - SPARK-33050 Upgrade Apache ORC to 1.5.12 - HIVE-24222 Upgrade ORC to 1.5.12 We have been improving backward compatibility in 1.7/1.6/1.5 to provide more easier migration paths, and we should start discussion of making an ORC 1.7 release soon. ## Community Health: Thanks to new releases, all activities in Apache ORC community had increases. Although Apache ORC PMC expect some decrease due to the seasonal reasons in the next quarter, Apache ORC PMC will focus on the growth of the community in order to keep the vitality and sustainability of the community. - dev@orc.apache.org had a 152% increase (371 emails compared to 147) - issues@orc.apache.org had a 77% increase (163 emails compared to 92) - 28 issues opened in JIRA, past quarter (16% increase) - 34 issues closed in JIRA, past quarter (100% increase) - 89 commits in the past quarter (140% increase) - 14 code contributors in the past quarter (16% increase) - 35 PRs opened on GitHub, past quarter (40% increase) - 40 PRs closed on GitHub, past quarter (66% increase)
No report was submitted.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues that require the board's attention. ## Membership Data: Apache ORC was founded 2015-04-21 (5 years ago) There are currently 41 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - No new PMC members. Last addition was Jesús Camacho Rodríguez on 2019-12-26. - No new committers. Last addition was Renat Valiullin on 2019-06-19. ## Project Activity: We released two bug fix releases (1.5.10 and 1.6.3) on 2020-04-26. We need to release another 1.6 release soon to address some issues raised by Apache Iceberg and Presto. We should start discussion of making an ORC 1.7 release soon. ## Community Health: The project has been quiet, although traffic on the dev list has gone up 65% this quarter. We have several new contributors and it would be good to make them committers.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (5 years ago) There are currently 41 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - No new PMC members. Last addition was Jesús Camacho Rodríguez on 2019-12-26. - No new committers. Last addition was Renat Valiullin on 2019-06-19. ## Project Activity: Recent releases: 1.5.9 was released on 2020-01-30. 1.4.5 was released on 2019-12-09. 1.5.8 was released on 2019-11-25. We're planning a 1.6 release soon. There was a talk on ORC Deep Dive that was given internally at Cloudera. The slides are available here: https://www.slideshare.net/oom65/orc-deep-dive-2020 ## Community Health: It has been a relatively quiet quarter and the queue of waiting PR's has built up. We need to spend more time working through the queue and create some new committers to help. - dev@orc.apache.org had a 22% decrease in traffic in the past quarter (56 emails compared to 71) - user@orc.apache.org had a 200% increase in traffic in the past quarter (15 emails compared to 5) - 30 issues opened in JIRA, past quarter (no change) - 22 issues closed in JIRA, past quarter (no change) - 55 commits in the past quarter (-34% decrease) - 14 code contributors in the past quarter (7% increase) - 34 PRs opened on GitHub, past quarter (6% increase) - 29 PRs closed on GitHub, past quarter (3% increase) On a personal note, I left Cloudera in early February, took some time off, and started at LinkedIn in mid-March. I think with more experience with the ORC code available, it will encourage additional contribution from LinkedIn.
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: There are no issues requiring board attention. ## Membership Data: Apache ORC was founded 2015-04-21 (5 years ago) There are currently 41 committers and 12 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - Dongjoon Hyun was added to the PMC on 2019-12-08 - Jesús Camacho Rodríguez was added to the PMC on 2019-12-26 - No new committers. Last addition was Renat Valiullin on 2019-06-19. ## Project Activity: Recent releases: - 1.4.5 was released on 2019-12-09. - 1.5.8 was released on 2019-11-25. - 1.6.2 was released on 2019-11-25. ORC 1.6.2 has stabilized the ORC 1.6 branch, although most of the downstream projects are still using 1.5. Iceberg recently committed updated ORC support that uses the new 1.6 release. Active Apache downstream projects include: - Apache Arrow - Apache Druid - Apache Flink - Apache Hive - Apache Iceberg - Apache Spark ## Community Health: It was a busy quarter with a couple new contributors coming in with patches. We're watching the contributors looking for new committers. - dev@orc.apache.org had a 84% increase in traffic in the past quarter (70 emails compared to 38) - 26 issues opened in JIRA, past quarter (no change) - 21 issues closed in JIRA, past quarter (-22% decrease) - 83 commits in the past quarter (33% increase) - 12 code contributors in the past quarter (20% increase) - 30 PRs opened on GitHub, past quarter (30% increase) - 27 PRs closed on GitHub, past quarter (8% increase)
## Description: The mission of ORC is the creation and maintenance of software related to the smallest, fastest columnar storage for Hadoop workloads ## Issues: - There are no issues requiring board attention at this time. ## Membership Data: Apache ORC was founded 2015-04-21 (4 years ago) There are currently 41 committers and 10 PMC members in this project. The Committer-to-PMC ratio is roughly 3:1. Community changes, past quarter: - No new PMC members. Last addition was Gang Wu on 2019-01-16. - No new committers. Last addition was Renat Valiullin on 2019-06-19. ## Project Activity: - ORC 1.6.0 was released on 2019-09-04. - Talks on ORC column encryption at: - ApacheCon NA 2019 - Strata NYC 2019 - We should release new bug fix releases on the 1.5 and 1.6 lines in the coming week. ## Community Health: - The project has been relatively quiet this quarter as users start testing out the new 1.6 release. - Some new contributors have been submitting patches and we are tracking their performance as potential committers.
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We released the bug fix release 1.5.6. - We are preparing to making release candidates for 1.6 in the coming week. - The column encryption has been committed to master. - We added user annotations on the types to support the Iceberg project. - We worked with the Spark community to help them adopt the 1.5.6 release. ## Health report: - It has been challenging to get code reviews on the Java side. We prefer to have code reviewed before it is committed, but we allow commit then review, so the work doesn't get stuck. Hopefully the new committers will give us additional review bandwidth. - Two new committers were added this quarter. ## PMC changes: - Currently 10 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Gang Wu on Wed Jan 16 2019 ## Committer base changes: - Currently 41 committers. - New commmitters: - Sandeep More was added as a committer on Wed Jun 12 2019 - Renat Valiullin was added as a committer on Wed Jun 19 2019 ## Releases: - 1.5.6 was released on Wed Jun 26 2019 ## Mailing list activity: - With the 1.5 line being stable, there have been relatively little traffic on the users list, but lots of traffic on dev & issues as we close in an 1.6 release. - dev@orc.apache.org: - 67 subscribers (up 0 in the last 3 months): - 104 emails sent to list (50 in previous quarter) - issues@orc.apache.org: - 20 subscribers (down -1 in the last 3 months): - 136 emails sent to list (78 in previous quarter) - user@orc.apache.org: - 68 subscribers (down -1 in the last 3 months): - 3 emails sent to list (55 in previous quarter) ## JIRA activity: - 39 JIRA tickets created in the last 3 months - 29 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We released the bug fix release 1.5.5. - The column encryption work is nearing completion. - We gave a presentation about column encryption at Dataworks Summit in Barcelona. https://s.apache.org/orc-encryption - The C++ reader is being integrated into Apache Impala. - Work is starting on improving the integration with Apache Arrow. ## Health report: - The community is doing well. ## PMC changes: - Currently 10 PMC members. - Gang Wu was added to the PMC on Wed Jan 16 2019 ## Committer base changes: - Currently 39 committers. - No new committers added in the last 3 months - Last committer addition was Dongjoon Hyun at Fri Jan 11 2019 ## Releases: - 1.5.5 was released on Wed Mar 13 2019 ## Mailing list activity: - There has been a noticeable uptick in the user traffic. - dev@orc.apache.org: - 67 subscribers (up 3 in the last 3 months): - 65 emails sent to list (97 in previous quarter) - issues@orc.apache.org: - 21 subscribers (up 1 in the last 3 months): - 85 emails sent to list (323 in previous quarter) - user@orc.apache.org: - 69 subscribers (up 1 in the last 3 months): - 57 emails sent to list (10 in previous quarter) ## JIRA activity: - 32 JIRA tickets created in the last 3 months - 18 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - There are talks about ORC scheduled for: * DataWorks Summit - Barcelona (18-21 March 2019) * DataWorks Summit - Washington DC (20-23 May 2019) - HAWQ is adding ORC support. ## Health report: - The project is doing well and working toward the next release. ## PMC changes: - Currently 9 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Eugene Koifman on Tue Sep 05 2017 ## Committer base changes: - Currently 39 committers. - Dongjoon Hyun was added as a committer on Fri Jan 11 2019 ## Releases: - 1.5.4 was released on Wed Dec 19 2018 ## Mailing list activity: - The mailing list activity has been stable. There has been a bit less traffice on dev this month because we are between releases. - dev@orc.apache.org: - 64 subscribers (up 5 in the last 3 months): - 76 emails sent to list (223 in previous quarter) - issues@orc.apache.org: - 20 subscribers (up 0 in the last 3 months): - 290 emails sent to list (267 in previous quarter) - user@orc.apache.org: - 66 subscribers (up 2 in the last 3 months): - 7 emails sent to list (9 in previous quarter) ## JIRA activity: - 40 JIRA tickets created in the last 3 months - 27 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We made the 1.5.3 bug fix release. - There were two presentations about ORC at ApacheCon in Montreal: + A presentation about ORC column encryption + A presentation about benchmarks for ORC and other file formats read performance from Spark ## Health report: - The community has been a little quieter this quarter in between releases. - We continue to evaluate new committers and PMC members. ## PMC changes: - Currently 9 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Eugene Koifman on Tue Sep 05 2017 ## Committer base changes: - Currently 38 committers. - No new committers added in the last 3 months - Last committer addition was Xiening Dai at Fri Apr 06 2018 ## Releases: - 1.5.3 was released on Mon Sep 24 2018 ## JIRA activity: - 25 JIRA tickets created in the last 3 months - 24 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for big data workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - Released ORC 1.5.0 (and a few follow up bug fixes 1.5.1 and 1.5.2) + Added new C++ writer + Added support for variable length blocks in HDFS + Implemented a CSV to ORC converter + Improved performance for decimal types + Added support for compiling C++ code under MSVC. - Presentation about Avro, ORC, and Parquet benchmarks from Spark was given at Berlin Buzzwords https://berlinbuzzwords.de/18/session/fast-access-your-complex-data-avro-json-orc-and-parquet The code is in ORC-386 at https://github.com/apache/orc/pull/290 - Presentation about how to split up projects, such as ORC out of Hive, at FOSS Backstage https://foss-backstage.de/session/untangling-spaghetti-when-and-how-split-projects - Did a FeatherCast about ORC in Berlin. ## Health report: - After more discussions and a thread on Apache legal, we have reincorporated the benchmark code, which depends on the GPL'ed JMH framework back in to the project. Consensus was that because the benchmarks are not user facing they qualify as "optional components" from https://www.apache.org/legal/resolved.html#optional . To strengthen this characterization, we ensured that the benchmarks are not built by default and are not distributed on Maven Central. ## PMC changes: - Currently 9 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Eugene Koifman on Tue Sep 05 2017 ## Committer base changes: - Currently 38 committers. - No new committers added in the last 3 months - Last committer addition was Xiening Dai at Fri Apr 06 2018 ## Releases: - 1.4.4 was released on Sun May 13 2018 - 1.5.0 was released on Sun May 13 2018 - 1.5.1 was released on Thu May 24 2018 - 1.5.2 was released on Thu Jun 28 2018 ## Mailing list activity: - With the new version coming out, we've had a lot of developer list traffic. - dev@orc.apache.org: + 57 subscribers (up 1 in the last 3 months): + 393 emails sent to list (305 in previous quarter) - issues@orc.apache.org: + 20 subscribers (up 0 in the last 3 months): + 462 emails sent to list (388 in previous quarter) - user@orc.apache.org: + 60 subscribers (up 2 in the last 3 months): + 7 emails sent to list (37 in previous quarter) ## JIRA activity: - 47 JIRA tickets created in the last 3 months - 40 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We made two bug fix releases for the 1.4 branch and will make another soon. - We also need to finalize the ORC 1.5 release. - We decided to create a security list. - Expanded the development page on the website to help new committers. - Refactored the format specification on the website to make it easier to add the new ORCv2 specification as we work on it. - The Iceberg project https://github.com/Netflix/iceberg from Netflix added ORC support. - Presentations got accepted about ORC for Berlin Buzzwords and Dataworks Summit San Jose. ## Health report: - Discovered that our benchmarking code was using a GPL dependency and so moved the code to an external github site. - The activity level is increasing and drawing from an increasing number of contributors. We continue to track the contributions looking for new committers and PMC members. ## PMC changes: - Currently 9 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Eugene Koifman on Tue Sep 05 2017 ## Committer base changes: - Currently 38 committers. - New commmitters: - Gang Wu was added as a committer on Wed Mar 28 2018 - Xiening Dai was added as a committer on Fri Apr 06 2018 ## Releases: - 1.4.2 was released on Mon Jan 22 2018 - 1.4.3 was released on Thu Feb 08 2018 ## Mailing list activity: The traffic is up this quarter, but in line with expectactions. - dev@orc.apache.org: - 55 subscribers (up 5 in the last 3 months) - 328 emails sent to list (261 in previous quarter) - issues@orc.apache.org: - 20 subscribers (up 0 in the last 3 months) - 409 emails sent to list (293 in previous quarter) - user@orc.apache.org: - 58 subscribers (up 2 in the last 3 months) - 36 emails sent to list (14 in previous quarter) ## JIRA activity: - 51 JIRA tickets created in the last 3 months - 38 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We made bug fix releases for both the 1.3 and 1.4 branches. - The C++ writer can write the original (Hive 0.11) version and will be extended to write the current version. - Apache Hive has moved up to the ORC 1.4.1 release. - Apache Arrow has started adding support for ORC. - Apache Flink is working on a patch to add ORC support. - We are working on getting column encryption implemented. - We've started release discussions for ORC 1.5. ## Health report: - We have several active contributors that we are encouraging and tracking to committership. ## PMC changes: - Currently 9 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Eugene Koifman on Tue Sep 05 2017 ## Committer base changes: - Currently 36 committers. - No new committers added in the last 3 months - Last committer addition was Deepak Majeti at Tue May 09 2017 ## Releases: - Last releases were ORC-1.4.1 and ORC-1.3.4 on Sun Oct 15 2017 ## Mailing list activity: - The email traffic is down a bit this quarter, but I expect it to pick back up again this quarter as we move forward on ORC 1.5. - dev@orc.apache.org: - 50 subscribers (up 1 in the last 3 months): - 250 emails sent to list (327 in previous quarter) - issues@orc.apache.org: - 20 subscribers (up 0 in the last 3 months): - 265 emails sent to list (417 in previous quarter) - user@orc.apache.org: - 56 subscribers (up 3 in the last 3 months): - 14 emails sent to list (18 in previous quarter) ## JIRA activity: - 34 JIRA tickets created in the last 3 months - 31 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We are planning bug fix releases on the 1.3 and 1.4 branches with Prasanth as the release manager. - Spark has started using the ORC 1.4 release (instead of the ORC from Hive 1.2), which has lead to large performance improvements. The work to improve the Spark bindings for ORC continues. - Presentations: - Ingesting Data at Blazing Speed Using Apache ORC https://dataworkssummit.com/sydney-2017/sessions/ingesting-data-at-blazing-speed-using-apache-orc/ - Performance Update: When Apache ORC Met Apache Spark. https://dataworkssummit.com/sydney-2017/sessions/performance-update-when-apache-orc-met-apache-spark/ - Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet. https://dataworkssummit.com/sydney-2017/sessions/big-data-storage-comparing-speed-and-features-for-avro-json-orc-and-parquet/ - We are discussing a new version of the ORC format for ORC 2.0. ## Health report: - The community continues to gain strength. ## PMC changes: - Currently 9 PMC members. - New PMC members: - Eugene Koifman was added to the PMC on Tue Sep 05 2017 - Deepak Majeti was added to the PMC on Tue Sep 05 2017 ## Committer base changes: - Currently 36 committers. - No new committers added in the last 3 months - Last committer addition was Deepak Majeti at Tue May 09 2017 ## Releases: - Last release was 1.4.0 on Sun May 07 2017 ## Mailing list activity: - The number of subscribers to the various lists has been increasing. - dev@orc.apache.org: - 50 subscribers (up 10 in the last 3 months): - 330 emails sent to list (397 in previous quarter) - issues@orc.apache.org: - 20 subscribers (up 2 in the last 3 months): - 420 emails sent to list (406 in previous quarter) - user@orc.apache.org: - 54 subscribers (up 5 in the last 3 months): - 18 emails sent to list (9 in previous quarter) ## JIRA activity: - 40 JIRA tickets created in the last 3 months - 30 JIRA tickets closed/resolved in the last 3 months
## Description: - A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - A presentation on "ORC File - Optimizing Your Big Data" was given at the Dataworks Summit in San Jose. - Alibaba is contributing a C++ ORC file writer, which is a very welcome contribution. ## Health report: - Unfortunately, the C++ code reviews have been going slower than we would like. The combination of summer vacations, the small pool of ORC committers that feel comfortable reviewing the C++ code, and the influx of C++ from Alibaba have lead to longer review cycles. Clearly we need more committers on the C++ side of the house. ## PMC changes: - Currently 7 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Gopal Vijayaraghavan on Sun Jan 08 2017 ## Committer base changes: - Currently 36 committers. - Deepak Majeti was added as a committer on Tue May 09 2017 ## Releases: - 1.4.0 was released on Sun May 07 2017 ## Mailing list activity: - We've seen a large uptick in developer activity, which is great to see. - dev@orc.apache.org: - 40 subscribers (up 5 in the last 3 months): - 411 emails sent to list (173 in previous quarter) - issues@orc.apache.org: - 18 subscribers (up 2 in the last 3 months): - 413 emails sent to list (231 in previous quarter) - user@orc.apache.org: - 49 subscribers (up 3 in the last 3 months): - 9 emails sent to list (38 in previous quarter) ## JIRA activity: - 39 JIRA tickets created in the last 3 months - 20 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - Hive's master branch has moved over to the ORC project's artifacts. - There was a talk at Hadoop Summit Munich about our file format benchmark comparing size and speed of Avro, Json, ORC, and Parquet. ## Health report: - We are doing well and are actively encouraging the new contributors towards becoming committers. ## PMC changes: - Currently 7 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Gopal Vijayaraghavan on Sun Jan 08 2017 ## Committer base changes: - Currently 35 committers. - New commmitters: - Carl Steinbach was added as a committer on Mon Jan 30 2017 - Jimmy Xiang was added as a committer on Mon Jan 30 2017 - Naveen Gangam was added as a committer on Fri Feb 17 2017 ## Releases: - 1.3.0 was released on Sun Jan 22 2017 - 1.3.1 was released on Thu Feb 02 2017 - 1.3.2 was released on Sun Feb 12 2017 - 1.3.3 was released on Mon Feb 20 2017 ## Mailing list activity: - Now that the code base has completely moved from Hive, we are seeing a significant upswing in activity. - dev@orc.apache.org: - 35 subscribers (up 2 in the last 3 months): - 175 emails sent to list (109 in previous quarter) - issues@orc.apache.org: - 16 subscribers (up 1 in the last 3 months): - 234 emails sent to list (101 in previous quarter) - user@orc.apache.org: - 46 subscribers (up 7 in the last 3 months): - 38 emails sent to list (18 in previous quarter) ## JIRA activity: - 42 JIRA tickets created in the last 3 months - 31 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - There was considerable concern among the Hive committers about losing their ability to make changes to the ORC code base as we finish replacing the copy of ORC's code base in Hive with a reference to the ORC project's release artifacts. The ORC PMC voted to allow the current Hive committers to become ORC committers. Twenty of the Hive committers accepted the offer and have been added. I feel this is good for the project, but it may induce some short term turmoil with the large influx of new committers. - On a separate note, the ORC PMC also added Gopal Vijayaraghavan as a PMC member. Gopal has done a lot of great work on ORC's performance and is a great addition to the PMC. - We made two bug fix releases on the 1.2 line and will likely make a 1.3 release in the next quarter. ## Health report: - We need to finish Hive's migration from its own ORC code base to use our release artifacts. Having two copies of the code in different projects is very hard to mantain. - We are seeing some new contributors, which is great. ## PMC changes: - Currently 7 PMC members. - Gopal Vijayaraghavan was added to PMC on Jan 08 2017 ## Committer base changes: - Currently 32 committers. - New commmitters: - Aihua Xu was added as a committer on Fri Dec 16 2016 - Lalam Chinna Rao was added as a committer on Fri Dec 16 2016 - Chaoyu Tang was added as a committer on Thu Dec 15 2016 - Jianyong Dai was added as a committer on Fri Dec 16 2016 - Eugene Koifman was added as a committer on Thu Dec 15 2016 - Ashutosh Chauhan was added as a committer on Thu Dec 15 2016 - Jesús Camacho Rodríguez was added as a committer on Fri Dec 16 2016 - Jason Dere was added as a committer on Thu Dec 15 2016 - Lars Francke was added as a committer on Wed Dec 21 2016 - Rui Li was added as a committer on Fri Dec 16 2016 - Mithun Radhakrishnan was added as a committer on Thu Dec 15 2016 - Matt McCline was added as a committer on Thu Dec 15 2016 - Pengcheng Xiong was added as a committer on Thu Dec 15 2016 - Rajesh Balamohan was added as a committer on Thu Dec 15 2016 - Sergio Peña was added as a committer on Fri Dec 16 2016 - Siddharth Seth was added as a committer on Mon Jan 09 2017 - Vaibhav Gumashta was added as a committer on Thu Dec 15 2016 - Wei Zheng was added as a committer on Thu Dec 15 2016 - Ferdinand Xu was added as a committer on Wed Jan 04 2017 - Yongzhi Chen was added as a committer on Fri Dec 16 2016 ## Releases: - 1.2.2 was released on Wed Nov 30 2016 - 1.2.3 was released on Sun Dec 11 2016 ## JIRA activity: - 24 JIRA tickets created in the last 3 months - 20 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - We've made the 1.2.0 release and a 1.2.1 bug fix release. - We've updated the http://orc.apache.org/ home page to better support the Apache branding guidelines. - We've implemented benchmarks between Avro, JSON, ORC, and Parquet. - We've given presentations about the benchmarks at Hadoop Summit San Jose and Melbourne, and Strata New York. https://s.apache.org/file-format-bench ## Health report: - We continue efforts to increase the size and diversity of the contributor base by encouraging people to contribute to the project. ## PMC changes: - Currently 6 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Aliaksei Sandryhaila on Wed Nov 18 2015 ## Committer base changes: - Currently 12 committers. - No new changes to the committer base since last report. - No new committers added in the last 3 months - Last committer addition was more than 2 years ago ## Releases: - 1.2.0 was released on Wed Aug 24 2016 - 1.2.1 was released on Tue Oct 04 2016 ## Mailing list activity: - The mailing list is relatively stable with fewer issues, but a little more dev and user traffic. - dev@orc.apache.org: - 33 subscribers (up 4 in the last 3 months): - 149 emails sent to list (186 in previous quarter) - issues@orc.apache.org: - 15 subscribers (up 0 in the last 3 months): - 136 emails sent to list (186 in previous quarter) - user@orc.apache.org: - 32 subscribers (up 3 in the last 3 months): - 18 emails sent to list (4 in previous quarter) ## JIRA activity: - 23 JIRA tickets created in the last 3 months - 17 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - The Java code in Hive was finally separated into a stand alone module and copied over to the ORC project. - We've made the 1.1.0 release and two bug fix releases keeping synchronized with the Hive code base. - The HIVE-14007 jira for removing the hive-orc module from Hive is currently being reviewed. ## Health report: - We continue efforts to increase the size and diversity of the contributor base by encouraging people to contribute to the project. ## PMC changes: - Currently 6 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Aliaksei Sandryhaila on Wed Nov 18 2015 ## Committer base changes: - Currently 12 committers. - No new changes to the committer base since last report. - No new committers added in the last 3 months - Last committer addition was in April 2015. ## Releases: - 1.1.0 was released on Fri Jun 10 2016 - 1.1.1 was released on Mon Jun 13 2016 - 1.1.2 was released on Sat Jul 09 2016 ## Mailing list activity: - With moving the Java code in, the development activity has increased dramatically. - dev@orc.apache.org: - 29 subscribers (up 6 in the last 3 months): - 192 emails sent to list (63 in previous quarter) - issues@orc.apache.org: - 15 subscribers (up 2 in the last 3 months): - 192 emails sent to list (65 in previous quarter) - user@orc.apache.org: - 29 subscribers (up 5 in the last 3 months): - 4 emails sent to list (14 in previous quarter) ## JIRA activity: - 34 JIRA tickets created in the last 3 months - 26 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - There are no issues requiring the board's attention. ## Activity: - After the last board report, it was suggested that we go ahead and make a release of the C++ code without waiting for the Java to be separated out of Hive. We went ahead and did that and released 1.0.0. - We're about to create a first pull request for ORC-1, which is the import of the Java code from Hive. We plan to make a 1.1 release soon after the code is pulled in to ORC. ## Health report: - The project has been relatively quiet. Getting the Java reader and writer released will enable projects such as Spark to import a much smaller set of dependencies when they use ORC. ## PMC changes: - Currently 6 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Aliaksei Sandryhaila on Wed Nov 18 2015 ## Committer base changes: - Currently 12 committers. - No new changes to the committer base since last report. - No new committers added in the last 3 months - Last committer addition was in April 2015. ## Releases: - 1.0.0 was released on Mon Jan 25 2016 ## Mailing list activity: - The mailing lists have been relatively quiet during this quarter. - dev@orc.apache.org: - 23 subscribers (up 0 in the last 3 months): - 64 emails sent to list (31 in previous quarter) - issues@orc.apache.org: - 13 subscribers (up 0 in the last 3 months): - 72 emails sent to list (68 in previous quarter) - user@orc.apache.org: - 26 subscribers (up 2 in the last 3 months): - 7 emails sent to list (33 in previous quarter) ## JIRA activity: - 11 JIRA tickets created in the last 3 months - 7 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - We really need a release, which is blocked by wanting to get the Java reader and writer moved out of Hive. ## Activity: - Progress continues on moving the Java reader and writer out of Hive. We've created the ORC module in Hive and moved the API, utilities and writer into it. We are currently cleaning up some of the ORC APIs before Hive 2.0.0 gets released. The final bits to move over to the ORC module will be the reader. Once the reader is moved over, the entire module can be moved to the ORC project. ## Health report: - The activity level is relatively low. The C++ reader works and is stable. There continues to be user interest in reading and writing ORC files outside of Hive. ## PMC changes: - Currently 6 PMC members. - Aliaksei Sandryhaila was added to the PMC on Wed Nov 18 2015 ## Committer base changes: - Currently 12 committers. - No new changes to the committer base since last report. - No new committers added in the last 3 months - Last committer addition was more than 2 years ago ## Releases: - No release has been made yet. ## Mailing list activity: - dev@orc.apache.org: - 23 subscribers (up 3 in the last 3 months): - 29 emails sent to list (37 in previous quarter) - user@orc.apache.org: - 24 subscribers (up 4 in the last 3 months): - 30 emails sent to list (21 in previous quarter) - issues@orc.apache.org: - 13 subscribers (up 3 in the last 3 months): - 73 emails sent to list (63 in previous quarter) ## JIRA activity: - 5 JIRA tickets created in the last 3 months - 3 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Issues: - Progress in separating out the Java ORC reader & writer from Hive has gone slowly. The progress on HIVE-10171 is currently 19 of 24 sub-taskes done. Part of the problem has been a lot of conflicting activity in the ORC code base from the development of LLAP. The LLAP branch has merged, which should slow down the churn on the code base. - The project needs to make a release soon to start developing the community. The release is blocked by getting the Java code out of Hive. ## Activity: - The Java ORC reader & writer, which are still in Hive, had 142 commits in the last 3 months. - The C++ ORC reader has only had 2 commits. ## Health report: - The activity is low. There are users asking for the Java release, which we are working on. ## PMC changes: - Currently 5 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Prasanth Jayachandran at Tue Apr 21 2015 ## LDAP changes: - Currently 12 committers and 5 committee group members. - No new changes to the committee group or committership since last report. ## Releases: - No releases yet. ## Mailing list activity: - dev@orc.apache.org: - 20 subscribers (up 0 in the last 3 months): - 26 emails sent to list (49 in previous quarter) - user@orc.apache.org: - 20 subscribers (up 3 in the last 3 months): - 21 emails sent to list (4 in previous quarter) - issues@orc.apache.org: - 10 subscribers (up 0 in the last 3 months): - 69 emails sent to list (41 in previous quarter) ## JIRA activity: - 7 JIRA tickets created in the last 3 months - 6 JIRA tickets closed/resolved in the last 3 months
## Description: A high-performance columnar file format for Hadoop workloads. ## Activity: - The project is continuing work on refactoring the Hive and Orc Java code base inside the Hive project to minimize the time that the code is forked before it can be replaced with a reference to the new project. The work can be tracked in HIVE-10171. - The C++ implementation has been moved over from github. ## PMC/Committership changes: - Currently 12 committers and 5 PMC members in the project. - The PMC was created on 22 April 2015 and no members have been added since. - The project added 7 committers on 11 May 2015. ## Releases: - No releases have been made. ## Mailing list activity: - dev@orc.apache.org: - 18 subscribers (up 3 in the last month): - 35 emails sent to list (17 in previous month) - user@orc.apache.org: - 15 subscribers (up 2 in the last month): - 4 emails sent to list (1 in previous month) - issues@orc.apache.org: - 9 subscribers (up 0 in the last month): - 33 emails sent to list (11 in previous month) ## JIRA activity: - 19 JIRA tickets created in the last 3 months - 4 JIRA tickets closed/resolved in the last 3 months
@Owen please update the records on IP clearance for the code move from github
## Description: ORC is a high-performance columnar file format for Hadoop workloads. ## Activity: - Two talks were given about ORC at the Hadoop Summit San Jose. - We've create the orc.apache.org website. - Work continues on factoring ORC out of Hive. - We're working with the Flink project on integration efforts. ## Issues: - There are no issues that require the board's attention. ## PMC/Committership changes: - Currently 7 committers and 5 PMC members in the project. - No new changes to the PMC or committership since last report. ## Releases: - We have not made a release yet. ## Mailing list activity: - dev@orc.apache.org: - 16 subscribers (up 2 in the last month): - 17 emails sent to list (2 in previous month) - user@orc.apache.org: - 13 subscribers (up 9 in the last month): - 1 emails sent to list (0 in previous quarter) - issues@orc.apache.org: - 9 subscribers (up 3 in the last month): - 11 emails sent to list (0 in previous month) ## JIRA activity: - 11 JIRA tickets created in the last 3 months - 2 JIRA tickets closed/resolved in the last 3 months
Report from the Orc project [Owen O'Malley] ## Description: - ORC is a high-performance columnar file format for Hadoop workloads. ## Activity: - Infra created the git repository. - Infra created the jira. - Infra created the mailing lists. - Infra created the website and we are creating the content for it. - We added 7 new committers based on their previous work on ORC. - We've started the work to refactor Hive so that ORC can depend on a minimal subset of Hive. - We are working on the code grant to pull the C++ reader code from github.com/hortonworks/orc.git. ## Issues: - There are no issues that require the board's attention. ## PMC/Committership changes: - Currently 7 committers and 5 PMC members in the project. ## Releases: - No releases have been made yet. ## Mailing list activity: - dev@orc.apache.org: - 10 subscribers (up 10 in the last 3 months): - 2 emails sent to list (0 in previous quarter) - user@orc.apache.org: - 9 subscribers (up 9 in the last 3 months): - 1 emails sent to list (0 in previous quarter) - issues@orc.apache.org: - 6 subscribers (up 6 in the last 3 months) ## JIRA activity: - 3 JIRA tickets created in the last 3 months - 0 JIRA tickets closed/resolved in the last 3 months
WHEREAS, the Board of Directors deems it to be in the best interests of the Foundation and consistent with the Foundation's purpose to establish a Project Management Committee charged with the creation and maintenance of open-source software, for distribution at no charge to the public, related to high performance columnar file formats for distributed computing. NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee (PMC), to be known as the "Apache Orc Project", be and hereby is established pursuant to Bylaws of the Foundation; and be it further RESOLVED, that the Apache Orc Project be and hereby is responsible for the creation and maintenance of software related to composite oriented programming; and be it further RESOLVED, that the office of "Vice President, Apache Orc" be and hereby is created, the person holding such office to serve at the direction of the Board of Directors as the chair of the Apache Orc Project, and to have primary responsibility for management of the projects within the scope of responsibility of the Apache Orc Project; and be it further RESOLVED, that the persons listed immediately below be and hereby are appointed to serve as the initial members of the Apache Orc Project Management Committee: * Chris Douglas <cdouglas@apache.org> * Alan Gates <gates@apache.org> * Prasanth Jayachandran <prasanthj@apache.org> * Lefty Leverenz <leftyl@apache.org> * Owen O'Malley <omalley@apache.org> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Owen O'Malley be appointed to the office of Vice President, Apache Orc, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed; and be it further RESOLVED, that the initial Apache Orc Project be and hereby is tasked with the creation of a set of bylaws intended to encourage open development and increased participation in the Apache Orc Project. Special Order 7A, Establish the Apache Orc Project, was approved by Unanimous Vote of the directors present.