
This was extracted (@ 2025-02-19 22:10) from a list of minutes
which have been approved by the Board.
Please Note
The Board typically approves the minutes of the previous meeting at the
beginning of every Board meeting; therefore, the list below does not
normally contain details from the minutes of the most recent Board meeting.
WARNING: these pages may omit some original contents of the minutes.
Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).
Report was filed, but display is awaiting the approval of the Board minutes.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Ongoing with high activity Issues for the board: None ## Membership Data: Apache Impala was founded 2017-11-15 (7 years ago) There are currently 72 committers and 40 PMC members in this project. The Committer-to-PMC ratio is 9:5. Community changes, past quarter: - No new PMC members. Last addition was Riza Suminto on 2024-03-06. - Jason Fehr was added as committer on 2024-08-27 - Stephen Carlin was added as committer on 2024-11-01 ## Project Activity: Over the last three months, the Impala community has implemented the following: - MERGE statement for Iceberg tables - Progress was made on the new Calcite planner - Enabling new Clang tidy checks - Improved Workload Management features - Perf improvements to Iceberg V2 tables - Reading column statistics from Puffin files - Improved visualization of query timelines - Filtering criteria for the OPTIMIZE statement ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were a bit over 3000 emails to that list in September, October, and November (until 11th)
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Ongoing with high activity Issues for the board: none ## Membership Data: Apache Impala was founded 2017-11-15 (7 years ago) There are currently 70 committers and 40 PMC members in this project. The Committer-to-PMC ratio is 7:4. Community changes, past quarter: - No new PMC members. Last addition was Riza Suminto on 2024-03-06. - No new committers. Last addition was Zihao Ye on 2024-03-24. ## Project Activity: Impala 4.4.0 was released on May 25th Impala 3.4.2 was released on June 22nd Over the last three months, the Impala community has implemented the following: - Continuous work on using Calcite for query planning - Improvements and fixes for the current query planner - Improvements to CatalogD and event handling - Iceberg V2 read performance improvements - JDBC table enhancements - Numerous bug fixes - Performance improvements - Test improvements ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 2100 emails to that list in June, July, and Aug (until 9th)
WHEREAS, the Board of Directors heretofore appointed Jim Apple (jbapple) to the office of Vice President, Apache Impala, and WHEREAS, the Board of Directors is in receipt of the resignation of Jim Apple from the office of Vice President, Apache Impala, and WHEREAS, the Project Management Committee of the Apache Impala project has chosen by vote to recommend Zoltán Borók-Nagy (boroknagyz) as the successor to the post; NOW, THEREFORE, BE IT RESOLVED, that Jim Apple is relieved and discharged from the duties and responsibilities of the office of Vice President, Apache Impala, and BE IT FURTHER RESOLVED, that Zoltán Borók-Nagy be and hereby is appointed to the office of Vice President, Apache Impala, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed. Special Order 7A, Change the Apache Impala Project Chair, was approved by Unanimous Vote of the directors present.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Healthy and active Issues for the board: None ## Membership Data: Apache Impala was founded 2017-11-15 (6 years ago) There are currently 70 committers and 40 PMC members in this project. The Committer-to-PMC ratio is 7:4. Community changes, past quarter: - Riza Suminto was added to the PMC on 2024-03-06 - Zihao Ye was added as committer on 2024-03-24 - Peter Rozsa was added as committer on 2024-03-02 ## Project Activity: Over the last three months, the Impala community has implemented the following: - Numerous test fixes and test coverage improvements. - Enhancements for Data Source table Scanner - New features for Iceberg table format: - Query nested columns from metadata views - Create table with primary keys - SHOW METADATA TABLES IN - Intermediate result caching node - Query history table - JDBC table enhancements - Observability enhancements for query profiles - Enhancements for BINARY type - Event processor bug fixes and improvements - Introduced Calcite in query planning for simple queries - Numerous bug fixes - Numerous performance improvements The last release (4.3) was in last October, however, 4.4 is on the way now. ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were a bit over 4000 emails to that list in March, April, and May (until 8th)
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Ongoing with high activity Issues for the board: none ## Membership Data: Apache Impala was founded 2017-11-15 (6 years ago) There are currently 69 committers and 40 PMC members in this project. The Committer-to-PMC ratio is roughly 9:5. Community changes, past quarter: - Riza Suminto was added to the PMC on 2024-03-06 - Peter Rozsa was added as committer on 2024-03-02 ## Project Activity: Over the last three months, the Impala community has implemented the following: - Improvements Iceberg support (UPDATE, equality deletes, metadata tables) - Table maintenance operations for Iceberg tables (OPTIMIZE) - Workload management functionalities - Improvements to the event processor - Improvements to JSON support - Impala to Impala federation - Predicate push down to external data sources - Various optimizations (count star, intra-node communication, etc.) - Numerous bug fixes ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 3628 emails to that list in January, February, and March (until 15th)
No report was submitted.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Ongoing with high activity Issues for the board: none ## Membership Data: Apache Impala was founded 2017-11-14 (6 years ago) There are currently 68 committers and 39 PMC members in this project. The Committer-to-PMC ratio is roughly 9:5. Community changes, past quarter: - Michael Smith was added to the PMC on 2023-11-02 - No new committers. Last addition was Gergely Fürnstáhl on 2023-08-12. ## Project Activity: The latest release was 4.3.0 on 2023-10-03. Over the last three months, the Impala community has implemented the following: - Improved support for JDK 17, aarch64, Python 3, and Unicode names - Improvements to Iceberg support - A graphical timeline view of query execution - Support for querying external RDBMS via JDBC - Improved memory tracking for codegen caching - High availability for Impala's Statestore - Implemented small string optimization for StringValue to speed serialization - Improvements to Hive Metastore event processing - Improvements to cardinality estimation - Improvements to runtime filter aggregation - Numerous bug fixes ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 3921 emails to that list in October, November, and December.
No report was submitted.
No report was submitted.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Project Status: Current project status: Ongoing with high activity Issues for the board: none ## Membership Data: Apache Impala was founded 2017-11-14 (6 years ago) There are currently 67 committers and 38 PMC members in this project. The Committer-to-PMC ratio is roughly 9:5. Community changes, past quarter: - No new PMC members. Last addition was Andrew Sherman on 2023-04-20. - Kurt Deschler was added as committer on 2023-07-03 ## Project Activity: The latest release was 4.1.2 on 2023-04-10. Over the last three months, the Impala community has implemented the following: - Fixed numerous race conditions and one null pointer exception - Fixed several build failures (flaky pre-merge tests) - Improved compatibility with JDK, aarch64, cgroups, Redhat and Ubuntu, LLVM, OpenSSL, and Spring - Improved compatibility with Apache projects Ozone, Maven, Hive, Iceberg, Avro, Hadoop, Atlas, Ranger, Thrift, Kudu, and ORC - Added high availability to the catalog service - Added support for building DEB and RPM packages - Improved performance on TPC-DS - Fixed two query correctness bugs - Made many improvements to cardinality estimations ## Community Health: In answer to the question about community health from the last board report, each patch produces several emails - every review, commit, new version of a patch, and bot linter run produces an email. reviews@ is the most reliable metric of Impala community activity level. There were 4042 emails to that list in June, July, and August.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-15 (5 years ago) There are currently 66 committers and 38 PMC members in this project. The Committer-to-PMC ratio is roughly 9:5. Community changes, past quarter: - Andrew Sherman was added to the PMC on 2023-04-21 - Fang-Yu Rao was added as committer on 2023-03-10 Inactive PMC members: After the recent message from the ASF Board about PMC member responsibilities we've identified 13 PMC members who are not subscribed to the project's private@ mailing list. We sent them an initial reminder about their responsibilities hoping they become active again. ## Project Activity: 4.1.2 was released on 2023-04-10. 4.2.0 was released 2022-12-12. The Impala community had implemented the following over the last three months: - Bug fixes for nested types handling - Various improvements on the Web UI - Apache Iceberg support improvements - Apache Kudu support improvements - Fix for incorrectly written Iceberg test data - Added Hive's ESRI geospatial functions - Document changes - Various performance improvements - Improved logging for different existing features - Apache Ozone support improvements - Introduced support for OBS file system - Various test fixes - Switch to C++17 - Bumped versions for various dependencies - Improved decision making in query Planner - Various improvements for Python3 support - Fixes and improvements for impala-shell ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 3606 emails to that list in February, March and April together. This is one of the highest numbers for a quarter in recent years. Impala remains a vibrant project.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (5 years ago) There are currently 65 committers and 37 PMC members in this project. The Committer-to-PMC ratio is roughly 9:5. Community changes, past quarter: - Daniel Becker was added to the PMC on 2023-01-16 - Yida Wu was added as committer on 2023-01-20 - Li Penglin was added as committer on 2022-12-20 - Michael Smith was added as committer on 2022-11-07 ## Project Activity: The Impala community had implemented the following over the last three months: - Improved support for Apache projects including Iceberg, Hadoop, Ozone, Thrift, Ranger, Hive, Parquet, Avro, Hudi, and Kudu - Multiple dependency upgrades due to their CVEs - Multiple tracing and debugging improvements - Multiple fixes for flaky tests - Numerous documentation fixes - Some tightening of authorization constraints - DDL support for bucketed tables - Improved support for Docker and for Ubuntu 16.04 - Made the docs much prettier - Added support for Aliyun Object Storage Service - Fixed multiple crashes ## Community Health: 4.2.0 was released 2022-12-12. reviews@ is the most reliable metric of Impala community activity level. There were 2980 emails to that list in November, December, and January. Impala remains a vibrant project.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (5 years ago) There are currently 63 committers and 36 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - No new PMC members. Last addition was Tamás Máté on 2022-07-10. - No new committers in August, September, and October. Michael Smith was added as committer on 2022-11-07 ## Project Activity: - Improved support for Apache projects including Hadoop, Iceberg, Hive, Ozone, Commons, Kudu, Ranger, ORC, Parquet, Tez, and Thrift - Improved support for Guava, Jackson, AWS S3, Tencent COS, Ubuntu 18+, log4j 1.x -> reload 4j, Docker, Java 11, Redhat and Redhat-based Linux distributions, Spring, flatbuffers, GCC 10.4, Docker, zlib, and zstd - Reduced compile times and built binaries' size - Improved debugging support - Increased decimal performance - Added support for TBLPROPERTIES on views - Fixed multiple flaky tests - Fixed multiple memory leaks - Added support for map type in SELECT list - Added support for TLS 1.3 - Added support for BINARY columns - Made multiple improvements to code review tooling ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 3847 emails to that list in August, September, and October. Impala remains a vibrant project. 4.1.1 was released on 2022-10-20.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (5 years ago) There are currently 62 committers and 36 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - Tamás Máté was added to the PMC on 2022-07-10 - Riza Suminto was added as committer on 2022-05-27 ## Project Activity: - Improved support for Apache projects including Iceberg, Parquet, Ozone, Kudu, Hive, Avro, HBase, ORC, Ranger, Thrift, Tez, YARN, and Hadoop - Improved support for re2, Google Cloud, Ubuntu 20, Kerberos, CentOS, and Tlinux - Multiple improvements to the build system - Improved support for timestamps - Improved support for views - Fix multiple undefined behaviors in C++ code - Multiple flaky test improvements - Multiple improvements to our Python test and shell environments, including transposed result printing - Increase security of transport protocols by eliminating default support for RC4 - Support for various statistical UDAFs ## Community Health: reviews@ is the most reliable metric of Impala community activity level. There were 3083 emails to that list in May, June, and July. Impala remains a vibrant project. 4.1.0 was released on 2022-06-01.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (4 years ago). There are currently 61 committers and 35 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - No new PMC members. Last addition was Laszlo Gaal on 2022-01-20. - No new committers. Last addition was Daniel Becker on 2021-12-07. ## Project Activity: - Improved support for compatibility with Apache projects Iceberg, ORC, Hive, Ranger, Parquet, Thrift, and Kudu - Many fixes for flaky tests - Improved support for materialized views - Upgraded Spring past three CVEs - Upgraded other packages past other CVEs - Numerous performance improvements, including some queries improved by 50% or more ## Community Health: reviews@ is the best gauge of Impala community activity. There were 2532 emails to reviews@ in the last three months; Impala remains a busy community. The most recent release was Impala 3.4.1, on 2022-04-07.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (4 years ago). There are currently 61 committers and 35 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - Laszlo Gaal was added to the PMC on 2022-01-20. - Amogh Margoor was added as committer on 2021-11-23. - Daniel Becker was added as committer on 2021-12-07. ## Project Activity: - Improved support for compatibility with Apache projects Iceberg, Parquet, Kudu, Hive, DataSketches, ORC, and HBase - Improved support for or compatibility with protobuf, Java UDFs, Boost, glog, gutil, the HTTP/1.1 RFC, and PEP-0503 - Support for multiple resource groups (unfinished) - Fixes for multiple flaky tests - Faster query analysis for queries containing VALUES() - Multiple fixes for more consistent metadata change application - Support for Tencent Cloud Object Storage - Support for zipping when unnesting arrays ## Community Health: reviews@ is the best gauge of Impala community activity. There were 2180 emails to reviews@ in the last three months; Impala remains a busy community. The most recent release was Impala 4.0.0, on 2021-07-12.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention ## Membership Data: Apache Impala was founded 2017-11-14 (4 years ago) There are currently 59 committers and 34 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - No new PMC members. Last addition was Vihang Karajgaonkar on 2021-06-03. - No new committers. Last addition was Wenzhe Zhou on 2021-07-09. ## Project Activity: During August, September, and October, the Impala community: - Improved support for integrations with Apache projects Ozone, ORC, Iceberg, Hive, Ranger, Kudu, DataSketches, Parquet, and HDFS - Improved integration with non-Apache projects, formats, or protocols S3, CentOS 7, PyPi, flame graphs, Docker, and LDAP This quarter few new features landed that weren't integrations as mentioned above. Most other patches were bug fixes. ## Community Health: Perhaps the most stable indicator of Impala activity is reviews@, which registers an email for each code review, each submit, and each Jenkins job completion. This decreased this quarter to 2415 from 2819, a 14% decline. Impala is still a thriving community.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention ## Membership Data: Apache Impala was founded 2017-11-14 (4 years ago) There are currently 59 committers and 34 PMC members in this project. The Committer-to-PMC ratio is roughly 8:5. Community changes, past quarter: - Vihang Karajgaonkar was added to the PMC on 2021-06-03 - Qifan Chen was added as committer on 2021-06-25 - Tamás Máté was added as committer on 2021-06-11 - Wenzhe Zhou was added as committer on 2021-07-09 ## Project Activity: Impala 4.0.0 was released on 2021-07-12. CVE-2021-28131 was filed, fixed, and announced. The Impala community also accomplished: * Increased compatibility with other Apache projects, including Parquet, Hive, Iceberg, ORC, Ranger, Kudu, DataSketches * Improved support for z-order * Added functionality to impala-shell (a rarely touched part of the codebase) * Added support for JSON Web Tokens ("JWT") * Added more support for running Impala in containers * Fixed multiple DDL race conditions * Added multiple planner heuristic improvements to join cardinality estimates * Added multiple expansions to use of min/max filters * Added some support for Alibaba cloud * Made multiple fixes to ACID table support ## Community Health: Perhaps the most stable indicator of Impala activity is reviews@, which registers an email for each code review, each submit, and each Jenkins job completion. This decreased this quarter to 2902 from 3153, an 8% decline. Impala is still a thriving community.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: ## Membership Data: Apache Impala was founded 2017-11-14 (3 years ago) There are currently 56 committers and 33 PMC members in this project. The Committer-to-PMC ratio is roughly 7:5. Community changes, past quarter: - No new PMC members. Last addition was Csaba Ringhofer on 2020-02-18. - No new committers. Last addition was Abhishek Rawat on 2020-12-08. ## Project Activity: During February, March, and April, the Impala community: * Upgrad dependencies: DataSketches, thrift, Impyla, Bzip2, LZ4, Snappy, Zlib, ZStd, urllib3, python requests, Paramiko, springframework, JacksonDatabing, and slf4j * Added improvements to compatibility with ABFS, RHEL 8, Iceberg, S3, Ubuntu 20.04, Ranger, Kudu, Calcite, Google Cloud Storage, UTF-8, Hive, ORC, and docker hub * Addressed reliability for failed nodes and teardowns * Added result spooling * De-flaked many tests * Added most components needed for supporting external frontends * Added support for spilling to S3 ## Community Health: The community is overall healthy. This quarter has a common amount of variability in some previous metrics. It is not infrequent that this variability has no plainly obvious cause. * 157 patches were committed this quarter, vs. 153 the previous quarter * 212 tickets were opened, up 24%, and 152 tickets were closed, down 67% * reviews@ traffic was up 33% to 3288 emails
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (3 years ago) There are currently 56 committers and 33 PMC members in this project. The Committer-to-PMC ratio is roughly 7:5. Community changes, past quarter: - No new PMC members. Last addition was Csaba Ringhofer on 2020-02-18. - Abhishek Rawat was added as committer on 2020-12-08 ## Project Activity: During November, December, and January, the Impala community added support (or improved support) for: - Codegen in the sorter - FIPS compliance - More sketches from Apache DataSketches - Cookie authentication in impala-shell - Numerous fixes for flaky tests, including many with timing requirements that were too tight - More support for parallelism within a single node ("dop") - Role-related statements using Apache Ranger - Unicode - An admission control daemon - More integration with Apache Iceberg ## Community Health: The community is overall healthy. This quarter has a common amount of variability in some previous metrics. It is not infrequent that this variability has no plainly obvious cause, though the US holiday season is sometimes correlated with lower activity. - 2,576 reviews were sent to reviews@, 39% down from the previous quarter. This metric is the most notable change. - 170 new JIRA tickets were filed, 28% lower than the previous quarter. - 153 patches were committed this quarter, 15% down from last quarter. There is a notable dip around Christmas, in which weekly commits increased from 3 to 22 within a week. - Notable increases in activity are visible in total JIRA traffic as well as a 125% increase in JIRAs closed.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (3 years ago) There are currently 55 committers and 33 PMC members in this project. The Committer-to-PMC ratio is 5:3. Community changes, past quarter: - No new PMC members. Last addition was Csaba Ringhofer on 2020-02-18. - Aman Sinha was added as committer on 2020-09-08 - Shant Hovsepian was added as committer on 2020-10-13 - Sheng Wang was added as committer on 2020-11-06 ## Project Activity: During August, September, and October, the Impala community added support (or improved support) for: - More Iceberg support, including ALTER TABLE, INSERT INTO for (non-partitioned tables), ORC, and more - Movement towards FIPS compliance - Error message readability and location improvements - System internals visibility improvements into artifacts like like queues and skews - Daily aarch64 build-and-test runs - Many more patches than a typical quarter about developer experience. Eyeballing it, maybe twice as much? This includes fixing some long-standing build and test issues. - Impala's first patches from contributors at @tencent.com - The addition of support for Alluxio - First SIMD support outside of the x86-64 family ## Community Health: The community is overall healthy. This quarter has a common amount of variability in some previous metrics. It is not infrequent that this variability has no plainly obvious cause. - 4,278 reviews were sent to reviews@, 2% down from the previous quarter - 272 new JIRA tickets were filed vs. 315 last quarter - 184 patches were committed this quarter, 5% down from last quarter - user@ and saw traffic decrease (31 emails to 19), while dev@ saw it increase (73 emails to 89)
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (3 years ago) There are currently 52 committers and 33 PMC members in this project. The Committer-to-PMC ratio is roughly 7:5. Community changes, past quarter: - No new PMC members. Last addition was Csaba Ringhofer on 2020-02-18. - Anurag Mantripragada was added as committer on 2020-05-13 ## Project Activity: This quarter, the Impala community added support (or improved support) for: - GROUPING, INTERSECT DISTINCT, EXCEPT DISTINCT, and uncorrelated subqueries in HAVING - Development environment bootstrapping with GCC 7 and on Ubuntu 20.04 and SLES12 sp5 - Sanitizers like ASAN and TSAN in developer testing - Asynchronous code execution so a query can start in interpreted mode and switch to native code when code generation is complete - TPCDS queries in the test suite - Running in containerized environments The Impala community improved compatibility with other Apache projects by: - Adopting Apache DataSketches KLL structure for quantile estimation - Recognizing the new ASF URL practices when downloading Maven and Ant - Improving support for Apache Hive ACID tables - Adding Apache Iceberg CREATE TABLE support - Adding a number of Apache Kudu compatibility improvements - Supporting Apache Parquet FIXED_LEN_BYTE_ARRAY DECIMAL - Supporting Apache Hadoop Ozone in "load data inpath" The Impala community removed some or all support for the following in the 4.0 branch: - Dateless timestamps - Impala-lzo - Sentry - Hive 2 ## Community Health: The community is overall healthy. This quarter has a common amount of variability in some previous metrics. It is not infrequent that this variability has no plainly obvious cause. - Commits are down this quarter from 221 to 197. - Six community members authored their first patch. - JIRAs created is down to 315 from 360; JIRAs resolved are up to 357 from 243. A significant number of these are Later, WontFix, CannotReproduce, etc. - user@ traffic is up 50% to 30 emails; dev@ traffic is down 48% to 69 emails.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (2 years ago) There are currently 51 committers and 33 PMC members in this project. The Committer-to-PMC ratio is roughly 3:2. Community changes, past quarter: - Csaba Ringhofer was added to the PMC on 2020-02-18 - Norbert Luksa was added as committer on 2020-04-09 ## Project Activity: - Support for Apache Hudi tables - 3.4.0 release and move of HEAD to 4.0, allowing breaking changes to land - Fix numerous flaky tests caused by races, including many found using ThreadSanitizer. - Improvements to interoperability (or interoperability documentation) with many Apache projects, including Parquet, Kudu, Ranger, HDFS Ozone, and ORC - Continued significant efforts towards aarch64 support - Improvements to zstd read support - Reduction in duplicate codegen work by sharing codegen models between fragment instances - Numerous improvements to Kerberos ergonomics - Significant performance improvements via query rewrites as well as work sharing of codegen and join builds - Support for CentOS 8.1 and Ubuntu 18.04 ## Community Health: Activity on most metrics increased last quarter: dev@ +86%, issues@ +56%, reviews@ +33%, commits +37%.
## Description: The mission of Apache Impala is the creation and maintenance of software related to a high-performance distributed SQL engine ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (2 years ago) There are currently 50 committers and 32 PMC members in this project. The Committer-to-PMC ratio is roughly 7:4. Community changes, past quarter: - No new PMC members. Last addition was Fredy Wijaya on 2019-07-27. - No new committers. Last addition was Laszlo Gaal on 2019-06-19. ## Project Activity: - Discussions on a release of 3.4 have begun - Planner and executor improvements for multi-threaded execution - Improvements to tests on ACID tables - Continued iterations on local catalog mode - The enablement of primary/foreign key hints during table creation - A number of improvements to test reproducability - A correctness fix for negative zero - Numerous improvements to ORC file handling - Several Apache Ranger related improvements, including support for column masking - Ten tickets with activity on aarch64 support; Impala has traditionally only supported x86-64 ## Community Health: Activity on many metrics decreased last quarter. This is typical for the project, and it corresponds to the US holiday season. The most prominent decrease was in the number of commits, which was down to 164. The November-December-January quarter has, in years past, seen 238, 258, 310, 199, and 183 commits (reverse chronological order).
## Description: Apache Impala is a high-performance distributed SQL engine. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (2 years ago) There are currently 50 committers and 32 PMC members in this project. The Committer-to-PMC ratio is roughly 7:4. Community changes, past quarter: - No new PMC members. Last addition was Fredy Wijaya on 2019-07-27. - No new committers. Last addition was Laszlo Gaal on 2019-06-19. ## Project Activity: Notable activity in the last quarter includes: - Version 3.3.0 was released on August 22 - A number of correctness and functional-parity fixes for transactional tables and their tests. Transactional tables are relatively new in Impala - A number of minor improvements to the webui - Improves memory estimation for clusters with dedicated coordinators (i.e. nodes that are not acting as executors) - Continued support for the new catalog version in the pre-merge tests - A JSON formatting of the query profile. For years, the query profile has been unstable, in that the developers reserved the right to change it or its formatting at any time. This is a first step in the direction of stability, which could increase usability and allow tooling built on top of the profile to be more reliable. - Support for .DEFLATE text data files in tables - By default, limits SQL statements to 16 million characters or fewer - A variety of improvements to compatibility with other Apache projects, including Knox, Tez, Derby, Kudu, Ranger, and Hive - The publication of CVE-2019-10084 - Support for cookie-based authentication - Numerous improvements to the end-stages of query lifespans, including some enhancements in resource deallocation - A large number of commits about spooling, which had zero presence in the commit log before July of 2019 - Some support for ZORDER - The addition of DATE support for Avro files; the removal of DATE support for the year 0 - Support for distributable impala-shell. It can be installed from pypi ## Community Health: While the number of commits labeled a "fix" has held steady over the last three quarters at 58, 54, and 59, this last quarter the number of non-"fix" commits dropped to 196 from 252 and 247 the previous quarters. Impala has not had this few commits (of any flavor) in an August-September-October timeframe before (although the repository contains an anomaly in which almost all pre-2014 commits landed in a single moment in January 2014). Overall activity is a mixed bag, but mostly down, with a decrease in email traffic, JIRA activity, and number of distinct patch authors, in addition to the commit number mentioned above. Furthermore, there is usually a lull in activity during our November-December-January reporting quarter due to US holidays. While this is a slowdown, development activity is still high in the context of open-source projects, with dozens of patch authors and activity on hundreds of JIRAs.
## Description: Apache Impala is a high-performance distributed SQL engine. ## Issues: There are no issues requiring board attention. ## Membership Data: Apache Impala was founded 2017-11-14 (2 years ago) There are currently 50 committers and 32 PMC members in this project. The Committer-to-PMC ratio is roughly 7:4. Community changes, past quarter: - Bikramjeet Vig was added to the PMC on 2019-05-29 - Fredy Wijaya was added to the PMC on 2019-07-27 - Gabor Kaszab was added to the PMC on 2019-05-22 - Andrew Sherman was added as committer on 2019-06-07 - Laszlo Gaal was added as committer on 2019-06-19 - Sahil Takiar was added as committer on 2019-05-22 - Vihang Karajgaonkar was added as committer on 2019-05-14 ## Project Activity: Notable activity in the last quarter includes: - Numerous commits related to support for Hive's ACID table format - Improvements to the consideration of nodes as individual executors or coordinators, including: - Improvements to admission control and executor pool management when there is a dedicated coordinator - "Executor groups", a feature that allows users to run different queries in disjoint sets of executors - The addition of admission control parameters that scale with the number of executors - Improvements the developer experience on Docker - Increased compatibility with Apache projects, including Hive 3, erasure coding and S3Guard in HDFS, page skipping and Zstd and lz4 compression in Parquet, and miscellaneous compatibility improvements for Ranger, Knox, Kudu, and Atlas - The addition of several built-in functions for the DATE type as well as the ability to read and write DATE in Parquet - Move closer to deprecating the Beeswax protocol by adding HS2 support to the Impala shell - Numerous patches improving tracing, logging, and metrics - Multiple improvements to build times or isolation - The addition of a data cache for remote reads, improving TPC-DS performance on S3 by 30% in one scenario, which made S3 performance as good as HDFS-on-EBS ## Community Health: By almost all metrics, Impala activity is down quarter-over-quarter and year-over-year. That said, the project is still very active, with each day featuring approximately: - Three commits - Two dev@ emails - 25 JIRA updates - 70 code reviews or patch updates
## Description: Impala is a high-performance distributed SQL engine. ## Issues: There are no issues requiring board attention at this time. ## Activity and health report: The previous three months had 237 patches to the master branch, while this three-month period had 286. This is likely the recovery from the usual seasonal dip around the end of each calendar year. Prominent work in the last three months includes: - An admission controller debugging page - Thousands of lines of new planner tests - A number of improvements to the shell scripts used to build and to start the various daemons - A few patches that reduced the disk space needed for development by tens of gigabytes - Support for development on Ubuntu 18.04. - Support for Apache Ranger and decoupling Apache Sentry - Support for complex types in ORC files - Better hardware detection - Compatibility with Hive 3.x for data loading and the Metastore - Numerous improvements to metrics and counters - Continued work on supporting Docker in development environments and in production - Initial support for a timeless DATE type Health is a subjective metric, but the increased compatibility with other open source and Apache projects is a good sign, as is the nine new patch authors. ## PMC changes: - Currently 29 PMC members. - Quanlong Huang was added to the PMC on Sun Mar 10 2019 ## Committer base changes: - Currently 46 committers. - New committers: - Pooja Nilangekar was added as a committer on Tue Apr 09 2019 - Paul Rogers was added as a committer on Mon Feb 04 2019 ## Releases: - 3.2.0 was released on Wed Mar 27 2019 ## JIRA activity: - 326 JIRA tickets created in the last 3 months - 296 JIRA tickets closed/resolved in the last 3 months
@Myrle: follow up about trademark issue
## Description: Impala is a high-performance distributed SQL engine. ## Activity: The previous three months had 330 patches to the master branch, while this three-month period had 237. This is likely a seasonal dip. Prominent work in the last three months includes: - The revival of the 2.x branch - Modernization of the documentation of the HBase integration - Significant changes to enable Impala to run better in containers - Improvements to the user experience of dealing with changing metadata - Multiple improvements to profile statistics - Numerous build process improvements for performance - Support for reading additional Parquet field types ## Health report: The project remains healthy and metrics (number of commits, bugs filed, and mailing list activity) remain healthy. Three new contributors had patches committed. ## PMC changes: - Currently 28 PMC members. - Zoltán Borók-Nagy was added to the PMC on Thu Jan 03 2019 ## Committer base changes: - Currently 45 committers. - Paul Rogers was added as a committer on Mon Feb 04 2019 - Zoram Thanga was added as a committer on Fri Nov 16 2018 ## Releases: - 3.1.0 was released on Wed Dec 05 2018 ## Mailing list and JIRA activity: Activity dropped, consistent with a seasonal dip during US holidays that Impala sees every year: reviews@, issues@, dev@ traffic decreased by about 30%.
## Description: Impala is a high-performance distributed SQL engine. ## Activity: The previous three months had 350 patches to the master branch, while this three-month period had 330. Prominent work in the last three months includes: - Support for multiple DISTINCT - The first Apache two-dot release (3.0.1) was made; normally we only do x.y.0 releases. This was done to fix two security issues. - Official CentOS support for developers. - A number of changes to make the C++ code have a reduced number of undefined behaviors. - Support for Hadoop's connector for Azure's new storage system, "Azure Data Lake Storage Gen2". - Multiple improvements in resource estimation and resource management. - Continued improvements in "local catalog" mode. - The addition of builtin JSON parsing functions. - Graceful node shutdown (with drain/quiesce). ## Health report: The project remains healthy and metrics (number of commits, bugs filed, and mailing list activity) remain healthy. Four new contributors had patches committed. ## PMC changes: - Currently 27 PMC members. - Joe McDonnell was added to the PMC on Mon Aug 20 2018 ## Committer base changes: - Currently 44 committers. - Quanlong Huang was added as a committer on Thu Aug 23 2018 ## Releases: - 3.0.1 was released on Tue Oct 23 2018 ## Mailing list activity: Mailing lists metrics that held steady: - user@: 83 emails sent in the past 3 months, 87 in the previous cycle. Mailing list metrics that changed more: - dev@: 205 emails sent in the past 3 months, 299 in the previous cycle. There is no obvious immediate cause and this is likely just statistical fluctuation. - reviews@: 7312 emails sent in the past 3 months, 5638 in the previous cycle. ## JIRA activity: - 409 JIRA tickets created in the last 3 months - 386 JIRA tickets closed/resolved in the last 3 months
## Description: Impala is a high-performance distributed SQL engine. ## Issues: As mentioned in the May report, Impala asked Google to add a disclaimer to their "IMPALA: Scalable Distributed DeepRL in DMLab-30"[0] acknowledging that the ASF owns the Impala trademark. Google had declined to do so, but as of June 5th, they "made a decision to launch the code by reference to its full name (Importance Weighted Actor-Learner Architecture) - i.e. we will no longer be using the acronym IMPALA". A project named "Palo" was accepted into the Incubator as "Doris". It is based on a forked Impala code base and a few Impala PMC members have expressed concern about some of their previous practices from a community-oriented perspective. These were addressed directly by the Palo/Doris community in the incubation proposal and there will hopefully be upcoming opportunities for both projects to start more direct contributions to each other. ## Activity: The previous three months had 364 patches to the master branch, while this three-month period had 350. Prominent work in the last three months includes: - The release of Impala 3.0 - The start of a very large effort to enable Impala to run without catalogd, which can sometimes be a bottleneck - A number of improvements to developer workflow, including documentation build changes and improvements to automate our CI system even more - The addition of fine-grained privileges A new book with multiple chapters about Impala was published, "Next-Generation Big Data: A Practical Guide to Apache Kudu, Impala, and Spark", by Butch Quinto. ## Health report: The project remains healthy and the cadence of most metrics (number of commits, emails, releases) is steady compared to previous quarters. One notable change is user@, which declined in volume by roughly 64%. The previous quarter, some large threads dominated user@: - A user inquiry on local join vs. exchange (10%) - A user inquiry on installation on Ubuntu (10%) - A user inquiry on admission control (7%) - A user inquiry on memory estimation (10%) ## PMC changes: - Currently 26 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Philip Zeyliger on Sun Apr 22 2018 ## Committer base changes: - Currently 42 committers. - New committers: - Attila Jeges was added as a committer on Fri May 18 2018 - Zoltán Borók-Nagy was added as a committer on Fri May 04 2018 - Csaba Ringhofer was added as a committer on Mon May 28 2018 - Fredy Wijaya was added as a committer on Mon Jun 25 2018 - Gabor Kaszab was added as a committer on Fri May 18 2018 - Jin Chul Kim was added as a committer on Fri May 18 2018 ## Releases: - 3.0.0 was released on Sun May 06 2018 ## Mailing list activity: Mailing lists metrics that held very steady: - dev@: 341 emails (309 in previous quarter) - issues@: 932 emails (929 in previous quarter) - reviews@: 6250 emails (6502 in previous quarter) ## JIRA activity: - 419 JIRA tickets created in the last 3 months - 418 JIRA tickets closed/resolved in the last 3 months
# Description: Impala is a high-performance distributed SQL engine. ## Issues: The PMC requested that Google, the owners of "IMPALA: Scalable Distributed DeepRL in DMLab-30"[0], add a disclaimer acknowledging that the ASF owns the Impala trademark. Google declined to do so. We engaged with Mark Thomas, ASF's VP, Brand. He suggested we not pursue this further. See [1] for details. 0: https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/ 1: https://s.apache.org/impala-deeprl-rm ## Activity: The previous three months had 248 patches to the master branch, while this three-month period had 364. This pattern in the past was usually a result of a slowdown during the US winter holiday months, and I expect that's true of this uptick, as well. Prominent work in the last three months includes: - Decimal support in Kudu tables - End-to-end compression of metadata - Support for LLVM 5 - Support for Hadoop 3 - Support for the ORC format ## Health report: The project remains very active. Statistics pertaining to this are covered above and below: more than 100 commits per month, more than 100 JIRAs resolved per month, four new people added as PMC members or committers. We have continued a release cadence of about once per quarter. ## PMC changes: - Currently 26 PMC members. - New PMC members: - Philip Zeyliger was added to the PMC on Sun Apr 22 2018 - Thomas Marshall was added to the PMC on Sun Feb 04 2018 ## Committer base changes: - Currently 36 committers. - New committers: - Alexandra Rodoni was added as a committer on Fri Apr 06 2018 - Vuk Ercegovac was added as a committer on Tue Apr 03 2018 ## Releases: - 2.12.0 was released on Mon Apr 23 2018 - 3.0 is being voted on as of May 3. ## JIRA activity: - 504 JIRA tickets created in the last 3 months - 408 JIRA tickets closed/resolved in the last 3 months
## Description: Impala is a high-performance distributed SQL engine. ## Issues: There are no special issues the board should be aware of. ## Activity: The most prominent activity in January was the branching to prepare for a 3.0 release. Version 2.11.0 was also released in January, and a 2.x branch is maintained with the anticipated possible release of versions 2.12.0 and beyond. This branching enabled a number of breaking changes to finally land. Other notable efforts include: * Enhancements to sampled statistics collection * The continuation of long-term efforts around the buffer pool * The continuation of long-term efforts around Kudu's RPC * The enablement of different decimal semantics (for the 3.x branch only) * Improved usage of OpenSSL (both performance and correctness) * The exposure of more system information in the web UI * Multiple improvements to test parallelism performance and correctness ## Health report: The project remains healthy. There were 124 dev@emails, 62 user@ emails, 106 tickets opened and 104 resolved, and three new patch authors. There were 98 commits, which is consistent with the average rate over 2017 of 92 commits per month. ## PMC and committer changes: Tianyi Wang was added as a committer on January 5. Philip Zeyliger was added as a committer on January 9. The most recent new PMC member was added on 2017-09-27. ## Releases: 2.11.0 was released on January 18.
## Description: Impala is a high-performance distributed SQL engine. ## Issues: There are no special issues the board should be aware of. ## Activity: Notable efforts in December include work on decimal and floating point correctness, test and build infrastructure refactoring, the addition of more debugging and profiling information (and the removal of some less helpful information), perfomance improvements for computing table statistics, support for processors with AVX-512, a variety of fixes to runtime filters, and kerberos handling improvements. ## Health report: The project is healthy. December was a slower month than November, likely due to two US holidays at the end of the month. There were 60 commits, 112 dev@ emails, 45 user@ emails, 66 tickets resolved, and 94 tickets opened. ## PMC and committer changes: Greg Rahn was added as a committer on December 12. The most recent new PMC member was added on 2017-09-27. ## Releases: The release process for 2.11 is underway: https://s.apache.org/impala-2.11-vote-results
## Description: Impala is a high-performance distributed SQL engine. ## Issues: There are no special issues the board should be aware of. ## Activity: Impala graduated from the incubator to a TLP on 15 November 2017. The incubator report covering August, September, and October is available at https://wiki.apache.org/incubator/November2017. Notable efforts in November include Hadoop 3.0 compatibility work, TABLESAMPLE work to compute statistics more quickly, test reliability, decimal arithmetic type changes, client connectivity bug fixes, and changing the RPC mechanism for data stream service. ## Health report: The project is healthy. November had 90 commits, the same as October. The dev list had 139 emails, compared to 175 in the previous 30 days. The user list had 31 emails, compared to 17 in the previous 30 days. 109 tickets were resolved, compared to 101 in October. 122 tickets were created, compared to 130 in October. Several new contributors have been active. ## PMC and committer changes: The most recent new PMC member was added on 2017-09-27. Two new committers were added on 2017-09-29. ## Releases: The last release was 2017-09-14. Discussions for the next release are in progress. Discussions of a major (compatibility-breaking) release have occurred, but there does not appear to be much enthusiasm to do such a release right now. Some compat-breaking changes, like DECIMAL_V2, are available already behind feature flags.
WHEREAS, the Board of Directors deems it to be in the best interests of the Foundation and consistent with the Foundation's purpose to establish a Project Management Committee charged with the creation and maintenance of open-source software, for distribution at no charge to the public, related to a high-performance distributed SQL engine. NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee (PMC), to be known as the "Apache Impala Project", be and hereby is established pursuant to Bylaws of the Foundation; and be it further RESOLVED, that the Apache Impala Project be and hereby is responsible for the creation and maintenance of software related to a high-performance distributed SQL engine; and be it further RESOLVED, that the office of "Vice President, Apache Impala" be and hereby is created, the person holding such office to serve at the direction of the Board of Directors as the chair of the Apache Impala Project, and to have primary responsibility for management of the projects within the scope of responsibility of the Apache Impala Project; and be it further RESOLVED, that the persons listed immediately below be and hereby are appointed to serve as the initial members of the Apache Impala Project: * Alex Behm <abehm@apache.org> * Bharath Vissapragada <bharathv@apache.org> * Brock Noland <brock@apache.org> * Carl Steinbach <cws@apache.org> * Casey Ching <casey@apache.org> * Daniel Hecht <dhecht@apache.org> * Dimitris Tsirogiannis <dtsirogiannis@apache.org> * Henry Robinson <henry@apache.org> * Ishaan Joshi <ishaan@apache.org> * Jim Apple <jbapple@apache.org> * John Russell <jrussell@apache.org> * Juan Yu <jyu@apache.org> * Lars Volker <lv@apache.org> * Lenni Kuff <lskuff@apache.org> * Marcel Kornacker <marcel@apache.org> * Martin Grund <mgrund@apache.org> * Matthew Jacobs <mjacobs@apache.org> * Michael Brown <mikeb@apache.org> * Michael Ho <kwho@apache.org> * Sailesh Mukil <sailesh@apache.org> * Skye Wanderman-Milne <skye@apache.org> * Taras Bobrovytsky <tarasbob@apache.org> * Tim Armstrong <tarmstrong@apache.org> * Todd Lipcon <todd@apache.org> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Jim Apple be appointed to the office of Vice President, Apache Impala, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed; and be it further RESOLVED, that the initial Apache Impala PMC be and hereby is tasked with the creation of a set of bylaws intended to encourage open development and increased participation in the Apache Impala Project; and be it further RESOLVED, that the Apache Impala Project be and hereby is tasked with the migration and rationalization of the Apache Incubator Impala podling; and be it further RESOLVED, that all responsibilities pertaining to the Apache Incubator Impala podling encumbered upon the Apache Incubator PMC are hereafter discharged. Special Order 7D, Establish the Apache Impala Project, was approved by Unanimous Vote of the directors present.
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: Our graduation proposal is in the works. Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How has the community developed since the last report? There have been 279 Commits: git log --format='%ci' | grep -cE '2017-(08|09|10)' 62 of those commits were by non-committers: git log --format='%an %ci' | grep -E '2017-(08|09|10)' | tr -d '0-9\-' | cut -d ' ' -f -2 | sort | uniq -c | sort -n Of the 37 patch authors, 16 were not committers at the beginning of this reporting period. There are three new committers members and one new PPMC member: https://lists.apache.org/list.html?dev@impala.apache.org:dfr=2017-8-1|dto=2017-10-31:%22has%20invited%22 Impala has done a fourth release with a third release manager. Impala has begun graduation procedures: we have held a community discussion and a community vote on graduation, both unanimous. We have established our intended PMC. Next, we will draft our charter and hold a discussion on general@incubator. How has the project developed since the last report? Impala has removed the old unpartitioned hash and aggregation nodes, relics from years ago that were kept around for backwards compatibility: the new buffer management makes these obsolete. Code generation for decimal and timestamp types has been added to the text scanner, increasing the performance of some queries by up to 19%. More robust query plans in case of data skew have made some aggregations eight times as fast. A number of large changes are in-flight, including changes to equivalence class computation in the planner, more decimal semantics adjustments, min-max filters for Kudu, and multi-threaded metadata loading that increases the performance of some metadata operations by 8x. How would you assess the podling's maturity? Please feel free to add your own commentary. [ ] Initial setup [ ] Working towards first release [ ] Community building [X] Nearing graduation [ ] Other: Date of last release: 2017-09-14 When were the last committers or PPMC members elected? 2017-09-29 Signed-off-by: [ ](impala) Tom White Comments: [X](impala) Todd Lipcon Comments: [ ](impala) Carl Steinbach Comments: [X](impala) Brock Noland Comments: IPMC/Shepherd notes: Drew Farris (shepherd): Three mentors active on the mailing lists, healthy project, excellent progress towards graduation.
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Growth of the developer community 2. 3. Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How has the community developed since the last report? There have been 268 Commits: git log --format='%ci' | grep -cE '2017-0(5|6|7)' 51 of those commits were by non-committers: git log --format='%ae %ci' | grep -E '2017-0(5|6|7)' | cut -d ' ' -f 1 | sort | uniq -c | sort -n There are two new PPMC members: https://lists.apache.org/list.html?dev@impala.apache.org:dfr=2017-2-1|dto=2017-4-30:%22has%20invited%22 Impala has done a third release with a second release manager. Two CVEs were issued, our first ones under the Apache security guidelines. How has the project developed since the last report? There have been big changes to the buffer pool, as outlined in https://lists.apache.org/thread.html/f573698455bf2ff9ac2073c778802d0d5c9f3c8be43ede80614259cb@%3Cdev.impala.apache.org%3E . There have also been big changes landing to the RPC layer to improve scalability. Impala now has TABLESAMPLE to allow running queries on only a small percentage of the table for experimenting with queries quickly, and it now works on ADLS. How would you assess the podling's maturity? Please feel free to add your own commentary. [ ] Initial setup [ ] Working towards first release [X] Community building [X] Nearing graduation [ ] Other: Once the developer community has grown a bit, Impala will be ready to contemplate graduation. Date of last release: 2017-06-16 When were the last committers or PPMC members elected? 2017-07-17 Signed-off-by: [ ](impala) Tom White Comments: [x](impala) Todd Lipcon Comments: [x](impala) Carl Steinbach Comments: [ ](impala) Brock Noland Comments:
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Growth of the developer community 2. 3. Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How has the community developed since the last report? There have been 267 Commits: git log --format='%ci' | grep -cE '2017-0(2|3|4)' 42 of those commits were by non-committers: git log --format='%ae %ci' | grep -E '2017-0(2|3|4)' | cut -d ' ' -f 1 | sort | uniq -c | sort -n There were 114 emails to the user list. Of the top nine participants, six were not committers: https://lists.apache.org/trends.html?user@impala.apache.org:dfr=2017-2-1|dto=2017-4-30: There are six new committers: https://lists.apache.org/list.html?dev@impala.apache.org:dfr=2017-2-1|dto=2017-4-30:%22has%20invited%22 Two new contributors have announced plans to take on large development efforts (JSON support and ppc64le support). How has the project developed since the last report? Two of the "three most important issues to address in the move towards graduation" from our last report in February have been completed: The bug tracker was transitioned to issues.apache.org, and the documentation now describes Apache Impala specifically, not any non-Apache extensions. The documentation has also been posted to http://impala.incubator.apache.org/docs/build/html/index.html and http://impala.incubator.apache.org/docs/build/impala.pdf. Many commits have landed since our last report towards increasing the performance of metadata. How would you assess the podling's maturity? Please feel free to add your own commentary. [ ] Initial setup [ ] Working towards first release [X] Community building [X] Nearing graduation [ ] Other: Once the developer community has grown a bit, Impala will be ready to contemplate graduation. Date of last release: 2017-01-22 When were the last committers or PPMC members elected? 2017-04-24 Signed-off-by: [X](impala) Tom White Comments: [X](impala) Todd Lipcon Comments: [X](impala) Carl Steinbach Comments: [X](impala) Brock Noland Comments:
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Community growth 2. Transition of bug tracker to issues.apache.org 3. Evolution of documentation to describe specifically Apache Impala Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How have the community and project developed since the last report? Our last report was in November. Since then, there have been 148 commits. 49 commits were authored by non-committers, of which 4 new commits come from 3 new contributors. dev@ received 496 emails and user@ received 53. 443 new issues have been filed. There has been one release (our second Apache release) and we have added one new PPMC member. Our infrastructure has been transitioning: we moved our pre-commit testing out of our old, pre-Apache hosting and we have been actively working with Gavin McDonald on migrating our JIRA hosting. Date of last release: 2017-01-22 When were the last committers or PPMC members elected? 2017-01-12 Signed-off-by: [x](impala) Tom White [x](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Community growth 2. Transition of user documentation to Apache hosting 3. Migration of pre-commit continuous integration testing to publicly-available infrastructure Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How has the community developed since the last report? Our last report was in August. Since then, we have five new contributors who have authored patches, while two relatively recent contributors who were active before August have continued their involvement by authoring new patches. Traffic to our developer mailing list has grown by about 60%. How has the project developed since the last report? There have been 241 commits since the last report. Our status website now has 16 of the 17 listed work items complete. We had our first Apache release and have a wiki page describing how to perform the release in detail. We scrubbed our code using the RAT tool for copyright notices not compliant with the ASF rules. We wrote guidelines for contributors on how to become a committer and added a new committer. All developer documentation has now moved to the Apache-hosted wiki. Date of last release: 2016-10-05 When were the last committers or PMC members elected? 2016-08-18 Signed-off-by: [X](impala) Tom White [ ](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Three most important issues to address in the move towards graduation: 1. Transition of development workflows to ASF (see https://issues.cloudera.org/browse/IMPALA-3221) 2. Initial release as incubating project. 3. Community growth Any issues that the Incubator PMC or ASF Board might wish/need to be aware of? No. How has the community developed since the last report? Our last report was in April. Since then * Six new contributors have submitted patches for review, and two contributors new to the project since incubation have continued to send patches. * Mailing list activity more than doubled in the four months since our last report compared to the four months before that, from 31 threads to 75 threads (excluding patch review comments) How has the project developed since the last report? * The podling name search was completed: https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-96 * Trademark handoff was completed * Impala's git repository is now hosted on ASF infrastructure * Impala's website's source is hosted on ASF’s git infrastructure and the website is now available on https://impala.apache.org * Project bylaws have been ratified: https://impala.apache.org/bylaws.html * Developer documentation has started to move to the ASF-hosted wiki: https://cwiki.apache.org/confluence/collector/pages.action?key=IMPALA * Work has begun in migrating to ASF-hosted JIRA * A patch changing the copyright headers is in review Date of last release: No releases have been made yet. When were the last committers or PMC members elected? No committers or PMC members have been added since incubation began. Signed-off-by: [X](impala) Tom White [ ](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Transition of development workflows to ASF (see https://issues.cloudera.org/browse/IMPALA-3221) 2. Initial release as incubating project. 3. Community growth Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None. How has the community developed since the last report? There have been no additions to the committer or PMC lists since incubation began. We continue to see an uptick in external contributions, with two patches from new contributors this month. One contributor pleasingly reported it was "great to work closely with Impala community". How has the project developed since the last report? We have put together a list of tasks required to move development of Impala from Cloudera's infrastructure to the ASF. Since Impala was already a well-established project before the Incubator proposal, there is perhaps more decoupling required than for more nascent projects. The list is at https://issues.cloudera.org/browse/IMPALA-3221, and is being actively worked on. Note that this doesn't cover the standard podling tasks (like name search, etc). Date of last release: None since entering incubation. When were the last committers or PMC members elected? None since entering incubation. Signed-off-by: [X](impala) Tom White [ ](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Movement of existing JIRA / Git / wiki resources to Apache equivalents 2. Initial release as incubating project. 3. Community growth Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None. How has the community developed since the last report? There have been no additions to the committer or PMC lists since incubation began. However, we have seen an uptick in external contributions, both through code, and in discussion on the mailing list. One contributor has been attempting to port Impala to PPC, and has reported some success after asking many questions. How has the project developed since the last report? We have made some slow progress with our initial infrastructure tasks. Code review traffic is now copied to dev@impala.incubator.apache.org, which means that developer discussions are now happening on the mailing lists. We have a number of infrastructure tasks ahead of us which are blocked on the current Impala team at Cloudera being very busy with an internal release (this is one reason we look forward to a more diverse community!). For example: 1. We would like to move our Git repository to .apache.org in short order, but as it stands the existing repo is 10GB large and historically contains many binary artifacts that, while acceptably licensed, have no useful place in Impala's repository. We need to strip these artifacts from the Git history, and then adjust Gerrit to commit to the new branch in the new repo. This is not hard but takes some time. 2. We would also like to move our JIRA tickets from issues.cloudera.org to issues.apache.org. Experience in a sister podling has shown that this isn't straightforward if we wish to preserve existing release labels, user assignments and so on, so requires some time. We anticipate having much more time to work on these basic issues after the end of February. We look forward to getting Impala into a position where it is easier for the larger community to collaborate on these kinds of project management issues. Date of last release: None since entering incubation. When were the last committers or PMC members elected? None since entering incubation. Signed-off-by: [X](impala) Tom White [ ](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland
Impala is a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters. Impala has been incubating since 2015-12-03. Three most important issues to address in the move towards graduation: 1. Resolve any issues around use of Gerrit as code-review tool. 2. Movement of existing JIRA / Git / wiki / e-mail resources to Apache equivalents 3. Initial release as incubating project. Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None. How has the community developed since the last report? Slowly - Impala is still in the very early stages of incubation, and performing the mechanical tasks of code movement and infrastructure setup is our first priority. The holiday period in the United States has slowed this effort slightly, but we look forward to picking up pace in early 2016. There have been no additions to the committer or PMC lists since incubation began. How has the project developed since the last report? We have performed some of the basic initial tasks for incubation - establishing wiki pages, Git repositories and accounts for the initial committer set. Our next steps are: 1. Finalize the SGA from Cloudera 2. Move existing @cloudera.org e-mail aliases to their @impala.incubator.apache.org equivalents. 3. Move source code from Cloudera git repository to Apache git repo. 4. Improve out-of-box build and test experience so that community can easily evaluate release artifacts. 5. Migrate cloudera.org JIRA tickets to issues.apache.org. Date of last release: NA When were the last committers or PMC members elected? At the time of the Incubation vote, 2015-12-03. Signed-off-by: [X](impala) Tom White [X](impala) Todd Lipcon [ ](impala) Carl Steinbach [ ](impala) Brock Noland Shepherd/Mentor notes: