Iceberg

This was extracted (@ 2025-07-16 21:10) from a list of minutes which have been approved by the Board.

Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting. ASF Members may have access to a private draft of these still-unapproved minutes.

WARNING: these pages may omit some original contents of the minutes.
This is due to changes in the layout of the source minutes over the years. Fixes are being worked on.

18 Jun 2025 [Ryan Blue / Shane] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (5 years ago)
There are currently 34 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:2.

Community changes, past quarter:
- No new PMC members. Last addition was Amogh Jahagirdar on 2024-08-12.
- No new committers. Last addition was Huaxin Gao on 2025-02-06.

## Project Activity:

Releases:
- Table spec v3 was adopted on 2025-05-22
- Java 1.9.1 was released on 2025-05-28
- Java 1.9.0 was released on 2025-04-28
- Java 1.7.2 was released on 2025-03-19
- PyIceberg 0.9.1 was released on 2025-04-30
- Rust 0.5.1 was released on 2025-05-31
- Rust 0.5.0 was released on 2025-05-26
- Go 0.3.0 was released on 2025-05-29
- Go 0.2.0 was released on 2025-03-26

Table spec:
- v3 of the Iceberg spec was adopted by a community vote! Full support for v3 is
 targeted for the upcoming Java 1.10.0 release and other implementations are
 adding support.
- Planning for Iceberg v4 has started, with groups self-organizing around
 projects like faster commits, columnar metadata, and relative path support

Java:
- Added support for Spark 4.0, removing support for 3.3
- Implemented v3 row lineage support in Spark MERGE and UPDATE
- Added a connector for the BigQuery catalog
- Added support for Flink 2.0 and removed Flink 1.18
- Ongoing work to develop a dynamic Flink sync handles table schema changes
- Added Zookeeper locking for Flink table maintenance
- Refactored REST catalog client to use AuthManager
- Added reader/writer for partition stats files
- Completed the core implementation of Variant type

PyIceberg:
- Exceeded 500,000 downloads in a single day
- Refactoring OAuth for REST catalogs with an AuthManager, like Java
- Working on adding optimistic concurrency
- Added support for decimal backed by int32/int64
- Fixed "upsert" with complex types

Rust:
- Adding support for v3 metadata fields and encryption fields
- Now exports DataFusion table provider to Python bindings
- Added support to add existing Parquet files
- Added support for Apache Arrow dictionary type
- Added support for writing Puffin files
- Runs sqllogictests using DataFusion

Go:
- Added write and commit support
- Added support for the Glue catalog
- Supports REST catalog integration tests

C++:
- Added virtual classes for API concepts: Catalog, Table, file readers/writers
- Added manifest and manifest list structures
- Added TypeVisitor, support for converting schemas to Avro
- Added expressions, sort orders, and partition specs
- Added support for configuration files similar to those used by PyIceberg

## Community Health:

Health metrics:
- Code contributors increased by 23%, on top of a 20% rise last quarter!
- Most metrics were stable; issues closed dropped due to an outlier day

Iceberg Summit 2025 was held April 8th (in person) and 9th (virtual). 62 talks
are now available from the project's youtube channel.
https://www.youtube.com/playlist?list=PLkifVhhWtccxMcqWlXXFvjJybisFF7ESh

In the 11 months between Iceberg Summit 2024 and 2025, there were:
- 16 releases across Java, Python, Rust, and Go
- 250 new contributors
- 7 new committers
- 5 new PMC members
- 1 new language implementation (C++)

19 Mar 2025 [Ryan Blue / Sander] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (5 years ago)
There are currently 34 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:2.

Community changes, past quarter:
- No new PMC members. Last addition was Amogh Jahagirdar on 2024-08-12.
- Huaxin Gao was added as committer on 2025-02-06
- Scott Donnelly was added as committer on 2024-12-12

## Project Activity:

Releases:
- Java 1.7.1 was released on 2024-12-06
- Java 1.8.0 was released on 2025-02-13
- Java 1.8.1 was released on 2025-02-28
- PyIceberg 0.8.1 was released on 2024-12-06
- PyIceberg 0.9.0 was released on 2025-03-06
- Rust 0.4.0 was released on 2024-12-23
- Go 0.1.0 was released on 2024-11-18

Java:
- License cleanup to ensure all distributed source and binaries are compliant
- Ongoing work to implement the v3 format before adoption
- Deletion vectors have been added
- Default value support was implemented
- Readers and writers for unknown, timestamp(9), and variant are committed
- Expression support for filtering on shredded variant metrics was completed
- Spark support for Datafusion Comet integration
- Added support for v3 row lineage to api/core
- Added InternalData to allow using columnar formats for table metadata

PyIceberg:
- Support for upsert operations
- Added residuals for scan tasks
- Use partition values from metadata
- Support for reading v3 DVs
- Write support for bucket and truncate transforms

Rust:
- Support for reading puffin file metadata
- Added manifest metadata table

Go:
- Added create/commit table support
- Completed commit updates/requirements for REST Catalog
- Add support for register table
- Add view support
- Add listing pagination support
- Improved manifest scanning

C++
- Added Schema and Types
- New language repository!

## Community Health:

The community has seen new first-time contributors across all projects with
recent releases including:
- Java: 37 new contributors as of 1.8.1 release
- Python: 33 new contributors as of 0.9.0 release
- Rust: 17 new contributors as of 0.4.0 release

This quarter also saw a 20% increase in the number of contributors.

The community has self-organized a significant number of Iceberg-focused
meetups spanning the globe with recent meetups in the following locations:
- Austin, TX
- Palo Alto, CA
- San Francisco, CA
- Seattle, WA
- Singapore
- Tokyo, Japan
- Hyderabad, India

The PMC will discuss and document guidelines for using the trademark to make it
easier for meetups to happen while meeting the ASF requirements.

A second Iceberg Summit will be April 8th & 9th 2025. It combines in-person
and virtual events.

Last, I want to clarify that what I said on PR 11670 was this:
> Right now, we're focusing on describing how the community operates, rather
> than discussing how it might operate in the future.

18 Dec 2024 [Ryan Blue / Jeff] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (5 years ago)
There are currently 32 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 4:3.

Community changes, past quarter:
- No new PMC members. Last addition was Amogh Jahagirdar on 2024-08-12.
- Matthew Topol was added as committer on 2024-12-09
- Scott Donnelly was added as committer on 2024-12-10

## Project Activity:

Releases
- 1.7.1 was released on 2024-12-06.
- 1.7.0 was released on 2024-11-08.
- PyIceberg 0.8.1 was released on 2024-12-06.
- PyIceberg 0.8.0 was released on 2024-11-18.
- Go 0.1.0 was released on 2024-11-18.

Table format (v3)
- Added deletion vectors and synchronous maintenance to improve row-level ops
- Added row lineage fields and requirements for fine-grained row tracking
- Proposal for geography and geometry types is close to consensus
- Update to add Parquet's variant type is approved, waiting on Parquet upstream
- Finalized new type promotion rules

Puffin format
- Added deletion vector blob type to support DVs in tables

REST catalog spec
- Added storage credentials passing
- Added credential refresh
- Created a docker image for catalog testing
- Discussing proposal for partial metadata commits
- Discussed partial metadata loading

Views
- Discussions about materialized view metadata are ongoing

Java
- Released new Kafka Connect sink
- Added default values implementation for Avro
- Added nanosecond timestamps
- Added v3 DV support in core, ongoing work in Spark
- Flink: Made FLIP-27 source the default
- Spark: Removed Spark 3.3 support
- Hive: Removing Hive 2.x and 3.x (Iceberg support is in Hive for 4.x and on)
- Pig: Removed the iceberg-pig module that is no longer used

PyIceberg
- Support: Added Python 3.12, dropped Python 3.8

Rust
- Support for default values and type promotion in reads
- Added TableMetadataBuilder
- Implemented table requirements

Go
- Produced the first go release!
- Supports scan planning and reading (data and metadata)
- Supports loading and listing tables with the Glue catalog
- Supports local and S3 storage

C++
- Added a C++ repository for a Puffin implementation

## Community Health:

The PMC has published guidelines for contributors that want to know more about
how they can become committers on the Iceberg site. This guide should help
contributors understand how Iceberg and other ASF communities decide and add
committers, and should set expectations clearly. This was the most important
follow up from discussions on the dev list earlier this year, where it became
clear that contributors did not understand the requirements or process.

The community has started planning a second Iceberg Summit, intended to be held
in Spring of 2025. The proposal details are being finalized (such as the members
of the selection committee) and will be submitted for approval in the next few
weeks.

The community added two new committers this quarter and had a slight increase in
the number of contributors.

There were also a number of commercial announcements from companies adding or
expanding support for Iceberg.

18 Sep 2024 [Ryan Blue / Justin] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 31 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 4:3.

Community changes, past quarter:
- Amogh Jahagirdar was added to the PMC on 2024-08-12
- Eduard Tudenhoefner was added to the PMC on 2024-08-12
- Honah J. was added to the PMC on 2024-07-22
- Renjie Liu was added to the PMC on 2024-07-22
- Peter Vary was added to the PMC on 2024-08-12
- Piotr Findeisen was added as committer on 2024-07-24
- Kevin Liu was added as committer on 2024-07-24
- Sung Yun was added as committer on 2024-07-24
- Hao Ding was added as committer on 2024-07-23

## Project Activity:
Releases:
- Java 1.6.1 was released on 2024-08-28
- Rust 0.3.0 was released on 2024-08-20
- PyIceberg 0.7.1 was released on 2024-08-18
- PyIceberg 0.7.0 was released on 2024-07-30
- Java 1.6.0 was released on 2024-07-23

Table format:
- Work for v3 is picking up
- Committed timestamp_ns implementation
- Ongoing discussion/proposal for improvements to row-level deletes
- Ongoing discussion/proposal for row-level metadata for change tracking
- Discussion for adding variant type and where to maintain the spec (Parquet)
- Making progress on geometry types
- Clarified transform requirements to add transforms as needed (to support geo)
- Discovered issues affecting new type promotion cases, reduced scope

View format:
- Ongoing discussions for tracking metadata for materialized views

REST protocol specification:
- Added server-side scan planning
- Support for removing partition specs
- Support for endpoint discovery for future additions
- Clarified failure requirements for unknown actions or validations

Java:
- Added classes for v3 table writes
- Fixed rewrites in tables with 1000+ columns
- Added Kafka Connect runtime bundle
- Support for Flink 1.20
- Added range distribution support in Flink
- Dropped support for Java 8

PyIceberg:
- Discussed adding a dependency on iceberg-rust for native extensions
- Write support for time and identity transforms
- Parallelized large writes
- Support for deletes using filter predicates
- Staged table creation for atomic CTAS
- Support manifest merging on write
- Better integration with PyArrow to produce lazy readers from scans
- New API to add existing Parquet files
- Support custom catalogs

Rust:
- Established subproject pyiceberg_core to support PyIceberg
- Implemented OAuth for catalog REST client
- Added Parquet writer and reader capabilities with support for data projection.
- Introduced memory catalog and memory file IO support
- Initialized SQL Catalog
- Added support for GCS storage and AWS session tokens
- Implemented concurrent table scans and data file fetching
- Enhanced predicate builders and expression evaluators
- Added support for timestamp columns in row filters

Go:
- Implemented expressions and expression visitors

## Community Health:
Several new committers and PMC members were added this quarter, which is a good
indicator for community health. There was also a significant number of threads
on the mailing list about setting expectations for contributors and clearly
document how the community operates. New guidelines for merging PRs have been
added to the website and the community is also discussing guidelines for how
contributors can become committers. This builds on work from last quarter that
clarified the process for design discussions.

Many of the topics under discussion were raised because of the acquisition that
was noted in the last board report. The community has been working to address
the concerns raised, which are primarily in 3 areas:
- How decisions are made about designs and commits (now clarified)
- How contributors become committers and PMC members (under discussion)
- How the community operates when people cannot reach consensus

The last concern has historically not been a problem; people have so far
chosen to "disagree and commit" when a large majority in the community has
a different opinion. However, the first instance of this was encountered near
the end of the quarter. The community and PMC need to discuss how to make
progress on the issue.

19 Jun 2024 [Ryan Blue / Christofer] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 27 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:4.

Community changes, past quarter:
- No new PMC members. Last addition was Szehon Ho on 2023-04-20.
- No new committers. Last addition was Renjie Liu on 2024-03-06.

## Project Activity:

Releases:
- 1.5.1 was released on 2024-04-25
- 1.5.2 was released on 2024-05-09
- PyIceberg 0.6.1 was released on 2024-04-30

PyIceberg:
- Contributors are working to release more often
- Improved retries for Hive catalog locking
- Added register table support for Glue catalogs
- Adding metadata table support (snapshots, manifests, etc.)
- Working toward 0.7.0 release with partitioned writes and staged table creation

Rust:
- Implemented projection to support partition-based file pruning
- Implemented the inclusive metrics evaluator and predicate pushdown to Parquet
- Added Hive catalog support
- Improved REST catalog with OAuth2 and custom headers
- Added integration with DataFusion

Go:
- Working toward full expression support; added literals

Iceberg Java:
- The next Java release, 1.6.0, is targeted for release in June
-

Specs:
- Discussions about standardizing metadata for materialized views have made good
 progress. The community decided to use existing objects rather than creating a
 new combined table/view object and is working on metadata details.
- An extension to the REST protocol for privilege GRANT and REVOKE operations
 was proposed.
- Many discussions for extending the REST protocol are ongoing, including adding
 routes to plan scans, adding auth decisions, and appending data files
- There are also discussions for v3 features, like additional types (variant,
 timestampns, and others)

## Community Health:

The Iceberg community continues to be healthy, with a large number of commits
and individual contributors over the past quarter. Although overall commits
decreased, the change corresponds with the number of opened PRs so the change is
not a concern for health; PRs are getting reviewed.

The community is formalizing design discussions and has added github labels and
documented a process for making changes to community specs.

The community also held the first Iceberg Summit this quarter, with 32 sessions
that are now available on the YouTube (https://tinyurl.com/iceberg-summit).
Community members also spoke at CoC EU.

A company that employs 3 PMC members and 2 committers was acquired. The PMC
members (2 of whom are ASF members) have been reminded to act as individuals,
not as representatives of their employer, when interacting in the community.
Concentrations of PMC members is a risk that the community is aware of and will
note in future board reports.

Other projects and announcements:
- Trino added support for Iceberg views
- Beam has added an Iceberg sink
- Confluent, Terradata, and Oracle announced Iceberg support
- Snowflake announced a new open source REST catalog project, Polaris
- Databricks released its Unity catalog that implements the REST protocol
- Nessie added support for the Iceberg REST catalog protocol
- Gravino, which supports the REST protocol, was added to the incubator

20 Mar 2024 [Ryan Blue / Shane] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 27 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:4.

Community changes, past quarter:
- No new PMC members. Last addition was Szehon Ho on 2023-04-20.
- Bryan Keller was added as committer on 2024-03-02
- Honah J. was added as committer on 2024-01-11
- Renjie Liu was added as committer on 2024-03-06

## Project Activity:

Releases:
- Java 1.5.0 was released on 2024-03-11
- Rust 0.2.0 was released on 2024-02-20 (first release!)
- PyIceberg 0.6.0 was released on 2024-02-19
- Java 1.4.3 was released on 2023-12-27

Java implementation:
- 1.5.0 is the first release supporting Iceberg Views
- Added View resolution support in Spark engine integration
- Added View commands to Spark (SHOW/CREATE/DROP/etc.)
- View support in Trino is unblocked by the 1.5.0 release
- Added View support to REST, Nessie, and JDBC catalogs
- Discussing Materialized View extensions to Iceberg specs
- Added EncryptingFileIO to minimize encryption-related API changes
- Added StandardEncryptionManager to implement Iceberg Encryption spec
- Added Parquet (native) and Avro (AES GCM) encryption support
- Added pagination to listing in the REST catalog protocol
- Discussing multiple extensions to the REST protocol (appends, planning)
- Added delete file cache to Spark
- Added support for Flink 1.18
- Removed support for Spark 3.2

PyIceberg Python implementation:
- 0.6.0 is the first release supporting native writes
- Append and full table overwrite are supported
- Only writes to unpartitioned tables are supported
- Added commit support to JDBC, Glue, and Hive catalogs
- Implemented name mapping support for reading Parquet files without field IDs
- Actively working on writes to partitioned tables and engine integration

Rust implementation:
- 0.2.0 is the first Rust release
- Supports reading metadata files
- Supports REST catalog interaction
- Scan planning is the next active area of work

Documentation:
- Switched to new site build in the iceberg repository so contributing is easier

## Community Health:

The Iceberg community continues to be healthy. Although commit and PR activity
declined, the metrics indicate that activity was still strong (with 70
contributors and nearly 1,000 commits). This quarter also included holidays
(which usually have decreased activity) and a huge increase in mailing list
traffic (60%) because the community has been having many design discussions
about evolving the REST spec, introducing new specs (materialized views), and
discussions around how to keep track of new design proposals.

The community also started organizing an Iceberg Summit, to be held May 14-15.
The summit has been cleared by trademarks and the call for proposals has been
posted. More information can be found at:
* The Iceberg Summit website: https://iceberg-summit.org/
* The Call for Proposals: https://sessionize.com/iceberg-summit-2024/

@Jean-Baptiste: follow up with Iceberg PMC about committer requirements

17 Jan 2024 [Ryan Blue / Rich] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 24 committers and 17 PMC members in this project.
The Committer-to-PMC ratio is 3:2.

Community changes, past quarter:
- No new PMC members. Last addition was Szehon Ho on 2023-04-20.
- Rushan Jiang was added as a committer on 2023-01-05.

## Project Activity:

Releases:
- 1.4.3 was released on 2022-12-27.
- 1.4.2 was released on 2023-11-02.
- Python 0.5.0 was released on 2023-09-18.

REST protocol spec:
- Considering an extension to delegate scan planning to the catalog
- Discussing how to exchange access decisions/restrictions for tables
- An extension was proposed for server-side commits

Java:
- Started planning for a 2.0.0 release to clean up deprecated APIs
- Added an encryption manager that supports Parquet native encryption
- Ongoing effort to add encryption for table metadata using AES GCM streams
- Added support for Flink 1.18
- Completed the view API and support in the REST and Nessie catalogs
- Added view read support in Spark
- Ongoing work to improve Spark delete file performance

PyIceberg:
- Write support is nearing completion

Rust:
- Working toward first release (documentation, additional tests)
- Readers and writers for manifests and manifest lists were committed

Documentation:
- The Iceberg site is moving back into the main repo to make contribution easier

## Community Health:

The project continues to be healthy, with no concerning changes to metrics.
Technical progress is strong and growing in the new language implementations.
The community also expects a proposal for a third-party organized Iceberg
conference.

20 Dec 2023 [Ryan Blue / Sander] ¶

No report was submitted.

20 Sep 2023 [Ryan Blue / Shane] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: none

## Membership Data:
Apache Iceberg was founded 2020-05-19 (3 years ago)
There are currently 24 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is 3:2.

Community changes, past quarter:
- No new PMC members. Last addition was Szehon Ho on 2023-04-20.
- No new committers. Last addition was Amogh Jahagirdar on 2023-04-25.

## Project Activity:

Releases:
* PyIcberg 0.4.0 was released on 2023-07-23
* 1.3.1 was released on 2023-07-25

Java:
* Preparing for a 1.4.0 release in Sept/Oct
* Added dependency bundles for AWS, GCP, and Azure
* Added Azure FileIO implementation
* Added API for multi-table commits
* Performance optimizations for delete file scan planning
* Spark: Implemented adaptive split sizing
* Spark: Implemented function pushdown in v2 expressions
* Flink: Added bucketing only key-by strategy
* Build: Updated to Gradle version catalog
* Making progress on the reference implementation of common views
* Continuing work on table encryption

Python:
* 0.5.0 rc1 vote is under way
* Added support for serverless environments
* Implemented schema evolution
* Moved to Pydantic v2
* Added support for positional deletes
* Substantially improved Avro read performance
* Added conversion from Parquet to Iceberg schemas
* Added support for FSSpec and HDFS data
* Added SQL filter parsing

Rust:
* Created a repository for the Rust implementation, iceberg-rust
* 25 PRs merged
* Implemented base table metadata (e.g., types, transforms)
* Implemented visitors for working with nested structures
* Added Avro/Iceberg schema conversion
* Added build tooling

Go:
* Created a repository for the Go implementation, iceberg-go
* Added schema and types

## Community Health:

The largest development in the community is the addition of the Rust and Go
repositories, which is shown in the increase in code contributors this quarter.
The new implementations will also lead to new committers and PMC members. The
community has had good discussions about how manage contributions, to build
confidence in the implementations as well as to help new contributors become
familiar with the way the Apache community operates. (Along with ASF
requirements like license documentation.)

Two community metrics show decreases. Dev list traffic tends to vary because of
how the community uses the dev list -- that is, mostly for large design
discussions. The number of issues closed was also lower than normal and is not
expected to fluctuate. We will take a look and see what the difference is.

21 Jun 2023 [Ryan Blue / Willem] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Project Status:
Current project status: Ongoing
Issues for the board: none

## Membership Data:
Apache Iceberg was founded 2020-05-19 (3 years ago)
There are currently 24 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is 3:2.

Community changes, past quarter:
- Fokko Driesprong was added to the PMC on 2023-04-06
- Steven Wu was added to the PMC on 2023-04-06
- Szehon Ho was added to the PMC on 2023-04-20
- Yufei Gu was added to the PMC on 2023-04-06
- Amogh Jahagirdar was added as committer on 2023-04-25
- Eduard Tudenhoefner was added as committer on 2023-04-25

## Project Activity:

* 1.3.0 was released on 2023-05-26
* 1.2.1 was released on 2023-04-01
* 1.2.0 was released on 2023-03-20

The 1.3.0 release added support for Spark 3.4 and Flink 1.17. It also included
several updates and fixes, including:
* Better Spark file distribution for row-level plans like MERGE
* Improved bit density in the object storage layout
* Readable metrics in metadata tables
* Optimized vectorized reads for decimal types
* Spark timestamp_ntz and UUID support

The Python implementation is nearing an 0.4.0 release that will include:
* Delete file support
* Metadata updates for tables
* Improved compatibility

The community is also continuing to build a view specification, expand REST
catalog support, and add encryption to the table spec.

## Community Health:

The community continues to be healthy, with most metrics steady this quarter.

19 Apr 2023 [Ryan Blue / Sander] ¶

## Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:

There are no issues requiring board attention.

## Membership Data:

Apache Iceberg was founded 2020-05-19 (3 years ago)
There are currently 22 committers and 15 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:2.

Community changes, past quarter:
- Fokko Driesprong was added to the PMC on 2023-04-06
- Steven Wu was added to the PMC on 2023-04-06
- Yufei Gu was added to the PMC on 2023-04-06
- No new committers. Last addition was Steven Wu on 2022-10-07.

## Project Activity:

Releases:
* 1.2.0 was released on 2023-03-20, followed by 1.2.1 on 2023-04-11
* Python 0.3.0 was released on 2023-02-09

The Python implementation has reached feature parity with the "legacy" codebase,
so the legacy code that was never part of an ASF release has been removed! The
Python implementation now supports full read planing, including parallel
metadata reads, manifest pruning, partition pruning, and column stats pruning.
Python frameworks that use Apache Arrow can use data from Iceberg tables,
including Arrow compute, Pandas, DuckDB, and Ray. Write support is the next
milestone for the Python implementation.

The Java implementation's latest release included several new capabilities:
* Branching and tagging, with support in Flink and Spark using VERSION AS OF
* Spark DDL for branches and tags
* Metadata query pushdown in Spark
* Changelog reads in Spark
* Throttling for streaming reads in Flink
* FileIO support for ORC readers and writers
* SigV4 support for REST catalog auth
* Remote signing client for S3
* The ability to read Snowflake Iceberg tables

There are also efforts to add encryption to the format and to support multi-
table transactions.

The community is also discussing a Rust or C++ implementation hosted by the ASF.

## Community Health:

The community remains healthy, with a reasonable increase in both opened and
closed pull requests, as well as a stead number of unique contributors. The
Python implementation has been bringing a lot of new contributors.

Iceberg was featured in 14 talks at Subsurface, as well as in a panel.

22 Mar 2023 [Ryan Blue / Sander] ¶

No report was submitted.

@Sander: pursue a roll call for Iceberg

15 Feb 2023 [Ryan Blue / Rich] ¶

No report was submitted.

18 Jan 2023 [Ryan Blue / Christofer] ¶

No report was submitted.

21 Dec 2022 [Ryan Blue / Sharan] ¶

No report was submitted.

19 Oct 2022 [Ryan Blue / Roy] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 years ago)
There are currently 22 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 2:1.

Community changes, past quarter:
- No new PMC members. Last addition was Jack Ye on 2021-11-14.
- Fokko Driesprong was added as committer on 2022-08-21
- Steven Wu was added as committer on 2022-10-07
- Yufei Gu was added as committer on 2022-08-25

## Project Activity:
The community had 2 releases in the 0.14.x line and an initial Python release,
0.1.0. In addition, the vote for a 1.0.0 release is currently passing.

The Python release is the result of significant community effort and includes
a new CLI utility (pyiceberg), support for Hive and REST catalogs, and the
ability to read table metadata. The next goal is a 0.2.0 release that can handle
query planning to enable reads in Python and Python-based engines.

The 1.0.0 JVM release adds API guarantees to the API module, but is closely
based on 0.14.1 to make transitioning to a new major version simple.

Next, the community is preparing a 1.1.0 release with significant new updates:
* The ability to read and write table branches
* Scan metrics reporting
* Support for Spark FunctionCatalog
* FLIP-27 reader support in Flink SQL
* Z-order support when rewriting or compacting data files
* Support for Puffin stats in table metadata

## Community Health:
The community continues to be healthy in terms of commits. The number of
unique contributors decreased slightly, which indicates the community should
ensure pull requests from contributors are getting enough attention.

The increase of issues closed is due to setting up a stale issues bot to help
keep issues fresh and relevant. The community also added issue templates to
make bug reports and feature requests better and more clear.

This year, there were 4 presentations about Iceberg at ApacheCon:
* Accelerate Data Lakehouse deployment with Apache Iceberg in Cloudera Data
 Platform  (Attila Turoczy, Bill Zhang)
* Apache Iceberg's REST Catalog - A Gateway to Enriching Data Access via the
 Simplicity of an HTTP Service (Sam Redai)
* Iceberg's Best Secret: Exploring Metadata Tables (Szehon Ho)
* Integrated Audits: Streamlined Data Observability with Apache Iceberg
 (Sam Redai)

There were also 2 Iceberg presentations at Flink Forward:
* Batch Processing at Scale with Flink & Iceberg (Andreas Hailu)
* Tame the small files problem and optimize data layout for streaming ingestion
 to Iceberg (Steven Wu, Gang Ye)

21 Sep 2022 [Ryan Blue / Roman] ¶

No report was submitted.

20 Jul 2022 [Ryan Blue / Roman] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 years ago)
There are currently 19 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:3.

Community changes, past quarter:
- No new PMC members. Last addition was Jack Ye on 2021-11-14.
- No new committers. Last addition was Szehon Ho on 2022-03-07.

## Project Activity:
The community recently released 0.13.2 on 2022-06-13 and is currently voting on
a candidate for the 0.14.0 release. 0.14.0 will be followed closely by a 1.0.0,
which will make API stability guarantees.

The 0.14.0 release contains significant new features, including:
* Support for Apache Spark 3.3
* Support for Apache Flink 1.15
* MERGE and UPDATE plans using row-level deletes
* A FLIP-27 reader for Flink
* The new REST catalog implementation with change-based commits
* A new file format for index and stats data, Puffin
* Zorder sorting when rewriting data files
* Range and tail reads for IO
* Additional metrics collection

The community has also been working on new features, including:
* Table-level statistics and data sketches using Puffin
* Table branching and tagging
* View metadata tracking
* Default values in schemas
* A native Python implementation

The python implementation has been making good progress and may see a release
next quarter.

The project has also been working to improve documentation and has a new site
design that tracks older versions.

## Community Health:
Community health continues to be good. The community's primary gauge of activity
is pull requests and commits, and there were 1088 PRs opened this quarter a
(5% increase) and 780 commits (a slight decrease). There were approximately the
same number of contributors as the previous quarter, 79.

15 Jun 2022 [Ryan Blue / Sander] ¶

No report was submitted.

16 Mar 2022 [Ryan Blue / Bertrand] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 years ago)
There are currently 19 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:3.

Community changes, past quarter:
- No new PMC members. Last addition was Jack Ye on 2021-11-14.
- Szehon Ho was added as committer on 2022-03-07

## Project Activity:
Iceberg 0.13.0 was released on 2022-02-01, and was followed quickly by 0.13.1 on
2022-02-14 to fix a performance regression.

The 0.13 release included many significant new features:
* Spark 3.2 support and overhauled row-level plans
* Flink 1.13 and 1.14 support
* Spark and Flink modules built and tested against each engine version
* GCS and Aliyun OSS IO integration

The community has also been working on some major features:
* Delta-based MERGE INTO and UPDATE plans for Spark (complete)
* Scala 2.13 support for Spark 3.2 (complete)
* IO metrics collection (complete)
* Vectorized reads with delete files (complete)
* An implementation of table branching and tagging
* Addition of a REST catalog spec, like the Hive Thrift interface for Iceberg
* Addition of a view spec that tracks SQL or other plan representations
* Spec updates for secondary indexes and metrics
* Spec updates for default values

In addition to features, the community also overhauled the ASF site. The new
site better communicates Iceberg's major features and has version-specific docs.

There were 6 Iceberg talks at the Subsurface conference and the conference
organizers noted it was a major theme. On the last report, we were asked whether
the presentations are available on the Iceberg site. They are present under the
Talks tab.

## Community Health:
Community health continues to be good. There were some decreases this quarter in
metrics like dev list traffic and issues opened, but this isn't concerning with
the context of the last few quarters of growth and that this report includes
December when many people are on holiday. In addition, the number of unique
contributors increased to 81 (20% higher).

15 Dec 2021 [Ryan Blue / Roman] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 years ago)
There are currently 18 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is 3:2.

Community changes, past quarter:
- Jack Ye was added to the PMC on 2021-11-14
- Russell Spitzer was added to the PMC on 2021-11-13
- No new committers. Last addition was Jack Ye on 2021-07-02.

## Project Activity:
0.12.1 was released on 2021-11-08. The community is also working on the next
release, 0.13.0.

* A spec for table branching and tagging was written and is nearing completion
* Iceberg's documentation is being updated so that multiple versions can be
 easily maintained and updated.
* Delete file compaction was added to the rewrite files action and stored
 procedure. Additional compaction options are planned.
* Sort based compaction was added
* Flink and Spark plugins have been refactored so that each version is
 independent and is compiled against the correct engine version. While this
 duplicates some code, it makes integrating new features easier and reduces
 the risk of runtime incompatibilities.
* Added support for Flink 1.14.x and Spark 3.2.x
* A REST catalog API spec is taking shape. This should standardize an interface
 for providing a table catalog, similar to the thrift metastore interface used
 by Hive.
* Aliyun OSS support was added as an IO module
* The community decided on goals for a 1.0 release, targeted for early next year
* Python implementation is making progress

## Community Health:
Community metrics show healthy growth. Notably, there were 66 unique
contributors this quarter, up from 50 last quarter. Total PRs submitted was more
than 750, about 50% more than the 500 last quarter. Similarly, PRs closed also
increased to 682 from about 400 last quarter, a 64% increase.

The most significant stat is the increase in unique contributors, which signals
that more people are interested in the project.

This quarter, there were talks featuring Iceberg at AWS re:Invent (where
Athena announced support), Trino summit, and community events for PrestoDB,
lakeFS, and SF Big Analytics.

15 Sep 2021 [Ryan Blue / Sharan] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (a year ago)
There are currently 18 committers and 10 PMC members in this project.
The Committer-to-PMC ratio is 9:5.

Community changes, past quarter:
- Zheng Hu was added to the PMC on 2021-06-28
- Jack Ye was added as committer on 2021-07-02

## Project Activity:
0.12.0 was released on 2021-08-15 and is a significant update from 0.11.1.

The community voted to adopt version 2 of the Iceberg table format that adds
row-level updates and deletes.

The community is also working on several improvements:
* Preparing for 1.0 of the Java reference implementation
* Adding an Iceberg specification for SQL views
* Spark implementations of MERGE and UPDATE that use row-level deletes
* Flink UPSERT support
* Z-order specification
* Relative path support for disaster recovery
* Branching and tagging table snapshots
* Additional storage integration modules (Dell EMC, Aliyun OSS)
* Encryption

## Community Health:
The community is healthy and continues to grow. Unique contributors grew by
2% this quarter to 50. Contributions increased to more than 500 PRs and more
than 400 PRs were addressed.

The community is discussing how to scale and coordinate with a published roadmap
and github projects to link the roadmap to individual issues.

The community is also forming a new group of contributors around Python. This
group has held sync meetings and is planning to make the python API more
pythonic and to get to an initial python release.

16 Jun 2021 [Ryan Blue / Roman] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

Apologies that this report is late.
The community will report next month if needed.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (a year ago)
There are currently 17 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:5.

Community changes, past quarter:
- No new PMC members. Last addition was Anton Okolnychyi on 2020-05-19.
- Ted Gooch was added as committer on 2021-05-11
- Russell Spitzer was added as committer on 2021-04-02
- Ryan Murray was added as committer on 2021-03-26
- Yan Yan was added as committer on 2021-03-23

## Project Activity:

0.11.1 was released on 2021-04-03.

The community is currently working on the 0.12.0 release, which will
update support for Spark 3.1 to fix the Iceberg SQL extensions.

Several features were finished:
* Spark UPDATE support was committed
* Row identifier fields were added to schemas to support Flink UPSERT
* An action to import existing data files was added
* Hive integration has been updated to allow using multiple catalogs

In addition, there are several on-going projects:
* The community is working on updates for Spark 3.1
* Spark data file compaction strategies and a new implementation have been
 discussed and should be available in 0.12.0
* A design for encryption support has been proposed that will support
 Parquet and ORC encryption, as well as encryption for the metadata tree.
* There have been design discussions for adding secondary indexes that can be
 updated asynchronously to keep commit latency low.
* There have been design discussions for adding default field values
* Support for Spark 3.0 structured streaming with the DSv2 API is under review
* A DynamoDB catalog has been submitted as a PR


## Community Health:
The community is healthy and showed an increase in contributors in the past
quarter. New contributors are working on significant projects, like Spark
streaming support and default values.

The community also added 4 new committers in the past quarter!

17 Mar 2021 [Ryan Blue / Roy] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (9 months ago)
There are currently 13 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:5.

Community changes, past quarter:
- No new PMC members. Last addition was Anton Okolnychyi on 2020-05-19.
- Peter Vary was added as committer on 2021-01-23

## Project Activity:
0.11.0 was released on 2021-01-26 and included several important new features:
* Support for partition evolution
* Spark SQL extensions with support for MERGE INTO, DELETE FROM, and new DDL
* Spark support for table maintenance through stored procedures
* Streaming reads, filter pushdown, and experimental CDC writes in Flink
* AWS module with better integration for S3 and Glue metastore
* Nessie metastore module

The community is working toward finalizing the v2 format spec and the next
release. There is good progress on metrics collection for Avro data files,
Hive integration, Spark UPDATE support, and more table maintenance actions.

## Community Health:
The overall number of pull requests merged in the last quarter decreased from
the previous quarter, but this is mostly due to annual holidays. Leading up to
the release in January, the project set a new record at 78 PRs merged in a week.
More importantly, although there was a decrease in PRs merged, the number of
contributors still increased slightly, to 50 code contributors in the quarter.
Peter Vary was also added as a committer in January.

16 Dec 2020 [Ryan Blue / Sander] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (7 months ago)
There are currently 12 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is 4:3.

Community changes, past quarter:
- No new PMC members. Last addition was Anton Okolnychyi on 2020-05-19.
- Jingsong Lee was added as committer on 2020-10-09
- Zheng Hu was added as committer on 2020-10-09

## Project Activity:
Recent releases:
* 0.10.0 was released on 2020-11-11.

The 0.10.0 release included:
* A new Flink module supporting DataStreams and SQL writes and (batch) reads
* A new Hive module supporting reads
* Row-level delete implementation, part of the v2 spec, for engine integration

More recently, the community has added:
* Stored procedures for Spark that perform table maintenance from SQL
* New catalog implementations for Nessie and Glue
* Writers to support Flink CDC events and Spark MERGE plans
* Handling for NaN values in metadata, and NaN predicates

The project is making significant progress.

## Community Health:
Community activity continues to increase. Recent video sync calls have had 20+
participants, code contributions are increasing in frequency (588 PRs opened and
552 PRs closed), and there are many new community members joining in.

The community added two new committers this quarter.

16 Sep 2020 [Ryan Blue / Sam] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (4 months ago)
There are currently 10 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is roughly 1:1.

Community changes, past quarter:
- No new PMC members since graduation on 2020-05-19
- Shardul Mahadik was added as committer on 2020-07-25

## Project Activity:
Recent releases:
* 0.9.0 was released on 2020-07-13
* 0.9.1 was released on 2020-08-14

The community expects to release 0.10.0 soon with support for Hive reads,
Flink writes, and the utilities needed to implement row-level deletes in
external processing engines, like Presto.

Notable improvements this month include:
* Implemented end-to-end row-level deletes in the client library (direct reads)
* Committed Flink write support for both DataStreams and SQL
* Added Hive predicate pushdown and a runtime bundle
* Committed name mapping support for reading ORC files from non-Iceberg tables
* Added a new snapshot expiration action that runs in parallel using Spark
* Added metadata to configure tables with a preferred sort order

The community is actively working on Hive column pruning, Hive write support,
Flink read support, and row-level deletes in more processing engines.

## Community Health:
The number of unique contributors increased in the last month to 26, from the
previous high watermark of 21. Contributions are still healthy, with 74 commits
in the past month. New community members have been contributing documentation
and build improvements (PR labels, fixing warnings); it is great to have these
valuable contributions in addition to features and bug fixes.

19 Aug 2020 [Ryan Blue / Shane] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 months ago)
There are currently 10 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is roughly 1:1.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- Shardul Mahadik was added as committer on 2020-07-25

## Project Activity:
0.9.0 was released, including support for Spark 3 and SQL DDL commands, support
for JDK 11, vectorized Parquet reads, and an action to compact data files.

Since the 0.9.0 release, the community has made progress in several areas:
- The Hive StorageHandler now provides access to query Iceberg tables
 (work is ongoing to implement projection and predicate pushdown).
- Flink integration has made substantial progress toward using native RowData,
 and the first stage of the Flink sink (data file writers) has been committed.
- An action to expire snapshots using Spark was added and is an improvement on
 the incremental approach because it compares the reachable file sets.
- The implementation of row-level deletes is nearing completion. Scan planning
 now supports delete files, merge-based and set-based row filters have been
 committed, and delete file writers are under review. The delete file writers
 allow storing deleted row data in support of Flink CDC use cases.

Releases:
- 0.9.0 was released on 2020-07-13
- 0.9.1 has an ongoing vote

## Community Health:
The month since the last report has been one of the busiest since the project
started. 80 pull requests were merged in the last 4 weeks, and more importantly,
came from 21 different contributors. Both of these are new high watermarks.

Community members gave 2 Iceberg talks at Subsurface Conf, on enabling Hive
queries against Iceberg tables and working with petabyte-scale Iceberg tables.
Iceberg was also mentioned in the keynotes.

15 Jul 2020 [Ryan Blue / Craig] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (2 months ago)
There are currently 9 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- No new committers were added.

## Project Activity:
In July, the community held one sync meeting to discuss general topics, and
one specifically to discuss how to include both groups that have been working
on integration with Hive.

To address the question on the last board report, the community sync meetings
are video conferences that anyone in the community is welcome to attend. The
discussion is documented and summarized for anyone that can't attend. We have
found these to be a good way to exchange context and ideas more quickly, but
recognize that this isn't the best way for some people to participate and so
we don't consider these a forum for making decisions or voting. If we come to
a tentative conclusion on a topic, it is still open for further discussion
on the dev list. The idea for this comes from the Parquet community that has
been doing this for several years.

Development activity:
* Spark vectorized reads for flat schemas was merged and benchmarked
* The Spark 3 integration branch was merged into master
* Name mapping for Parquet files without IDs was committed
* And action to compact data files was added
* Support was added for managing and adding delete files in table metadata
* Refactoring to support reuse Spark components for Flink
* Several PRs for Flink support have been committed and more are open
* CI tests for JDK 11 have been added

The community also plans to release 0.9.0 with Spark 3 support soon.

## Community Health:
Most community metrics have again increased in the last month, although dev
list traffic is a bit lower. More importantly, the community has made further
progress on several large areas with different groups leading the efforts,
like Hive support, Spark 3 support, and Flink support.

17 Jun 2020 [Ryan Blue / Roy] ¶

## Description:
Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Iceberg was founded 2020-05-19 (21 days ago)
There are currently 9 committers and 9 PMC members in this project.
The Committer-to-PMC ratio is 1:1.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- No new committers were added.

## Project Activity:
There were two community syncs in May, with good discussions on adding secondary
indexes and fixing some persistent issues, like Guava library conflicts and how
to support multiple Spark versions.

Development activity:
- Row-level delete progress continues with several PRs merged
- Added support for ORC predicate push-down and metrics filtering, which is a
 significant step toward performance parity with Parquet
- The vectorized Parquet read path is passing end-to-end tests for flat data
- Guava is now shaded and relocated, unblocking integration with Hive
- The build changed dependency locking plugins to unblock Hive and Spark 3 work
- Flink contributors opened pull requests to merge the prototype sink

## Community Health:
Nearly all metrics (list traffic, pull requests, and issues opened) are showing
an increase in the last month, and the community has made significant progress
on several large extensions (ORC and Flink, notably).

20 May 2020 ¶

Establish the Apache Iceberg Project

 WHEREAS, the Board of Directors deems it to be in the best interests of
 the Foundation and consistent with the Foundation's purpose to establish
 a Project Management Committee charged with the creation and maintenance
 of open-source software, for distribution at no charge to the public,
 related to managing huge analytic datasets using a standard at-rest
 table format that is designed for high performance and ease of use.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
 (PMC), to be known as the "Apache Iceberg Project", be and hereby is
 established pursuant to Bylaws of the Foundation; and be it further

 RESOLVED, that the Apache Iceberg Project be and hereby is responsible
 for the creation and maintenance of software related to managing huge
 analytic datasets using a standard at-rest table format that is designed
 for high performance and ease of use; and be it further

 RESOLVED, that the office of "Vice President, Apache Iceberg" be and
 hereby is created, the person holding such office to serve at the
 direction of the Board of Directors as the chair of the Apache Iceberg
 Project, and to have primary responsibility for management of the
 projects within the scope of responsibility of the Apache Iceberg
 Project; and be it further

 RESOLVED, that the persons listed immediately below be and hereby are
 appointed to serve as the initial members of the Apache Iceberg Project:

  * Anton Okolnychyi <aokolnychyi@apache.org>
  * Carl Steinbach   <cws@apache.org>
  * Daniel C. Weeks  <dweeks@apache.org>
  * James R. Taylor  <jamestaylor@apache.org>
  * Julien Le Dem    <julien@apache.org>
  * Owen O'Malley    <omalley@apache.org>
  * Parth Brahmbhatt <parth@apache.org>
  * Ratandeep Ratti  <rdsr@apache.org>
  * Ryan Blue        <blue@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Ryan Blue be appointed to
 the office of Vice President, Apache Iceberg, to serve in accordance
 with and subject to the direction of the Board of Directors and the
 Bylaws of the Foundation until death, resignation, retirement, removal
 or disqualification, or until a successor is appointed; and be it
 further

 RESOLVED, that the Apache Iceberg Project be and hereby is tasked with
 the migration and rationalization of the Apache Incubator Iceberg
 podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache Incubator
 Iceberg podling encumbered upon the Apache Incubator PMC are hereafter
 discharged.

 Special Order 7G, Establish the Apache Iceberg Project, was
 approved by Unanimous Vote of the directors present.

15 Jan 2020 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

### Three most important unfinished issues to address before graduating:

 1. Grow the Iceberg community
 2. Add more committers and PPMC members

### Are there any issues that the IPMC or ASF Board need to be aware of?

 No issues.

### How has the community developed since the last report?

 In the 4 months since the last report, 138 pull requests were merged for an
 average of 34.5 per month. While this is down from the previous monthly
 average of 49.6 per month for June through August, this contribution rate
 is still very active and healthy. Contributions are coming from a regular
 group of contributors outside of the initial set of committers, which is a
 positive indication for adding new committers and PPMC members over the
 next few months.

 The community released the first version of Apache Iceberg,
 0.7.0-incubating. This release used the "standard" incubator disclaimer and
 included convenience binaries. The release candidate votes were very active
 with community members testing out the release and reporting problems.

 There was an Apache Iceberg talk at ApacheCon NA in September.

### How has the project developed since the last report?

 - The community is building support for the upcoming Spark 3.0 release
 - The first PR from the vectorization branch has been merged into master
 - Support for IN and NOT IN predicates was contributed
 - Python added support for Hive metastore tables and the read path is
 near commit
 - Flaky tests have been fixed
 - Baseline checks (style, errorprone, findbugs) are now applied to all
 modules

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [x] Community building
 - [x] Nearing graduation
 - [ ] Other:

### Date of last release:

 - 0.7.0-incubating was released 25 October 2019

### When were the last committers or PPMC members elected?

 - Anton Okolnychyi was added 30 August 2019

### Have your mentors been helpful and responsive?

 Yes. 4 of 5 mentors voted on the 0.7.0-incubating IPMC vote. Thanks to our
 mentors for being active!

### Is the PPMC managing the podling's brand / trademarks?

 Yes, the podling is managing the brand and is not aware of any issues.
 The project name has been approved.

### Signed-off-by:

 - [x] (iceberg) Ryan Blue
    Comments:
 - [ ] (iceberg) Julien Le Dem
    Comments:
 - [X] (iceberg) Owen O'Malley
    Comments:
 - [ ] (iceberg) James Taylor
    Comments:
 - [ ] (iceberg) Carl Steinbach
    Comments:

### IPMC/Shepherd notes:

18 Sep 2019 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

### Three most important unfinished issues to address before graduating:

 1. Make the first Apache release.
 (https://github.com/apache/incubator-iceberg/milestone/1)
 2. Grow the Iceberg community
 3. Add more committers and PPMC members

### Are there any issues that the IPMC or ASF Board need to be aware of?

 No issues.

### How has the community developed since the last report?

 The community continues to grow steadily. In the last month:
 * 59 pull requests have been merged
 * 17 people contributed the merged PRs
 * 18 issues have been closed, 22 issues were opened

 For comparison, the last report had 74 pull requests merged over 3 months.

### How has the project developed since the last report?

 * License documentation has been completed for the Java project,
unblocking
 the first release
 * Added more documentation to iceberg.apache.org
 * Started vectorized read branch with significantly better performance
 * Added metadata tables
 * Added configuration to control statistics and truncate long values
 * Improved Hive Metastore integration
 * A working python read path has been submitted in PRs

### How would you assess the podling's maturity?

 - [ ] Initial setup
 - [x] Working towards first release
 - [x] Community building
 - [x] Nearing graduation
 - [ ] Other:

### Date of last release:

 * No release yet

### When were the last committers or PPMC members elected?

 * Anton Okolnychyi was added 30 August 2019

### Have your mentors been helpful and responsive?

 Yes

### Signed-off-by:

 - [x] (iceberg) Ryan Blue
    Comments:
 - [ ] (iceberg) Julien Le Dem
    Comments:
 - [X] (iceberg) Owen O'Malley
    Comments:
      The project also gave two presentations:
        * Berlin Buzzwords (June 2019)
        * ApacheCon NA (Sep 2019)
      Iceberg is being used in production at Netflix on huge tables, up to
25 petabytes.

 - [X] (iceberg) James Taylor
    Comments:
 - [X] (iceberg) Carl Steinbach
    Comments:
      Approval added by Ryan Blue, Carl had trouble editing the new report
      location

### IPMC/Shepherd notes:
 Justin Mclean: The included stats don't really mean much to anyone
 outside of your project, please drop them from future reports.
 The community growth section might as well be blank.
 I find it surprising that this project thinks that it is near graduation.
 Please discuss this with your mentors.

17 Jul 2019 ¶

Iceberg is a table format for large, slow-moving tabular data.
Iceberg has been incubating since 2018-11-16.

### Three most important unfinished issues to address before graduating:

 1. Update build for Apache release, add LICENSE/NOTICE to Jars.
 2. Make the first Apache release.
 (https://github.com/apache/incubator-iceberg/milestone/1)
 3. Grow the Iceberg community

### Are there any issues that the IPMC or ASF Board need to be aware of?

 * No issues that require attention.

### How has the community developed since the last report?

 * Community growth has continued with several new contributors and
 reviewers
 * Community has decided on style and added checking to CI for most modules
 * Community has started work on extending the spec for new use cases

### How has the project developed since the last report?

 * Much more content on iceberg.apache.org has been added
 * 74 pull requests have been merged, many reviewed by new community
 members
 * Work has begun to add row-level deletes and upserts to the format
 * Added support for Spark streaming, a catalog API, and numerous bug fixes
 * Contributors are reviewing code, submitting substantial features, and
 improving dev practices

### How would you assess the podling's maturity?
 Please feel free to add your own commentary.

 - [ ] Initial setup (name clearance approval pending)
 - [X] Working towards first release
 - [X] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:

 None yet

### When were the last committers or PPMC members elected?

 None yet

### Have your mentors been helpful and responsive?

 Yes.

### Signed-off-by:

 - [X](iceberg) Ryan Blue
    Comments: I wrote the first pass of the report.
 - [ ](iceberg) Julien Le Dem
    Comments:
 - [X](iceberg) Owen O'Malley
    Comments: +1 from discussion on dev list
 - [ ](iceberg) James Taylor
    Comments:
 - [ ](iceberg) Carl Steinbach
    Comments:

### IPMC/Shepherd notes:

20 Mar 2019 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

Three most important issues to address in the move towards graduation:

 1. Update build for Apache release, add LICENSE/NOTICE to Jars.
 2. Make the first Apache release.
 (https://github.com/apache/incubator-iceberg/milestone/1)
 3. Grow the Iceberg community

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * No issues that require attention.

How has the community developed since the last report?

 * The community has continued to receive new contributors
 * Several contributors are reliable helping review pull requests. Because
   of these review contributions and the small number of committers, the
   community voted to relax the RTC requirements and allow committers to
   push their own changes if the community has reviewed the PR. This helps
   develop reviewers and gets changes in faster. The vote also set reasonable
   limits for this practice: PRs must be up for at least 2 days and this is only
   for the first year, while we are working with a small set of committers.

How has the project developed since the last report?

 * Podling name search concluded that Iceberg is a suitable name.
   (See PODLINGNAMESEARCH-163)
 * The community voted to accept a large PR with a Python implementation.
 * Contributors are fixing important predicate push-down issues, including
   case sensitivity, filtering on nested types, missing file metrics, etc.
 * Contributors added support for plugging in file stream encryption.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [ ] Initial setup (name clearance approval pending)
 [X] Working towards first release
 [X] Community building
 [ ] Nearing graduation
 [ ] Other:

Date of last release:

 None yet

When were the last committers or PPMC members elected?

 None yet

Have your mentors been helpful and responsive or are things falling
through the cracks? In the latter case, please list any open issues
that need to be addressed.

 Yes.

Signed-off-by:

 [X](iceberg) Ryan Blue
    Comments: I wrote the first pass of the report.
 [ ](iceberg) Julien Le Dem
    Comments:
 [X](iceberg) Owen O'Malley
    Comments: (Approval copied from +1 on dev list)
 [ ](iceberg) James Taylor
    Comments:
 [ ](iceberg) Carl Steinbach
    Comments:

20 Feb 2019 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

Three most important issues to address in the move towards graduation:

 1. Update build for Apache release, add LICENSE/NOTICE to Jars.
 2. Make the first Apache release.
 (https://github.com/apache/incubator-iceberg/milestone/1)
 3. Grow the Iceberg community

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * No issues that require attention.

How has the community developed since the last report?

 * Pull requests from 6 contributors were merged, 7 new contributors

How has the project developed since the last report?

 * Submitted evidence for podling name search: PODLINGNAMESEARCH-163
 * Netflix submitted a revised trademark agreement for counter-signing
 * Abstracted data file locations for community use cases
 * Reviewing proposed API update for file stream encryption plugins
 * New contributor highlights:
   - A new contributor is fixing case sensitivity in expressions
   - A new contributor opened a PR to add a startsWith predicate
   - A new contributor reviewed 4 pull requests and opened another

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [X] Initial setup (name clearance approval pending)
 [X] Working towards first release
 [X] Community building
 [ ] Nearing graduation
 [ ] Other:

Date of last release:

 None yet

When were the last committers or PPMC members elected?

 None yet

Have your mentors been helpful and responsive or are things falling
through the cracks? In the latter case, please list any open issues
that need to be addressed.

 Yes.

Signed-off-by:

 [X](iceberg) Ryan Blue
    Comments: dev list traffic appears to be increasing also
 [ ](iceberg) Julien Le Dem
    Comments:
 [ ](iceberg) Owen O'Malley
    Comments:
 [x](iceberg) James Taylor
    Comments:
 [X](iceberg) Carl Steinbach
    Comments: From dev list: "Looks good to me. +1"

IPMC/Shepherd notes:

16 Jan 2019 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

Three most important issues to address in the move towards graduation:

 1. Finish the name clearance and trademark agreement.
 2. Make the first Apache release.
(https://github.com/apache/incubator-iceberg/milestone/1)
 3. Grow the Iceberg community

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * Gitbox traffic is now going to issues@. The community was losing dev@
   subscribers because of the high volume of traffic from Gitbox. However,
   now all updates are sent to issues@. It would be nice to have emails
   from creation go to dev@, while updates and resolutions would go the
   issues@.
 * The trademark agreement proposed by Netflix was not acceptable to the
   ASF. It would be helpful if the ASF published the terms that the ASF
   requires to avoid trial and error. Netflix is drafting a new agreement.

How has the community developed since the last report?

 * Moved gitbox notifications to avoid loss of dev@ subscribers
 (self-reported leaving dev@).
 * New contributor activity: 3 new issues opened, 4 PRs submitted
 * 5 PRs from non-committers merged
 * 2 contributors started reviewing PRs
 * New design doc proposed by a community contributor
 * Moved issues from Netflix repository to Apache repository

How has the project developed since the last report?

 * Planned blockers for first release, 0.1.0, in milestone 1
 * Partial python implementation submitted
 * Manifest listing file added to the spec and implementation committed
 (blocker for initial release). Resulted in a significant improvement in
 query planning time for large tables.
 * Abstracted file IO API to support community use cases
 * Reviewing community proposal for external plugins to support file-level
 encryption
 * Added doc strings to schemas

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [X] Initial setup (name clearance pending)
 [X] Working towards first release
 [ ] Community building
 [ ] Nearing graduation
 [ ] Other:

Date of last release:

 None yet

When were the last committers or PPMC members elected?

 None yet

Have your mentors been helpful and responsive or are things falling
through the cracks? In the latter case, please list any open issues
that need to be addressed.

 Last month was December, so traffic has been low and both PPMC members and
 mentors were slow to respond. This is not abnormal, but the PPMC missed
 the deadline to file this report. We will ensure this doesn't recur.

Signed-off-by:

 [X](iceberg) Ryan Blue
    Comments: I wrote the first pass of the report, but after the deadline.
 [ ](iceberg) Julien Le Dem
    Comments:
 [X](iceberg) Owen O'Malley
    Comments: Approval from +1 on dev list.
 [ ](iceberg) James Taylor
    Comments:
 [ ](iceberg) Carl Steinbach
    Comments:

19 Dec 2018 ¶

Iceberg is a table format for large, slow-moving tabular data.

Iceberg has been incubating since 2018-11-16.

Three most important issues to address in the move towards graduation:

 1. Get the SGA accepted.
 2. Finish the name clearance.
 3. Make the first Apache release.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 * Gitbox integration has helped a lot, although it is frustrating that
   the team members are not allowed to configure the project and must go
   through infra for every change.
 * The traffic on the dev list from Github pull requests and issues is
   pretty heavy. It would be nice to have emails from creation go to dev@,
   while updates and resolutions would go the issues@.

How has the community developed since the last report?

 This is the first report.

How has the project developed since the last report?

 This is the first report.
 Both the software grant and trademark agreements have been submitted.
 Code has been imported and updated to use the ASF license header. LICENSE
 and NOTICE files have been updated to comply with ASF policy.
 Podling website is up at https://iceberg.apache.org.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [X] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [ ] Nearing graduation
 [ ] Other:

Date of last release:

 None yet

When were the last committers or PPMC members elected?

 None yet

Have your mentors been helpful and responsive or are things falling
through the cracks? In the latter case, please list any open issues
that need to be addressed.

 We're working through the issues as they come up.

Signed-off-by:

 [X](iceberg) Ryan Blue
    Comments:
 [ ](iceberg) Julien Le Dem
    Comments:
 [X](iceberg) Owen O'Malley
    Comments: I wrote the first pass of the report.
 [X](iceberg) James Taylor
    Comments:
 [X](iceberg) Carl Steinbach
    Comments:

IPMC/Shepherd notes: