Skip to Main Content
The Apache Software Foundation
Apache 20th Anniversary Logo

This was extracted (@ 2024-11-20 22:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

WARNING: these pages may omit some original contents of the minutes.
This is due to changes in the layout of the source minutes over the years. Fixes are being worked on.

Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).

Parquet

16 Oct 2024 [Julien Le Dem / Sander]

## Description:
A column-oriented data file format designed for efficient data storage and
retrieval. It provides high performance compression and encoding schemes to
handle complex data in bulk and is supported in many programming languages and
analytics tools.

## Project Status:
Current project status: Ongoing
 - improve Parquet footer metadata using Flatbuffers.
 - Define a Variant type based on the work from the Spark project
 - encryption feature improvements
 - new geometry logical type
Issues for the board: none

## Membership Data:
Apache Parquet was founded 2015-04-21 (9 years ago)
There are currently 39 committers and 30 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:5.

Community changes, past quarter:
- Antoine Pitrou was added to the PMC on 2024-07-17
- Micah Kornfield was added to the PMC on 2024-07-17
- No new committers. Last addition was Xuwei Fu on 2024-07-11.

## Project Activity:
There are a couple ongoing projects in the community. Highlighting two in
particular.

Improved metadata footer:
 - the mechanism for replacing the existing footer in a backwards compatible
   way with a transition period is approved
 - the new footer in flatbuffers format is under POC. Parquet users are
   encouraged to donate anonymized footers to test performance on real-life
   metadata.

New Variant type:
 - The Parquet community as agreed on principle to adopt the binary format
   defined in the Spark community
 - The spec is being iterated on as part of the parquet-format repo
 - implementations will be contributed as well
 - a columnar shredding algorithm is also under discussion

New geometry type:
- Two PoC (java and c++) implementations are finished
- The spec proposal has reached consensus from other communities (GeoParquet,
  Sedona, Iceberg)
- Parquet community is finalizing the spec and hopefully will be released with
  Iceberg V3 together.

## Community Health:
Healthy community. Regular discussions are held:
- on the mailing list
(traffic is back to normal after a surge of discussions around the start of
the V3 effort)
- in a recurring bi-weekly online meeting open to all and notes are posted
on the mailing list.

17 Jul 2024 [Julien Le Dem / Sander]

## Description:
A column-oriented data file format designed for efficient data storage and
retrieval. It provides high performance compression and encoding schemes to
handle complex data in bulk and is supported in many programming languages and
analytics tools.

## Project Status:
Current project status: Parquet is an ongoing, fairly mature project. As a file
format, new features are added relatively slowly as backward compatibility is
required. There is an increase of activity towards making changes to
improve the format under the "Parquet V3" label (see project activity below).
Issues for the board: none

## Membership Data:
Apache Parquet was founded 2015-04-21 (9 years ago)
There are currently 38 committers and 28 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- Gang Wu was added to the PMC on 2024-05-10
- No new committers. Last addition was Gang Wu on 2023-02-28.
- Julien Le Dem is now the PMC chair. Thank you Xinli for your service!

## Project Activity:
- Discussions on adding Parquet extension support: (Parquet extensions:
https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit).
 The end goal is to allow fast iteration for new features and
  accelerate innovation.

- Adding support for geo data types in Parquet. This is a feature that
  progresses in the wider Open Source data ecosystem
(including in Iceberg for example).

- There are discussions to clarify the process for adopting new features for
  parquet-format and release for Parquet Java
  https://lists.apache.org/thread/nq7n6pbp222txrfo232ybgpvlvpmykbp

- "Parquet V3":
   parquet-format 2.10.0 was released on 2023-11-20
   There are a few discussions under the "Parquet V3" label. I
   put this in quotes as the goal is not to make a major incompatible release
   but instead to add functionality or change the format in a backwards
   compatible way in a few areas:
  - Improve footer metadata format to improve wide schemas access: Wide
    schemas are schemas with many columns (1000s. 10,000s or more) Currently,
    the footer is one thrift data structure. This means that when reading a
    few columns of a very wide file, one must scan all the columns' metadata
    to read the few interesting columns. When the metadata is large, this is
    significant overhead. Current discussion includes splitting the thrift
    metadata or using flatbuffers (like the Arrow project). In particular this
    requires a mechanism to add a new footer in a way that doesn't break old
    readers in the transition period.
  - New encodings: In particular, encodings that compress better time series
    or strings. Consensus is to add few encodings that will solve this well on
    average. A few research papers on this topic have been mentioned.
  - Cross validation: As the ecosystem has grown quite a bit since the initial
    release of Parquet. There are discussions to introduce a new cross
    compatibility testing framework to ensure various integrations in open
    source or proprietary projects are compatible and respect the same
    semantics. See https://github.com/apache/parquet-format/issues/441

- The Parquet-MR has been renamed to Parquet-Java to better reflect what’s in
  the repository. Parquet-Java has done two releases: 1.14.0 in May 2024,
  and 1.14.1 in June 2024.

- Parquet C++ implementation location: A while back the Parquet C++ was moved
  to the Arrow repo to ease dependency management between the 2 code bases.
  The C++ language in particular makes cross repo dependencies difficult. This
  has raised questions on whether the Parquet C++ code base should move back
  to its own repo to clarify governance. The current consensus (across the
  Parquet and Arrow PMCs) is to keep it as is because of technical
  difficulties to move it without making C++ development across the two repo
  painful.

- Issue migration to GitHub: as issue tracking was being migrated for the
  parquet-cpp codebase, moving other issues to GitHub added relatively little
  overhead. We migrated 2485 past and current issues from Parquet Jira to
  GitHub issue trackers. We strived to keep contents and metadata as close to
  the originals as possible to minimize disruption to work of contributors and
  keep the historical record of work. Comments, issue crosslinks, attachments,
  versions, priorities and labels were preserved wherever possible. Authorship
  is indicated with Jira and GitHub (where known) usernames. All issues for
  Apache Parquet are now tracked in GitHub issue trackers of parquet-java,
  parquet-format, parquet-testing, parquet-site and arrow (for parquet-cpp).

- There is some effort to document the client feature compatibility matrix
  across the ecosystem that is currently under discussion:
  https://github.com/apache/parquet-site/pull/34

## Community Health:
There is a surge in email traffic linked to the "Parquet V3" discussion
summarized above (~+300% on the dev list). This should sustain over the next
few quarters as we make progress towards a V3.

19 Jun 2024

Change the Apache Parquet Project Chair

 WHEREAS, the Board of Directors heretofore appointed Xinli Shang
 (shangxinli) to the office of Vice President, Apache Parquet, and

 WHEREAS, the Board of Directors is in receipt of the resignation of
 Xinli Shang from the office of Vice President, Apache Parquet, and

 WHEREAS, the Project Management Committee of the Apache Parquet project
 has chosen by vote to recommend Julien Le Dem (julien) as the successor
 to the post;

 NOW, THEREFORE, BE IT RESOLVED, that Xinli Shang is relieved and
 discharged from the duties and responsibilities of the office of Vice
 President, Apache Parquet, and

 BE IT FURTHER RESOLVED, that Julien Le Dem be and hereby is appointed
 to the office of Vice President, Apache Parquet, to serve in accordance
 with and subject to the direction of the Board of Directors and the
 Bylaws of the Foundation until death, resignation, retirement, removal
 or disqualification, or until a successor is appointed.

 Special Order 7F, Change the Apache Parquet Project Chair, was
 approved by Unanimous Vote of the directors present.

17 Apr 2024 [Xinli Shang / Shane]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Project Status:
Current project status: Ongoing
Issues for the board: No

## Membership Data:
Apache Parquet was founded 2015-04-21 (9 years ago)
There are currently 38 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gang Wu on 2023-02-28.

## Project Activity:
Recent releases:
Format 2.10.0 was released on 2023-11-20.
1.13.1 was released on 2023-05-18.
MR-1.11.2 was released on 2021-10-06.

## Community Health:
dev@ had a 87% decrease in traffic (190 emails compared to 1436)
issues@ had a 352% increase in traffic (661 emails compared to 146)
Low number of issues or PRs are seen in past quarter.

17 Jan 2024 [Xinli Shang / Sander]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop
ecosystem

## Project Status:
Current project status: Ongoing Issues for the board: n/a

## Membership Data:
Apache Parquet was founded 2015-04-21 (9 years ago) There are currently 38
committers and 27 PMC members in this project. The Committer-to-PMC ratio is
roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gang Wu on 2023-02-28.

## Project Activity:
Format 2.10.0 was released on 2023-11-20.
1.13.1 was released on 2023-05-18. MR-1.11.2 was released on 2021-10-06.

## Community Health:
dev@parquet.apache.org had a 88% increase in traffic in the past quarter
issues@parquet.apache.org had a big increase in traffic in the past quarter
47 issues opened in JIRA, past quarter (17% increase)
62 issues closed in JIRA, past quarter (195% increase)
100 commits in the past quarter (100% increase)
22 code contributors in the past quarter (100% increase)
77 PRs opened on GitHub, past quarter (37% increase)
86 PRs closed on GitHub, past quarter (79% increase)

18 Oct 2023 [Xinli Shang / Sharan]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Project Status:
Current project status: Ongoing
Issues for the board: No issues

## Membership Data:
Apache Parquet was founded 2015-04-21 (8 years ago)
There are currently 38 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gang Wu on 2023-02-28.

## Project Activity:
Recent releases:
MR-1.13.1 was released on 2023-05-18.
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.

## Community Health:
dev@parquet.apache.org had 842 emails in the past quarter(-42% change)
39 issues opened in JIRA, past quarter (-30% change)
23 issues closed in JIRA, past quarter (-41% change)
48 commits in the past quarter (-65% change)
12 code contributors in the past quarter (-55% change)
49 PRs opened on GitHub, past quarter (-46% change)
46 PRs closed on GitHub, past quarter (-47% change)

19 Jul 2023 [Xinli Shang / Justin]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Project Status:
Current project status: green
Issues for the board: no issues

## Membership Data:
Apache Parquet was founded 2015-04-21 (8 years ago)
There are currently 38 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gang Wu on 2023-02-28.

## Project Activity:
Recent releases:
1.13.1 was released on 2023-05-18.
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.

## Community Health:
dev@parquet.apache.org had a 28% decrease in traffic in the past quarter
43 issues opened in JIRA, past quarter (-15% change)
25 issues closed in JIRA, past quarter (-58% change)
109 commits in the past quarter (37% increase)
21 code contributors in the past quarter (-22% change)
65 PRs opened on GitHub, past quarter (-4% change)
68 PRs closed on GitHub, past quarter (17% increase)

19 Apr 2023 [Xinli Shang / Shane]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
There was no issues found in the community.

## Membership Data:
Apache Parquet was founded 2015-04-21 (8 years ago)
There are currently 38 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- Gang Wu was added as committer on 2023-02-28

## Project Activity:
MR-1.13.0 was released on 2023-04-06.
MR-1.12.3 was released on 2022-05-26.
MR-1.12.2 was released on 2021-10-06.
MR-1.11.2 was released on 2021-10-06.

## Community Health:
40 issues opened in JIRA, past quarter (81% increase)
44 issues closed in JIRA, past quarter (238% increase)
48 commits in the past quarter (242% increase)
22 code contributors in the past quarter (100% increase)
46 PRs opened on GitHub, past quarter (100% increase)
40 PRs closed on GitHub, past quarter (100% increase)
dev@parquet.apache.org had a 151% increase in traffic in the past quarter

18 Jan 2023 [Xinli Shang / Willem]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
No issues found

## Membership Data:
Apache Parquet was founded 2015-04-21 (8 years ago)
There are currently 37 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
Recent releases:
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.
MR-1.12.0 was released on 2021-03-25.

## Community Health:
dev@parquet.apache.org had a 65% decrease in traffic in the past quarter
28 issues opened in JIRA, past quarter (-22% change)
14 issues closed in JIRA, past quarter (40% increase)
16 commits in the past quarter (-27% change)
12 code contributors in the past quarter (-33% change)
25 PRs opened on GitHub, past quarter (-24% change)
22 PRs closed on GitHub, past quarter (-8% change

19 Oct 2022 [Xinli Shang / Sharan]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
No issues found

## Membership Data:
Apache Parquet was founded 2015-04-21 (7 years ago)
There are currently 37 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
Recent releases:
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.
MR-1.12.0 was released on 2021-03-25.

## Community Health:
dev@parquet.apache.org had a 65 decrease in traffic
37 issues opened in JIRA, past quarter (32% increase)
9 issues closed in JIRA, past quarter (no change)
23 commits in the past quarter (-41% change)
17 code contributors in the past quarter (30% increase)
34 PRs opened on GitHub, past quarter (21% increase)
22 PRs closed on GitHub, past quarter (29% increase)

20 Jul 2022 [Xinli Shang / Rich]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
There is no issue found.

## Membership Data:
Apache Parquet was founded 2015-04-21 (7 years ago)
There are currently 37 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
MR-1.12.3 was released on 2022-05-26.
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.
MR-1.12.0 was released on 2021-03-25.

## Community Health:
dev@ had a 65% decrease in the past quarter (270 emails compared to 751)
27 issues opened in JIRA, past quarter (no change)
8 issues closed in JIRA, past quarter (-52% change)
38 commits in the past quarter (18% increase)
12 code contributors in the past quarter (20% increase)
27 PRs opened on GitHub, past quarter (-20% change)
17 PRs closed on GitHub, past quarter (-43% change

20 Apr 2022 [Xinli Shang / Christofer]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
No issues found

## Membership Data:
Apache Parquet was founded 2015-04-21 (7 years ago)
There are currently 37 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Gidon Gershinsky on 2021-11-23.
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
Recent releases:
MR-1.11.2 was released on 2021-10-06
MR-1.12.2 was released on 2021-10-06
MR-1.12.0 was released on 2021-03-25
New website parquet.apache.org was launched in March 2022

## Community Health:
25 issues opened in JIRA, past quarter (150% increase)
18 issues closed in JIRA, past quarter (63% increase)
30 commits in the past quarter (172% increase)
10 code contributors in the past quarter (42% increase)
32 PRs opened on GitHub, past quarter (190% increase)
29 PRs closed on GitHub, past quarter (163% increase)
dev@parquet.apache.org had a 65% decrease in traffic in the past quarter

19 Jan 2022 [Xinli Shang / Sheng]

## Description:
The mission of Parquet is the creation and maintenance of software related to
columnar storage format available to any project in the Apache Hadoop ecosystem

## Issues:
No issues found

## Membership Data:
Apache Parquet was founded 2015-04-21 (7 years ago)
There are currently 37 committers and 27 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- Gidon Gershinsky was added to the PMC on 2021-11-23
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
Recent releases:
MR-1.11.2 was released on 2021-10-06.
MR-1.12.2 was released on 2021-10-06.
## Community Health:
dev@parquet.apache.org had a 65% decrease in traffic in the past quarter
9 issues opened in JIRA, past quarter (-75% change)
11 issues closed in JIRA, past quarter (-45% change)
7 commits in the past quarter (-85% change)
7 code contributors in the past quarter (-53% change)
11 PRs opened on GitHub, past quarter (-47% change)
10 PRs closed on GitHub, past quarter (-54% change)

20 Oct 2021 [Xinli Shang / Sander]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (6 years ago)
There are currently 37 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Xinli Shang on 2020-11-09.
- No new committers. Last addition was Gidon Gershinsky on 2021-04-05.

## Project Activity:
- Released parquet-mr 1.12.1 on 2021-09-13.
- Adding high throughput column encryption rewriter.
- Support native 'in' predicate in FilterAPI
- Bug fixes

## Community Health:
Commit activity has dropped over the summer, with a
decrease (-26%). The overall activities are lower than last quarter.
We will see if it can regain in next quarter.

21 Jul 2021 [Xinli Shang / Sheng]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (6 years ago)
There are currently 37 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Xinli Shang on 2020-11-09.
- Gidon Gershinsky was added as committer on 2021-04-05

## Project Activity:
- define core features / compliance levels for different implementations of
 parquet-format.
- bug fixes
- improvements related to ZSTD, INT96, and reliability

## Community Health:
The regained activity continues, although it is slightly down in June. It
could be attributed to the coming of summer.

19 May 2021

Change the Apache Parquet Project Chair

 WHEREAS, the Board of Directors heretofore appointed Julien Le Dem
 (julien) to the office of Vice President, Apache Parquet, and

 WHEREAS, the Board of Directors is in receipt of the resignation of
 Julien Le Dem from the office of Vice President, Apache Parquet, and

 WHEREAS, the Project Management Committee of the Apache Parquet project
 has chosen by vote to recommend Xinli Shang (shangxinli) as the
 successor to the post;

 NOW, THEREFORE, BE IT RESOLVED, that Julien Le Dem is relieved and
 discharged from the duties and responsibilities of the office of Vice
 President, Apache Parquet, and

 BE IT FURTHER RESOLVED, that Xinli Shang be and hereby is appointed to
 the office of Vice President, Apache Parquet, to serve in accordance
 with and subject to the direction of the Board of Directors and the
 Bylaws of the Foundation until death, resignation, retirement, removal
 or disqualification, or until a successor is appointed.

 Special Order 7A, Change the Apache Parquet Project Chair, was
 approved by Unanimous Vote of the directors present.

21 Apr 2021 [Julien Le Dem / Sander]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (6 years ago)
There are currently 37 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 5:4.

Community changes, past quarter:
- No new PMC members. Last addition was Xinli Shang on 2020-11-09.
- Gidon Gershinsky was added as committer on 2021-04-05

## Project Activity:
Latest release: MR-1.12.0 was released on 2021-03-25.
main features:
 - encryption
 - bloom filter
 - BYTE_STREAM_SPLIT encoding
many bug fixes
https://github.com/apache/parquet-mr/blob/master/CHANGES.md#version-1120

## Community Health:
Nice to see an increase in activity after the somewhat slower activity for
the past year.

20 Jan 2021 [Julien Le Dem / Sander]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (6 years ago)
There are currently 36 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- Xinli Shang was added to the PMC on 2020-11-09
- No new committers. Last addition was Antoine Pitrou on 2020-05-21.

## Project Activity:
- bug fixes
- improvements related to encryption feature
- dependency maintenance updates

## Community Health:
Regain of activity after the pandemic slow down,
in particular on the mailing list and contributors on github.

21 Oct 2020 [Julien Le Dem / Bertrand]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (5 years ago)
There are currently 36 committers and 25 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Gábor Szádovszky on 2019-06-27.
- No new committers. Last addition was Antoine Pitrou on 2020-05-21.

## Project Activity:
Ongoing efforts:
 - encryption
 - integrations improvements / bug fixes: (avro, thrift, protobuf)
 - release

## Community Health:
 - Still lower activity at the moment. Possibly attributed to the pandemic.
 - github activity is picking up.

15 Jul 2020 [Julien Le Dem / Shane]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Parquet was founded 2015-04-21 (5 years ago)
There are currently 36 committers and 25 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Gábor Szádovszky on 2019-06-27.
- Antoine Pitrou was added as committer on 2020-05-21
- Micah Kornfield was added as committer on 2020-05-21

## Project Activity:
Ongoing discussion regarding:
 - encryption feature (now used in production at Uber)
 - Hardware acceleration (in particular for compression)
 - bug fixes
 - next release

## Community Health:
- Somewhat lower activity this quarter that might be related to the ongoing
  pandemic.

15 Apr 2020 [Julien Le Dem / Justin]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
there are no issues requiring board attention at this time

## Membership Data:
Apache Parquet was founded 2015-04-21 (5 years ago)
There are currently 34 committers and 25 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Gábor Szádovszky on 2019-06-27.
- Xinli Shang was added as committer on 2020-03-12

## Project Activity:
Work in progress:
- encryption
- bloom filters
- improvements to CLI
working on release 1.11.1

Recent releases:
- Parquet Format 2.8.0 was released on 2020-01-13.
- Parquet 1.11.0 was released on 2019-12-06.
- Parquet Format 2.7.0 was released on 2019-09-29.

## Community Health:
JIRA and PRs are opened and resolved at a healthy pace
discussions happening around: releases, encryption, bloom filters,
CLI improvement

15 Jan 2020 [Julien Le Dem / Danny]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
there are no issues requiring board attention at this time

## Membership Data:
Apache Parquet was founded 2015-04-21 (5 years ago)
There are currently 33 committers and 25 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Gábor Szádovszky on 2019-06-27.
- No new committers. Last addition was Fokko Driesprong on 2019-06-25.

## Project Activity:
We released 1.11.0:
 https://github.com/apache/parquet-mr/blob/master/CHANGES.md#version-1110
In particular, it includes:
 - column indexes.
 - new logical types
 - nanosecond precision timestamps
also many bug fixes and dependencies updates.

Still in progress: encryption

Recent releases:
1.11.0 was released on 2019-12-06.
Format 2.7.0 was released on 2019-09-29.

## Community Health:
dev@parquet.apache.org had a 9% increase in traffic in the past quarter
(624 emails compared to 569)
We're closing tickets at a reasonable rate
61 issues opened in JIRA, past quarter (17% increase)
54 issues closed in JIRA, past quarter (8% increase)

16 Oct 2019 [Julien Le Dem / Myrle]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
there are no issues requiring board attention at this time

## Membership Data:
Apache Parquet was founded 2015-04-21 (4 years ago)
There are currently 33 committers and 25 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:7.

Community changes, past quarter:
- No new PMC members. Last addition was Gábor Szádovszky on 2019-06-27.
- No new committers. Last addition was Fokko Driesprong on 2019-06-25.

## Project Activity:
- Format 2.7.0 was released on 2019-09-29.
working towards a parquet-mr release to go with it.

## Community Health:
JIRA activity is fairly stable, tickets are opened and closed at a similar
rate.
- 51 issues opened in JIRA, past quarter (-21% decrease)
- 46 issues closed in JIRA, past quarter (-4% decrease)
there is a bit of activity in finalizing big efforts that have
been in the work for a while (encryption, bloom filters)
- 40 commits in the past quarter (21% increase)
- 19 code contributors in the past quarter (72% increase)
- 40 PRs opened on GitHub, past quarter (-16% decrease)
- 49 PRs closed on GitHub, past quarter (40% increase)
Nice exploration of floating point compression from the community.

17 Jul 2019 [Julien Le Dem / Ted]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
there are no issues requiring board attention at this time

## Activity:
Following up on activity from last report. We have recently added new
committers and a PMC member. We are still working on releasing 1.10. There’s
been agreement on a plan to get there but is still slow moving.

 ## Health report:
The discussion volume on the mailing lists is stable.
Tickets get created and closed at a reasonable pace.

## PMC changes:

 - Currently 25 PMC members.
 - Gábor Szádovszky was added to the PMC on Fri Jun 28 2019

## Committer base changes:

 - Currently 33 committers.
 - New commmitters:
    - Fokko Driesprong was added as a committer on Tue Jun 25 2019
    - Nándor Kollár was added as a committer on Tue Jun 25 2019

## Releases:

 - Last release was Format 2.6.0 on Tue Oct 02 2018

## Mailing list activity:

 - dev@parquet.apache.org:
    - 238 subscribers (up 14 in the last 3 months):
    - 693 emails sent to list (684 in previous quarter)


## JIRA activity:

 - 63 JIRA tickets created in the last 3 months
 - 48 JIRA tickets closed/resolved in the last 3 months

17 Apr 2019 [Julien Le Dem / Rich]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration. (Now as part of apache arrow)

## Issues:
there are no issues requiring board attention at this time

## Activity:
We have been working towards releasing parquet 1.11.0
We have been slow at validating the release.
This is dues in part to the scope of the release affecting the file
format itself and warranting more scrutiny to ensure backwards compatibility.
We are actively discussing how to improve our processes.
Current actions considered:
 - Clarify the vetting process for such releases.
 - Simplify/Automate the release validation process.
 - Review potential PMC candidates in our current contributors.

## Health report:
The discussion volume on the mailing lists is stable.
Tickets get created and closed at a reasonable pace.

## PMC changes:

 - Currently 24 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Zoltan Ivanfi on Sun Apr 15 2018

## Committer base changes:

 - Currently 31 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Benoit Hanotte at Mon May 28 2018

## Releases:

 - Last release was Format 2.6.0 on Mon Oct 01 2018

## Mailing list activity:

 - email volume is stable, JIRA opened and closed at a similar pace

 - dev@parquet.apache.org:
    - 224 subscribers (up 5 in the last 3 months):
    - 684 emails sent to list (517 in previous quarter)


## JIRA activity:

 - 66 JIRA tickets created in the last 3 months
 - 65 JIRA tickets closed/resolved in the last 3 months

16 Jan 2019 [Julien Le Dem / Ted]

## Description:
Parquet is a standard and interoperable columnar file format for efficient
analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
  definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
  integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
  bindings and arrow integration.

## Issues:
 No issue at this time

## Activity:
Current activity around: encryption Page indexing cutting a new release
improvement on parquet-proto

## Health report:
The discussion volume on the mailing lists is stable. Tickets get created and
closed at a reasonable pace.

## PMC changes:

 - Currently 24 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Zoltan Ivanfi on Sun Apr 15 2018

## Committer base changes:

 - Currently 31 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Benoit Hanotte at Mon May 28 2018

## Releases:

 - Last release was Format 2.6.0 on Mon Oct 01 2018

## Mailing list activity:

 - dev@parquet.apache.org:
    - 216 subscribers (up 2 in the last 3 months):
    - 529 emails sent to list (757 in previous quarter)


## JIRA activity:

 - 49 JIRA tickets created in the last 3 months
 - 65 JIRA tickets closed/resolved in the last 3 months

17 Oct 2018 [Julien Le Dem / Shane]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration.

## Issues:
  No issue at this time

## Activity:
As mentioned in the Arrow report:
 - the Arrow and Parquet communities resolved by vote to merge their respective
  C++ codebases in the Apache Arrow repository. This work was completed this
  quarter.
 - We now need to update the parquet-cpp repository accordingly.

Current activity around:
- encryption
- Page indexing
- Bug fixes

## Health report:
The discussion volume on the mailing lists is stable.
Tickets get created and closed at a reasonable pace

## PMC changes:
- Currently 24 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Zoltan Ivanfi on Sun Apr 15 2018

## Committer base changes:
- Currently 31 committers.
- No new committers added in the last 3 months
- Last committer addition was Benoit Hanotte at Mon May 28 2018

## Releases:
- CPP-1.5.0 was released on Wed Sep 19 2018
- Format 2.6.0 was released on Mon Oct 01 2018

## Mailing list activity:
- dev@parquet.apache.org:
    - 215 subscribers (up 7 in the last 3 months):
    - 797 emails sent to list (880 in previous quarter)

## JIRA activity:
- 94 JIRA tickets created in the last 3 months
- 55 JIRA tickets closed/resolved in the last 3 months

18 Jul 2018 [Julien Le Dem / Brett]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration.

## Issues:
there are no issues requiring board attention at this time

## Activity:
Progress on encryption functionality
bloom filters discussions
Page indexes implementation

## Health report:
The discussion volume on the mailing lists is increasing.
 Tickets get created and closed at a reasonable pace

## PMC changes:

 - Currently 24 PMC members.
 - Zoltan Ivanfi was added to the PMC on Mon Apr 16 2018

## Committer base changes:

 - Currently 31 committers.
 - New commmitters:
    - Benoit Hanotte was added as a committer on Mon May 28 2018
    - Costi Muraru was added as a committer on Sat May 19 2018
    - Gábor Szádovszky was added as a committer on Wed May 16 2018

## Releases:

 - 1.8.3 was released on Fri May 11 2018
 - Format 2.5.0 was released on Wed Apr 18 2018

## Mailing list activity:

Steady activity

 - dev@parquet.apache.org:
    - 208 subscribers (up 3 in the last 3 months):
    - 898 emails sent to list (875 in previous quarter)


## JIRA activity:

 - 77 JIRA tickets created in the last 3 months
 - 61 JIRA tickets closed/resolved in the last 3 months

18 Apr 2018 [Julien Le Dem / Ted]

## Description:
Parquet is a standard and interoperable columnar file format for efficient
analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
  definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
  integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
  bindings and arrow integration.

## Issues:
  No issue at this time

## Activity:
 - Statistics bug fixes
 - ongoing votes for parquet-format and parquet-mr release
 - ongoing votes for parquet-rust contribution
 - parquet-proto improvements and new contributors

## Health report:
 The discussion volume on the mailing lists is increasing. Tickets get created
 and closed at a reasonable pace

## PMC changes:
 - Currently 23 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Uwe Korn on Sun Mar 26 2017

## Committer base changes:
 - Currently 28 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Zoltan Ivanfi at Fri Oct 27 2017

## Releases:
 - CPP-1.4.0 was released on Mon Mar 05 2018

## Mailing list activity:
 - mailing list activity up this quarter
 - dev@parquet.apache.org:
    - 202 subscribers (up 5 in the last 3 months):
    - 933 emails sent to list (573 in previous quarter)

## JIRA activity:
 - 77 JIRA tickets created in the last 3 months
 - 54 JIRA tickets closed/resolved in the last 3 months

17 Jan 2018 [Julien Le Dem / Mark]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration.

## Issues:
 No issue at this time

## Activity:
Current activity around:
Deprecating int96 timestamp
Preparing for a release
Supporting dot net integration
Min max stats improvement
Page indexing new features
Bloom filters support

## Health report:
The discussion volume on the mailing lists is stable.
Tickets get created and closed at a reasonable pace

## PMC changes:

- Currently 23 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Uwe Korn on Sun Mar 26 2017

## Committer base changes:

- Currently 28 committers.
- New commmitters:
   - Lars Volker was added as a committer on Mon Oct 16 2017
   - Zoltan Ivanfi was added as a committer on Fri Oct 27 2017

## Releases:

- CPP-1.3.1 was released on Fri Oct 27 2017
- Format 2.4.0 was released on Sat Oct 21 2017

## Mailing list activity:
- dev@parquet.apache.org:
   - 200 subscribers (up 12 in the last 3 months):
   - 633 emails sent to list (432 in previous quarter)


## JIRA activity:

- 54 JIRA tickets created in the last 3 months
- 41 JIRA tickets closed/resolved in the last 3 months

18 Oct 2017 [Julien Le Dem / Shane]

## Description:
Parquet is a standard and interoperable columnar file format for efficient
analytics.

## Issues:
there are no issues requiring board attention at this time.

## Activity:
- Ongoing work to add Bloom Filters to parquet format. Discussion around the
prototype and java<->cpp interoperability
- Prototype is ready for adding page offset metadata in the footer and using
it for better push down. Ready to proceed with merging metadata.
- compression with Brotli and Zstandard
- Preparing a parquet-format release

## Health report:
- issues: tickets closed about at the same rate they are opened
- mailing list email level is stable.

## PMC changes:
- Currently 23 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Uwe Korn on Sun Mar 26 2017

## Committer base changes:
- Currently 26 committers.
- Deepak Majeti was added as a committer on Tue Aug 01 2017

## Releases:

- CPP-1.2.0 was released on Sun Jul 30 2017
- CPP-1.3.0 was released on Sun Sep 24 2017

## Mailing list activity:
- activity stable since the last report

- dev@parquet.apache.org:
 - 188 subscribers (up 3 in the last 3 months):
 - 468 emails sent to list (618 in previous quarter)


## JIRA activity:
- 74 JIRA tickets created in the last 3 months
- 54 JIRA tickets closed/resolved in the last 3 months

19 Jul 2017 [Julien Le Dem / Phil]

## Description: Parquet is a standard and interoperable columnar file format
for efficient analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration.

## Issues: there are no issues requiring board attention at this time.

## Activity:
- Ongoing work to add Bloom Filters to parquet format. Discussion around the
prototype and java<->cpp interoperability
- Ongoing prototype for adding page offset metadata in the footer and using it
for better push down.
- Preparing a patch level release of parquet-mr
- Planning release 1.2.0 of parquet-cpp
- activity around protocol buffer integration

## Health report:
- issues: tickets closed about at the same rate they are opened
- mailing list email level is stable.

## PMC changes:
- Currently 23 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Uwe Korn on Sun Mar 26 2017

## Committer base changes:
- Currently 25 committers.
- No new committers added in the last 3 months
- Last committer addition was Uwe Korn at Sun Sep 04 2016

## Releases:
- CPP-1.1.0 was released on Sun May 21 2017

## Mailing list activity:
- activity stable since the last report
- dev@parquet.apache.org:
 - 185 subscribers (up 8 in the last 3 months):
 - 637 emails sent to list (638 in previous quarter)

## JIRA activity:
- 101 JIRA tickets created in the last 3 months
- 91 JIRA tickets closed/resolved in the last 3 months

19 Apr 2017 [Julien Le Dem / Brett]

## Description:
Parquet is a standard and interoperable columnar file format for efficient
analytics. Parquet has 3 sub-projects:
- parquet-format: format reference doc along with thrift based metadata
definition (used by both sub-projects bellow)
- parquet-mr: java apis and implementation of the format along with
integrations to various projects (thrift, pig, protobuf, avro, ...)
- parquet-cpp: C++ apis and implementation of the format along with Python
bindings and arrow integration.

## Issues:
there are no issues requiring board attention at this time

## Activity:
- We had our first parquet-cpp release (kudos Uwe and Wes!)
- Several threads relating to time types and statistics.

## Health report:
We host regular public sync ups on hangout. Notes are sent to the mailing list
and follow ups happen on JIRA and github pull requests. Recently we're also
using google docs attached to JIRAs to drive the discussion.

## PMC changes:
- Currently 23 PMC members.
- Uwe Korn was added to the PMC on Sun Mar 26 2017

## Committer base changes:
- Currently 25 committers.
- No new committers added in the last 3 months
- Last committer addition was Uwe Korn at Sun Sep 04 2016

## Releases:
- 1.8.2 was released on Mon Jan 23 2017
- CPP-1.0.0 was released on Mon Mar 13 2017

## Mailing list activity:
Some of the extra activity related to 1.8.2 release for spark, time types
discussion, parquet-cpp release, statistics discussions.

- dev@parquet.apache.org:
 - 178 subscribers (up 2 in the last 3 months):
 - 656 emails sent to list (418 in previous quarter)

## JIRA activity:
- 120 JIRA tickets created in the last 3 months
- 90 JIRA tickets closed/resolved in the last 3 months

18 Jan 2017 [Julien Le Dem / Marvin]

## Description:
Parquet is a standard and interoperable columnar file format
for efficient analytics.

## Issues:
there are no issues requiring board attention at this time

## Activity:
- parquet-arrow integration has been added in parquet-cpp
- We're preparing a 1.8.2 patch release for the Apache Spark project
- We're preparing parquet-cpp 0.1: its first release (PARQUET-713)

## Health report:
Discussion is happening on the mailing list, JIRA and
regular hangout sync up. Notes are sent to the mailing list.

## PMC changes:
- Currently 22 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Wes McKinney on Thu Sep 01 2016

## Committer base changes:
- Currently 25 committers.
- No new committers added in the last 3 months
- Last committer addition was Uwe Korn at Sun Sep 04 2016

## Releases:
- 1.9.0 was released on Sun Oct 23 2016

## Mailing list activity:
- Activity on the mailing list is still relatively the same
- JIRAS are resolved about at the same pace they are opened.

- dev@parquet.apache.org:
- 176 subscribers (up 3 in the last 3 months):
- 452 emails sent to list (436 in previous quarter)

## JIRA activity:
- 81 JIRA tickets created in the last 3 months
- 67 JIRA tickets closed/resolved in the last 3 months

19 Oct 2016 [Julien Le Dem / Sam]

Report from the Apache Parquet committee [Julien Le Dem]

## Description:
Parquet is a standard and interoperable columnar file format for
efficient analytics.

## Issues:
there are no issues requiring board attention at this time

## Activity:
The community has been converging toward a 1.9 release. The vote will start in
the coming days. Discussion about better encoding and vectorization apis are
ongoing.  The parquet-cpp repo has reached a stable state and should release
soon. Integration with arrow-cpp is now in the parquet-cpp repo.

## Health report:
The PMC and committer list are growing. Discussion is happening on the mailing
list, JIRA and regular hangout sync up. Notes are sent to the mailing list.

## PMC changes:
 - Currently 22 PMC members.
 - Wes McKinney was added to the PMC on Thu Sep 01 2016

## Committer base changes:
 - Currently 25 committers.
 - Uwe Korn was added as a committer on Sun Sep 04 2016

## Releases:
 - Last release was Format 2.3.1 on Thu Dec 17 2015
 - parquet-mr 1.9.0 vote ongoing

## Mailing list activity:
 - Activity on the mailing list is still relatively the same
 - JIRAS are resolved about at the same pace they are opened.

 - dev@parquet.apache.org:
    - 172 subscribers (up 9 in the last 3 months):
    - 486 emails sent to list (394 in previous quarter)

## JIRA activity:
 - 85 JIRA tickets created in the last 3 months
 - 74 JIRA tickets closed/resolved in the last 3 months

20 Jul 2016 [Julien Le Dem / Isabel]

## Description:
Parquet is a standard and interoperable columnar file format for
efficient analytics.

## Issues:
there are no issues requiring board attention at this time

 ## Activity:
- Work on stabilizing master preparing for a release of parquet-mr (ByteBuffer)
- encoding strategy experiments
- Bytebuffer stabilization.
- Brotli compression experiments
- parquet-cpp development
- discussion about vectorized reads and Apache Arrow integration

 ## Health report:
- JIRAs opened and closed at the same rate
- email activity was more important last quarter due to parquet-cpp kickoff and discussions.

 ## PMC changes:
 - Currently 21 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Alex Levenson on Tue Apr 21 2015

 ## Committer base changes:
 - Currently 24 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Wes McKinney at Thu Mar 03 2016

 ## Releases:
 - Last release was Format 2.3.1 on Thu Dec 17 2015

 ## Mailing list activity:
Last quarter had more email activity dues to the kickoff of parquet-cpp
- dev@parquet.apache.org:
 - 163 subscribers (up 5 in the last 3 months):
 - 427 emails sent to list (901 in previous quarter)

 ## JIRA activity:
 - 81 JIRA tickets created in the last 3 months
 - 80 JIRA tickets closed/resolved in the last 3 months

20 Apr 2016 [Julien Le Dem / Bertrand]

## Description:
 Parquet is a standard and interoperable columnar file format for
efficient analytics.

## Issues:
 there are no issues requiring board attention at this time

## Activity:
There is a surge of activity related to the development of the Parquet-cpp
library.
Initially Parquet had a java implementation as well as reference
implementations for some encodings in C++. The C++ version is now being
fully implemented. A new committer has been recently invited based on that
work.

## Health report:
 The project is healthy. We have new contributors. Communication happens
on the mailing list and on regular public hangout sync ups for which
notes are published on the mailing list.

## PMC changes:
 - Currently 21 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Alex Levenson on Tue Apr 21 2015

## Committer base changes:
 - Currently 24 committers.
 - Wes McKinney was added as a committer on Thu Mar 03 2016

## Releases:
 - Format 2.3.1 was released on Thu Dec 17 2015

## Mailing list activity:
   A surge of emails related to the development of parquet-cpp

 - dev@parquet.apache.org:
    - 152 subscribers (up 14 in the last 3 months):
    - 940 emails sent to list (361 in previous quarter)

## JIRA activity:
 - 158 JIRA tickets created in the last 3 months
 - 109 JIRA tickets closed/resolved in the last 3 months

20 Jan 2016 [Julien Le Dem / David]

## Description:
   Apache Parquet is a general-purpose columnar storage format.

## Issues:
  there are no issues requiring board attention at this time

## Activity:
 All changes required by Apache Drill have been merged into Apache Parquet,
 getting Drill off of its Parquet fork.  Releases are ongoing to allow Drill
 to upgrade its dependencies.  Several efforts are ongoing to improve
 vectorized reads from Java and C++ They involve collaboration of several
 organizations.  Communication is happening in JIRA

## Health report:
  We have now a rotation to have someone responsible for answering JIRAs and
  emails each week.  Level of ticket creation and resolution is about the
  same, keeping opened tickets to a reasonable amount.  Typically user
  activity shows up in the user lists of other projects depending on parquet
  (drill, impala, presto, spark, ...)

## PMC changes:

 - Currently 21 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Alex Levenson on Tue Apr 21 2015

## Committer base changes:

 - Currently 23 committers.
 - New commmitters:
    - Cheng Lian was added as a committer on Wed Dec 02 2015
    - Sergio Peña was added as a committer on Wed Dec 02 2015

## Releases:

 - Parquet-Format 2.3.1 was released on Thu Dec 17 2015
 - Parquet-mr 1.9.0 in preparation

## Mailing list activity:

 - dev@parquet.apache.org:
    - 147 subscribers (up 17 in the last 3 months):
    - 466 emails sent to list (396 in previous quarter)

## JIRA activity:

 - 40 JIRA tickets created in the last 3 months
 - 36 JIRA tickets closed/resolved in the last 3 months

21 Oct 2015 [Julien Le Dem / Rich]

## Description:
 Apache Parquet is a general-purpose columnar storage format.
## Issues:
there are no issues requiring board attention at this time
## Activity:
- Bloom filters: need to finalize the design. Have use cases to validate it
  (query execution, etc)
 - Vectorized read API: refactoring of the code based on feedback.
 - Using dict in filter push down: rework to have better code reuse.
 - ByteBuffer: close to being merged.
## Health report:
The project is fairly stable with new features and
compatibility testing underway.
## PMC changes:
- Currently 21 PMC members
. - No new PMC members added in the last 3 months
- Last PMC addition was Alex Levenson at Tue Apr 21 2015
## LDAP changes:
- Currently 21 committers and 21 committee group members.
- No new changes to the committee group or committership since last report.
## Releases:
- 1.8.1 was released on Tue Jul 21 2015
## Mailing list activity:
- dev@parquet.apache.org:
- 130 subscribers (up 13 in the last 3 months):
- 367 emails sent to list (705 in previous quarter)
## JIRA activity:
- 53 JIRA tickets created in the last 3 months
- 25 JIRA tickets closed/resolved in the last 3 months

15 Jul 2015 [Julien Le Dem / Shane]

Apache Parquet is a general-purpose columnar storage format.

## Activity:
We're working towards a 1.8.0 release and merging the ByteBuffer PR (ZeroCopy
HDFS reads) Our goal is to keep master in a releasable state and to do
releases quickly.

## Issues:
- there are no issues requiring board attention at this time

## LDAP committee group/Committership changes:
- Currently 21 committers and 21 LDAP committee group members.
- No new changes to the LDAP committee group or committership since last
 report. Two new PMC members Alex Levenson and Daniel Weeks were added on Dec
 28th 2014

## Releases:
- 1.7.0 was released on Mon May 18 2015
- 1.8.0 is being voted on.

## Mailing list activity:
- dev@parquet.apache.org:
- 116 subscribers (up 5 in the last 3 months):
- 707 emails sent to list (722 in previous quarter)

## JIRA activity:
- 79 JIRA tickets created in the last 3 months
- 64 JIRA tickets closed/resolved in the last 3 months

17 Jun 2015 [Julien Le Dem / Shane]

## Description:
   Apache Parquet is a general-purpose columnar storage format.

## Activity:
We're working towards a 1.8.0 release and merging the ByteBuffer PR (ZeroCopy
HDFS reads) Our goal is to keep master in a releasable state and to do
releases quickly.

## Issues:
 there are no issues requiring board attention at this time

## PMC/Committership changes:

 - Currently 21 committers and 21 PMC members in the project.
 - No new changes to the PMC or committership since last report. Two new PMC
   members Alex Levenson and Daniel Weeks were added on Dec 28th 2014

## Releases:

 - 1.7.0 was released on Mon May 18 2015

## Mailing list activity:

 - dev@parquet.apache.org:

    - 112 subscribers (up 12 in the last 3 months):
    - 829 emails sent to list (459 in previous quarter)


## JIRA activity:

 - 91 JIRA tickets created in the last 3 months
 - 57 JIRA tickets closed/resolved in the last 3 months

20 May 2015 [Julien Le Dem / Jim]

Parquet is a columnar file format for Hadoop.

## Project Status

The project just graduated from the incubator and is voting on its first
release as a TLP.  No issues to report.

## Community

 - Two new PMC members Alex Levenson and Daniel Weeks on Dec 28th 2014
 - No new committer or PMC member since last report in April
 - JIRA past 30 days: 30 created and 22 resolved as of May 18th
https://issues.apache.org/jira/browse/PARQUET
 - 114 subscribers to the dev mailing list as of May 18th
 - emails on the dev list: Apr: 397, Mar: 319, Feb: 135, Jan: 112
http://mail-archives.apache.org/mod_mbox/parquet-dev/
 - commits: Apr: 84, Mar: 38, Feb: 47, Jan: 9
http://mail-archives.apache.org/mod_mbox/parquet-commits/
 - regular project sync ups are held on hangout.
They are open to anyone and advertised on the dev mailing list
notes are then published on the list as well
 - several Parquet related presentations scheduled at the Hadoop summit in June
http://2015.hadoopsummit.org/san-jose/agenda/

# Community Objectives

The community main objectives (not excluding other efforts also ongoing)
 - Working towards merging the ByteBuffer access work
 - Vectorized execution improvements (and integration with Apache Drill,
   Apache Hive, Presto)
 - Improving Projection and Predicate APIs
 - Standardizing nested type representations (thrift and avro write-side)
 - Improving high-level type specs (microsecond time/timestamp)

## Releases

 - Last releases:
    - parquet-mr 1.6.0-incubating on Apr 12th:
https://dist.apache.org/repos/dist/release/parquet/parquet-mr-1.6.0-incubating/
    - parquet-mr 1.7.0 on May 18th (just voted):
https://dist.apache.org/repos/dist/release/parquet/parquet-mr-1.7.0/
 - Next release: a parquet-format release will happen soon.

22 Apr 2015

Establish the Apache Parquet Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to the
 public, related to a columnar storage format for Hadoop.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache Parquet Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache Parquet Project be and hereby is
 responsible for the creation and maintenance of software
 related to a columnar storage format for Hadoop; and be it further

 RESOLVED, that the office of "Vice President, Apache Parquet" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache Parquet Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache Parquet Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache Parquet Project:

    * Chris Aniszczyk <caniszczyk@apache.org>
    * Ryan Blue <blue@apache.org>
    * Jonathan Coveney <jcoveney@apache.org>
    * Tim <tianshuo@apache.org>
    * Jake Farrell <jfarrell@apache.org>
    * Marcel Kornacker <marcel@apache.org>
    * Mickael Lacour <mlacour@apache.org>
    * Julien Le Dem <julien@apache.org>
    * Alex Levenson <alexlevenson@apache.org>
    * Nong Li <nong@apache.org>
    * Todd Lipcon <todd@apache.org>
    * Chris Mattmann <mattmann@apache.org>
    * Aniket Mokashi <aniket486@apache.org>
    * Lukas Nalezenec <lukas@apache.org>
    * Brock Noland <brock@apache.org>
    * Wesley Graham Peck <wesleypeck@apache.org>
    * Remy Pecqueur <rpecqueur@apache.org>
    * Dmitriy Ryaboy <dvryaboy@apache.org>
    * Roman Shaposhnik <rvs@apache.org>
    * Daniel Weeks <dweeks@apache.org>
    * Thomas White <tomwhite@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Julien Le Dem
 be appointed to the office of Vice President, Apache Parquet, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache Parquet PMC be and hereby is
 tasked with the creation of a set of bylaws intended to
 encourage open development and increased participation in the
 Apache Parquet Project; and be it further

 RESOLVED, that the Apache Parquet Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Incubator Parquet podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator Parquet podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 Special Order 7D, Establish the Apache Parquet Project, was
 approved by Unanimous Vote of the directors present.

22 Apr 2015

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20.

Three most important issues

 - 1st releases toward org.apache Parquet 1.6.0 GA
 - Expanding the community and adding new committers
 - Ensuring timely code reviews by committers, developing reviewers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 - None at this time

Latest Additions:

 * PMC addition:        None
 * Committer addition:  Dan Weeks and Alex Levenson (from last report)

Issue backlog status since last report:

 * Created:   34
 * Resolved:  50

Mailing list activity since last report:

 * dev      560 messages: 111 in Jan, 136 in Feb, and 313 in Mar

How has the project developed since the last report?

 - Preparing last commits for the first parquet-mr release candidate
 - Planned parquet-mr 1.6.0 release schedule
 - ASF required changes to parquet-mr are finished
 - Released parquet-format 2.3.0, with org.apache packages
 - Parquet presentation at Strata 2015 San Jose and the Presto meetup

Date of last release:

 - parquet-format 2.3.0 released 19 Feb
 - Not yet released: parquet-mr and parquet-cpp

Signed-off-by:

 [ ](parquet) Todd Lipcon
 [X](parquet) Jake Farrell
 [X](parquet) Chris Mattmann
 [X](parquet) Roman Shaposhnik
 [ ](parquet) Tom White

21 Jan 2015

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20 .

Three most important issues

 - Expanding the community and adding new committers
 - 1st releases toward org.apache Parquet 1.6.0 GA
 - Identifying how to ensure timely code reviews by committers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 - None at this time

Latest Additions:

 * PMC addition:          None
 * Contributor addition:  Dan Weeks and Alex Levenson

Issue backlog status since last report:

 * Created:   45
 * Resolved:  20

Mailing list activity since last report:

 * dev      310 messages: 90 in Oct, 126 in Nov, and 94 in Dec

How has the project developed since the last report?

 - Completed first release, Apache Parquet Format (incubating) 2.2.0
 - Established a by-law for adding committers
 - Added 2 new committers
 - Parquet presentation accepted for Strata San Jose

Date of last release:

 - parquet-format released 14 November 2014
 - Not yet released: parquet-mr and parquet-cpp

Signed-off-by:

 [ ](parquet) Todd Lipcon
 [X](parquet) Jake Farrell
 [X](parquet) Chris Mattmann
 [X](parquet) Roman Shaposhnik
 [ ](parquet) Tom White

Shepherd/Mentor notes:

 Mailing lists are active; most mentors are active.

15 Oct 2014

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20 .

Three most important issues

 - Expanding the community and adding new committers
 - 1st releases
 - Identifying how to ensure timely code reviews by committers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 - None at this time

Latest Additions:

 * PMC addition:          N/A
 * Contributor addition:  N/A

Issue backlog status since last report:

 * Created:   27
 * Resolved:  19

Mailing list activity since last report:

 * dev      144 messages

How has the project developed since the last report?

 - Attempted parquet-format release twice, next RC in early October.
 - Assembled tasks to complete for a parquet-mr release
 - New push-down filter API and task-side block metadata reading

Date of last release:

 - No releases as of yet.

Signed-off-by:

 [ ](parquet) Todd Lipcon
 [X](parquet) Jake Farrell
 [ ](parquet) Chris Mattmann
 [X](parquet) Roman Shaposhnik
 [X](parquet) Tom White

20 Aug 2014

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20.

Three most important issues to address in the move towards graduation:

 1. Expanding the community and adding new committers
 2. 1st release
 3. Identifying how to ensure timely code reviews by committers

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None at this time

Latest Additions:

 * PMC addition:          N/A
 * Contributor addition:  N/A

Issue backlog status since last report:

 * Created:   60
 * Resolved:  17

Mailing list activity since last report:

 * dev      212 messages

How has the project developed since the last report?

 * New commit workflow has been documented and commits have been
 increasing using the commit script.
 * Project website is posted: parquet.incubator.apache.org, working on
 moving more content from github hosting
 * Moved to issues.apache.org for all new issues
 * Planning first release of parquet-format and parquet-mr. Using
 parquet-format release to identify steps needed to release the
 larger projects (e.g., parquet-mr)
 * Adding documentation on reviews and contacts for specific modules

Signed-off-by:

 [X](parquet) Jake Farrell
 [ ](parquet) Chris Mattmann
 [ ](parquet) Roman Shaposhnik
 [X](parquet) Tom White
 [X](parquet) Todd Lipcon

Shepherd/Mentor notes:
 Mailing list has a healthy traffic, mostly bug reports. Mentors are
 active and participating in the community.

16 Jul 2014

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20 .

Three most important issues

 - Finish bootstrapping project(completed), IP clearance (completed),
 initial website (in progress)
 - Expanding the community and adding new committers
 - 1st release

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 - None at this time

Latest Additions:

 * PMC addition:          N/A
 * Contributor addition:  N/A

 Issue backlog status since last report:

 * Created:   8
 * Resolved:  2

 Mailing list activity since last report:

 * @dev       69 messages

How has the project developed since the last report?

 - All bootstrap tickets have been completed and status page updated
   - Mailing lists created, Jira setup, Code imported
 - Jira issues starting to be imported to issues.apache.org
 - Website in the works and will be available soon, infra for this is
 all ready setup
 - Working on documenting contributing guide and committers workflow
   - We have now setup the mechanisms to accept contributions through
   the Apache Github and have already accepted one external contribution.

Date of last release:

 - No releases as of yet.

Signed-off-by:

 [X](parquet) Todd Lipcon
 [X](parquet) Jake Farrell
 [ ](parquet) Chris Mattmann
 [X](parquet) Roman Shaposhnik
 [X](parquet) Tom White

18 Jun 2014

Parquet is a columnar storage format for Hadoop.

Parquet has been incubating since 2014-05-20 .

Three most important issues

 - Finish bootstrapping project, IP clearance, initial website
 - Expanding the community and adding new committers
 - 1st release

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 - None at this time

How has the community developed since the last report?

 - All initial committers have submitted ICLAs and the accounts have been
   created. The mailing lists have been setup and we are starting to use
   them for communication.

How has the project developed since the last report?

 - We have setup the incubator status page and are waiting on the final
   SGA to be sent in to start the code import (INFRA-7782)

Date of last release

 - No releases as of yet. Working through initial IP clearance.

When were the last committers or PMC members elected?

 - N/A, still bootstrapping the project.

Signed-off-by:

 [ ](parquet) Todd Lipcon
 [X](parquet) Jake Farrell
 [ ](parquet) Chris Mattmann
 [X](parquet) Roman Shaposhnik
 [X](parquet) Tom White