Skip to Main Content
The Apache Software Foundation
Apache 20th Anniversary Logo

This was extracted (@ 2024-12-18 22:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

WARNING: these pages may omit some original contents of the minutes.
This is due to changes in the layout of the source minutes over the years. Fixes are being worked on.

Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).

Beam

18 Dec 2024 [Kenneth Knowles / Willem]

Report was filed, but display is awaiting the approval of the Board minutes.

18 Sep 2024 [Kenneth Knowles / Justin]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: ongoing
Issues for the board: none

## Membership Data:
Apache Beam was founded 2016-12-20 (8 years ago)
There are currently 96 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:1.

Community changes, past quarter:
- No new PMC members. Last addition was Alex Van Boxel on 2023-10-01.
- XQ Hu was added as committer on 2024-06-24

## Project Activity:

Recent releases:
- 2.59.0 was released on 2024-09-11.
- 2.58.1 was released on 2024-08-16.
- 2.58.0 was released on 2024-08-06.
- 2.57.0 was released on 2024-06-26.

Technical and community activity highlights:

- We just held the 2024 Beam Summit which saw 170+ people from 23 countries,
 with 50+ speakers. Highlights:
  - heavy emphasis on ML-related talks, which comprised about 1/3
  - notably high volume of Beam-on-Flink subject matter (including talks on
    other topics)
  - continued emphasis on new ways of using Beam's core tech: _another_ Go
    SDK, a Swift SDK, YAML pipelines, data lineage
- There has been some discussion of Beam 3.0 and what it would mean for our
 community. There is early consensus is that we do not intend to break
 backwards compatibility, and we do want to make 3.0 features available
 early, but still want to signal a new era of Beam releases.
- Through continuous refinement of our release process, we are able to execute
 patch releases when necessary. An example of this is the release of version
 2.58.1 to resolve an issue in our KafkaIO connector. Additionally, we have
 established a policy specifically tailored for patch releases.
- Initial experimental support for using Prism with the Java and Python SDKs.
 This is our project to have a single performant local/testing runner that
 supports all of Beam's new advanced features. Instead of one local runner
 per SDK language, with lots of drift between them, we have one that is built
 in Go. Also, notably, Beam pipelines are inherently multi-language, so it is
 a benefit that the runner be implemented with no bias toward any SDK.

Dependencies/integrations updates:
- First release with Flink 1.18 support
- First release with Python 3.12 support
- Go SDK Minimum Go Version updated to 1.21
- Added Feast feature store handler for enrichment transform (Python)
- Support for Solace source (SolaceIO.Read) added (Java)

Detailed release notes at
https://github.com/apache/beam/blob/master/CHANGES.md

## Community Health:
Community health metrics are about the same, perhaps a bit lower than
previous. The activity that is taking place is transparent and good open
source spirit.

19 Jun 2024 [Kenneth Knowles / JB]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: Ongoing
Issues for the board: none.

## Membership Data:
Apache Beam was founded 2016-12-20 (7 years ago)
There are currently 95 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:1.

Community changes, past quarter:
- No new PMC members. Last addition was Alex Van Boxel on 2023-10-01.
- No new committers. Last addition was Svetak Sundhar on 2024-02-09.

## Project Activity:

Recent releases:

 - 2.56.0 was released on 2024-05-02.
 - 2.55.1 was released on 2024-04-08.
   Notably, Beam's first point release!
   Our release automation has gotten much better so this was finally
   worthwhile to do.
 - 2.55.0 was released on 2024-03-25.

Technical development notes:

 - Added a new API for "Managed" transforms that represents an innovative
   direction for Beam: these transforms are explicitly constructed from
   a machine-readable config rather than just code, with the intention
   that OSS runners and/or Cloud providers can use the config to
   manage them more effectively. Up to this point, with a few exceptions,
   Beam transforms have been "guest" code managed by the user, with runners
   treating them as black boxes. With this API, we hope to enable even
   smoother user experience than Beam's portability APIs enabled, for example
   transparently applying upgrades to address CVEs, etc.
 - New Ordered Processing PTransform added for encapsulating a common
   pattern for processing order-sensitive stateful data.
 - Added bad record handling for BigQueryIO and PubsubIO connectors.
 - Added Vertex AI Feature Store handler for the Enrichment transform (a
   best-effort pseudo-join for when just grabbing data from an auxiliary
   store is good enough).

Dependency/related project updates:

 - Arrow version was bumped to 15.0.0 from 5.0.0 (a breaking change
   that we determined was justified)
 - Go SDK base container image moved to distroless/base-nossl-debian12,
   reducing vulnerable container surface to kernel and glibc (also
   potentially breaking change per Hyrum's Pitfall [1]
   since container surface reduced)
 - First release with Flink 1.17 support.
 - Added Flink 1.18 support

[1] https://www.hyrumslaw.com/

## Community Health:
Community health is steady. Traffic on dev@ list has settled in to a new
activity level that isn't changing too much. The same is true for code
contributions, bug reports, and code review.

20 Mar 2024 [Kenneth Knowles / Sander]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: Ongoing
Issues for the board: None

## Membership Data:
Apache Beam was founded 2016-12-20 (7 years ago)
There are currently 95 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:1.

Community changes, past quarter:
- No new PMC members. Last addition was Alex Van Boxel on 2023-10-01.
- Svetak Sundhar was added as committer on 2024-02-09

## Project Activity:

Recent releases:
- 2.54.0 was released on 2024-02-14.

Highlighted technical developments:
 - New capability to auto-generated Python wrappers for Java-based transforms,
   which will rapidly increase the features available to Python users.
 - Added new Enrichment transform for joining a data stream with side storage,
   with support for BigTable and Vertex Feature Store
 - Added DLQ supports to MLTransform and many widely-used connectors
 - New transform "RequestResponseIO" to read/write Web APIs without
   overwhelming them or getting banned.

Dependency upgrades: we are continuing to improve processes to stay
ahead of emerging vulnerabilities, so it is worth reporting on some
highlights here.
 - Java: Upgraded GCP libraries BOM to 26.32.0 (a major lift that upgrades
   a huge number of dependencies)
 - Python: Upgraded for a very old and deprecated GCS client to the latest
   recommended by GCP.
 - Go (used in our containers as well as SDK): upgraded to 1.21.6

## Community Health:

The overall volume on the mailing list has decreased, but there has
been a greater focus on proposals and design discussions. The average
volume on dev@beam seems steady, though lower than late 2023 due to a spike
last fall. The volume on user@beam is steady.

17 Jan 2024 [Kenneth Knowles / Craig]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: ongoing
Issues for the board: none

## Membership Data:
Apache Beam was founded 2016-12-20 (7 years ago)
There are currently 94 committers and 26 PMC members in this project.
The Committer-to-PMC ratio is roughly 3:1.

Community changes since last report:
 - Valentyn Tymofieiev was added to the PMC on 2023-10-02
 - Robert Burke was added to the PMC on 2023-10-02
 - Alex Van Boxel was added to the PMC on 2023-10-02

 - Sam Whittle was added as committer on 2023-10-09
 - Byron Ellis was added as a committer on 2023-10-13

## Project Activity:

Recent releases:
 - 2.53.0 was released on 2024-01-05.
 - 2.52.0 was released on 2023-11-17.
 - 2.51.0 was released on 2023-10-11.

Highlighted technical developments
 - Beam YAML (a YAML format for writing a pipeline) has its
   stable release!
 - Lots of focus on Beam ML, a collection of utility transforms
   that handle loading models, performing inference, and increasingly
   pre- and post-processing steps specific to ML workloads.
 - Running multi-language pipelines locally no longer requires docker. This
   addresses pain points for users of operating systems with less
   great docker support, as well as corporate policies that
   forbid it.
 - Avro dependency finally removed from the core SDK, fixing dependency
   conflicts that plagued users.
 - Explicit java 21 support added to our released artifacts.
 - Deprecated Euphoria DSL due to being obsoleted by Beam main SDK
 - Finished migrating all of Jenkins jobs to GitHub actions
 - Upgraded to golang 1.21.5

Highlights of community activities:
 - Beam College 2023 (https://beamcollege.dev/step/2023/) took place from
   October 23 -  November 3, 2023 as an online training event.  More than 800
   attendees joined this season It contains three tracks:
   - Dive into data processing
   - Hands-on Apache Beam
   - Graduate to streaming.
 - Beam Blogs have been active and varied, for example:
   - Two part series on scaling up Beam on Flink
     (https://beam.apache.org/blog/apache-beam-flink-and-kubernetes/)
   - A "Contributor spotlight" [blog]

[blog](https://beam.apache.org/blog/contributor-spotlight-johanna-ojeling/)


## Community Health:
Variations in community metrics are within normal variations,
especially considering the season.

20 Dec 2023 [Kenneth Knowles / Rich]

No report was submitted.

20 Sep 2023 [Kenneth Knowles / Bertrand]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: Ongoing
Issues for the board: none

## Membership Data:
Apache Beam was founded 2016-12-20 (7 years ago)
There are currently 92 committers and 23 PMC members in this project.
The Committer-to-PMC ratio is approximately 9:2.

Community changes, past quarter:
- No new PMC members. Last addition was Jan Lukavský on 2023-02-14.
- Ahmed Abualsaud was added as committer on 2023-08-24

## Project Activity:

Top-level technical notes to show direction of the project:
 - The "Prism Runner", mentioned in prior reports, is now complete enough and
   the default for the Go SDK. Next up is to make it the best local runner
   for other SDKs to test multi-language pipelines (which are expected to
   be the norm, not the exception, as major libraries are built in one language
   and used in all languages).
 - Multi-language pipelines continue to get easier and more transparent to
   author, this time with a new automatically launched and managed subprocess
   that can serve multiple "external" transforms.
 - BigTable Change Streams support was added. While it is just one connector,
   it is notable for being part of an increased interest in streaming
   applications, in which most storage may be shipping changes around,
   including those not traditionally considered "streaming" systems.
 - ML conveniences continue to be added to Beam Python, such as:
   - hugging face model handler
   - Vertex AI model handler
   - new "MLTransform" for pre/postprocessing, complementing RunInference
   - prebuilding docker containers to bundle large dependencies
 - All Beam released container images are now multi-arch images that support
   both x86 and ARM CPU architectures.
 - Go SDK requires Go 1.2.0 to build
 - SparkRunner now defaults to Spark 3.2.2

Community:
- Beam Summit 2023 was a huge success. Beam Summit 2023
  <https://beamsummit.org/> was the eighth and the biggest edition of the
   flagship conference for the Apache Beam community. Beam Summit 2023 took
   place on June 13 - 15, 2023 as an in-person event, bringing the community
   together in NYC, and on July 18-20, 2023 as a virtual edition. [impact
   report]

[impact report] https://lists.apache.org/thread/l7hxz8wpl9rqt8jotv64620sl2zmdx2p

Recent releases:
- 2.50.0 was released on 2023-08-30.
- 2.49.0 was released on 2023-07-17.

Our 6 week release cadence is going quite well. Increased release automation
has reduced the time from cutting a release branch to finalizing a release.

## Community Health:

Issues and code traffic are within normal variation, aka flat. No growth or
shrinking trends.

Community faced email delivery issues, causing friction when communicating on
the email lists, and making project activities (e.g. release validation)
difficult to coordinate. Even though issues were addressed, we are not sure if
we will run into these issues again because there was not a permanent fix.
(Examples: https://issues.apache.org/jira/browse/INFRA-24574
https://issues.apache.org/jira/browse/INFRA-24790
https://issues.apache.org/jira/browse/INFRA-24872)

21 Jun 2023 [Kenneth Knowles / Shane]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Project Status:
Current project status: Ongoing
Issues for the board: none

## Membership Data:
Apache Beam was founded 2016-12-20 (6 years ago)
There are currently 91 committers and 23 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:2.

Community changes, past quarter:
- No new PMC members. Last addition was Jan Lukavský on 2023-02-14.
- Anand Inguva was added as committer on 2023-04-21.
- Damon Douglas was added as committer on 2023-04-21.

## Project Activity:

- 2.48.0 was released on 2023-06-03.
- 2.47.0 was released on 2023-05-10.

Highlights of community activities:

- Beam Summit 2023 (https://beamsummit.org/) was just held June 13-15. Summit
 reached 500+ registrations. This is the 6th iteration of this annual
 community-organized summit.
- Interactive Beam Playground updated with lots of new features
 (https://lists.apache.org/thread/15phr0h5q007pjgfotwqcvdr7hyotks1).
- Google Cloud Skills Boost launched a Beam "quest"
 (https://www.cloudskillsboost.google/quests/310), paid educational content
  where users earn a completion "badge".

Some technical highlights relevant to community development:

- Python 3.11 support added
- Flink 1.16.x support added
- The Go SDK is now very nearly at feature parity with Python and Java. The
 gaps are small enough that they are more than compensated for by the
 qualitative differences between the SDKs, and the existing gaps and bugs in
 each. It has arrived!
- "Experimental" annotation cleanup: the annotation and concept have been
  removed from Beam to avoid the misperception of code as "not ready". They
  were there to signal that something might change or disappear, but in
  practice we rarely did so and we almost always neglected to "graduate"
  features from this status. We will just make case-by-case judgment.
- A new local Beam runner called the "Prism Runner" is authored in Go and
 poised to become the definitive local portable runner, serving as a proper
 reference for the Beam model. Rapid developments in Beam have resulted in a
 major gap in this area, with no runner supporting every corner of the model
 so users could reliably test their work prior to running on a cloud service.

## Community Health:

Traffic on various communication channels is roughly stable. The number and
variety of attendees at the Beam Summit shows a very healthy diversity of
stakeholders in Beam and interest in the project's development. There were
many talks of unexpected and interesting work that took place outside the
project's communication channels and code repository. Creating an ecosystem
bigger than itself is a good sign, but also some of these may be opportunities
to invite work to merge into Beam itself to grow committers and PMC members
from those stakeholders.

22 Mar 2023 [Kenneth Knowles / Rich]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (6 years ago)
There are currently 90 committers and 24 PMC members in this project.
The Committer-to-PMC ratio is roughly 9:2.

Community changes, past quarter:
- Jan Lukavský was added to the PMC on 2023-02-14
- No new committers. Last addition was Yi Hu on 2022-11-05.

## Project Activity:
- Google Summer of Code processes are kicked off, and there are a few project
proposals.
- Beam Summit 2023 will be held in New York, and the CFP recently closed.

Recent releases:
- 2.46.0 was released on 2023-03-10
- 2.45.0 was released on 2023-02-15

## Community Health:
There is an across the board 20-30% reduction in many measures of
community health traffic. This is not a cause for concern (yet). One
could speculate about how global events and major disruptions in
big tech could influence activity over this period.

18 Jan 2023 [Kenneth Knowles / Roman]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (6 years ago) There are currently 90
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 9:2.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
 2021-01-20.
- Ritesh Ghorse was added as committer on 2022-11-02
- Yi Hu was added as committer on 2022-11-05

## Project Activity:

Highlights of community activities:
- We started planning the Beam Summit for June 13-15th, 2023 in NYC.
- New webpage on ML/RunInference
- Java Multi-language pipelines support including support for using Python
 RunInference from Java SDK.
- We have been using GitHub Issues for a while now, and community response is
 overall positive.
- There is a trend of many IO connectors being made into "schema transforms"
 which means they have a cross-language schema and become more language-SDK
 agnostic. It signals the continued trend of Beam as a language-independent
 framework to be used with any big data processin gengine.


Recent releases:

- 2.44.0 was released on 2023-01-13.
- 2.43.0 was released on 2022-11-17.
- 2.42.0 was released on 2022-10-16.

## Community Health:
The community on the mailing list and in code seems to be holding steady. We
have had departures of very active committers but also addition of new ones.
If you aren't growing you are shrinking! Change is the only constant! :-)

21 Dec 2022 [Kenneth Knowles / Roman]

No report was submitted.

21 Sep 2022 [Kenneth Knowles / Bertrand]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (6 years ago) There are currently 88
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 9:2.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
  2021-01-20.
- John Casey was added as committer on 2022-07-27
- Steve Niemitz was added as committer on 2022-07-19

## Project Activity:

### Recent releases

- 2.41.0 was released on 2022-08-23.

### Integrations and deprecations

- Support for Spark 2.4.x is deprecated and will be dropped with the release
  of Beam 2.44.0 or soon after.
- The modules amazon-web-services and kinesis for AWS Java SDK v1 are
  deprecated in favor of amazon-web-services2 and will be eventually removed
  after a few Beam releases

### Events

 - Beam Summit held as a hybrid event in Austin, TX, USA with about 200
   in-person attendees and 2000+ online attendees.

## Community Health:
Issues, pull requests, and dev list all holding at the roughly the same activity
level.

We have additional metrics at https://metrics.beam.apache.org/d/code_velocity/
which show some improvements in "time to first response" on pull requests.

20 Jul 2022 [Kenneth Knowles / Christofer]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (6 years ago) There are currently 86
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 9:2.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
  2021-01-20.
- Danny McCormick was added as committer on 2022-06-17
- Jack McCluskey was added as committer on 2022-06-17
- Ke Wu was added as committer on 2022-05-27

## Project Activity:

### Recent releases:

- 2.40.0 was released on 2022-06-28.
- 2.39.0 was released on 2022-05-26.
- 2.38.0 was released on 2022-04-20.

We continue to average one release every six weeks, per our intention. To keep
the number of changes in each release roughly comparable, we cut a branch
every six weeks regardless of how long it takes to finalize each release.

### Misc Highlights

 - We have completed our migration from Jira to GitHub Issues.
 - The TypeScript SDK has been merged to the repository.
 - New RunInference API, a framework agnostic transform for ML inference,
   supporting PyTorch and Sckit-learn.
 - Beam Summit will take place July 18-20. There are 180 in-person
   registrations and 2100 online registrations.

### Go SDK

 - Go 1.18 required to support generics.
 - Watermark estimation supported.
 - Pipeline drain support added.
 - Generic function registration for optimizing DoFn execution.
 - TextIO moved to Splittable DoFn.
 - User can author self-checkpointing Splittable DoFns to read from
   streaming sources.

### Runner ecosystem

 - Flink 1.14.x support added.
 - Beam 2.38.0 will be the last minor release to support Flink 1.11
 - Scala 2.12 support added for Flink,
   because most of the libraries support version 2.12 onwards.
 - Support for Spark 2.4.x is deprecated and will be dropped with the release
   of Beam 2.44.0 or soon after.
 - Interactive Beam supports remotely executing Flink pipelines on
   Google Cloud Dataproc via JupyterLab extension.
 - Support for impersonation credentials added to dataflow runner in the Java
   and Python SDK.
 - Two new Python-native runners proposed and under way! Dask and Ray.

### IO ecosystem and other integrations

 - Significant work on CdapIO which gives access to a whole ecosystem
   of connectors maintained by CDAP.
 - Support for Elasticsearch 8.x
 - Upgrade to ZetaSQL 2022.04.1
 - More IO standard documents proposed and reviewed
   (https://s.apache.org/beam-io-api-standard-documentation &
   https://s.apache.org/beam-io-api-standard).
 - A new IO for Neo4j graph databases was added with the ability to update
   nodes and relationships using UNWIND statements and to read data using
   cypher statements with parameters.
 - Connectors for AWS v2 APIs reached parity, and additionally support for
   Kinesis writes and sharded record aggregation, plus fixes to
   connectors for S3, DynamoDB, and SQS. The previous
   modules for AWS v1 and Kinesis are now deprecated.
 - The march of progress toward eliminating null pointer exceptions in all
   code proceeding to include much of KafkaIO, BigQueryIO, and the core SDK,
   but not before high profile NPEs had caused major user problems.
 - ExternalPythonTransform API added for easily invoking Python transforms
   from Java. Previously, multi-language pipelines were focused on making
   mature Java connectors available to other languages. This one, conversely,
   makes ML and scientific transforms available in Java.
 - JmsIO gains dynamic writes and more flexible input handling.
 - Upgraded to Hive 3.1.3 for HCatalogIO. Users can still provide their own
   version of Hive.
 - Implemented Apache PulsarIO.

### Other developments worth noting

 - Early projection pushdown optimizer to the Java SDK. Somewhat limited in
   which pipelines it applies to, but proving the concept.
 - Pandas compatibility continues to improve, specifically adding
   unstack, stack, and pivot.

## Community Health:

Both dev@ and user@ traffic declined during this period, but the community is
still very active. The overall throughput of pull requests is nearly
identical.

With the Go SDK reaching maturity we are somewhat hopeful that we will reach
a new community of users that previously had nothing comparable available
to them.

15 Jun 2022 [Kenneth Knowles / Bertrand]

No report was submitted.

16 Mar 2022 [Kenneth Knowles / Willem]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (5 years ago) There are currently 82
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 4:1.

Community changes, past quarter:

- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
  2021-01-20.
- Kiley Sok was added as committer on 2022-01-27
- Moritz Mack was added as committer on 2022-03-04

## Project Activity:

### Recent releases

 - 2.37.0 was released on 2022-03-04.
 - 2.36.0 was released on 2022-02-07.
 - 2.35.0 was released on 2021-12-29.

We continue to average one release every six weeks, per our intention. To keep
the number of changes in each release roughly comparable, we cut a branch
every six weeks regardless of how long it takes to finalize each release.

### log4j

While the core of Apache Beam does not depend on log4j, many transitive
dependencies do, since Beam integrates with "every" storage system, at least
in theory. Our community really came together around this very quickly.

 - Upgraded test setup to test non-vulnerable recent versions of log4j2
 - Upgrade transitive dependencies to non-vulnerable versions
 - Upgrade to Gradle 7, a multi-week, multi-person effort
   (https://lists.apache.org/thread/ovn4f7ymg6dcy1yn7pdljh4v094yjyrg).

### Ecosystems

 - Added Java 17 support and testing
 - Added Python 3.9 support and testing
 - Added pandas 1.4 support and testing

Previously such changes were difficult one-off endeavors, but they are
becoming part of the project's routine now.

### Multi-language

Parity across languages continues to improve, with Go and Python adding more
core model features to match our first language, Java.

Most notably, though, Beam's central technology - a language-and-engine
agnostic model of big data computation - has yielded rapid progress in
multiple arenas.

 - Go SDK connectors. By leveraging the existing Java-based connectors, Go SDK
   gained access to JDBC, Debezium, SQL, BigQuery, and Kafka.
 - TypeScript / JavaScript SDK! At a new-year hackathon, about a half dozen
   contributors (a few experienced and the rest quite new) built
   a working SDK for TypeScript in a single week!
   (https://lists.apache.org/thread/orxnz7p8mg22ys92dbo034g9335oc2sl)

### Ease of onboarding

 - Beam Playground, a new online interface for getting to know Beam
   (https://lists.apache.org/thread/r088lzjnk4khfrcp8m0q1oymw1mmtmo0)
 - Starter repositories, template repositories instead of maven archetypes or
   less-discoverable subdirectories of our main repo
   (https://lists.apache.org/thread/x16ykz3lrtc48sgo4m7sxgjlyp1y1ffl)

### Other notable developments and discussions

 - IO standards for APIs, testing, and documentation. This should help the
   community and software grow while having some regularity and reliability
   for users
   (https://lists.apache.org/thread/pl13km8y6xo448q9jbrftqblodks831w)
 - Kafka Streams runner proposed
   (https://lists.apache.org/thread/sp9yvbxyfn4mrbmj91d2trhk8hs7ln7n)
 - Automated reviewer assignment. Like many OSS projects, we have a review
   latency and backlog problem. Previous attempts at this were not successful,
   but we still need to keep trying things to solve the problem.
   (https://lists.apache.org/thread/6xg35sw72k8k1rj4od86q9wrsol8p7dc)
 - Migrating from Jira to GitHub issues now has consensus and is in the
   planning stage
   (https://lists.apache.org/thread/zh2t7ql83z45syqj4yd75dgstlo14nmp)

### Beam Summit

Beam Summit 2022 is accepting submissions:
https://lists.apache.org/thread/js0vfljlkvs9l1k1knpwsbxw8obsl56f

## Community Health:

We have seen an influx of new contributors, with a lot of new ideas mentioned
in "project activity". A lot of the ideas are specifically around improving
community health.

Mailing lists were a bit quieter this round, but probably just due to the
winter holidays. There has been very good transparency and discussion on the
lists on all major developments, and there have been quite a few of them.

15 Dec 2021 [Kenneth Knowles / Roy]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (5 years ago) There are currently 80
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 4:1.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
 2021-01-20.
- No new committers. Last addition was Emily Ye on 2021-07-22.

## Project Activity:
Releases:
- 2.34.0 was released on 2021-11-11

Notable technical developments:
- The Beam Java API for inserting SQL into a pipeline is no longer
 "experimental". This has been available for users for many years, but this
 represents a declaration of confidence to our users.
- New support for `pip install apache-beam[dataframe]` to track the pandas
 versions that we have compatibility with.
- Experimental support for the new BigQuery Storage Read API, which should
 be a simpler and more efficient choice for many use cases.

Detailed technical change log at
https://github.com/apache/beam/blob/master/CHANGES.md

Notable discussions:
- There seems to be consensus to migrate from Jira to GitHub Issues,
 with the primary goal being familiarity for new and/or casual
 contributors. The technical effort involved is not yet clear. [issues]
- An update to schema-aware transforms, to use this system for
 even more of Beam. These are transforms that have a known schema
 for their configuration parameters and also have schemas for their
 input and output (vs just passing blobs of bytes). The increased
 development and adoption of this should be good for debugging
 and performance. [schema]

[issues] https://lists.apache.org/thread/q5nbwxqvfkzlz664c4kchzkbj26c3r89
[schema] https://lists.apache.org/thread/8yxt3bo5h6xs4vqhvch7mrpln04sjtqj

## Community Health:
Community metrics show nothing remarkable. A typical dip in the later part of
the year and otherwise largely stable.

17 Nov 2021 [Kenneth Knowles / Sharan]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (5 years ago) There are currently 80
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 4:1.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
 2021-01-20.
- No new committers. Last addition was Emily Ye on 2021-07-22.

## Project Activity:

Recent releases:
- 2.33.0 was released on 2021-10-07 (6 weeks from 2.32.0)
- 2.32.0 was released on 2021-08-26 (7 weeks from 2.31.0)
- 2.31.0 was released on 2021-07-08 (4 weeks from 2.30.0)

Notable developments:
- Beam's Go SDK exits "experimental" status, bringing in a third ecosystem
 and community! https://beam.apache.org/blog/go-sdk-release/
- Beam's Dataframe API (mentioned last report) also has graduated out of
 "experimental" status.
- Beam Summit was held online August 4-6, 2021. https://2021.beamsummit.org/
 850 live attendees from 50+ countries. 4.58/5 Average Event Rating

Interesting functional improvements to Beam:
- Initial support for pushing projections into sources when programming
 using Beam's schema-driven transforms, for some big performance gains
- Google Cloud Firestore connector
- Beam SQL supports `CREATE FUNCTION` syntax from Calcite
- New append-only variant of ElasticSearch sink
- Partitioned reads over JDBC
- Improved Beam schema / Avro schema / JDBC schema interoperability

## Community Health:
Community metrics are about the same, in terms of dev list, user list, GitHub
pull requests, and Jira.

There is a statistical uptick in emails to the dev list, but this is due to
automated alerts about high priority issues in Jira. There does seem to be a
major increase in Jira issues closed, but I think this is due to clean up.

20 Oct 2021 [Kenneth Knowles / Sharan]

No report was submitted.

15 Sep 2021 [Kenneth Knowles / Sheng]

No report was submitted.

16 Jun 2021 [Kenneth Knowles / Justin]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (4 years ago) There are currently 79
committers and 23 PMC members in this project. The Committer-to-PMC ratio is
roughly 8:2.

Community changes, past quarter:
- No new PMC members. Last addition was Chamikara Madhusanka Jayalath on
 2021-01-20.
- Ning was added as committer on 2021-03-18
- Tomo Suzuki was added as committer on 2021-03-31
- Yichi Zhang was added as committer on 2021-04-13

## Project Activity:

Recent releases (we start release process every 6 weeks):
- 2.30.0 was released on 2021-06-08 (6 weeks since 2.29.0).
- 2.29.0 was released on 2021-04-27 (9 weeks since 2.28.0).

Maintenance work on runners and Java ecosystem:
- Drop support for Flink 1.10.
- Spark Classic and Portable runners officially support Spark 3.
- Official Java 11 support for most runners (Dataflow, Flink, Spark).

Some new features integrating with other notable projects:
- Pandas-compatible DataFrame API: Added support for collecting DataFrame
 objects in interactive Beam. Interactive Beam is how one uses Beam in a
 notebook, so there is a good synergy.
- DebeziumIO cross-language wrapper for Python.

Misc work worth noting:
- New contributor flow improvements (CI, documentation, automation).

Issue management:

We were starting to develop a large backlog of "P1" issues. This priority is
reserved for critical issues, including failing tests that obscure visibility
into health of the code [beam-jira-priorities]. This backlog was largely
invisible to the broader community. To increase awareness and voluntary
activity around this we started automated daily emails (our policy is
continuous updates on P1s after all) listing and linking to all of them. It
seems to have helped somewhat for both flakes [beam-flake-trend] and non-flake
P1s [beam-p1-trend] but there is more to do. Another area where we have a
backlog that most community members ignore are untriaged issues, which you
could view as a contrast (no email - significant backlog growth)
[beam-triage-trend].

[beam-jira-priorities] https://beam.apache.org/contribute/jira-priorities/
[beam-flake-trend] https://s.apache.org/beam-flake-trend
[beam-p1-trend] https://s.apache.org/beam-p1-trend
[beam-triage-trend] https://s.apache.org/beam-triage-trend

## Community Health:
Community health metrics are remarkably stable again. Exactly the same number
of code contributors and closed PRs as last quarter. Almost the same
number of Jiras were opened and closed. Mailing list traffic is up, but that
fluctuates a lot anyhow, and we added two daily emails which account for a
significant fraction.

17 Mar 2021 [Kenneth Knowles / Bertrand]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention at this time.

## Membership Data:
Apache Beam was founded 2016-12-20 (4 years ago)
There are currently 76 committers and 23 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:2.

Community changes, past quarter:
- Chamikara Madhusanka Jayalath was added to the PMC on 2021-01-20
- Piotr Szuberski was added as committer on 2021-01-19

## Project Activity:

Recent releases:
- 2.28.0 was released on 2021-02-22.
- 2.27.0 was released on 2021-01-08.
- 2.26.0 was released on 2020-12-11.

Technical improvements are steady. There are no disruptive technical changes
to mention, just healthy enhancements and bugfixes for a variety of modules:
ParquetIO, BigQueryIO, PubsubIO, SQL. [changes]

Highlights of community activities:

- A new design for the website is now done and deployed [website].
- Beam College [college] starts on April 7th. This is a training event
 offering five single-day sessions for Beam users to go deep on topics.
- Beam Summit 2021 planning has begun [summit].

[changes] https://github.com/apache/beam/blob/master/CHANGES.md
[website] https://s.apache.org/7rr5d
[college] http://beamcollege.dev/
[summit] https://s.apache.org/8cp13

## Community Health:
Community health metrics are remarkably stable across all mailing lists and
code review. I will mention one specific statistic so that no one is
concerned. In the data, we see "300 issues closed in JIRA, past quarter (782%
increase)" but this is due to breakage and subsequent repair of our Jira
workflows.

20 Jan 2021 [Kenneth Knowles / Roy]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (4 years ago)
There are currently 75 committers and 22 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:2.

Community changes, past quarter:

- No new PMC members. Last addition was Alexey Romanenko on 2020-06-11.
- No new committers. Last addition was Heejong Lee on 2020-09-03.

The cause of the stall is simply lowered PMC activity. It has been noted
by the PMC and we are getting it moving again.

## Project Activity:

In the core model there is big news: "Splittable DoFn" is now the default
recommended way to write new data connectors. In simple terms: data sources
are now dynamic. Previously, data connectors were a root of the computation
graph (no inputs) and you said what you wanted to read before you started your
job. Now data connectors take their input specification at runtime. This opens
up a whole new realm of data processing, as you can take a "big data" number
of Kafka topics or HDFS paths on input and read from all of them, and the
rest of the Beam model "just works" with this (including unification of bounded
and unbounded data and watermarks, etc).

In the Python realm:
- Python 2 and Python 3.5 support dropped!
- Performance-driven type checking added (opt-in) [pytypes].
- An exciting new avenue for users is a Pandas compatible API. The goal is
exact compatibility. To that end, we are running Pandas own test suite
against the Beam module.
- Beam's cross-language capabilities continue to expand: Java-based KinesisIO,
SnowflakeIO, are available for Beam Python users

In the Java realm:
- Java 11 is officially supported and tested. Users are invited to use
 Java 11.
- We have started to develop BOMs that simplify dependency management for
 users who have committed to a particular ecosystem (where "ecosystem" is
 deliberately undefined and user demand can drive new BOMs being made).
- Our Hadoop connectors are now tested against Hadoop 3.

In the SQL realm: a bunch more connector capabilities:
- Avro, JSON, and Protobuf over Kafka
- Avro over Pubsub
- Bigtable connector
- Thrift format support

For the Flink runner there is a major change in the works: it was cloning
every item of data needlessly. This was noticed, diagnosed, and fixed, reducing
some pipeline runtimes by 80%.

For the Dataflow runner there is a major migration happening: "Dataflow V2"
is going more "all in" on Beam. Rather than translating Beam's pipeline model
to the Dataflow API it is using Beam's model directly. This also enables
cross-language pipelines and users to have simplified custom containers for
their UDFs. FlinkRunner and SparkRunner already had "portable" variants, and
this is the "portable" variant of Dataflow. (the term "portable" refers to using
Beam's new "portability" APIs that allow all the language-agnostic goodness).

Recent releases (we have a target cadence of 6 weeks):

- 2.27.0 was released on 2021-01-08.
- 2.26.0 was released on 2020-12-11.
- 2.25.0 was released on 2020-10-23.

([pytypes](https://beam.apache.org/blog/python-performance-runtime-type-checking/)


## Community Health:

There is an overall trend of reduced activity. The variance in usual quarters
is pretty high, but I would guess the pandemic has had a significant effect.

Verbatim stats, for reference:

- dev@beam.apache.org had a 21% decrease in traffic in the past quarter (811
 emails compared to 1017)
- github@beam.apache.org had a 37% decrease in traffic in the past quarter
 (7584 emails compared to 11968)
- issues@beam.apache.org had a 39% decrease in traffic in the past quarter
 (13471 emails compared to 21750)
- 565 issues opened in JIRA, past quarter (3% increase)
- 697 commits in the past quarter (-37% decrease)
- 120 code contributors in the past quarter (-27% decrease)
- 610 PRs opened on GitHub, past quarter (-27% decrease)
- 586 PRs closed on GitHub, past quarter (-30% decrease)
- 114 issues closed in JIRA, past quarter (570% increase)

16 Dec 2020 [Kenneth Knowles / Shane]

No report was submitted.

16 Sep 2020 [Kenneth Knowles / Patricia]

## Description:

The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:

There are no issues requiring board attention.

## Membership Data:

Apache Beam was founded 2016-12-20 (3.75 years ago)
There are currently 75 committers and 22 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:2.

Community changes, past quarter:
- Alexey Romanenko was added to the PMC on 2020-06-11
- Aizhamal Nurmamat kyzy was added as committer on 2020-06-17
- Austin Bennett was added as committer on 2020-06-22
- Heejong Lee was added as committer on 2020-09-03
- Reza Ardeshir Rokni was added as committer on 2020-08-17

## Project Activity:

- Beam 2.24.0 is the last release with Python 2 and Python 3.5 support.
- “Cross-language transforms” continue to grow: JdbcIO (Java-based) now
  available to Beam Python users.
- Twister2 runner is merged
- Python 3.8 support added
- More effort on Splunk, Snowflake, and Google Healthcare API integrations

Recent releases:

- 2.23.0 was released on 2020-07-29.
- 2.22.0 was released on 2020-06-08.
- 2.21.0 was released on 2020-05-27.

## Community Health:

Community metrics steady:
 - Mailing list activity about the same on user@ (about 500) and
   dev@ (between 1000 and 1500)
 - Pull request open and close rate about the same (just under 800). The fact
   that equal numbers of pull requests are opened and closed is nice.

The board may enjoy this basic analysis [1] of code contributors presented at
the Beam Summit. Highlighted points:

 - Each release has a bit under 100 unique contributors (steady for a long
   time)
 - About 20 of which are new each time (which means 20 depart as well)
 - This is explained because the majority of Beam's contributors have under
   10 commits total, perhaps mostly "scratching an itch" by fixing a one-off
   issue.

[1] https://s.apache.org/beam-summit-code-contributor-analysis

17 Jun 2020 [Kenneth Knowles / Sander]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (3 years ago)
There are currently 71 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:2.

Community changes, past quarter:

- No new PMC members. Last addition was Pablo Estrada on 2019-05-13.
- Robin Qiu was added as committer on 2020-05-18

## Project Activity:

Some updates:

- Last report, the community had received drafts of our new mascot, the
 firefly. Now the final drafts are done and committed to the website.
 [mascot]
- Last report, website migration to Hugo/docsy was just beginning. It is now
 complete, ready for i18n.
- Last report, we were working on moving to a dedicated Jenkins instance. That
 stalled for a bit, but in the last couple of days moved rapidly and is
 almost done.

[mascot] https://beam.apache.org/community/mascot/

Other milestones:

- As of April, docker containers that Beam releases adhere to the guidance of
 LEGAL-503. [LEGAL-503]
- Beam's "cross-language" features are maturing, with a focus on making Beam
 Java features available for Beam Python. This is not just bridging
 languages, but communities/ecosystems. Only some runners support executing
 such a pipeline for now; as each one fully migrates to Beam's
 "portability framework" this will be enabled. What can Pythonistas use on
  enabled runners now?
   - SQL (Java-based, built on Apache Calcite)
   - KafkaIO (a connector to Apache Kafka authored in Java)
- New IO Connectors: Beam now has IO connectors for Snowflake and Google
 Healthcare APIs.

[LEGAL-503] https://issues.apache.org/jira/browse/LEGAL-503

Some work on project health, removing things we don't want/need to maintain:

- Following Gearpump retiring from the incubator, the Gearpump runner was
 removed.
- Following Apex moving to the attic, the Apex runner will be removed.

Other highlighted activity:

- The community discussed how many / which Python 3.x versions Beam should
 support concurrently, with the conclusion that 3.5 and 3.7 were highest
 priority. [py3]
- A Beam "fixit" week was proposed. Contributors would add testing /
 reliability / quality related Jiras to a label `beam-fixit` and we could fix
 some. 105 Jiras were added to the label, and about 30 were fixed.
- Beam has 3 Google Summer of Code students working on Beam SQL and Beam
 Python
- Beam was accepted to Google Season of Docs program. Currently the project is
 accepting proposals for tech writers to improve documentation.
- We moved to our own Jira priority scheme, strictly numerical (P0, P1, etc)
 with tooltips we authored, and explanations on the Beam site
 [jira-priorities]. This reduced friction for users and release managers,
  since Jira's built-in priorities like "Blocker", "Critical", "Major" were
  amgibuous and caused confusion.
- We activated Jira automation to:
 - Unassign issues that were likely forgotten. This identified many issues
   that could be picked up by new contributors, and many that could be
   closed.
 - Lower priority from P2 ("default") to P3 ("nice to have") for unassigned
   issues that were very old, to match how they were prioritized in practice.
   This identified many issues at the wrong priority, and also prompted
   discussions between users and Beam developers.

[py3] https://s.apache.org/beam-py3-discussion
[jira-priorities] https://beam.apache.org/contribute/jira-priorities/

In the current pandemic situation, conferences have been altered or canceled,
but we have activity to note:

- "Distributed Processing for Machine Learning Production Pipelines" presented
 at Flink Forward Virtual 2020 (https://www.youtube.com/embed/jV1WFTmm4qg)
- Beam Summit organizers committed to working more transparently with the
 community [beam-summit-transparency]. In June we have received weekly status
 reports. [beam-summit-20200603] [beam-summit-2020-0610]
- Beam Summit 2020 rescheduled and converted to Beam Digital Summit.
 [beam-digital-summit]
- Organized a May digital learning month, to keep community engaged with
 weekly talks during COVID19. Hosted 4 webinars introducing different
 features of Apache Beam. Received 100~200 viewers on average for each
 webinar.

[beam-summit-transparency] https://s.apache.org/beam-summit-transparency
[beam-summit-20200603] https://s.apache.org/beam-summit-20200603
[beam-summit-20200610] https://s.apache.org/beam-summit-20200610
[beam-digital-summit] https://s.apache.org/beam-digital-summit

Recent releases:

- 2.22.0 was released on 2020-06-08.
- 2.21.0 was released on 2020-05-27.
- 2.20.0 was released on 2020-04-15.


## Community Health:
- Traffic on dev@ was about the same, PRs and commits about the same.
- Traffic on user@ was double, seemingly a real and sustained increase.

15 Apr 2020 [Kenneth Knowles / Niclas]

## Description:

The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:

There are no issues requiring board attention.

## Membership Data:

Apache Beam was founded 2016-12-20 (3 years ago)
There are currently 70 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:2.

Community changes, past quarter:

- No new PMC members. Last addition was Pablo Estrada on 2019-05-13.
- Alex Van Boxel was added as committer on 2020-02-03
- Chad Dombrova was added as committer on 2020-02-20
- Hannah Jiang was added as committer on 2020-02-03
- Jincheng Sun was added as committer on 2020-02-18
- Kamil Wasilewski was added as committer on 2020-02-27
- Katarzyna Kucharczyk was added as committer on 2019-12-19
- Michał Walenia was added as committer on 2020-01-25

## Project Activity:

- Improvements and fixes on several IOs have been done or on the way (updated
  ElasticsearchIO, JmsIO new message types support, …)
- Beam now has an official Beam Improvement Proposal (BIP) process [bip] and a
  first BIP [bip1]. This gives a clear way for people to proposal
  enhancements, followed by an official voting process. We are looking forward
  to evolving the process as we gain experience, in order to be more clear
  about the status of proposals, for the Beam dev community and also broader
  community including users.
- Last report, the community had chosen the firefly as mascot. Now we have
  received some draft artwork from a vendor.
- Lots of activity around Google Summer of Code projects. Beam docker images
  are now transitioned to the apache org (off the apachebeam). [docker]
- A draft communications strategy makes for really interesting reading about
  outreach and awareness. [comms]
- The new twister2 runner is approaching merge. [twister2]
- Website transition to docsy has an update that it is beginning shortly.
   [docsy]
- Starting with Beam 2.21.0 support for Flink 1.7 will be removed. [flink17]
- Starting with Beam 2.21.0 support for Flink 1.10 has been added. [flink110]
- Starting with the approach to the 2.20.0 release, Beam has adopted a
  CHANGES.md file to track and draft release notes. It should also help make
  it easier to have informative board reports.
- We still have not got the isolated Jenkins instance finished, which would
  allow precommits to run on pull requests from untrusted parties.

[bip] https://s.apache.org/iwaoz
[bip1] https://s.apache.org/yo5zh
[docker] https://s.apache.org/y5cmf
[twister2] https://s.apache.org/iwuuw
[docsy] https://s.apache.org/j5nds
[comms] https://s.apache.org/ccqs8
[flink17] https://s.apache.org/8dky5
[flink110] https://issues.apache.org/jira/browse/BEAM-9295

Recent releases:

- 2.19.0 was released on 2020-02-03.
- 2.18.0 was released on 2020-01-23.
- 2.17.0 was released on 2020-01-06.

## Community Health:

Busiest email thread: this thread has been re-used over time for people to
request a committer to trigger testing on the PR

 - dev@beam.apache.org Jenkins jobs not running for my PR 10438(98 emails)

The traffic on builds@ indicates a lot more test failures on master. It does
mean the community needs to communicate and come together around test health.

- builds@beam.apache.org had a 56% increase in traffic in the past quarter
  (11442 emails compared to 7317):

Traffic on dev@, issues@, and user@ don't show interesting changes.

JIRA continues to grow faster than it shrinks, in a healthy ratio:

- 514 issues opened in JIRA, past quarter (-29% decrease)
- 398 issues closed in JIRA, past quarter (4% increase)

Could we possibly be catching up on PRs? Let's wait and see. We are down to
118 open at the time of this writing (we were previously on the move from
about 100 open all the time up to 150+ open all the time).

GitHub PR activity:

- 720 PRs opened on GitHub, past quarter (-11% decrease)
- 745 PRs closed on GitHub, past quarter (-6% decrease)

18 Mar 2020 [Kenneth Knowles / Rich]

No report was submitted.

@Rich: pursue a report for Beam

18 Dec 2019 [Kenneth Knowles / Dave]

## Description:
The mission of Apache Beam is the creation and maintenance of software related
to a unified programming model for both batch and streaming data processing,
enabling efficient execution across diverse distributed execution engines and
providing extensibility points for connecting to different technologies and
user communities.

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache Beam was founded 2016-12-20 (3 years ago)
There are currently 63 committers and 21 PMC members in this project.
The Committer-to-PMC ratio is 3:1.

Community changes, past quarter:
- No new PMC members. Last addition was Pablo Estrada on 2019-05-13.
- Alan Myrvold was added as committer on 2019-09-24
- Brian Hulette was added as committer on 2019-11-14
- Daniel Oliveira was added as committer on 2019-11-20

## Project Activity:
We crossed 10,000 pull requests! That is just a cumulative milestone, but this
quarter alone was extremely active. Number of days from PR #1 and PR #1000:
211. Number of days from PR #9000 and PR #10000: 71

Some notable technical developments, focusing mostly on integrations:
 - A new experimental Spark runner based on Spark structured streaming
   framework is available on master for testing. To fully support the Beam
   model, it will require Spark structured streaming to support multiple
   aggregations, but it can be tested for batch jobs in the meantime. [1]
 - A new Jupyter notebook integration, dubbed "interactive beam", was proposed
   and implemented, and remains under heavy development. [2]
 - Portability continues to mature, with significant use and development of
   Python on Flink, and maturation of multi-language pipelines, in which a
   "Beam Python" pipeline can include connectors and SQL from Java. The Go SDK
    intends to primarily support this mode, to avoid re-implementing any
    connectors, so this will lead to "Go on any data processing engine".
 - The beginning of transition from AWS v1 to AWS v2 [3]. Many other
   improvements across a many connectors to storage systems.

[1] https://lists.apache.org/thread.html/0135c726ab
 f454ea381c1075fe6b588b42b8e6b1e69964e749a0621d%40%3Cdev.beam.apache.org%3E
[2] https://lists.apache.org/thread.html/6ed9a4100
 89b86c7c99a0f0ad8e9ce97b6414eb95ffb69f5a52dc0dc%40%3Cdev.beam.apache.org%3E
[3] [https://lists.apache.org/thread.html/130cb60e6b
 cdd58c5afdd0c375663eaf05e705aab9ee0196535cd17f%40%3Cdev.beam.apache.org%3E]

Some notable community resolutions and discussions:
 - Discussion of Beam Summit 2020 [4]
 - The Beam community has decided to adopt a mascot, the Firefly. [5] [6]
   Currently being designed by community members.
 - We documented Jira priority explanations and release blocking policies [7]
 - We joined a pledge on https://python3statement.org/ to discontinue Python 2
   support in 2020. [8]
 - A renewed discussion and interest is communicating effectively with Beam
   users the maturity and stability of different components. [9]
 - Another renewed conversation around a more formal "BIP" (Beam Improvement
   Proposal) process, to improve clarity of approval and development of bigger
   changes. [10]
- Our Outreachy proposals did not receive contributions in the needed
  timeframe, despite some initial interest. They may not have been
  appropriately scoped for an Outreachy internship. [11]
- LTS (Long Term Support) version has not been very successful, with zero
  patch releases, because no one really seemed to want one. We may designate
  another one and try harder next time to really finish a patch release and
  measure its uptake.
  [12]

[4] https://lists.apache.org/thread.html/bd9a1cebbcc
 6994b0f9a5f1cdb402a19efe9c5acc54d6aa65bc671a2%40%3Cdev.beam.apache.org%3E][5]
 https://lists.apache.org/thread.html/ff60eabbf8
 349ba6951633869000356c2c2feb48bbff187cf3c60039%40%3Cdev.beam.apache.org%3E
[6] https://lists.apache.org/thread.html/fd8146e3e7
 9fc41e8c760924be3b29b1c5314024336f473f9f0e7723%40%3Cdev.beam.apache.org%3E
[7] https://lists.apache.org/thread.html/05fa80345
 f9e9ed5c9233f1dd2aa7ffbf1b5691dfeef5b449f6be338%40%3Cdev.beam.apache.org%3E
[8] https://lists.apache.org/thread.html/634f7346
 b607e779622d0437ed0eca783f474dea8976adf41556845b%40%3Cdev.beam.apache.org%[9]
 https://lists.apache.org/thread.html/0f76973
 6be1cf2fc5227f7a25dd3fdbb9296afe8a071761cb91f588a%40%3Cdev.beam.apache.org%3E
[10] https://lists.apache.org/thread.html/9236522d90
 06d6b8747d179bc369f5b082801e31fbecd4bdfce8f3e1%40%3Cdev.beam.apache.org%3E
[11] https://lists.apache.org/thread.html/217daec97f
 bcf04c71a93a2d306593f01c18f09aaad7abd69ec33eef%40%3Cdev.beam.apache.org%3E
[12] https://lists.apache.org/thread.html/100e13251b
 31ca601ddd53ab7e819de0960e826e96a0aece43045861%40%3Cdev.beam.apache.org%3E

A continuing pain point is our release process being slow and cumbersome. The
2.17.0 release has been underway for over 6 weeks. Such a burden is not
approachable for volunteer contributors.


## Community Health:
The community is vigorous, but the balance of activities is imperfect.

Mailing list stats are worth mentioning because dev@ traffic is getting *very*
large. It does not include Jira/Jenkins/GitHub notifications. Even some steady
contributors have indicated that they cannot really follow the dev@ list.

- dev@beam.apache.org had a 23% increase in traffic in the past quarter (1820
  emails compared to 1478)
- user@beam.apache.org had a 32% increase in traffic in the past quarter (446
  emails compared to 336)

The open PR count (steady state) has climbed from about 100 to over 150. This
is not due to more PRs being opened. The PR open rate is about the same. The
PR close rate is also about the same. I think this implies we have been
steadily falling behind further and further. Cultivating new
contributors/committers may help, as well as highlighting or trying to
incentivize code review by existing committers and also by non-committers.

18 Sep 2019 [Kenneth Knowles / Ted]

## Description:
 - Apache Beam is a unified programming model for both batch and streaming
   data processing, enabling efficient execution across diverse distributed
   execution engines and providing extensibility points for connecting to
   different technologies and user communities.

## Issues:
 - There are no issues requiring board attention at this time

## Activity:

Beam Summit NA occurred at ApacheCon NA. About 80 people attending 20 talks in
2 rooms over 2 days, plus a day-long Beam introductory workshop. Good
cross-pollination of talks and audience: 4 Beam related talks on other tracks;
2 non-Beam talks by Beam Summit organizers on community track; Beam Summit
attendees and speakers checked out the other tracks; other ApacheCon attendees
lured into Beam Summit :-)

Many large contributions:
 - A translator from Google's recently open sourced ZetaSQL dialect into
   Apache Calcite's relational algebra. This enables ZetaSQL as a choice of
   dialect for using Beam SQL.
 - An integration with ZetaSketch HyperLogLog algorithm, recently open sourced
   as well.
 - A new sort-merge join algorithm from Spotify. Still under discussion and
   review.

Technically, there is a noticeable trend into schema-aware data processing,
expanding from Beam SQL to other proposals like dataframe-style APIs and
columnar processing using Apache Arrow.

A SQL collaboration: PMC from Flink, Beam, and Calcite, together with
researchers from Oakridge National Lab that sit on the ISO SQL committee,
wrote a proposal for what streaming SQL should look like across their projects
and the industry as a whole, to influence streaming SQL standardization.
Presented to SIGMOD industry track and later at ApacheCon (and other venues).

Other dev list discussions of interest, due to community relevance or
integrations with other Apache projects:
 - A new integration - Ananas Analytics Desktop, a GUI for building pipelines.
   Built without Beam's involvement; a sign of relevance and accessibility.
   [0]
 - How to support users best, across user@, StackOverflow, and Slack: Covered
   tradeoffs between synchronous vs asynchonous, mostly. Did not get into
   ASF-hosted vs third party, nor was licensing of code snippets discussed.
   [1]
 - Which Flink versions to support and how best to support multiple versions.
   [2]
 - One Google Summer of Code project wrapped up, on optimized inserts to
   BigQuery. [3]
 - We have improved issue triage significantly. We added a default Jira status
   of "Needs Triage" to make sure all bugs get some attention from a
   knowledgeable community member. I asked for help triaging, and the
   community collaboratively kept untriaged issues steadily under 100, for the
   first time in a long time. [4]
 - Protocol for managing Beam's social media presence were discussed more, and
   we have a system in place (after long discussion) where the community can
   contribute easily and the PMC can review and approve. [5]
 - Improvements to our release process of vendored artifacts and documentation
   of it. [6]

[0] https://lists.apache.org/thread.html/ce3a051789868e362680e358569da26711d6b513cf2396094a242230@%3Cdev.beam.apache.org%3E
[1] https://lists.apache.org/thread.html/ed90f898d571856a5b92df23150d3417732a9f5f1b4c6ff2a41bf237@%3Cdev.beam.apache.org%3E
[2] https://lists.apache.org/thread.html/124200de15d88321d590bf83be3ba0e8bdfc3a161a0bcd66a12921ed@%3Cdev.beam.apache.org%3E
[3] https://gist.github.com/ttanay/80f84b7b852e0867d5a00d3b345e1dad
[4] https://lists.apache.org/thread.html/dd0048c68c1b5511ca5a0f668a848159fd441d51c21f332b43510163@%3Cdev.beam.apache.org%3E
[5] https://lists.apache.org/thread.html/babceeb52624fd4dd129c259db8ee9017cb68cba069b68fca7480c41@%3Cdev.beam.apache.org%3E
[6] https://lists.apache.org/thread.html/e2c49a5efaee2ad416b083fbf3b9b6db60fdb04750208bfc34cecaf0@%3Cdev.beam.apache.org%3E

## Health report:

Dev and user list subscription and traffic steady.

Each release continues to include commits from 60-100 contributors.

One sign of community bonding I've noticed happily was people letting dev@
know when they were going on vacation.

## PMC changes:

 - Currently 21 PMC members.
 - No new PMC members. Last addition was Pablo Estrada on 2019-05-13.

## Committer base changes:

 - Currently 60 committers.
 - New committers:
    - Rui Wang was added as committer on 2019-07-30
    - Kyle Weaver was added as committer on 2019-08-02
    - Jan Lukavský was added as committer on 2019-07-25
    - Robert Burke was added as committer on 2019-06-28
    - Mikhail Gryzykhin was added as committer on 2019-06-16
    - Valentyn Tymofieiev was added as committer on 2019-08-09

## Releases:

 - 2.15.0 was released on 2019-08-22.
 - 2.14.0 was released on 2019-08-01.
 - 2.13.0 was released on 2019-06-04.

## Mailing list activity:

Mailing list activity does not indicate any significant change.

 - dev@beam.apache.org:
    - 627 subscribers (up 17 in the last 3 months):
    - 1549 emails sent to list (2110 in previous quarter)

 - user@beam.apache.org:
    - 641 subscribers (up 14 in the last 3 months):
    - 359 emails sent to list (416 in previous quarter)

## JIRA activity:

 - 615 JIRA tickets created in the last 3 months
 - 334 JIRA tickets closed/resolved in the last 3 months

19 Jun 2019 [Kenneth Knowles / Craig]

## Description:
 - Apache Beam is a unified programming model for both batch and streaming
   data processing, enabling efficient execution across diverse distributed
   execution engines and providing extensibility points for connecting to
   different technologies and user communities.

## Issues:
 - There are no issues requiring board attention at this time

## Activity:

 - A portable Spark runner added capable of running Python and Go pipelines in
   Batch mode.
 - A new runner has been added based upon Hazelcast Jet, with "experimental"
   status, a signal to users that it is new and may have breaking changes
   before it becomes finalized.
 - Beam Katas (interactive programming exercises of gradually increasing
   complexity) based on JetBrains Education Products have been added to the
   project to aid increasing user growth. They are available for Python and
   Java.
 - Cross-language transform support for Flink runner was added. It now
   possible (with some effort) to build a pipeline in Python that utilizes
   transforms authored in Java.


## Health report:

Dev and user list subscription steady, but great increase in traffic on dev@.
There have been healthy discussions around technical decisions.

Each release tends to include commits from 60-100 contributors.

## PMC changes:

 - Currently 21 PMC members.
 - Pablo Estrada was added to the PMC on Mon May 13 2019

## Committer base changes:

 - Currently 54 committers.
 - New commmitters:
    - Boyuan Zhang was added as a committer on Tue Apr 09 2019
    - Jozef Vilcek was added as a committer on Sat Jun 08 2019
    - Udi Meiri was added as a committer on Fri May 03 2019
    - Yifan Zou was added as a committer on Mon Apr 22 2019

## Releases:

 - 2.12.0 was released on Wed Apr 24 2019
 - 2.13.0 was released on Tue Jun 04 2019

## Mailing list activity:

 - dev@beam.apache.org:
    - 627 subscribers (up 17 in the last 3 months):
    - 2110 emails sent to list (1352 in previous quarter)

 - user@beam.apache.org:
    - 641 subscribers (up 14 in the last 3 months):
    - 416 emails sent to list (383 in previous quarter)

## JIRA activity:

 - 719 JIRA tickets created in the last 3 months
 - 610 JIRA tickets closed/resolved in the last 3 months

20 Mar 2019 [Kenneth Knowles / Shane]

## Description:
Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:
There are no issues requiring board attention at this time

## Activity:
Apache Beam has a number of major technical endeavors maturing. As usual for
Beam, these include many integrations with other projects / communities:

 - Beam Python can be executed on Flink and is reported to be used in
   production
 - Beam Java on Samza is reported to be used in production, and some success
   reported running Beam Python on Samza
 - Our first release with partial Python 3 support
 - Beam's IO connector ecosystem is healthy, with additions or upgrades to
   connectivity to RabbitMQ, Redis, Spanner, Parquet, Hadoop, MongoDb, Kafka,
   BigQuery, Cassandra, JDBC. And Beam SQL has added Hive Metastore support.
   This reflects the growing diversity of users and use cases represented in
   the Beam community.

Community event activity has seen good developments and cross-community
building too:

 - A San Francisco-based Beam meetup group started and seems self-sustaining
 - Beam Python on Flink has gained interest and was presented at a Seattle
   Flink Meetup
 - Kettle on Beam was presented at a London Pentaho Meetup

Beam Summit Europe 2019 is approved, to occur June 19-20 in Berlin (convenient
to Berlin Buzzwords).

There have been three Beam Newsletters [1, 2, 3] since the last report was
authored. There are collaboratively authored - anyone can suggest content
about technical achievements, what they are working on, events, blog posts,
etc. Much of the above and other details can be found in the newsletter.

A nice community touch is that the newsletter also gathers information from
threads where people introduce themselves, so it is a place to learn/reflect
on new members of the community.

The community has recently discussed moving finalized newsletter to the blog
or creating a "News" section on the website to boost visibility of the
information and clarity around the newsletter's publication date.

[1] https://s.apache.org/beam-newsletter-2018-12
[2] https://s.apache.org/beam-newsletter-2019-01
[3] https://s.apache.org/beam-newsletter-2019-02

## Health report:

Dev and user list subscription is slightly up, while traffic is slightly down.
There's no indication of major health changes. The content of both lists
remains qualitatively about the same.

Each release tends to include commits from 60-100 contributors.

## PMC changes:

 - Currently 20 PMC members.
 - Etienne Chauchot was added to the PMC on Thu Jan 24 2019

## Committer base changes:

- Currently 50 committers.
   - New commmitters:
      - Gleb Kanterov was added as a committer on Thu Jan 24 2019
      - Mark Liu was added as a committer on Fri Mar 08 2019
      - Michael Luckey was added as a committer on Fri Feb 22 2019
      - Raghu Angadi was added as a committer on Thu Mar 07 2019

## Releases:

 - 2.9.0 was released on Thu Dec 13 2018
 - 2.10.0 was released on Sun Feb 10 2019
 - 2.11.0 was released on Thu Feb 28 2019

## Mailing list activity:

 - dev@beam.apache.org:
    - 613 subscribers (up 28 in the last 3 months):
    - 1464 emails sent to list (1941 in previous quarter)


 - user@beam.apache.org:
    - 627 subscribers (up 28 in the last 3 months):
    - 391 emails sent to list (465 in previous quarter)

## JIRA activity:

 - 607 JIRA tickets created in the last 3 months
 - 462 JIRA tickets closed/resolved in the last 3 months

19 Dec 2018 [Kenneth Knowles / Shane]

## Description:
 - Apache Beam is a unified programming model for both batch and streaming
   data processing, enabling efficient execution across diverse distributed
   execution engines and providing extensibility points for connecting to
   different technologies and user communities.

## Issues:
 - There are no issues requiring board attention at this time

## Activity:
 - We are happy to welcome a new integration: A Beam runner for Apache Nemo
   (incubating) has been authored, and resides in the Nemo repository.
 - The community held a 2-day Beam Summit London in October with 80 attendees,
   mostly users. Considered a success, the community intends to hold more,
   likely planning a bit more in advance.

 - The project has also added a “Roadmap” to the website, to share with users
   exciting developments underway that are otherwise only discoverable on
   dev@. Based on a good discussion, it emphasizes how a roadmap for a
   community driven ASF project differs from a commercial roadmap.

 - Other recent community decisions include:
   - Releasing “vendored” artifacts as an alternative to shading, much as
     Apache Flink does.
   - Clarifying the conditions under which Beam’s “rollback first” policy
     applies. Notably, it does not apply to downstream (potentially
     non-public) integrations.
   - Send Jira and Jenkins notifications to separate lists issues@ and
     builds@, respectively.
   - Previously, the community agreed to establish a long-term support (LTS)
     branch. This quarter, the 2.7 minor release family was chosen for a 6
     month pilot.

 - IP clearance has been completed for:
   - Dataflow Java Worker

## Health report:
 - Notable this quarter is greatly increased attention to the website and wiki
   pages pertaining to onboarding new contributors.
 - The dev@ and user@ mailing lists continue the prior modest linear growth
   trend.
 - Email volume to dev@ has increased markedly, especially noting that we have
   rerouting all automated emails to issues@beam.apache.org and
   builds@beam.apache.org.

## PMC changes:

 - Currently 19 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Thomas Weise on Fri Jun 08 2018

## Committer base changes:

 - Currently 46 committers.
 - New commmitters:
    - David Morávek was added as a committer on Mon Oct 29 2018
    - Ankur Goenka was added as a committer on Mon Oct 22 2018
    - Matthias Baetens was added as a committer on Mon Nov 26 2018
    - Xinyu Liu was added as a committer on Mon Oct 15 2018

## Releases:

 - Since the last report, Apache Beam has published two releases, with one
   more currently in progress:
   - 2.7.0 was released on Fri Sep 28 2018
   - 2.8.0 was released on Thu Oct 25 2018
   - 2.9.0 is in progress

 - The community determined to start the release process every 6 weeks, and we
   have stuck to this. The smaller gap between 2.7.0 and
  2.8.0 is due to variance in the time to a final RC.

## Mailing list activity:

 - dev@beam.apache.org:
    - 575 subscribers (up 28 in the last 3 months):
    - 1939 emails sent to list (1937 in previous quarter)

 - user@beam.apache.org:
    - 593 subscribers (up 17 in the last 3 months):
    - 416 emails sent to list (559 in previous quarter)

## JIRA activity:
 - 881 JIRA tickets created in the last 3 months (811 in the previous quarter)
 - 622 JIRA tickets closed/resolved in the last 3 months (501 in the previous
   quarter)

19 Sep 2018

Change the Apache Beam Project Chair

 WHEREAS, the Board of Directors heretofore appointed Davor Bonaci
 (davor) to the office of Vice President, Apache Beam, and

 WHEREAS, the Board of Directors is in receipt of the resignation of
 Davor Bonaci from the office of Vice President, Apache Beam, and

 WHEREAS, the Project Management Committee of the Apache Beam project
 has chosen by vote to recommend Kenneth Knowles (kenn) as the successor
 to the post;

 NOW, THEREFORE, BE IT RESOLVED, that Davor Bonaci is relieved and
 discharged from the duties and responsibilities of the office of Vice
 President, Apache Beam, and

 BE IT FURTHER RESOLVED, that Kenneth Knowles be and hereby is appointed
 to the office of Vice President, Apache Beam, to serve in accordance
 with and subject to the direction of the Board of Directors and the
 Bylaws of the Foundation until death, resignation, retirement, removal
 or disqualification, or until a successor is appointed.

 Special Order 7B, Change the Apache Beam Project Chair, was
 approved by Unanimous Vote of the directors present.

19 Sep 2018 [Davor Bonaci / Roman]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

The Board is presented with a Special Order 7B to appoint Kenneth Knowles
(kenn) to the office of Vice President, Apache Beam. Kenneth has served on the
 PMC since its inception, and is very active and effective in growing the
 community. His exemplary posts have been cited in other projects.

## Activity:

Apache Beam is now approaching its second anniversary as a top-level project.
Major technical efforts going on include:
  - Finishing the portable Flink runner, which adds support for Python and Go
    SDKs.
  - Adding Schema support.
  - Beam SQL.
  - Infrastructure and automation.

Recent community decisions include:
  - Providing designated "Long Term Support" releases.
  - Better management of outdated dependencies.
  - Using JIRA to track and highlight non-code contributions.

Several blog posts have been published this quarter, primarily promoting the
releases. Google is organizing a Beam Summit in London next month, expecting
modest attendance. Additionally, Beam was featured at several conferences,
including Flink Forward 2018 in Berlin.

Going forward, the main focus should be on the community growth, particularly
on the user side using non-proprietary engines. This goes hand-in hand with
the next major technical milestone of delivering on the portability framework,
making Beam available to Python and Go communities.

## Health report:

The user community grew modestly, as evidenced by the increased mailing list
activity, which is encouraging.

Activity on the development mailing list decreased, but there were quite a few
new contributors joining, improving the diversity. Lifetime unique code
contributors grew to 322, with 53 new first-time contributors.

## PMC changes:

Currently 19 PMC members. No new PMC members have been added since the last
report. The last PMC addition was Thomas Weise on Fri Jun 08 2018.

Frances Perry requested to resign from the PMC, though that resignation has
been put on hold pending discussions around establishing an emeritus policy
instead, taking into account recent Board discussions and recommendations to
other projects.

## Committer base changes:

Currently 42 committers. Five new committers have been added since the last
report. New committers:
  - Scott Wegner was added as a committer on Thu Jun 21 2018.
  - Łukasz Gajowy was added as a committer on Wed Jun 27 2018.
  - Anton Kedin was added as a committer on Wed Aug 01 2018.
  - Andrew Pilloud was added as a committer on Wed Aug 01 2018.
  - Tim Robertson was added as a committer on Thu Aug 23 2018.

The PMC recognizes two areas for improvement: (1) diversity of affiliations
among active committers, and (2) an imbalance of contributors to active
committers. The main cause of these recent imbalances is turnover over the
last year, perhaps among some others. The plan of inviting quite a few new
committers over a period of time has materialized. We continue to be cautious
not to grow too quickly to jeopardize the community, or negatively affect
where the project business is handled.

## Releases:

Since the last report, Apache Beam has published two releases, with one more
currently in progress:
  - 2.5.0 was released on Thu Jun 21 2018.
  - 2.6.0 was released on Tue Aug 07 2018.
  - 2.7.0 is currently under preparation.

Going forward, we expect to publish a release every 6 weeks, a target that we
have become better at achieving.

## Mailing list activity:

Mailing list subscriptions and activity continues to increase modestly. The
activity on the development mailing list is down compared to the previous
quarter, likely due to seasonal effects of summer vacations. The activity on
the user mailing list has increased to a new high, which is very encouraging.

- dev@beam.apache.org
  - 547 subscribers (up 27 in the last 3 months).
  - 1757 emails sent to list (2000 in previous quarter).

- user@beam.apache.org
  - 574 subscribers (up 25 in the last 3 months).
  - 525 emails sent to list (353 in previous quarter).

## JIRA activity:

For the third quarter in a row, the JIRA activity is increasing, turning over
the earlier trend.

- 811 JIRA tickets created in the last 3 months (705 in the previous quarter).
- 501 JIRA tickets closed/resolved in the last 3 months (368 in the previous
  quarter).

20 Jun 2018 [Davor Bonaci / Phil]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

Apache Beam is now in its second year as a top-level project, and just
celebrated one-year anniversary of its first stable release.

Major technical efforts going on include:
 - Building a portable Flink runner, which adds support for Python and Go
   SDKs.
 - Beam SQL.
 - Infrastructure and automation.

Beam desires to serve as a glue in the ecosystem, interconnecting SDKs,
engines and storage/messaging systems. On the execution side, Apache Samza
runner has seen increased activity, while other prototype runners are mostly
dormant on feature branches. On the IO connector side, the healthy growth
continues, with new connectors being contributed or improved month-over-month.

IP clearances have been completed for:
 - Euphoria API.
 - Go SDK.

Recent community decisions include:
 - Publishing guidelines for becoming (and behaving as) a Beam committer.
 - Releasing Go SDK.
 - Automation of stale pull requests.

No blog posts have been published this quarter. Beam was featured at several
conferences, including Flink Forward San Francisco, and DataWorks Summit
Berlin.

Going forward, the main focus should be on the community growth, particularly
on the user side using non-proprietary engines. This goes hand-in hand with
the next major technical milestone of delivering on the portability framework,
making Beam available to Python and Go communities.

## Health report:

The community continues to grow steadily, as follows:
 - Lifetime unique contributors grew to 269, with 24 new first-time
   contributors.
 - Increased subscriptions and activity on the mailing list and in JIRA.
 - Contribution of new components into the project by external entities.

The amount of open discussion and design on the mailing list is at a new high,
benefited by arrival of new contributors and increased openness by existing
contributors.

## PMC changes:

Currently 19 PMC members. One PMC member has been added since the last report:
 - Thomas Weise was added to the PMC on Fri Jun 08 2018.

## Committer base changes:

Currently 37 committers. Six new committers have been added since the last
report. New committers:
 - Jason Kuster was added as a committer on Fri Apr 27 2018.
 - Pablo Estrada was added as a committer on Fri Apr 27 2018.
 - Gris Cuevas was added as a committer on Thu May 03 2018.
 - Charles Chen was added as a committer on Fri Jun 08 2018.
 - Henning Rohde was added as a committer on Fri Jun 08 2018.
 - Alexey Romanenko was added as a committer on Tue Jun 12 2018.

The PMC recognizes two areas for improvement: (1) diversity of affiliations
among active committers, and (2) an imbalance of contributors to active
committers. The main cause of these recent imbalances is turnover over the
last year, perhaps among some others. The general plan is to invite quite a
few new committers over the next period of time, but not too quickly to
jeopardize the community, or negatively affect where the project business is
handled.

Other recent actions include:
 - Revision of new contributor materials to be more welcoming.
 - Publishing the (subjective) guidelines for becoming a committer.
 - Adding an explicit "Community" section to the web site, highlighting
   ongoing projects to join.

PMC member Kenneth Knowles deserves (a rare) mention by name for proactively
reaching out to a large number of contributors, offering encouragement and
individual coaching to those interested. This effort meaningfully moved the
needle forward.

## Releases:

Since the last report, Apache Beam has published one release, with one more
currently in progress:
 - 2.4.0 was released on Mon Mar 19 2018.
 - 2.5.0 is currently under preparation and voting.

Version 2.0.0 was the first release that comes with API stability guarantees.
Going forward, we expect to publish a release every 6 weeks. We have been
short of our declared goal recently; however, the community is tackling this
issue.

## Mailing list activity:

Mailing list subscriptions and activity continues to increase modestly. The
number of emails on the development mailing list is up ~47%, and the number of
threads is up ~45%. We continue to see an increase in frequency and depth of
mailing list discussions, as well as better participation and diversity of
opinion compared to last year.

- dev@beam.apache.org
 - 519 subscribers (up 15 in the last 3 months).
 - 2127 emails sent to list (1393 in previous quarter).

- user@beam.apache.org
 - 548 subscribers (up 15 in the last 3 months).
 - 389 emails sent to list (456 in previous quarter).

## JIRA activity:

For the second quarter in a row, the JIRA activity is increasing, turning over
the earlier trend.

- 705 JIRA tickets created in the last 3 months (507 in the previous quarter).
- 368 JIRA tickets closed/resolved in the last 3 months (324 in the previous
 quarter).

21 Mar 2018 [Davor Bonaci / Chris]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

Apache Beam is now in its second year as a top-level project, and the
community continues to grow modestly.

In this quarter, the main technical focus continues to be on the portability
framework, and its adoption across all components of the project, which would,
among other benefits, extend the Python and Go SDKs to all Beam runners. A
sizeable portion of the community is working on this effort.

As usual, the project kept interconnecting additional execution engines and
data storage/messaging systems, and serves as a glue in the ecosystem. On the
execution side, runners for JStorm, Apache Hadoop MapReduce, Apache Samza and
Apache Tez are being prototyped in feature branches, but without too much
recent activity. On the IO connector side, the healthy growth continues, with
new connectors being contributed or improved month-over-month.

Seznam.cz decided to donate the Euphoria API to Apache Beam. Also, an SGA for
Google’s previous donation of the Go SDK is still pending. Both IP clearances
should complete by the next report.

Recent major community decisions include:
- Dropping Java 7 support, and requiring users to upgrade to Java 8.
- Dropping Apache Spark 1.6 support, and requiring users to upgrade to a Spark
 2.x cluster.
- Completely switching the build system to Gradle.

In this quarter, the community published two blog posts, one as a look-back at
2017 and one about the most recent release. Beam was featured at Strata Data
Conference San Jose 2017. Additionally, Google hosted a day-long Beam Summit
with solid participation.

Going forward, the main focus should be on the community growth, particularly
on the user side using non-proprietary engines. On the technical side, the
next major milestone is the completion of the portability framework across all
components of the project.

## Health report:

The community continues to grow steadily, as follows:
- Lifetime unique contributors grew to 245, with 30 new first-time
 contributors.
- Increased mailing list subscriber/activity.
- Increased JIRA activity.
- Contribution of new components into the project by external entities.
- Continued release cadence.

The overall health is solid, improving from a recent low, and is benefited by
addition of new community members with foundation membership and/or experience
in other projects.

## PMC changes:

Currently 18 PMC members. No new members have been added since the last
report. Last PMC addition was on Wed Nov 08 2017. We are watching several
potential candidates.

## Committer base changes:

Currently 31 committers. No new members have been added since the last report.
Last committer addition was on Wed Nov 08 2017.

There are clear candidates, probably five or so. I’m confident the PMC will
address this very quickly.

## Releases:

Since the last report, Apache Beam has published one release:
- 2.3.0 was released on Thu Feb 15 2018.

Version 2.0.0 was the first release that comes with API stability guarantees.
Going forward, we expect to publish a release every 2 months.

## Mailing list activity:

Mailing list subscriptions and activity continues to increase modestly. It is
worth noting that we saw an increase in frequency and depth of mailing list
discussions, as well as better participation and diversity of opinion compared
to last year.

- dev@beam.apache.org
 - 501 subscribers (up 28 in the last 3 months).
 - 1595 emails sent to list (1452 in previous quarter).

- user@beam.apache.org
 - 530 subscribers (up 36 in the last 3 months).
 - 503 emails sent to list (456 in previous quarter).

## JIRA activity:

Whereas JIRA activity was going down for a few quarters, it is great to report
that we’ve turned the trend back upwards.

- 507 JIRA tickets created in the last 3 months (449 in the previous quarter).
- 324 JIRA tickets closed/resolved in the last 3 months (171 in the previous
 quarter).

20 Dec 2017 [Davor Bonaci / Mark]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

While the project is doing well, I do want to point out that for the first
time there have been some departures from the community as noted in the Health
report below. This is to be expected with projects of this size and requires
no Board attention at this time.

## Activity:

This month we are celebrating the one year anniversary of becoming a top-level
project. Over the past year, the project has grown substantially, crossing 200
lifetime individual contributors, and nearing 500 mailing list subscribers.
The project published 7 releases, including a major one, version 2.0.0, the
first release that comes with an API stability promise.

In this quarter, the main technical focus continues to be on the portability
framework, and its adoption across all components of the project, which would,
among other benefits, extend the Python SDK to all Beam runners. A sizeable
portion of the community is working on this effort.

An SDK for Go has been contributed/donated to the project by Google, after
design and initial development stage outside of the community. The project has
accepted this contribution, and it is currently managed as a new component in
a feature branch. Hopefully, with community involvement, the component can be
merged into master sometime next year.

As usual, the project kept interconnecting additional execution engines and
data storage/messaging systems, and serves as a glue in the ecosystem. On the
execution side, runners for JStorm, Apache Hadoop MapReduce, Apache Samza and
Apache Tez are being prototyped in feature branches, but without too much
recent activity. The Spark runner migration from Apache Spark 1.6 to 2.x is
nearing completion. On the IO connector side, the healthy growth continues,
with new connectors being contributed month-over-month (Redis, RabbitMQ, and
others).

Out of the major discussions that affect the future of the project, it is
worth noting the following discussions:
- Continuing to support Java 7 vs. requiring users to upgrade to Java 8 in a
 future release.
- Continuing to support Apache Spark 1.6 vs. requiring users to have a Spark
 2.x cluster.
- Switching the build system from Apache Maven to Gradle.

In all three cases, the majority preference seem to be trending towards
upgrading, but with varying degrees of opposing opinion as well.

With respect to outreach, there have been no blog posts or press releases this
quarter. Beam was featured at Strata Data Conference New York and Singapore,
QCon San Francisco, as well as several local meetups in the Bay Area, New
York, London, Singapore, Guadalajara and Stockholm.

Outside the project, IBM launched an Apache Beam runner for IBM Streams as a
part of their cloud offering.  Enabling users to easily run Beam pipelines on
IBM Cloud is good for the overall project growth.

Going forward, the main focus should to be on the community growth,
particularly on the user side using non-proprietary engines. On the technical
side, the next major milestone is the completion of the portability framework
across all components of the project.

## Health report:

The community continues to grow steadily, as follows:
- Lifetime unique contributors grew to 215, with 19 new first-time
 contributors.
- Both PMC and committer base grew by 2 members each.
- Mailing list subscribers/activity continue the healthy growth, with over 40
 new user@ mailing list subscribers and 50% increase in dev@ email volume.
- The release cadence continues, albeit significantly slower than before.

The community diversity has decreased somewhat with the departure or
inactivity of a handful of early PMC members that were community champions.
The effects are visible in the community tone, behavior and consensus
building. This is not unexpected for a project of this size and at this point
in time, but it is something that we will work to regain over the next few
months.

## PMC changes:

Currently 18 PMC members. Two new PMC members have been added since the last
report:
- Ismaël Mejía was added to the PMC on Wed Nov 08 2017.
- Reuven Lax was added to the PMC on Wed Nov 08 2017.

## Committer base changes:

Currently 31 committers. Two new committers have been added since the last
report:
- Etienne Chauchot was added as a committer on Wed Nov 08 2017.
- Melissa Pashniak was added as a committer on Wed Nov 08 2017.

## Releases:

Since the last report, Apache Beam has published one feature release, as well
as one patch release:
- 2.2.0 was released on Sat Dec 02 2017.
- 2.1.1 was released on Fri Sep 22 2017.

Version 2.0.0 was the first release that comes with API stability guarantees.
Going forward, we expect to publish a release every 2 months.

## Mailing list activity:

Mailing list subscriptions continue to increase modestly, along with the
healthy increases in the overall email volume. It is worth noting that we saw
an increase in frequency and depth of mailing list discussions, as well as
better participation and diversity of opinion compared to the previous
quarter.

- dev@beam.apache.org
 - 467 subscribers (up 16 in the last 3 months).
 - 1499 emails sent to list (934 in previous quarter).

- user@beam.apache.org
 - 487 subscribers (up 42 in the last 3 months).
 - 483 emails sent to list (404 in previous quarter).

## JIRA activity:

While JIRA activity continues to be healthy, this is the second quarter with
decreasing participation. Earlier in the year, when the community was working
towards the first stable release, we had 650 resolved issues in the quarter,
falling first to 278, and now dropping to 171. Going forward, this is an area
for improvement for the community, as we make sure to hear user feedback.

- 449 JIRA tickets created in the last 3 months (505 in the previous quarter).
- 171 JIRA tickets closed/resolved in the last 3 months (278 in the previous
 quarter).

20 Sep 2017 [Davor Bonaci / Phil]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

In the previous quarter, we had achieved a major milestone for the project --
the completion of the first stable release, version 2.0.0. It signified a
statement from the community that it intends to maintain API stability with
all releases for the foreseeable future, and making Beam suitable for
enterprise deployment.

In this quarter, we continued to build on that momentum and kept
interconnecting additional execution engines and data storage/messaging
systems, and serve as a glue in the ecosystem. On the execution side, the
Apache Gearpump (incubating) runner effort has merged into the master branch
as a new component, and will be included in the next release. JStorm, Apache
Hadoop MapReduce, and Apache Tez runners are making further progress. The
Spark runner migration from Apache Spark 1.6 to 2.x is nearing completion.

A major effort to create an SQL extension, based on Apache Calcite, got merged
into the master branch as a new component, and is slated for the next release.

On the IO connector side, a connector for Apache Solr has been contributed,
and additional connectors for Redis, Apache DistributedLog (incubating),
Apache Parquet, RabbitMQ, and Advanced Message Queuing Protocol (AMQP) are in
progress. Major improvements to file-based connectors have been contributed,
making them capable of handling dynamic source and sink locations.

We have published two technical blog posts regarding the recent innovation in
the project:
- Powerful and modular IO connectors with Splittable DoFn in Apache Beam
- Timely (and Stateful) Processing with Apache Beam

Beam was covered at several industry conferences over the past quarter,
including the DataWorks Summit Sydney 2017, Kafka Summit San Francisco 2017,
YOW! Data Sydney 2017, and Flink Forward Berlin 2017.

Going forward, the main focus continues to be on the user growth, with
outreach continuing across conferences and meetups. On the technical side, the
next major milestone is the completion of the portability framework across all
components of the project, which would, among other benefits, extend the
Python SDK to all Beam runners.

## Health report:

The community continues to grow steadily, as follows:
- For the first time since the top-level project was established, we have
added new PMC members, and have added a record of five new committers in the
quarter.
- The number of contributors continues to increase. We are now at 196 unique
code contributors, up from 176 in the last report.
- Releases continue at a regular pace of 1-2 months per release.
- The mailing list subscribers continue to increase.

## PMC changes:

Currently 16 PMC members. Two new PMC members have been added since the last
report:
- Ahmet Altay was added to the PMC on Thu Aug 10 2017.
- Aviem Zur was added to the PMC on Thu Aug 10 2017.

## Committer base changes:

Currently 29 committers. Five new committers have been added since the last
report:
- Jingsong Lee was added as a committer on Thu Jun 22 2017.
- Reuven Lax was added as a committer on Fri Aug 11 2017.
- James Xu was added as a committer on Fri Aug 11 2017.
- Mingmin Xu was added as a committer on Fri Aug 11 2017.
- Manu Zhang was added as a committer on Fri Aug 11 2017.

## Releases:

Since the last report, Apache Beam has published one release with another one
currently being worked on:
- 2.1.0 was released on Mon Aug 21 2017.
- 2.2.0 is being prepared, with an expected publication in September 2017.

Version 2.0.0 was the first release that comes with API stability guarantees.
Going forward, we expect to publish a release every 1-2 months.

## Mailing list activity:

Mailing list subscriptions continues to increase. The small decrease in the
email volume is the effect of comparison with the previous quarter, which
included the major effort of publishing the first stable release.

- dev@beam.apache.org
 - 451 subscribers (up 26 in the last 3 months).
 - 1024 emails sent to list (1139 in previous quarter).

- user@beam.apache.org
 - 445 subscribers (up 56 in the last 3 months).
 - 413 emails sent to list (512 in previous quarter).

## JIRA activity:

JIRA activity continues to be healthy. The small decrease is the effect of
comparison with the previous quarter, which included the major effort of
publishing the first stable release.

- 505 JIRA tickets created in the last 3 months (725 in the previous quarter).
- 278 JIRA tickets closed/resolved in the last 3 months (650 in the previous
quarter).

21 Jun 2017 [Davor Bonaci / Chris]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

We have achieved a major milestone for the project -- the completion of the
first stable release, version 2.0.0. It signifies a statement from the
community that it intends to maintain API stability with all releases for the
foreseeable future, and making Beam suitable for enterprise deployment.
Additionally, version 2.0.0 improves user experience across the project,
focusing on seamless portability across execution environments, including
engines, operating systems, on-premise clusters, cloud providers, and data
storage systems.

Beam continues to interconnect additional execution engines and data
storage/messaging systems, and serves as a glue in the ecosystem. On the
execution side, the work continues on the Apache Gearpump (incubating) runner,
and a new effort on the JStorm runner has started. On the IO connector side,
connectors for Apache Cassandra and Apache Hive’s HCatalog have been
contributed, and additional connectors for Redis, Apache DistributedLog
(incubating), Apache Solr, Apache Parquet, RabbitMQ, and Advanced Message
Queuing Protocol (AMQP) are in progress. Finally, we have started a major
effort to create a SQL extension, based on Apache Calcite.

We have published a press release and a blog post regarding the first stable
release:
- https://blogs.apache.org/foundation/entry/the-apache-software-foundation-
 announces12
- https://beam.apache.org/blog/2017/05/17/beam-first-stable-release.html
We have also refreshed the design of our website.

Beam was covered at seven major industry conferences over the past quarter,
including the "Apache: Big Data" conference in Miami, FL, where we have had 4
talks, a birds-of-a-feather session and a social event. Additionally, we
organized the first meetup in the Bay Area, hosted by Hortonworks and Future
of Data.

Going forward, the main focus continues to be on the user growth, with
outreach continuing across conferences and meetups. On the technical side, the
next major milestone is the completion of the portability framework across all
components of the project, which would, among other benefits, extend Python
SDK to all Beam runners.

## Health report:

The community continues to grow steadily, as follows:
- The number of contributors continues to increase. We are now at 179 unique
 code contributors, with 76 individuals contributing to the latest release
 alone (which spanned less than 2 months).
- Releases continue at a regular pace of 1-2 months per release.
- The activity on the user@ mailing list more than doubled.

## PMC changes:

Currently 14 PMC members. No new PMC members have been added since graduation
six months ago.

We are watching for potential new PMC members.

## Committer base changes:

Currently 24 committers. Four new committers have been added since the last
report:
- Aviem Zur was added as a committer on Fri Mar 17 2017.
- Chamikara Jayalath was added as a committer on Fri Mar 17 2017.
- Ismaël Mejía was added as a committer on Fri Mar 17 2017.
- Eugene Kirpichov was added as a committer on Fri Mar 17 2017.

## Releases:

Since the last report, Apache Beam has published two releases:
- 0.6.0 was released on Mon Mar 13 2017.
- 2.0.0 was released on Mon May 15 2017.

Version 2.0.0 is the first release that comes with API stability guarantees.
Going forward, we expect to publish a release every 1-2 months.

## Mailing list activity:

Mailing list activity continues to increase across all metrics, with the
number of user@ emails more than doubling compared to the previous quarter.

- dev@beam.apache.org
 - 424 subscribers (up 63 in the last 3 months).
 - 1162 emails sent to list (1094 in previous quarter).

- user@beam.apache.org
 - 384 subscribers (up 73 in the last 3 months).
 - 547 emails sent to list (250 in previous quarter).

## JIRA activity:

JIRA activity continues to increase across all metrics, with the number of
resolved issues nearly doubling.

- 725 JIRA tickets created in the last 3 months (542 in the previous quarter).
- 650 JIRA tickets closed/resolved in the last 3 months (347 in the previous
 quarter).

15 Mar 2017 [Davor Bonaci / Rich]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

Apache Beam was established as a top-level project at December’s Board
meeting. This is the third in the series of three consecutive monthly reports
for new projects.

Since last month's report, we have started work on the next release, version
0.6.0. This will be the first release with the new Python SDK, a highly
anticipated component that opens up a new user community. Pipelines built with
Python SDK currently run on a limited number of runners, but work is ongoing
to extend runner support.

Beam continues to interconnect additional execution engines and data
storage/messaging systems. Since the last report, IO connector for Apache
HBase has been contributed, and additional connectors for Redis, Apache
Cassandra, Apache DistributedLog, Apache Parquet, Apache Solr, RabbitMQ, and
Advanced Message Queuing Protocol (AMQP) are in progress. The work has resumed
on the Apache Gearpump (incubating) runner.

Going forward, the main focus continues to be on the community growth,
particularly users. Beam will be covered at 6 major conferences over the next
2 months, including 2 talks and a tutorial at the upcoming Apache: Big Data
North America 2017 conference.

On the technical side, the next major milestone is the availability of the
first stable release, which will include backward-compatibility guarantees.
This stabilization effort has started recently.

## Health report:

The community continues to grow steadily, as follows:

- The number of contributors continues to increase.
- Releases continue at a regular pace of 1-1.5 months per release.
- Mailing list activity continues to increase significantly.

## PMC changes:

Currently 14 PMC members. No new PMC members have been added since graduation
three months ago.

## Committer base changes:

Currently 20 committers. Three new committers have been added since
graduation:

- Ahmet Altay was added as a committer on Tue Jan 31 2017.
- Pei He was added as a committer on Tue Jan 31 2017.
- Stas Levin was added as a committer on Tue Jan 31 2017.

## Releases:

In the two months following graduation, Apache Beam has published two
releases:

- 0.4.0 was released on Sun Jan 01 2017.
- 0.5.0 was released on Mon Feb 06 2017.

In addition, the 0.6.0 release is in progress.

## Mailing list activity:

Mailing list activity continues to increase across all metrics.

- dev@beam.apache.org
- 351 subscribers (up 60 in the last 3 months)
- 1161 emails sent to list (866 in previous quarter)

- user@beam.apache.org
- 298 subscribers (up 58 in the last 3 months)
- 282 emails sent to list (241 in previous quarter)

## JIRA activity:

- 542 JIRA tickets created in the last 3 months
- 347 JIRA tickets closed/resolved in the last 3 months

27 Feb 2017 [Davor Bonaci / Isabel]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

Apache Beam was established as a top-level project at December’s Board
meeting. This is the second in the series of three consecutive monthly reports
for new projects.

Since last month's report, we have:
- published the second post-graduation release, version 0.5.0,
- added 3 new committers from two different organizations,
- promoted the Python SDK to the master branch with support for two runners.

Over the last month, Apache Beam graduation has been covered in more than a
dozen technical publications and received endorsements from multiple
organizations.

Beam continues to interconnect additional execution engines and data
storage/messaging systems. Since the last report, IO connectors for
Elasticsearch and MQ Telemetry Transport have been released, and additional
connectors for Redis, Apache Cassandra, Apache DistributedLog, Apache Parquet,
RabbitMQ, and Advanced Message Queuing Protocol (AMQP) are in progress.

Going forward, the main focus continues to be on the community growth. On the
technical side, the next major milestone is the availability of the first
stable release, which will include backward-compatibility guarantees.

## Health report:

The community continues to grow steadily, as follows:
- The number of contributors continues to increase.
- Releases continue at a regular pace of 1-1.5 months per release.
- Mailing list activity continues to increase significantly.

## PMC changes:

Currently 14 PMC members. No new PMC members have been added since graduation
two months ago.

## Committer base changes:

Currently 20 committers. Three new committers have been added in the last
month:
- Ahmet Altay was added as a committer on Tue Jan 31 2017.
- Pei He was added as a committer on Tue Jan 31 2017.
- Stas Levin was added as a committer on Tue Jan 31 2017.

## Releases:

In the two months following graduation, Apache Beam has published two
releases:
- 0.4.0 was released on Sun Jan 01 2017.
- 0.5.0 was released on Mon Feb 06 2017.

## Mailing list activity:

Mailing list activity continues to increase across all metrics.

- dev@beam.apache.org
- 332 subscribers (up 56 in the last 3 months)
- 1032 emails sent to list (762 in previous quarter)

- user@beam.apache.org
- 276 subscribers (up 50 in the last 3 months)
- 301 emails sent to list (203 in previous quarter)

## JIRA activity:

- 481 JIRA tickets created in the last 3 months
- 322 JIRA tickets closed/resolved in the last 3 months

## Appendix:

More details about graduation media coverage are available in the “media
recap” blog post:
https://beam.apache.org/blog/2017/02/01/graduation-media-recap.html

18 Jan 2017 [Davor Bonaci / Shane]

## Description:

Apache Beam is a unified programming model for both batch and streaming data
processing, enabling efficient execution across diverse distributed execution
engines and providing extensibility points for connecting to different
technologies and user communities.

## Issues:

There are no issues that require the Board's attention at this time.

## Activity:

Apache Beam was established as a top-level project at last month's Board
meeting. This is the first in the series of three consecutive monthly reports
for new projects.

Since becoming a top-level project, we have:
* completed administrative and infrastructure-related tasks to transition from
 a podling to a TLP,
* published the press release and a follow-up blog,
* published the first non-incubating release, version 0.4.0.

In addition, since the last report, we have participated in major conferences
and meetups, including:
* presented at Apache: Big Data Europe 2016 and ApacheCon's Podling Shark
 Tank, as well as and the Birds of Feather session,
* presented at QCon San Francisco 2016,
* presented at Strata + Hadoop World Singapore 2016, along with a hands-on
 Beam tutorial,
* co-organized a meetup with an Apache Apex user group, and presented at
 another meetup.

Beam continues to interconnect additional execution engines and data
storage/messaging systems. Since the last report, a runner for Apache Apex was
merged from a feature branch and released, and IO connectors for Elasticsearch
and MQ Telemetry Transport have been contributed.

Going forward, the main focus continues to be on community growth. On the
technical side, the next major milestone is the availability of the first
stable release, which will include backward-compatibility guarantees.

## Health report:

The community continues to grow steadily, as follows:
* The number of contributors continues to increase, with an expectation of
 additional committers in the near future.
* Releases continue at a regular pace of 1-1.5 months per release.
* Mailing list activity continues to increase, with some metrics doubling
 quarter-over-quarter (see below).

## PMC changes:

Currently 14 PMC members. No new PMC members have been added since graduation
a month ago.

## Committer base changes:

Currently 17 committers. No new committers have been added since graduation a
month ago.

## Releases:

The first post-graduation release, version 0.4.0, was published on January 1,
2017.

## Mailing list activity:

Mailing list activity continues to increase, with some metrics doubling
quarter-over-quarter.

- dev@beam.apache.org:
  - 310 subscribers (up 49 in the last 3 months)
  - 1079 emails sent to list (519 in previous quarter)

- user@beam.apache.org:
  - 261 subscribers (up 54 in the last 3 months)
  - 231 emails sent to list (246 in previous quarter)

## JIRA activity:

- 512 JIRA tickets created in the last 3 months
- 338 JIRA tickets closed/resolved in the last 3 months

21 Dec 2016

Establish the Apache Beam Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to
 the public, related to a unified programming model for both
 batch and streaming data processing, enabling efficient
 execution across diverse distributed execution engines
 and providing extensibility points for connecting to different
 technologies and user communities.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache Beam Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache Beam Project be and hereby is
 responsible for the creation and maintenance of software
 related to a unified programming model for both batch and
 streaming data processing, enabling efficient execution across
 diverse distributed execution engines and providing extensibility
 points for connecting to different technologies and user
 communities; and be it further

 RESOLVED, that the office of "Vice President, Apache Beam" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache Beam Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache Beam Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache Beam Project:

   * Tyler Akidau <takidau@apache.org>
   * Davor Bonaci <davor@apache.org>
   * Robert Bradshaw <robertwb@apache.org>
   * Ben Chambers <bchambers@apache.org>
   * Luke Cwik <lcwik@apache.org>
   * Stephan Ewen <sewen@apache.org>
   * Dan Halperin <dhalperi@apache.org>
   * Kenneth Knowles <kenn@apache.org>
   * Aljoscha Krettek <aljoscha@apache.org>
   * Maximilian Michels <mxm@apache.org>
   * Jean-Baptiste Onofré <jbonofre@apache.org>
   * Frances Perry <frances@apache.org>
   * Amit Sela <amitsela@apache.org>
   * Josh Wills <jwills@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Davor Bonaci
 be appointed to the office of Vice President, Apache Beam, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache Beam PMC be and hereby is
 tasked with the creation of a set of bylaws intended to
 encourage open development and increased participation in the
 Apache Beam Project; and be it further

 RESOLVED, that the Apache Beam Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Incubator Beam podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator Beam podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 Special Order 7C, Establish the Apache Beam Project, was
 approved by Unanimous Vote of the directors present.

16 Nov 2016

Apache Beam is an open source, unified model and set of language-specific
SDKs for defining and executing data processing workflows, and also data
ingestion and integration flows, supporting Enterprise Integration Patterns
(EIPs) and Domain Specific Languages (DSLs). Beam pipelines simplify the
mechanics of large-scale batch and streaming data processing and can run on
a number of runtimes such as Apache Flink, Apache Gearpump, Apache Apex,
Apache Spark, and Google Cloud Dataflow. Beam also brings SDKs in different
languages, allowing users to easily implement their data integration
processes.

Beam has been incubating since 2016-02-01.

The most important issue to address in the move towards graduation:

  1. Make it easier for the Beam community to to learn, use, and grow by
     expanding and improving the Beam documentation, code samples, and the
     website

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

  None.

How has the community developed since the last report?

  * 441 closed/merged pull requests
  * High engagement on dev and user mailing lists (742 / 179 messages)
  * Several public talks, articles, and videos including:
    - @Scale San Jose (“No shard left behind: APIs for massive parallel
      efficiency in Apache Beam”)
    - Strata + Hadoop World NYC (“Learn stream processing with Apache Beam”)
    - Paris Spark Meetup (“Introduction to Apache Beam”)
    - Hadoop Summit Melbourne (“Stream/Batch processing portable across
      on-prem (Spark, Flink) and Cloud with Apache Beam”)
    - Hadoop User Group Taipei (“Stream Processing with Beam and Google Cloud
      Dataflow”)
    - Data Science Lab London (“Apache Beam: Stream and Batch Processing;
      Unified and Portable!”)

How has the project developed since the last report?

  Major developments on the project since last report include the following:

  * Second and third incubating release (0.2.0 and 0.3.0) and a release
    guide [1]
  * New DirectRunner support for testing streaming pipelines[2]
  * Continued improvements to the Flink, Spark, and Dataflow runners
  * Added support for new IO connectors, including MongoDB, Kinesis, and JDBC
    with Cassandra, MQTT support pending in pull requests
  * Addition of the Apache Apex runner on a feature branch, and continued
    work on the Apache Gearpump runner and Python SDK feature branches. [3]
  * Continued reorganization and refactoring of the project
  * Continued improvements to documentation and testing

 [1]: http://beam.incubator.apache.org/contribute/release-guide/
 [2]: http://beam.incubator.apache.org/blog/2016/10/20/test-stream.html
 [3]: http://beam.incubator.apache.org/contribute/work-in-progress/#feature-branches

Dates of last releases:

  * 2016/08/07 - 0.2.0-incubating
  * 2016/10/31 - 0.3.0-incubating

When were the last committers or PMC members elected?

  The following committers were elected on 2016/10/20:

  * Thomas Weise
  * Jesse Anderson
  * Thomas Groh

Signed-off-by:
 [X](beam) Jean-Baptiste Onofré
 [ ](beam) Venkatesh Seetharam
 [ ](beam) Ted Dunning

17 Aug 2016

Apache Beam is an open source, unified model and set of language-specific SDKs
for defining and executing data processing workflows, and also data ingestion
and integration flows, supporting Enterprise Integration Patterns
(EIPs) and Domain Specific Languages (DSLs). Beam pipelines simplify the
mechanics of large-scale batch and streaming data processing and can run on a
number of runtimes such as Apache Flink, Apache Gearpump, Apache Spark, and
Google Cloud Dataflow (a cloud service). Beam also brings SDKs in different
languages, allowing users to easily implement their data integration
processes.

Beam has been incubating since 2016-02-01.

Three most important issues to address in the move towards graduation:

 1. Additional and continued Beam releases
 2. Grow the community of Beam users and contributors
 3. Add to and improve upon documentation, code samples, and project
    website

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of? None.

How has the community developed since the last report?

 * 425 closed/merged pull requests
 * High engagement on dev and user mailing lists (590 / 455 messages)
 * Several public talks, articles, and videos including:
 * Hadoop Summit San Jose ("Apache Beam: A Unified Model for Batch and
   Streaming Data Processing" & "The Next Generation of Data Processing &
   OSS")
 * O’Reilly & The New Stack ("Future-proof and scale-proof your code")
 * QCon NY ("Apache Beam: The Case for Unifying Streaming API's")
 * JBCN Barcelona ("Introduction to Apache Beam")

How has the project developed since the last report?

 Major developments on the project since last report include the following:

 * First incubating release (0.1.0-incubating)
 * Second incubating release (0.2.0-incubating)
 * Addition of Apache Beam Python SDK
 * Addition of the Apache Gearpump runner
 * Added support for writing to Apache Kafka clusters
 * Added support for reading from and writing to Java Message Services,
   including Apache ActiveMQ, GeronimoJMS, and RabbitMQ
 * Ratified new Beam model APIs to improve efficiency and failure handling:
   DoFn setup, teardown, and reuse
 * Optimized key components such as data serialization and shuffle
 * Continued improvements to the Flink, Spark, and Dataflow runners
 * Continued reorganization and refactoring of the project
 * Continued improvements to documentation and testing

Date of last release:

 * 2016/06/15 - 0.1.0-incubating
 * 2016/08/08 - 0.2.0-incubating)

When were the last committers or PMC members elected?

 N/A - no changes since last report.

Signed-off-by:

 [X](beam) Jean-Baptiste Onofre
 [ ](beam) Venkatesh Seetharam
 [X](beam) Bertrand Delacretaz
 [X](beam) Ted Dunning

18 May 2016

Apache Beam is an open source, unified model and set of language-specific SDKs
for defining and executing data processing workflows, and also data ingestion
and integration flows, supporting Enterprise Integration Patterns
(EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify the
mechanics of large-scale batch and streaming data processing and can run on a
number of runtimes like Apache Flink, Apache Spark, and Google Cloud Dataflow
(a cloud service). Beam also brings DSL in different languages, allowing users
to easily implement their data integration processes.

Beam has been incubating since 2016-02-01.

Three most important issues to address in the move towards graduation:

 1. Continued releases
 2. Grow up user and contributor communities
 3. Improve and extend documentation and samples on the website

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 None

How has the community developed since the last report?

 * Both user and dev mailing lists activity increased
 * We sustain a high level of activity on the pull request cycles (submit,
   review, ...)

How has the project developed since the last report?

 * All resources have been created (website, Jira, git & github mirror, ...)
 * The code donation has been completed
 * The website has been published, we are still in the process of donated
   documentation and samples resources
 * We renamed all package to match the Apache convention
 * We started the re-organization and refactoring of the project structure
   (isolating and moving some modules)

Date of last release:

 N/A

When were the last committers or PMC members elected?

 N/A

Signed-off-by:

 [X](beam) Jean-Baptiste Onofre
 [X](beam) Jim Jagielski
 [X](beam) Venkatesh Seetharam
 [ ](beam) Bertrand Delacretaz
 [X](beam) Ted Dunning