Apache Logo
The Apache Way Contribute ASF Sponsors

This was extracted (@ 2017-05-22 18:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | Pre-organization meetings

Arrow

19 Apr 2017 [Jacques Nadeau / Brett]

## Description:
Arrow is a columnar in-memory analytics layer designed to accelerate big data.
It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
 - CodeBase/Format:
   - Substantial progress and a 0.2 release since last report, close to 0.3
   - Example additions include: Large contribution of GLIB from support from new
     contributor adding support for Ruby, Lua, Go, Enhancements to HDFS support
     including partitioned directories, clarification & improvements to Time
     types, Tensor Flow compatibility, support for fixed with binary types,
     Python read enhancements, Incorporation of Feather file format, and many
     other items.
   - Spark integration (https://s.apache.org/arrowspark) looks promising and
     will hopefully expose Arrow to a large group of additional users.

 - Awareness and evangelism:
   - Talks at conferences and meetups including:
      Spark Summit East https://s.apache.org/arrowss17
      Strata San Jose https://s.apache.org/arrowstrata17
      Dataworks Munich https://s.apache.org/arrowdataworks17

 - Community:
   - Continued influx of new contributors. Some PMC members have been
     especially effective at engaging new communities, through discussions on
     Twitter as well as other means.

## Health report:
 - Double the number of dev and issue emails over the previous quarter mean
   that the people who are active in the community are very active.
 - At the same time, the PMC just started a discussion about how to continue to
   grow the team. There have been various casual contributions which is good but
   the core group of prolific contributors is growing slowly.
 - We need to continue to make a concerted effort to provide example use cases
   to help more people understand and appreciate Arrow benefits.
 - We're seeing demand for this type of solution by other groups of people,
   some inside the foundation, some outside. We're doing community outreach to
   try to engage others but always worry about NIH thinking. Our open and
   collaborative approach to building and extending the Arrow format and
   software will hopefully convince more people to join the project rather than
   creating competing technologies. Only time will tell in each case.

## PMC changes:

 - Currently 19 PMC members.
 - Last PMC addition was Uwe Korn on Thu Apr 13 2017

## Committer base changes:

 - Currently 21 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Uwe Korn at Thu Oct 27 2016

## Releases:

 - 0.2.0 was released on Sat Feb 18 2017

## JIRA activity:

 - 332 JIRA tickets created in the last 3 months
 - 282 JIRA tickets closed/resolved in the last 3 months

18 Jan 2017 [Jacques Nadeau / Marvin]

## Description:
Arrow is a columnar in-memory analytics layer designed to accelerate big data.
It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:
- Arrow has made great progress since the last report.
- The community has actively been driving towards a set of cross-language
 compatibility tests. These are now complete.
- The compatibility tests were a key gate identified to seeing the
 specification as solidified. Now that it is, the community will be starting
 work on our second release.
- This release will show the arrow projects java arrow, arrow-cpp, py-arrow
 and Parquet's parquet-cpp all working nicely together.

## Health report:
- A core group of community members continue to make good progress on various
 aspects of both the Java, C++ and python projects.
- We're seen a small number of casual contributors arrive and provide
 additional patches to the project.
- Multiple people have been doing community outreach through the various blog
 posts, meetups and conference presentations. Examples include
 - Upcoming talk at Strata San Jose in March
 - Upcoming talk at Dataworks Summit Munich in April
 - Arrow and Pandas vision: https://s.apache.org/arrow_1701_01
 - Python Data Wrangling talk: https://s.apache.org/arrow_1701_02
- We continue to see nice growth in mailing list and jira activity.

## PMC changes:

- Currently 18 PMC members.
- Wes McKinney was added to the PMC on Wed Oct 19 2016

## Committer base changes:

- Currently 21 committers.
- Uwe Korn was added as a committer on Thu Oct 27 2016

## Releases:

- Last release was 0.1.0 on Tue Oct 11 2016

## JIRA activity:

- 140 JIRA tickets created in the last 3 months
- 117 JIRA tickets closed/resolved in the last 3 months

19 Oct 2016 [Jacques Nadeau / Chris]

## Description:
Arrow is a columnar in-memory analytics layer designed to accelerate big data.
It houses a set of canonical in-memory representations of flat and hierarchical
data along with multiple language-bindings for structure manipulation. It also
provides IPC and common algorithm implementations.

## Issues:

- There are no issues requiring board attention at this time.

## Activity:
- Arrow made its first release.
- In preparation of the release, multiple discussions were focused on
 formalizing various Arrow specification details.
- Discussion was good and we reworked some integration to invert the
 dependency model between the Parquet project and the Arrow project.
- A new Arrow file format was defined and implemented in both Java and C++.
 It is also available from Python.
- Community members covered Arrow at multiple conferences including Strata
 NYC.
- Arrow <> Parquet interchange has been made available in C++.
- The new Arrow file format is planned to be used to move forward on both
 cross-language IPC implementations and enabling cross-language compatibility
 tests.
- We've seen good growth in the Arrow developer mailing list, having increased
 to 467 subscribers (up 43 in the last 3 months):

## Health report:
- The first release is a good step in engaging a broader range of contributors
 and users. Having bits for use, albeit alpha, allows us to engage a wider
 range of engineers.
- We need to continue to add new examples and more documentation to better
 describe how to use and extend Arrow.

## PMC changes:

- Currently 17 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016

## Committer base changes:

- Currently 20 committers.
- No new committers added in the last 3 months
- Last committer addition was Ippokratis Pandis at Thu Feb 18 2016

## Releases:

- 0.1.0 was released on Wed Oct 12 2016

## JIRA activity:

- 95 JIRA tickets created in the last 3 months
- 73 JIRA tickets closed/resolved in the last 3 months

20 Jul 2016 [Jacques Nadeau / Marvin]

## Description:

Arrow is a columnar in-memory analytics layer designed to accelerate big data.
It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:

- There are no issues requiring board attention at this time.

## Activity:

- Awareness continues to increase with the community having done presentations
 at various meetups as well as the following conferences: Pydata Paris, Hadoop
 Summit Ireland, Hadoop Summit San Jose and Berlin Buzzwords.
- The CPP work has made good progress.
- The cross-project work with Parquet has seen substantial work (both in the
 Parquet project and the Arrow project). This should be a great first example
 proof-of-concept integration showing the benefits of in-memory columnar
 layer.
- There has been substantial progress on development of for the IPC / memory
 sharing.
- Java development has slowed some but appears to be picking up again.
- A new independent project called Feather is using Arrow as a format for
 writing to disk. This has also increased engagement with Arrow itself and we
 have a number excited communities including R & Python (and the Julia
 community experimenting).

## Health report:

- We've seen good discussion and development activity since the last report.
- We need to get to a first release.
- Prior to doing so, the community is working on rudimentary integration tests
 between Java and C++ and more formal format specification.
- More work can be done to make the project approachable to newly interested
 parties by creating additional documentation and quickstart. A sample
 application will also help.

## PMC changes:

- Currently 17 PMC members.
- No new PMC members added in the last 3 months.
- Last PMC addition was Abdel Hakim Deneche on Tue Jan 20 2016

## Committer base changes:

- Currently 20 committers.
- No new committers added in the last 3 months
- Last committer addition was Ippokratis Pandis at Thu Feb 18 2016

## Releases:

- No releases yet.

## JIRA activity:

- 71 JIRA tickets created in the last 3 months
- 40 JIRA tickets closed/resolved in the last 3 months

@Marvin: Links to unreleased source code must be removed from the Arrow home page.

20 Apr 2016 [Jacques Nadeau / Shane]

## Description:

Arrow is a columnar in-memory analytics layer designed to accelerate big
data. It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:

- There are no issues requiring board attention at this time.

## Activity:

- A number of public presentations have been done about Arrow including at
 Strata SJ, Hadoop Summit Europe and various meetups. Response at each was
 strong and we saw subsequent increased interaction on the mailing list.
- We've seen a great new project Feather, a collaboration between the Python
 and R communities build on top of Arrow to provide an ephemeral cross system
 format that performs faster and has better typing than the traditionally
 used CSV format.
- A number of interested organizations have posted blogs about their interest
 in and support for Arrow.

## Health report:

- We continue to see new community members engage.
- Public discussions and contributions from both committers and casual
 contributors continue to improve the Arrow specification.
- We're working with the incubating Mnemonic community to support alternatives
 to ephemeral memory for storing Arrow vectors. This will likely first appear
 as an optional extension module of the Java api.
- We're still negotiating the final Arrow switch over in the Drill community.
 The goal is to do it as part of the 2.0 branch to avoid any disruption to
 the active stable branch (master).

## PMC changes:

- Currently 17 PMC members.
- No new PMC members added in the last 3 months

## Committer base changes:

- Currently 20 committers.
- Most recently added committers:
  - Ippokratis Pandis was added as a committer on Thu Feb 18 2016
  - David Alves was added as a committer on Wed Feb 17 2016
  - Wes McKinney was added as a committer on Mon Feb 01 2016

## Releases:

- No releases yet.

## JIRA activity:

- 100 JIRA tickets created in the last 3 months
- 67 JIRA tickets closed/resolved in the last 3 months

16 Mar 2016 [Jacques Nadeau / Rich]

## Description:
- Arrow is a columnar in-memory analytics layer designed to accelerate big
data. It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:
- There are no issues requiring board attention at this time.

## Activity:

- A number of articles were posted about Arrow in the last month. This has
brought in a large number of interested parties and we've seen nice increase
in community engagement.
- There has been solid ideation and design discussions around IPC, metadata
and shared memory semantics.
- Development of the Python bindings is underway with a number of JIRAs
focused on that component.
- Since people have become aware of the project, we've seen a nice increase in
activity on the mailing list. Within that last month we've gone from 17
subscribers to 263 on the dev list and have seen 136 messages on the list.

## Health report:

- JIRAs are being opened and closed at a solid rate given the freshness of the
project.
- A number of design discussions have included great feedback and engagement
people outside the initial PMC/committers.
- We've seen several code contributions from first-time contributors.
- The final separation of code from the Drill codebase is up for review and
will likely move forward after the 1.6 Drill release (voting nearly underway).

## PMC changes:

- Currently 17 PMC members.
- No new PMC members since project was established.

## Committer base changes:

- Currently 20 committers.
- New committers:
- Ippokratis Pandis was added as a committer on Thu Feb 18 2016
- Wes McKinney was added as a committer on Mon Feb 01 2016

## Releases:

- No releases yet.

## JIRA activity:

- 41 JIRA tickets created in the last 3 months
- 20 JIRA tickets closed/resolved in the last 3 months

17 Feb 2016 [Jacques Nadeau / Shane]

## Description:
- Arrow is a columnar in-memory analytics layer designed to accelerate big
data. It houses a set of canonical in-memory representations of flat and
hierarchical data along with multiple language-bindings for structure
manipulation. It also provides IPC and common algorithm implementations.

## Issues:
- there are no issues requiring board attention at this time

## Activity:
- The project was established at the last board meeting.
- Mailing lists, repositories and issue tracking have been established by
infrastructure.
- An initial website is underway and should be available by the time of the
board meeting.
- The community also has worked with Sally from press@ to announce the project
via a press release on February 17th.
-  Various community members are working on putting together better
documentation and communication around Arrow.


## Health report:
- The project is just getting started as an independent project.
- One of the key initial efforts is finalizing the extraction of code from the
Drill codebase. This is tracking well and we hope to complete this before the
next board report.
- A number of community talks are being submitted to upcoming conferences and
meetups to make more people aware of Arrow.
- We need to start spending more time growing the community beyond the initial
PMC and committers.
 - The Project voted to add 5 additional committers to the core project due to
   their involvement initial Arrow discussions.


## PMC changes:

- Currently 17 PMC members.
- No new PMC members added since the project was established.

## Committer base changes:

- Currently 18 committers.
- Wes McKinney was added as a committer on Mon Feb 01 2016
- Offers are out to four other community members to become committers.

## Releases:

- No Arrow releases have yet been made.

## Mailing list activity:

- dev@arrow.apache.org:
   - 17 subscribers (up 17 in the last 3 months)

- issues@arrow.apache.org:
   - 4 subscribers (up 4 in the last 3 months)

20 Jan 2016

Establish the Apache Arrow Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to the
 public, related to columnar in-memory processing and data
 interchange

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache Arrow Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache Arrow Project be and hereby is
 responsible for the creation and maintenance of software
 related to columnar in-memory processing and data interchange;
 and be it further

 RESOLVED, that the office of "Vice President, Apache Arrow" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache Arrow Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache Arrow Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache Arrow Project:

 * Todd Lipcon <todd@apache.org>
 * Ted Dunning <tdunning@apache.org>
 * Michael Stack <stack@apache.org>
 * P. Taylor Goetz <ptgoetz@apache.org>
 * Reynold Xin <rxin@apache.org>
 * Julian Hyde <jhyde@apache.org>
 * Julien Le Dem <julien@apache.org>
 * Jacques Nadeau <jacques@apache.org>
 * James Taylor <jamestaylor@apache.org>
 * Jake Luciani <jake@apache.org>
 * Parth Chandra <parthc@apache.org>
 * Alex Levenson <alexlevenson@apache.org>
 * Marcel Kornacker <marcel@apache.org>
 * Steven Phillips <smp@apache.org>
 * Hanifi Gunes <hg@apache.org>
 * Jason Altekruse <json@apache.org>
 * Abdel Hakim Deneche <adeneche@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Jacques Nadeau
 be appointed to the office of Vice President, Apache Arrow, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed.

 RESOLVED, that the Apache Arrow Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Drill Arrow sub-project; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Drill Arrow sub-project encumbered upon the
 Apache Drill Project are hereafter discharged.

 Special Order 7A, Establish the Apache Arrow Project, was
 approved by Unanimous Vote of the directors present.