Apache Logo
The Apache Way Contribute ASF Sponsors

This was extracted (@ 2017-10-16 20:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | Pre-organization meetings

CarbonData

19 Jul 2017 [Liang Chen / Ted]

## Description:

- The Apache CarbonData is an indexed columnar store file format for fast
analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among
others) to help speed up queries an order of magnitude faster over petabytes
of data, with the aim of using a unified file format to satisfy all kinds of
data analysis cases.

## Issues:

- There are no new issues requiring board attention at this time.


## Activity:

- Community is pretty active, we are receiving around 100-200 pull requests
 per month from community contributors.
- A new patch release 1.1.1 be completed on 10th July
- Most of contributors are working on the next major release Apache
 CarbonData 1.2.0, there are some significant feature(partition, sort
 column, etc), these feature would further improve performance and usability.
- We are abstracting and refactoring index framework(datamap), aim to let
 users to extend other more index techniques(for example : lucene for text
 data to fast search)
- We are optimizing API for easier integrating with other big data project
 (Beam, Presto, Hive, Flink etc.)
- We are optimizing test cases, to add hadoop and spark cluster test cases
- Liang made a presentation in L3C conference on 21st June.
- We plan 3 meetups in the 2nd half of 2017 : Shanghai Meetup in Sep,
 Bangalore Meetup in Oct, Bay area Meetup in Nov/Dec


## Health Report:

- The project is healthy, community keep active in all the
various categories(dev mailing list, JIRAs, and pull requests).
- There are 3 potential new committers who are working on partition feature,
 update&delete feature.

## Releases:

- Apache CarbonData 1.1.0 released on 2017-05-16
- Apache CarbonData 1.1.1 released on 2017-07-10


## PMC changes:

- Ravindra Pesala was added to the PMC on Mon May 22 2017
- Currently 9 PMC members

## Committer base changes:

Currently 13 committers, two new committers added in the past quarter:

- hexiaoqiao was added as a committer on 2017-02-21.
- qiangcai was added as a committer on 2017-05-09.


## Mailing list activity:
- Mailing list activity stays at a high level

21 Jun 2017 [Liang Chen / Mark]

## Description:
- The Apache CarbonData is an indexed columnar store file format for fast
  analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among
  others) to help speed up queries an order of magnitude faster over petabytes
  of data, with the aim of using a unified file format to satisfy all kinds of
  data analysis cases.

## Board comment in the last month report:
- For future reports, please include the date of the last PMC addition (as well
  as the last committer addition) Re: this month report already added these
  information.

## Issues:
- There are no new issues requiring board attention at this time.

## Activity:
-  we finished 1st release as top level project, the apache carbondata 1.1.0 is
   a milestone release with V3 format to improve 50% performance for
   aggregation cases.
- Liang Chen gave a talk to introduce Apache CarbonData during ApacheCon in
  Miami.
- Jackylk present "CarbonData with SparkSQL" practices at China Spark summit on
  2017-05-19.
- We are prepare apache carbondata 1.1.1 and apache carbondata 1.2.0
- An important work is in progress of providing index framework for users to
  extend more index, for example : integrate with Apache Lucene for search
  data.

## Health Report:
- The project is healthy, community keep active in all the various
  categories(dev mailing list, JIRAs, and pull requests).

## Releases:
- Apache CarbonData 1.1.0 released on 2017-05-16

## PMC changes:
- Currently 9 PMC members.

## Committer base changes:
Currently 13 committers, two new committers added in the past quarter:
 - hexiaoqiao was added as a committer on 2017-02-21.
 - qiangcai was added as a committer on 2017-05-09.

## Mailing list activity:

dev@carbondata.apache.org:
- 152 subscribers (up 34 in the last 3 months).
- 862 emails sent in the past 3 months, 758 in the previous cycle

issues@carbondata.apache.org:
- 7 subscribers (up 1 in the last 3 months).
- 4951 emails sent in the past 3 months, 4266 in the previous cycle

user@carbondata.apache.org:
- 35 subscribers (up 29 in the last 3 months) (27 emails sent in the past 3
  months, 0 in the previous cycle)

## JIRA activity:
- 393 JIRA tickets created in the last 3 months
- 266 JIRA tickets closed/resolved in the last 3 months

17 May 2017 [Liang Chen / Rich]

The Apache CarbonData is an indexed columnar store file format for fast
analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among
others) to help speed up queries an order of magnitude faster over petabytes
of data, with the aim of using a unified file format to satisfy all kinds of
data analysis cases.

## Issues:
- There are no new issues requiring board attention at this time.

## Activity:
- Apache CarbonData graduated on April 19, 2017. Prepared press work to
announce the graduation, content work to remove incubation disclaimers,
and coordination with Infra for the necessary adjustments there.

- Nominated 1 committer and 1 PMC in private mailing list.
- Preparing Apache CarbonData 1.1.0, will release in May.
- Jackyli and Jihongma presented Apache CarbonData at Spark Summit East on
  2017-02-08.
- Liang Chen presented Apache CarbonData at beijing big data meetup on
  2017-05-05.
- Liang Chen will present Apache CarbonData at ApacheCon on 2017-05-16.
- Jackylk will presente Apache CarbonData at China Spark summit on 2017-05-19.

## Health Report:
- The project is healthy, community keep active in all the
various categories(dev mailing list, JIRAs, and pull requests).

## Releases:
- Apache CarbonData 1.0.0-incubating released on 2017-01-29
- Have started the new release Apache CarbonData 1.1.0, have submited RC2 for
  PMC vote.

## PMC changes:
- Currently 8 PMC members. We are discussing for a new PMC candidate.

## Committer base changes:
Currently 13 committers, two new committers added in the past quarter:
 - hexiaoqiao was added as a committer on 2017-02-21.
 - qiangcai was added as a committer on 2017-05-09.

## Mailing list activity:

dev@carbondata.apache.org:
- 141 subscribers (up 50 in the last 3 months).
- 654 emails sent to list in March, April, May (672 in previous quarter).

issues@carbondata.apache.org:
- 7 subscribers (up 1 in the last 3 months).
- 3552 emails sent to list in March, April, May (3964 in previous quarter).

## JIRA activity:
- 320 JIRA tickets created in March, April, May.
- 268 JIRA tickets closed/resolved in the last 3 months.

19 Apr 2017

Establish the Apache CarbonData Project

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to
 the public, related to an indexed columnar data format
 for fast analytics on big data platforms.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the "Apache CarbonData Project",
 be and hereby is established pursuant to Bylaws of the
 Foundation; and be it further

 RESOLVED, that the Apache CarbonData Project be and hereby is
 responsible for the creation and maintenance of software
 related to an indexed columnar data format for fast analytics
 on big data platforms; and be it further

 RESOLVED, that the office of "Vice President, Apache CarbonData" be
 and hereby is created, the person holding such office to
 serve at the direction of the Board of Directors as the chair
 of the Apache CarbonData Project, and to have primary responsibility
 for management of the projects within the scope of
 responsibility of the Apache CarbonData Project; and be it further

 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache CarbonData Project:

    * Liang Chen <chenliang613@apache.org>
    * Jean-Baptiste Onofré <jbonofre@apache.org>
    * Henry Saputra <hsaputra@apache.org>
    * Uma Maheswara Rao G <umamahesh@apache.org>
    * Jihong Ma <jihongma@apache.org>
    * Jacky Li <jackylk@apache.org>
    * Vimal Das Kammath <vimaldas@apache.org>
    * Heng Qiu <jarray888@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Liang Chen be
 appointed to the office of Vice President, Apache CarbonData, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification,
 or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache CarbonData PMC be and hereby is
 tasked with the creation of a set of bylaws intended to
 encourage open development and increased participation in the
 Apache CarbonData Project; and be it further

 RESOLVED, that the Apache CarbonData Project be and hereby
 is tasked with the migration and rationalization of the Apache
 Incubator CarbonData podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator CarbonData podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 Special Order 7E, Establish the Apache CarbonData Project, was
 approved by Unanimous Vote of the directors present.

16 Nov 2016

Apache CarbonData is a new Apache Hadoop native file format for faster
interactive query using advanced columnar storage, index, compression and
encoding techniques to improve computing efficiency, in turn it will help
speedup queries an order of magnitude faster over PetaBytes of data.

CarbonData has been incubating since 2016-06-02.

Three most important issues to address in the move towards graduation:

 1. Prepare a couple of new releases
 2. Increase the communities
 3. Prepare website

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 The community activity increased: many new users started to use and test
 CarbonData, we had more than 300 issues created till Nov; Two new finance
 enterprises have formally deployed CarbonData to their business system,and
 the query performance speeded up 10-70 times in comparison to old system
 (both are bank enterprise in China).

 We finished 2nd Meetup in Beijing on 29th Oct, and CarbonData has increased
 10+ contributors in last month.

How has the project developed since the last report?

 Code donation has been done and all resources have been created by
 INFRA(git, github mirror, mailing list, Jira, ...).

 We also created the Jenkins CI jobs, and preparing org website.

 We did the 2nd release (0.1.1-incubating) in Oct and we are preparing a new
 one(0.2.0) in Nov.

 We have finished 2 technical talks in Bay area with Databricks, Alluxio in
 last month for discussing ecosystem integration with Spark and Alluxio.

Date of last release:

 2016-10-10

When were the last committers or PMC members elected?

 We elected a new committer Kumar Vishal on 2016-10-15.

Signed-off-by:

 [X](carbondata) Henry Saputra
 [X](carbondata) Jean-Baptiste Onofré
 [X](carbondata) Uma Maheswara Rao G

21 Sep 2016

Apache CarbonData is a new Apache Hadoop native file format for faster
interactive query using advanced columnar storage, index, compression and
encoding techniques to improve computing efficiency, in turn it will help
speedup queries an order of magnitude faster over PetaBytes of data.

CarbonData has been incubating since 2016-06-02.

Three most important issues to address in the move towards graduation:

 1. Prepare new releases
 2. Increase both dev and user communities
 3. Publish and update website to promote CarbonData

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 None

How has the community developed since the last report?

 The community activity increased: many new users started to use and
 test CarbonData, we had more than 100 new issues created during Aug; One
 user formally deployed CarbonData to their business system, the
 query efficiency speeded up 50 times (the user is the biggest internet
 finance company in China)

How has the project developed since the last report?

 Code donation has been done and all resources have been created by INFRA
 (git, github mirror, mailing list, Jira, ...).

 We also created the Jenkins CI jobs.

 We did the first release (0.1.0-incubating) and we are preparing a new one.

 We're also working on the website (with improved look'n feel).

 We're also preparing some talks and presentations about CarbonData.

Date of last release:

 2016-08-27

When were the last committers or PMC members elected?

 2016-07-15

Signed-off-by:

 [X](carbondata) Henry Saputra
 [X](carbondata) Jean-Baptiste Onofre
 [X](carbondata) Uma Maheswara Rao G

17 Aug 2016

Apache CarbonData is a new Apache Hadoop native file format for faster
interactive query using advanced columnar storage, index, compression and
encoding techniques to improve computing efficiency, in turn it will help
speedup queries an order of magnitude faster over PetaBytes of data.

CarbonData has been incubating since 2016-06-02.

Three most important issues to address in the move towards graduation:

 1. Prepare first CarbonData release
 2. Prepare first CarbonData website
 3. Promote the project and grow user and dev community

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 None

How has the community developed since the last report?

 We voted two new committers: Venkata Ramana and Ravindra Pesala.

 We are preparing the CarbonData website, blog posts and talks to promote
 CarbonData and grow the user and dev communities.

 The community activity increased: many new users started to use and test
 CarbonData, we had more than 100 new issues created during July.

 On the other hand, presentation material has been created and a first talk
 has been given:

 http://www.slideshare.net/liangchen18/apache-carbondatanew-high-performance-d
   ata-format-for-faster-data-analysis

How has the project developed since the last report?

 Code donation has been done and all resources have been created by INFRA
 (git, github mirror, mailing list, Jira, ...).

 We also created the Jenkins CI jobs.

 We are now in the process of cleanup and polishing the build and legal to
 prepare the first release.

Date of last release:

 Not yet available

When were the last committers or PMC members elected?

 2016-07-15

Signed-off-by:

 [ ](carbondata) Henry Saputra
 [X](carbondata) Jean-Baptiste Onofre
 [X](carbondata) Uma Maheswara Rao G

20 Jul 2016

Apache CarbonData is a new Apache Hadoop native file format for faster
interactive query using advanced columnar storage, index, compression and
encoding techniques to improve computing efficiency, in turn it will help
speedup queries an order of magnitude faster over PetaBytes of data.

CarbonData has been incubating since 2016-06-02.

Three most important issues to address in the move towards graduation:

 1. Finalize code cleanup and code
 2. Prepare releases
 3. Grow up community

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 No

How has the community developed since the last report?

 It's the first CarbonData report

How has the project developed since the last report?

 We created the resources (git, github integration, Jira, ...).
 The code donation has been done, and the first PR merged.

 We are in the process of website creation (CWIKI requested) and creating
 the Jenkins CI jobs.

Date of last release:

 XXXX-XX-XX

When were the last committers or PMC members elected?

 N/A

Signed-off-by:

 [X](carbondata) Henry Saputra
 [X](carbondata) Jean-Baptiste Onofre
 [ ](carbondata) Uma Maheswara Rao G

Shepherd/Mentor notes:

 Jean-Baptiste Onofre:

   This report is the first CarbonData report. The activity is focused on
   the resources creation.