Skip to Main Content
ApacheCon is Coming 9-12 Sept. 2019 - Las Vegas The Apache Software Foundation
Apache 20th Anniversary Logo

Community-led development "The Apache Way"

Apache Support Logo

This was extracted (@ 2021-05-12 21:10) from a list of minutes which have been approved by the Board.
Please Note The Board typically approves the minutes of the previous meeting at the beginning of every Board meeting; therefore, the list below does not normally contain details from the minutes of the most recent Board meeting.

Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).

DataSketches

17 Mar 2021 [Lee Rhodes / Craig]

## Description:
The mission of Apache DataSketches is the creation and maintenance of software
related to an open source, high-performance library of streaming algorithms
commonly called "sketches" in the data sciences. Sketches are small, stateful
programs that process massive data as a stream and can provide approximate
answers, with mathematical guarantees, to computationally difficult queries
orders-of-magnitude faster than traditional, exact methods

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache DataSketches was founded 2020-12-15 (3 months ago)
There are currently 15 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is 5:4.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- Charlie Dickens was added as committer on 2020-12-18

## Project Activity:
DataSketches-java (java core) 2.0.0 was released 2021-02-22.
DataSketches-cpp (C++ core) 3.0.0 will be released week of 2021-03-08.

## Community Health:
Health is good. We are getting new sources of contribution: Ex: Prof Braverman
at Johns Hopkins wants to contribute to our library.

17 Feb 2021 [Lee Rhodes / Justin]

## Description:
The mission of Apache DataSketches is the creation and maintenance of software
related to an open source, high-performance library of streaming algorithms
commonly called "sketches" in the data sciences. Sketches are small, stateful
programs that process massive data as a stream and can provide approximate
answers, with mathematical guarantees, to computationally difficult queries
orders-of-magnitude faster than traditional, exact methods

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache DataSketches was founded 2020-12-15 (2 months ago)
There are currently 15 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is 5:4.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- Charlie Dickens was added as committer on 2020-12-18

## Project Activity:
We have completed the transition from podling to TLP.
DataSketches-memory was released Jan 22nd.
DataSketches-java (Java-core) is expected in the next week.
The ASF Press-Release graduation announcement was Feb 3rd.

## Community Health:
Health is good. We are continuing to get new inquiries about
our project. Ex: We were asked to do a comparison of BlinkDB to DataSketches.

20 Jan 2021 [Lee Rhodes / Justin]

## Description:
The mission of Apache DataSketches is the creation and maintenance of software
related to an open source, high-performance library of streaming algorithms
commonly called "sketches" in the data sciences. Sketches are small, stateful
programs that process massive data as a stream and can provide approximate
answers, with mathematical guarantees, to computationally difficult queries
orders-of-magnitude faster than traditional, exact methods

## Issues:
There are no issues requiring board attention.

## Membership Data:
Apache DataSketches was founded 2020-12-15 (a month ago)
There are currently 15 committers and 12 PMC members in this project.
The Committer-to-PMC ratio is 5:4.

Community changes, past quarter:
- No new PMC members (project graduated recently).
- Charlie Dickens was added as committer on 2020-12-18

## Project Activity:
Over the past month (since graduation) we have been busy with the transition.
With the holidays, we have had only two weeks to work on the transition,
nonetheless, as of this writing, we are about 95% complete. We have a number
of releases to do, which will be a strong test that we have all the pieces
in the right place.

Our last release was our C++, Python Core on Sep 22, 2020.
We plan for a new release of Java Memory this month with a new release of
our Java core shortly thereafter.

## Community Health:
We suspect that some of the decrease in traffic on dev@ and users@ may be due
to the holidays. Also, much of our code has been very stable in its quality,
which is a good thing. We will be introducing some new sketches soon, which
will indubitably have concomitant traffic.

16 Dec 2020

Establish the Apache DataSketches Project

 WHEREAS, the Board of Directors deems it to be in the best interests of
 the Foundation and consistent with the Foundation's purpose to establish
 a Project Management Committee charged with the creation and maintenance
 of open-source software, for distribution at no charge to the public,
 related to an open source, high-performance library of streaming
 algorithms commonly called "sketches" in the data sciences. Sketches
 are small, stateful programs that process massive data as a stream and
 can provide approximate answers, with mathematical guarantees, to
 computationally difficult queries orders-of-magnitude faster than
 traditional, exact methods.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
 (PMC), to be known as the "Apache DataSketches Project", be and hereby
 is established pursuant to Bylaws of the Foundation; and be it further

 RESOLVED, that the Apache DataSketches be and hereby is responsible for
 the creation and maintenance of software related to an open source,
 high-performance library of streaming algorithms commonly called
 "sketches" in the data sciences. Sketches are small, stateful programs
 that process massive data as a stream and can provide approximate
 answers, with mathematical guarantees, to computationally difficult
 queries orders-of-magnitude faster than traditional, exact methods; and
 be it further

 RESOLVED, that the office of "Vice President, Apache DataSketches" be and
 hereby is created, the person holding such office to serve at the
 direction of the Board of Directors as the chair of the Apache
 DataSketches Project, and to have primary responsibility for management
 of the projects within the scope of responsibility of the Apache
 DataSketches Project; and be it further

 RESOLVED, that the persons listed immediately below be and hereby are
 appointed to serve as the initial members of the Apache DataSketches
 Project:

 * Alexander Saydakov <alsay@apache.org>
 * Dave Fisher <wave@apache.org>
 * Edo Liberty <edo@apache.org>
 * Eshcar Hillel <eshcar@apache.org>
 * Evans Ye <evansye@apache.org>
 * Furkan Kamaci <kamaci@apache.org>
 * Jon Malkin <jmalkin@apache.org>
 * Justin Thaler <jthaler@apache.org>
 * Kenneth Knowles <kenn@apache.org>
 * Lee Rhodes <leerho@apache.org>
 * Liang Chen <chenliang613@apache.org>
 * Roman Leventov <leventov@apache.org>

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Lee Rhodes be appointed to
 the office of Vice President, Apache DataSketches, to serve in accordance
 with and subject to the direction of the Board of Directors and the Bylaws
 of the Foundation until death, resignation, retirement, removal or
 disqualification, or until a successor is appointed.

 RESOLVED, that the Apache DataSketches Project be and hereby is tasked
 with the migration and rationalization of the Apache Incubator
 DataSketches podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache Incubator
 DataSketches podling encumbered upon the Apache Incubator PMC are
 hereafter discharged.

 Special Order 7D, Establish the Apache DataSketches Project,
 was approved by Unanimous Vote of the directors present.

18 Nov 2020

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Adding more committers. We added one last quarter and we have a
    few more individuals that we have been considering.
 2. We have created a draft Maturity model, which is undergoing review.
 3. Prepare for Graduation. We have a Graduation checklist that we are
    going through

### Are there any issues that the IPMC or ASF Board need to be aware of?

 No.

### How has the community developed since the last report?
 Public presentations since last report:

 - ACM-KDD conference in August.
 - DataCon2020 in Taiwan in September.
 - ApacheCon 2020 in September.

 We are seeing increased interest from scientific communities that
 work with big data and platforms that want to use our code
 (e.g. Apache Impala).

### How has the project developed since the last report?

 We released a new minor release of C++: 2.1.0.

 Based on feedback from our community, we are developing a Docker
 deployable version of our library, which hopefully will be
 released soon.

 We are working on a brand new sketch as part of the Quantiles family.

 To the best of our knowledge all of our licensing and website issues
 have been addressed and have been implemented in formal releases or
 are in master-branch staging, awaiting the next release.

 We are continuing to respond to new user's requests for help.

### How would you assess the podling's maturity?
 Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [X] Community building
 - [X] Nearing graduation
 - [ ] Other:

### Date of last release:

 - 2020-06-19 incubating-datasketches-cpp  2.1.0

### When were the last committers or PPMC members elected?

 - 2020-08-17 (LDAP create date)

### Have your mentors been helpful and responsive?

 Generally our mentors have been very helpful. However, a
 little more help from our mentors on timely approval of
 our releases would be appreciated. Our last release took
 18 days to get 3 IPMC members to vote. We don't know what
 is typical, but this seems a bit long. Please advise.

### Is the PPMC managing the podling's brand / trademarks?

 To the best of our knowledge, yes.

 * Are 3rd parties respecting and correctly using the podlings
  name and brand?

  As far as we know, yes.

 * If not what actions has the PPMC taken to correct this?

   We have not had to face this issue yet.

 * Has the VP, Brand approved the project name?

  Yes, and it is clearly stated as such on
  http://incubator.apache.org/projects/datasketches.html

### Signed-off-by:

 - [ ] (datasketches) Liang Chen
    Comments:
 - [ ] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:
 - [X] (datasketches) Evans Ye
    Comments:
 - [X] (datasketches) Dave Fisher
    Comments:  I think that DataSketches will be ready to graduate at the
      December Board meeting.

### IPMC/Shepherd notes:

19 Aug 2020

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Adding more committers. We have just added our first new committer
    since incubation! We have a few more individuals that have been
    consistent contributors to the project that we will soon want to
    go through the new committer election process. This is a big change
    from our last report where we had no candidates at all.
 2. Fill out the Maturity Model
 3. Prepare for Graduation.

### Are there any issues that the IPMC or ASF Board need to be aware of?

 We could use some help in finding people who would find working in
 the sketching algorithms area really interesting and would want
 to work with us to become committers.

### How has the community developed since the last report?

 The word is getting out! We presented talks at the USPTO 2020 tech
 conference and the Spark & AI 2020 conference, mentioned in the last
 report, with lots of good feedback.

 We will be co-authors in a tutorial on sketching technology at the
 upcoming ACM-KDD conference in August with one of the world's
 leading scientists in streaming algorithms and sketching.

 We have been invited to give a keynote talk at the upcoming
 DataCon2020 in Taiwan in early September.

 We have been accepted for a talk at ApacheCon again this year.

 We also are seeing a big increase in the number of single PRs coming
 from a number of different people, especially for our C++ components,
 which is very good news. This proves that there is growing
 interest in the project and there are folks out there that want to
 contribute to the project.

### How has the project developed since the last report?

 See the releases since the last report below.

 In addition we have made significant improvements to our website
 thanks to some external contributors!

 To the best of our knowledge all of our licensing and website issues
 have been addressed and have been implemented in formal releases or
 are in master-branch staging, awaiting the next release.

### How would you assess the podling's maturity?
 Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [X] Community building
 - [X] Nearing graduation
 - [ ] Other:

### Date of last release:

 - 2020-07-06 incubating-datasketches-hive 1.1.0
 - 2020-06-19 incubating-datasketches-cpp  2.0.0
 - 2020-05-07 incubating-datasketches-java 1.3.0

### When were the last committers or PPMC members elected?

 August, 2020

### Have your mentors been helpful and responsive?

 Yes, in general. However, we do have to prod them with reminders
 to check-off our releases. Our releases have been taking
 longer and longer to get through the voting process especially
 when it is in the 2nd IPMC phase. A little help here would
 be appreciated.

### Is the PPMC managing the podling's brand / trademarks?

 To the best of our knowledge, yes.

 * Are 3rd parties respecting and correctly using the podlings
   name and brand?

   As far as we know, yes.

 * If not what actions has the PPMC taken to correct this?

   We have not had to face this issue yet.

 * Has the VP, Brand approved the project name?

   Yes, and it is clearly stated as such on
   http://incubator.apache.org/projects/datasketches.html

### Signed-off-by:

 - [X] (datasketches) Liang Chen
    Comments:
 - [X] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:
 - [X] (datasketches) Dave Fisher
    Comments:
 - [X] (datasketches) Evans Ye
    Comments:

### IPMC/Shepherd notes:

20 May 2020

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a
stream and can provide approximate answers, with mathematical
guarantees, to computationally difficult queries orders-of-magnitude
faster than traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Clearly, the most important issue for us is to add more committers.
    From the Clutch and Podling Website reports, this is the last
    major issue for us.

    We have tried to encourage folks that ask questions or raise issues
    to get more involved, and we have one or two folks that have
    expressed interest in submitting PRs or even a new sketch. But,
    alas, none have followed through, yet.

    Developing sketch code is very tricky and understanding how these
    algorithms work, and the math and statistics behind them, is a hurdle
    for most people. Yet, we have been very clear that we are prepared to
    train someone to become a committer.  All we ask is that the
    candidate be open to learning about these fascinating algorithms and
    committed to work with us. We could use some active help from our
    Mentors or from the Board to help us find someone that would find
    this work interesting.

    I am convinced that there are folks in the greater Apache community
    that would really enjoy working on this library, we just need to
    discover who they are!

 2. Referring to last month's report, we have made progress in setting up
    TODO lists on our major sites: Java and C++. And we keep working
    away at these lists.  We have also improved our Downloads page and
    brought it up to Apache standards. I don't feel these should be
    issues for graduation.

### Are there any issues that the IPMC or ASF Board need to be aware of?

 The issue mentioned above. We could use some help in finding someone
 who would find working in the sketching algorithms area really
 interesting and would want to work with us to become a committer.

### How has the community developed since the last report?

 We have been accepted to present at two conferences this Summer, the
 USPTO technology conference and the Spark & AI conference.

 We also have interest from Apache Flink and Apache Impala to
 integrate sketches into their systems. There has also been interest
 from Apache Beam, but so far no action.

### How has the project developed since the last report?

 We have done a lot of work making the C++ code more robust and will
 likely have a major new release of the C++ library before this
 report is read by the Board.  We also in the voting process for a
 new Java release that cleans up some licensing glitches and fixes
 a bug found by Druid.

 Our activity on Slack has increased quite a bit with
 interesting queries from all over.

 We also have done a lot of work on the website, adding content and
 improving navigation. The Community and Downloads pages are all new.
 Please have a look!

 We continue to improve our release process with more guided scripts
 and fix issues as we discover them.

### How would you assess the podling's maturity?
 Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [X] Community building -- this is a continuous, on-going effort
 - [X] Nearing graduation
 - [ ] Other:

### Date of last release:

 * 2020-01-26 Java release 1.2.0-incubating.
 * The Java 1.3.0-incubating release will be out before the Board
   meeting.
 * A new C++ 2.0.0-incubating release may be out before
   the Board meeting.

### When were the last committers or PPMC members elected?

 No new committers since April, 2019.

### Have your mentors been helpful and responsive?

 Yes. No open issues.

### Is the PPMC managing the podling's brand / trademarks?

  To the best of our knowledge, yes.

 * Are 3rd parties respecting and correctly using the podlings name and
  brand?

  As far as we know, yes.

 * If not what actions has the PPMC taken to correct this?

  We have not had to face this issue yet.

 * Has the VP, Brand approved the project name?

  Yes, and it is clearly stated as such on
  http://incubator.apache.org/projects/datasketches.html

### Signed-off-by:

 - [X] (datasketches) Liang Chen
    Comments:
 - [ ] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:
 - [X] (datasketches) Dave Fisher
    Comments:
 - [X] (datasketches) Evans Ye
    Comments:

### IPMC/Shepherd notes:
 Justin Mclean: Perhaps one way of attracting more interest is to have more
 conversation on the mailing list?

19 Feb 2020

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Be more communicative and document our code changes more clearly.
 2. We need to have more substantive discussions on dev@ especially about
    our growing
    TODO list and how we plan to address them -- create a roadmap as a
    guide for others to contribute.
 3. Find / Attract new code committers outside Yahoo!

### Are there any issues that the IPMC or ASF Board need to be aware of?
 No

### How has the community developed since the last report?
 We are presenting at more conferences which has attracted some interest.
 We are definitely getting more traffic on our forum, GitHub issues
 and email lists.  We recently added two channels on the-asf@slack:
 #datasketches and #datasketches-dev. The traffic has been fairly low on
 Slack as well as the forum. We could do more to publicize the slack
 channels.  I could be optimistic and believe the low traffic is due to
 the holidays -- or that the code just works :)

 Nonetheless, the download traffic measured by repository.a.o
 has grown exponentially since our first Apache release on Sep 23. We are
 over 1000
 unique IPs/ month and had a recent high of 22K downloads/ month.  Bear in
 mind
 that this is all traffic that has migrated from the older, pre-Apache
 artifacts
 at com.yahoo.datasketches and is already higher than our peak downloads
 prior to
 Apache. These numbers also do not reflect any downloads of our Zip
 artifacts
 from a.o./dist (which includes our C++ artifacts) or other external
 download
 repositories (for example, specific to PostgreSQL).

### How has the project developed since the last report?
 Our releases are becoming easier, more polished and routine.
 Nonetheless, our website needs a lot of work (as mentioned above) and
 this will become our focus for the next month or so.

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [X] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:
 These are the major components and their last release dates:

 * DataSketches-Java       2020-01-26
 * DataSketches-Memory     2019-11-21
 * DataSketches-CPP        2019-09-17
 * DataSketches-Hive       2019-10-11
 * DataSketches-Pig        2019-10-18
 * DataSketches-Postgresql 2019-10-29

### When were the last committers or PPMC members elected?
 No new committers since April, 2019.

### Have your mentors been helpful and responsive?
 Yes.
 No open issues.

### Is the PPMC managing the podling's brand / trademarks?
 To the best of our knowledge, yes.

 * Are 3rd parties respecting and correctly using the podlings name and
 brand?
   As far as we know, yes.

 * If not what actions has the PPMC taken to correct this?
   We have not had to face this issue yet.

 * Has the VP, Brand approved the project name?
   Yes, and it is clearly stated as such on
   http://incubator.apache.org/projects/datasketches.html

### Signed-off-by:

 - [X] (datasketches) Liang Chen
    Comments:
 - [X] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:
 - [X] (datasketches) Dave Fisher
    Comments:
 - [X] (datasketches) Evans Ye
    Comments:

### IPMC/Shepherd notes:

20 Nov 2019

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Finish the transfer and bring-up of our website to
    github.com/apache/...  This is now in process.
 2. __Team Interactions:__ We want to have our exchanges on the ASF
    Slack DataSketches-dev channel posted to our dev@datasketches.a.o
    list on a daily basis for improved visibility and searchability.
    We have an open INFRA ticket on this issue.
    We are searching for a solution to provide more open access to
    our video conference sessions when we have them. We are in the
    process of moving more of our interactions into the slack
    DS-dev channel and dev@ list. This is a culture change for us
    and will take some getting used to. We clearly want open
    access to our team discussions.
 3. We would like to see a few more folks
    join our contributors list.  We have several folks that
    have come forward and offered help because they are interested
    in the project.  This is great.  It is our hope that they will
    grow into active contributors.

### Are there any issues that the IPMC or ASF Board need to be aware of?
 None

### How has the community developed since the last report?
 * We have added 1 new Mentor, Dave Fisher (thank you!) to our project
   and we have been approached by another Apache member
   who would also like to be a mentor, and eventually a contributor
   as well. This is very positive!

### How has the project developed since the last report?

 * We have now managed 7 releases,  6 Java releases and 1 C++ release.
   We have one more C++ release pending.  These are across 6 different
   components of the DataSketches library.  With the last pending C++
   release, all of the code components targeted for release will
   be complete.

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [ ] Initial setup
 - [ ] Working towards first release
 - [X] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:

 2019-10-19  01:55 GMT DataSketches-pig

### When were the last committers or PPMC members elected?
 * Dave Fisher: 16 Sep 2019

### Have your mentors been helpful and responsive?
 * Helpful and responsive, Yes.
   Having additional mentors has helped the voting
   move forward more expeditiously!
 * I want to thank Dave Fisher for jumping in and helping us
   with a number of issues!


### Signed-off-by:

 - [X] (datasketches) Liang Chen
    Comments:
 - [x] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:
 - [X] (datasketches) Dave Fisher
    Comments:

### IPMC/Shepherd notes:

21 Aug 2019

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Our vote letter on general@ had no responses from anyone (not just
    IPMC members) for the first 73 hours. After sending a pleading
    reminder email I finally got 3 +1 binding votes. I'm trying to be
    polite and not needle folks, but I need guidance on how to get IPMC
    members' attention. I realize the vote  must stay open for at least
    72 hours, but having to wait until the last minute get any response
    is very aggravating. Would it be fair to send out reminder notices on
    24 hour intervals?
 2. Continue to perfect the release process.
 3. After we get this first release, we need to finish migrating the
    remaining repos.

### Are there any issues that the IPMC or ASF Board need to be aware of?
 1. Yes. In addition to #1 above, not all of our Mentors have been
    involved. Why do Mentors sign up if they do not or cannot mentor?

### How has the community developed since the last report?
 Not too much at the committer level. We have drawn the
 interest of a few new scientists in our work, but they did not
 learn of our work from Apache.
 It is still very early.  I am speaking at ApacheCon
 In September, hopefully we can attract some interest there.
 I am hoping to attract some committers.

### How has the project developed since the last report?
 We continue to evolve the project and make commits to the code base.
 We are also heavily integrated into the Druid platform.

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [X] Initial setup
 - [X] Working towards next release
 - [ ] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:

 2019-08-02 Our First release of our first component!
 Thanks to: Kenneth Knowles, Furkan Kamaci, Paul King and
 Justin Mclean for their help.

### When were the last committers or PPMC members elected?
 When we entered incubation.

### Have your mentors been helpful and responsive?
 Two (of 3) of our Mentors have been responsive when they are not
 otherwise unavailable (vacation, work, etc.)

### Signed-off-by:

 - [X] (datasketches) Liang Chen
    Comments:
 - [ ] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:

### IPMC/Shepherd notes:
 Justin Mclean: 72 hours is a minimum and a podling may not attract
 all needed votes in that time. I understand it may be frustrating
 but remember IPMC member are volunteers and mostly do this work
 unpaid in their spare time. If you need more Mentors just ask on
 the incubator general list.

17 Jul 2019

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches are small, stateful programs that process massive data as a stream
and can provide approximate answers, with mathematical guarantees, to
computationally difficult queries orders-of-magnitude faster than
traditional, exact methods.

DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Complete a successful 1st snapshot release of Memory repo to DIST and
 Nexus. This is a blocking issue.
 2. Finish refactoring/snapshot releasing the other repos, which depend on
 #1.
 3. Move, refactor Website.

### Are there any issues that the IPMC or ASF Board need to be aware of?
 For the IPMC:
 As a newbie podling, my experience so far has been exasperating. Finding
 how to accomplish key tasks is difficult.  The information is spread all
 over and the essential details of how to actually
 accomplish tasks are often missing.

 I have run into multiple roadblocks, especially with regards to
 permissions. I have to keep filing new tickets with INFRA to setup access
 to infrastructure and they reply that the Mentors need to do this. When
 I ask on general@incubator, the replies I get suggest I need to file
 tickets with INFRA. So I am  confused.

### How has the community developed since the last report?
 Not much. I wish I could spend more time on this, but I need to get
 the migration done.

### How has the project developed since the last report?
 We continue to evolve the project's functionality with commits to our
 GitHub repos.

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [x] Initial setup
 - [x] Working towards first release
 - [ ] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:

 No releases yet.

### When were the last committers or PPMC members elected?
 At the initial incubation date.

### Have your mentors been helpful and responsive?

 1. I have opened INFRA issues that have not yet been addressed and there
 will be more to come.
 2. I could REALLY use some 1:1 help from an experienced release engineer
 (perhaps from another project),that is very familiar with the Apache/Maven
 release process and POM to get us off the ground.
 Once we have created our first release, we can continue from there. But
 getting this first one is out is turning out to be quite a challenge.
 I don't think we need more than an hour with an experienced Apache
 release engineer, our project just isn't that complicated.
 3. I haven't heard from any of the mentors for the last week or so,
 perhaps they are all on vacation.

### Signed-off-by:

 - [ ] (datasketches) Liang Chen
    Comments:
 - [X] (datasketches) Kenneth Knowles
    Comments:
 - [ ] (datasketches) Furkan Kamaci
    Comments:

### IPMC/Shepherd notes:
 Justin Mclean: Please ask your mentors for help, they can setup most
 things or direct yo to when you can get help. If your mentors can't help
 then ask on teh incubator general list.

19 Jun 2019

 DataSketches is an open source, high-performance library of stochastic
 streaming algorithms commonly called "sketches" in the data sciences.
 Sketches
 are small, stateful programs that process massive data as a stream and can
 provide approximate answers, with mathematical guarantees, to
 computationally
 difficult queries orders-of-magnitude faster than traditional, exact
 methods.

 DataSketches has been incubating since 2019-03-30.

### Three most important unfinished issues to address before graduating:

 1. Finish code migration
 2. Set up automated builds
 3. Establish code review practices

### Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of

 No

### How has the community developed since the last report?

 We are still in the process of setting up permissions and figuring out
 Apache environment.

### How has the project developed since the last report?

 Most DataSketches repos have been moved to Apache repos.

### How would you assess the podling's maturity?
Please feel free to add your own commentary.

 - [X] Initial setup
 - [ ] Working towards first release
 - [ ] Community building
 - [ ] Nearing graduation
 - [ ] Other:

### Date of last release:

 No releases yet

### When were the last committers or PPMC members elected?

 We have just signed up our initial committers

### Have your mentors been helpful?

 Yes, very helpful.

### Signed-off-by:

 - [ ] (datasketches) Liang Chen
    Comments:
 - [X] (datasketches) Kenneth Knowles
    Comments:
 - [X] (datasketches) Furkan Kamaci
    Comments:

### IPMC/Shepherd notes:

15 May 2019

DataSketches is an open source, high-performance library of stochastic
streaming algorithms commonly called "sketches" in the data sciences.
Sketches
are small, stateful programs that process massive data as a stream and can
provide approximate answers, with mathematical guarantees, to
computationally
difficult queries orders-of-magnitude faster than traditional, exact
methods.

DataSketches has been incubating since 2019-03-30.

Three most important unfinished issues to address before graduating:

 1. Finish IP Assignments
 2. Code Migration
 3. Perform a Release

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 No

How has the community developed since the last report?

 We have the key committers signed up. We are all learning how
 to navigate in the Apache environment and how to find things.

How has the project developed since the last report?

 This is our first report.

How would you assess the podling's maturity?
Please feel free to add your own commentary.

 [X] Initial setup
 [ ] Working towards first release
 [ ] Community building
 [ ] Nearing graduation
 [ ] Other:

Date of last release:

 Our DataSketches.GitHub.io site is quite active as we are
 very active with new code and releases from this site.
 For example, our latest release of sketches-core was yesterday,
 25 April 2019.

 We are a long way from being able to release from the migrated
 Apache code base as it doesn't yet exist.

 XXXX-XX-XX

When were the last committers or PPMC members elected?

 We have just signed up are initial list of committers.

Have your mentors been helpful and responsive or are things falling
through the cracks? In the latter case, please list any open issues
that need to be addressed.

 Kenneth Knowles has been extremely helpful! Thank you!

Signed-off-by:

 [X](datasketches) Liang Chen
 Comments:
 [X](datasketches) Kenneth Knowles
 Comments: Initial set up has been a bit slow; that's on me
 [X](datasketches) Furkan Kamaci
 Comments:

IPMC/Shepherd notes: