This was extracted (@ 2024-09-21 23:10) from a list of minutes
which have been approved by the Board.
Please Note
The Board typically approves the minutes of the previous meeting at the
beginning of every Board meeting; therefore, the list below does not
normally contain details from the minutes of the most recent Board meeting.
WARNING: these pages may omit some original contents of the minutes.
Meeting times vary, the exact schedule is available to ASF Members and Officers, search for "calendar" in the Foundation's private index page (svn:foundation/private-index.html).
IPMC/Shepherd notes: johndament: Podling is expecting to retire soon.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, Flink, and Storm. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? One new MRQL committer was accepted since our last report. He developed the new evaluation mode for MRQL that runs on top of Apache Storm. How has the project developed since the last report? There was very little activity on JIRA since our last report. The resolved issues on JIRA included various bug fixes and performance improvements. How would you assess the podling's maturity? Please feel free to add your own commentary. [ ] Initial setup [ ] Working towards first release [X] Community building [ ] Nearing graduation [ ] Other: Date of last release: 2016-03-02 When were the last committers or PPMC members elected? 2016-12-22 Signed-off-by: [X](mrql) Alan Cabrera Comments: [ ](mrql) Edward J. Yoon Comments: [ ](mrql) Mohammad Nour El-Din Comments:
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There was one new developer who developed the new evaluation mode for MRQL that runs on top of Apache Storm. There were no new committers since our last report. How has the project developed since the last report? A new evaluation mode for MRQL queries was developed based on Apache Storm (using the Trident API). MRQL can now process stream queries on a single stream using Storm, but will be soon be extended to support multiple streams. Date of last release: 2016-03-02 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Alan Cabrera: Not a huge amount of activity but at least there's a small consistent trickle. I think that the fact that someone was interested in adding an Apache Storm adapter speaks somewhat to the viability of the project.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? No How has the community developed since the last report? There were no new developers or new committers since our last report. How has the project developed since the last report? There were 8 JIRA issues reported during this period, from which 7 have already been resolved. The main work during this period was on algebra-based and source-level debugging of MRQL queries. Basically, results from an MRQL query are annotated with lineage information, which links each result value with the input data that were used to calculate the value. Given a result value, our debugger not only can print the input values that contributed to the result value, but can also display the detailed workflow that was used to calculate this value (how-to provenance). In contrast other data-centric debuggers, such as Titian for Spark, our debugger supports on-line browsing and searching using GUIs and can work on all MRQL supported platforms (map-reduce, Spark, Flink, and Hama). It also allows to insert trace points in the query source to debug the query at the source level. We hope that this debugger will be a valuable tool for developing and understanding MRQL queries. Date of last release: 2016-03-02 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [X](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There were no new developers or new committers since our last report. How has the project developed since the last report? We had a new release on March 2nd. It included various bug fixes and performance improvements. The most important new feature in the new release is support for incremental query processing in MRQL streaming, called Incremental MRQL. There was very little activity on JIRA after the release (only 4 JIRA issues were reported and fixed). Date of last release: 2016-03-02 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [ ](mrql) Alan Cabrera [x](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There were no new developers or new committers since our last report. How has the project developed since the last report? We have released our fourth release under Apache incubation. There were various bug fixes and performance improvements before the release. The most important new feature in the new release is support for incremental query processing in MRQL streaming, called Incremental MRQL. Date of last release: 2016-03-02 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [ ](mrql) Alan Cabrera [x](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There were no new developers or new committers since our last report. There are 3 new UTA students who are planning to contribute to the MRQL projects as part of their Master's thesis in Spring'16, but they have not participated in the MRQL community yet. We hope to get more contributions/committers from the Flink and Spark communities to help us fine-tune the MRQL evaluation engine on these platforms. How has the project developed since the last report? Very little activity on JIRA since the last report (only 2 issues reported that were both fixed). One of the JIRA issues was related to a major extension to MRQL to support incremental query processing (to convert any stream-based MRQL query to an incremental query that merges the previous query results with the results of applying the query to the new data batches only). This was based on an idea presented at ApacheCon'15 as a future plan for MRQL. It works on Spark Streaming mode for now, but it will soon support Flink Streaming too. Date of last release: 2015-02-25 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [ ](mrql) Alan Cabrera [X](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes:
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There were no new developers or new committers since our last report, but the project got contributions from a non-committer (from the Flink community). How has the project developed since the last report? During the last three months, 5 Jira issues were reported, from which 4 were fixed. Most of these issues were related to MRQL query evaluation in Flink and Hama modes running on Yarn. Date of last release: 2015-02-25 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [x](mrql) Alan Cabrera [x](mrql) Edward J. Yoon [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Alan Cabrera (adc): Things seem to be chugging along nicely. The community is not the size of Apache Hadoop, but it's moving in the right direction.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? None How has the community developed since the last report? There were no new developers or new committers since our last report. We have presented our project at ApacheCon NA in April. The talk was well-attended and some developers from other Apache projects expressed some interest in collaboration, which resulted in discussions at the dev@mrql list and new JIRA issues. How has the project developed since the last report? We have created a GitHub mirror and synchronized it with the existing Apache git repository and with JIRA. The GitHub mirror allows to merge git branches easier and provides a better integration with JIRA. We have added more features to MRQL Streaming (such as, new input stream formats) and have fixed various bugs. Date of last release: 2015-02-25 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: There are members of MRQL who are enthusiastic about keeping the project around and have taken steps to attract new committers to the project.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? There were no new developers or new committers since our last report. We are scheduled to present our project at ApacheCon North America in April. We hope that this presentation will attract new developers and users to our project and will open new opportunities for collaboration with other Apache projects. How has the project developed since the last report? We have released our third release under Apache incubation. We have introduced a major extension to MRQL, called MRQL streaming, that supports continuous MRQL queries on streams of data. It currently works on Spark Streaming but we are planning to add support for Flink Streaming in the future. Date of last release: 2015-02-25 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Drew Farris (drew): One active mentor this month. There was some delay in obtaining sufficient PPMC/IPMC votes on this month's release. The project currently appears to have a single active developer, however I concur with the comments on potential community growth as indicated in the report. It looks like things are headed in the appropriate direction to attract new users and developers. Congratulations on the release.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Increase the number of active committers 2. Increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? The project is not making much progress on the level of increasing the number of active committers and expanding the users community. This is already being discussed inside the project community and the main active committer asked to give the project more few months as he will try spreading the word more about the project through some public events How has the community developed since the last report? None How has the project developed since the last report? During the last three month, 12 jira issues were reported, from which 10 were fixed. Most of these issues were related to MRQL query evaluation in Flink mode. Finally, we have set up MRQL on ASF Jenkins, which is a CI server that continuously checks the integrity of our builds and validates our tests. Date of last release: 2014-06-26 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Alex Karasulu [X](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Konstantin Boudnik (cos): Very light email list traffic compare to prev. month. Mentors are active.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. increase the number of active committers 2. increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? There were no new developers or new committers since our last reports. How has the project developed since the last report? During the last three month, 12 jira issues were reported, from which 10 were fixed. Most of these issues were related to MRQL query evaluation in Flink mode. Finally, we have set up MRQL on ASF Jenkins, which is a CI server that continuously checks the integrity of our builds and validates our tests. Date of last release: 2014-06-26 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [ ](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Podling looks very healthy.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. increase the number of active committers 2. increase adoption, expand user community, and increase user list activity Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? We have released our second release under Apache incubation. After we added Apache Flink as an evaluation backend for MRQL, there was an interest by the Flink community in our project, which may result in collaboration between the projects and may expand our user community. How has the project developed since the last report? We have added support for Apache Flink. Now users can run MRQL queries on a Yarn cluster using 4 different backends (Hadoop map-reduce, Hama, Spark, and Flink), without having to change their queries. Date of last release: 2014-06-26 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: jmclean: Mentors active and project healthy. Project has discussed graduation but feels it needs more active committers.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. increase adoption, expand user community, and increase user list activity 2. have at least one more incubator release Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? A new PMC member was elected, Moon Soo Lee. The company DataSayer and the project Zeppelin (zeppelin-project.org) are now using MRQL. How has the project developed since the last report? We have switched to Junit for query testing, instead of evaluating queries from files using plain Java code. New tests were introduced and some bugs were corrected based on these tests. Date of last release: 2013-10-31 When were the last committers or PMC members elected? 2014-04-17 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. increase adoption, expand user community, and increase user list activity 2. recruit more developers, committers, and PMCers 3. have at least one more incubator release Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? Nothing to report. No new developers. No new committers. How has the project developed since the last report? Ported MRQL to Yarn. Added support for Spark 0.9.0. Improved the MRQL build process in many ways. Changed the run scripts to construct assembly jars at runtime. Date of last release: 2013-10-31 When were the last committers or PMC members elected? 2013-03-13 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd/Mentor notes: Roman Shaposhnik (rvs): A reasonably sized community. Given that it has been incubating for about a year now, I'd encourage graduation activity. A few more releases would be a good first step.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. recruit more users, developers, committers and PMCers 2. 3. Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? We have released our first release. How has the project developed since the last report? Extended the interface between the query translator and the run-time evaluation engine to make query evaluation on Spark more efficient. Date of last release: 2013-10-31 When were the last committers or PMC members elected? 2013-03-13 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd notes: Dave Fisher (wave): This podling achieved one benchmark towards successful incubation - a release. When commits and mailing list activity are reviewed this looks like a tiny community with only a single developer and some administrative support. I am not sure if this another example of a community that is mostly from a single corporation and is doing all of the work outside and contributing it through one person with off-list decision making and communication. If so then the mentors need to look into it and get the real decision making process in public. There is a user ML, but the only email ever on that list was the release announcement.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Complete the first release a) Ensure proper transfer of code b) Verify distribution rights Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? We have already addressed most issues that prevented us from releasing our first release, which we now expect to take place in the next few weeks. We believe that, after we make the first release available, we will get many new users and contributors. How has the project developed since the last report? We have switched to maven as our project management tool. We refactored the source tree to make it compliant to maven; we created maven modules for compilation; we used the maven assembly plugin to create binary and source release artifacts; we updated the MRQL wiki page accordingly (getting started, how to contribute/build/commit). We have sent a request to the ASF Infrastructure to setup our project in Nexus so that we can deploy our snapshots there. We need also to verify the copyright information. As soon as we get these two tasks done, we will upload our first release on Nexus and stage it for vote. Date of last release: none yet When were the last committers or PMC members elected? 2013-03-13 Signed-off-by: [X](mrql) Alan Cabrera [ ](mrql) Anthony Elder [ ](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd notes: (rvs) Project looks reasonably well off -- no major surprises. I was a little bit surprised by a very low JIRA activity but on the other hand they've got an exemplary Wiki and ML traffic is not too shabby either.
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Complete the first release a) Ensure proper transfer of code b) Verify distribution rights 2. Use maven as our project management tool Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? No new PPMC members or committers added since last report, We are still working on our first release. We believe that, as soon as we make the first release available, many users will start using the system and will become new contributors. How has the project developed since the last report? We have sent a request to trademarks@apache.org to review and verify our searches that establish that "Apache MRQL" is a suitable name. More specifically, our US and EU trademark searches, and a Google search, show that the project name MRQL does not conflict with a previously registered trademark. We are now in the process of completing the MRQL codebase to make it ready for the first release. There are two component missing that delay this release: switching to maven as our project management tool and verifying copyright information. Date of last release: none yet Signed-off-by: [X](mrql) Alan Cabrera [X](mrql) Alex Karasulu [ ](mrql) Mohammad Nour El-Din Shepherd notes:
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Complete the first release a) Ensure proper transfer of code b) Verify distribution rights 2. Establish whether "Apache MRQL" is a suitable name Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? We have posted a project description on the HadoopSphere blog, which initiated some discussion and interest. Since the last report, there was little activity in recruiting new users and developers, but we are very close to completing the first release. We believe that, as soon as we make the first release available, many users will start using the system, will request bug fixes, improvements, and additional functionality, and will become new contributors. How has the project developed since the last report? We are in the process of completing the MRQL codebase to make it ready for the first release, which we expect to take place in the next few weeks. The only major component missing that delays this release is switching to maven as our project management tool. We believe that using maven is important because it will facilitate contributions by (and recruitment of) committers and will ease version management. The reasons for the delay are: 1) no one of the current developers has prior experience with maven 2) the codebase is non-standard because it uses non-standard tools to generate Java code. Date of last release: none yet Please check this [ ] when you have filled in the report for MRQL. Signed-off-by: Alan Cabrera: [X](mrql) Anthony Elder: [ ](mrql) Alex Karasulu: [ ](mrql) Mohammad Nour El-Din: [X](mrql) Shepherd notes:
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Expand the development community to include more diverse committers 2. Complete the first release 3. Improve query performance on large clusters Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? We are in the process of completing the infrastructure for the MRQL podling. We have created a new wiki site for MRQL that includes a detailed developer documentation, which includes a guide to potential committers on how to contribute to the MRQL project, gives a detailed roadmap of the codebase, and lists the future release plans. This wiki also contains user documentation of the MRQL query language. We have also created a JIRA site for MRQL. So far we have created 9 JIRA issues, some of them related to infrastructure improvement, from which 6 have already been resolved. How has the project developed since the last report? The existing code has been substantially modified in many ways to facilitate contributions by committers. The codebase now includes a testbed of queries to automatically test various aspects of the system. We are now in the process of refactoring the code based on our coding guidelines and of using maven as our project management tool. Please check this [ ] when you have filled in the report for MRQL. Signed-off-by: Alex Karasulu: [X](mrql) Anthony Elder: [ ](mrql) Alan Cabrera: [X](mrql) Mohammad Nour El-Din: [ ](mrql) Shepherd notes:
MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has been incubating since 2013-03-13. Three most important issues to address in the move towards graduation: 1. Expand the development community to include more diverse committers 2. Complete the first release 3. Create a wiki to document both the software use and the software development process Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware of? none How has the community developed since the last report? This is the first report for MRQL. We have decided to adopt the review-then-commit policy that requires at least two +1 from committers and no -1 from committer. In addition, we have started the process of creating a new infrastructure for the MRQL podling. We have already created three email lists, mrql-private, mrql-dev, and mrql-user, and we have requested the creation of a new JIRA project and a new wiki site for MRQL. How has the project developed since the last report? We have restructured the original source code and we have modified the in-file copyright notices of the source code to be compliant with the Apache License 2.0, to make it ready for the first release. Please check this [ ] when you have filled in the report for MRQL. Signed-off-by: Alex Karasulu: [ X ](mrql) Anthony Elder: [ ](mrql) Alan Cabrera: [ ](mrql) Mohammad Nour: [ ](mrql) Shepherd notes: