Document To Read Before Coding
Last updated
Was this helpful?
Last updated
Was this helpful?
GitHub repo: https://github.com/JingGe/101
This document will focus on how to get involved into Flink development and how to contribute code. It is written for experienced developers proficient with Flink concepts and architecture, e.g. stateful distributed stream processing, Flink runtime component, layered APIs, etc. For the developers who are not familiar with these, Flink provides the Hands-On Training(One of the best tutorials in the industry).
The Flink roadmap, especially the Feature Radar section, is the great info source to understand the big picture. The Feature Stages show us the most interesting components and their development status:
Feature Stages
MVP: Have a look, consider whether this can help you in the future.
Beta: You can benefit from this, but you should carefully evaluate the feature.
Ready and Evolving: Ready to use in production, but be aware you may need to make some adjustments to your application and setup in the future, when you upgrade Flink.
Stable: Unrestricted use in production
Reaching End-of-Life: Stable, still feel free to use, but think about alternatives. Not a good match for new long-lived projects.
Deprecated: Start looking for alternatives now
Put your focus on the Beta, Ready&Evolving, MVP stages and avoid spending time on phasing out features that will be deprecated in the future.
New Roadmap for upcoming Flink 2.0 is under construction.
It is highly recommended to ready whole page of the roadmap. The content is very valuable. After doing that, you can glance over Community & Project Info to get information about the mailing lists, Jira issue tracker, project wiki, contact to committers, etc. for the daily development.
All important information about the contribution is described in How To Contribute and the sections underneath. You can contribute code, document, and websites. The process described there is very precise and detailed. It has some common parts, like you have to follow the code style and code formatting, fulfill the code quality requirement; you should understand the PR review process very well to make sure your PR contains the right information so that it will be reviewed and accepted; etc.
This document only shows you some most important rules. You will not make big mistake if you start contribute code only based on this document. But, It is highly recommended that you should read all information under How To Contribute.
Beyond these common parts, there are some special matters that need attention:
Consensus is the king. Use mailing list to trigger discussion and reach the consensus. Big design concept that will change public API has to be described with a FLIP. Use Jira to summarize the result and break down the tasks. And, obviously, use Github for the PR review and merge.
Document contribution requires both English and Chinese.
There is a template for you to create new PR.
Separation Of Concern: Pull Requests must put cleanup, refactoring, and core changes into separate commits. These commits should be described in the Brief change log section of the PR. You can find an excellent example in https://github.com/apache/flink/pull/7264.
Flink has naming scheme for commits. The basic naming scheme for commits is [Jira issue|hotfix] [component] Message. Multiple commits may refer to the same issue, if the issue is fixed in multiple steps.
Flink has its own annotations you should pay attention to while reading/contributing code.
Flink emphasized how important it is to have high quality and well engineered code. I personally strongly recommend the Clean Code concept from Uncle Bob. Furthermore, there are some professional softwares, e.g. SonarGraph, take care of even deeper issues about the code and software architecture.
There are trade-offs to write code for data intensive processing, while code for coordination should continue keeping simple and clean, again Clean Code.
Flink keeps the dependency footprint small to avoid dependency clashes, e.g. extra framework like Guice will not be used for dependency injection purpose.
Avoid Mockito - Use reusable test implementations
The community has decided to migrate to JUnit 5 and AssertJ. New unit test should be built with them.
ArchiUnit has been used to check architectural violations of the codebase.
Get consensus with the committer, before you try to contribute code. You can find all Committers at here. The "People" section on the Flink Community & Project Info page might be out of date.
Well engineered code is a must.
Flink PR has a template with the following sections:
What is the purpose of the change
Open Architecture Questions (optional)
Further ToDos and Follow-ups (optional)
Brief change log
Verifying this change
Does this pull request potentially affect one of the following parts:
Documentation
Naming scheme for commits [Jira issue|hotfix] [component] Message, e.g.: