Best Practices working with ITCase
A real case study of how to setup the logs to get insight information of how SQL query works in ITCase environment by using TableEnvironment based on HBaseConnectorITCase
GitHub repo: https://github.com/JingGe/101
While solving issue in ITCase, the first thing to do is to understand the running process. You can debug the ITCase to get runtime information in details. If you want to quickly have a big picture of it, a good choice is to change the log setup to the insight information of the running process.
After reading this section, you will learn:
how to change the log setup to get required information.
how to find the root cause of a real issue of HBaseConnectorITCase
how does the MiniCluster have the impact on TableEnvironment
how to control the lifecycle of the MiniCluster for ITCase
how to write an ITCase efficiently
Change the log setup
Tests in Flink will use log4j2-test.properties under test/resources.
By default, the root logger level is set to be OFF.
Since it will take some effort to get the right setup, I have prepared one shows below. You can take it as the staring point for your own purpose.
With this setup, to minimise the log output, all INFO level log messages of hbase2 connector and minicluster and only WARN level log messages of other components will be shown in the rolling files.
In case you want to see the log in the console, you can activate the STDOUT appenderRef.
It is recommend the set rootLogger.level = INFO for your first check of an ITCase
Find the root cause of issue FLINK-24077
It is recommended to set rootLogger.level = INFO for trouble shooting.
Make sure you have built Flink. Now go to the Flink root directory and run:
you will see the log has been created:
In the log file you will get details information like a HBase MiniCluster and multiple Flink MiniClusters will be initialised, how tasks were executed, etc.
When you walk through the log, you will find there are some thrown runtime exception java.lang.IllegalStateException tells us that the MiniCluster is not yet running or has already been shut down, which turns out that the CollectResultFetcher was Failed and some data might be lost.
The root cause is, since the shutdown of the MiniCluster will be called asynchronously, CollectResultFetcher will got data lost sometimes based on race conditions and the unchecked RuntimeException java.lang.IllegalStateException will be thrown that we were not aware of.
Please pay attention that the mavn test was successful even if unchecked exceptions have been thrown.
MiniCluster has impact on TableEnvironment for ITCase
Thanks for the details log information, it is easy to be aware that there were multiple Flink MiniClusters that have been started and stopped. Each up and down of a MiniCluster will take about 4s.
While we are using TableEnvironment like:
each time when we execute a sql query like:
a new MiniCluster will be started and then stopped asynchronously after the job is finished in the background.
By default, running each query via TableEnvironment in ITCase will trigger a new MiniCluster being started and stopped in the background. It will cost time and resource.
Control the lifecycle of the MiniCluster
The idea case is to control the lifecycle of the MiniCluster manually for each ITCase. While we execute sql via TableEnvironment, it will check whether a MiniCluster is available. If yes, the available MiniCluster will be used for job submission.
In this case, you can use JUnit @ClassRule and @Rule and the MiniClusterWithClientResource provided by Flink:
Using @ClassRule will make sure a MiniCluster will be initialised before any test methods are running and stopped after all test methods are finished. In this way, the race conditions mentioned previously are solved.
Write the ITCase efficiently
By default, each query will trigger a new MiniCluster up and down. you can image how much time and resource it will cost when we run hundreds even thousands of queries.
After using @ClassRule to control it, the maven takes 1:38 min:
Don't under estimate the improvement. After considering the time cost of HBase initialisation, job submit and execution etc., each test method may only cost few seconds to finish. Compare to the last maven test, we saved 15 seconds for 9 tests. The performance improvement is significantly.
Please be aware that most of ITCases are working with MiniCluster implicitly or explicitly.
It is generally recommended to control external resource like MiniCluster at the class level for ITCase unless there is a technical reason for using extra individual MiniClusters for some test methods.
Last updated
Was this helpful?