In the previous article, we discussed the logical idea of the selection of the subgraph indicator and the precautions in the practical session, and left two interactive questions at the end of the article. This article will follow the ideas of the previous section, try to discuss and solve the problem of how to judge the correlation of the selected indicators, and how to deal with them if they encounter indicators that are related to the correlation, and on this basis, provide specific steps and criteria for the selection of indicators in the subgraph.
We are going to design a Tongdaxin sub-graph indicator system, the role is the quantitative prompt of ** management, the core calculation formula is ** = 2 * win rate -1, the current progress is to solve the quantitative problem of "win rate" from the perspective of mathematical statistics.
Previously, we defined the relevance of the indicator as whether the formula (or logic) is the same at the lowest level of the indicator, because the formula is the key factor in the winning rate of the indicator, or the probability of influencing the expected return. This is a simple definition given from intuitive common sense, and the purpose is to make it easier to understand.
Let's use the KDJ indicator as an example. In the Tongdaxin ** interface, press the CTRL+S shortcut key, we call up the "program transaction evaluation system", select KDJ trading, and then the next step, all settings are used by the system default, select Ping An of China (randomly selected) in the variety, and then start the evaluation. The system shows a win rate of 7143%。
If you are on the "Set Report" page, click "Optimization Parameters", tick all the boxes in the pop-up dialog box, and then click "Start Evaluation". This operation means that the KDJ parameter combinations are backtested one by one according to the same conditions, and there are 360 combinations in total. The effect is equivalent to putting together 360 highly relevant indicators. At this point, the overall win rate is reduced to 4412%, a significant decrease.
Through the simple comparison above, you can intuitively feel the impact of the high correlation indicator group on the final result. However, there are a lot of basic indicators of Tongdaxin, hundreds, and countless deformations, if you go to manual judgment one by one, the efficiency is low, and there will be a large error.
So how to scientifically judge the similarity of indicators? Here, let's think differently. Think of these indicators as factors in quantitative trading (in fact, they are, called technical indicator factors). There is a multi-factor strategy in volume and price trading, which is already a mature strategy, and there is multi-factor correlation processing in it.
The following ** is the process of mining factors in the multi-factor strategy, if the factor is replaced by a technical indicator, is it the same as the logical idea of our choice of sub-chart indicators.
Explain it specifically:
1. Select the effective n factors:
In quantification, the evaluation methods for evaluating the effectiveness of ** yield are IC and IR values, and the loose filter condition is IC 002,ir>0.3。Where:
1) IC is the information coefficient, which represents the ability of factor *** return. IC is calculated by calculating the linear correlation coefficient (correlation) between the ranking of all ** at the beginning of the rebalancing cycle and the ranking of returns at the end of the rebalancing cycle. The greater the IC, the stronger the stock selection ability.
2) IR is the information ratio, which represents the stability of the factor's performance over history. IR = IC Mean The volatility of IC. The performance of factors in different historical periods may vary greatly, sometimes the performance is very good, sometimes the performance is very poor, and it is manifested in the IC, that is, the volatility of the IC is very large. Assuming that the mean IC is constant, the smaller the volatility of IC and the more stable the factor performance, the greater the IR.
2. Correlation analysis of factors:
The Spearman correlation coefficient was calculated for the IC sequence of the factor. The specific calculation principles and processes are too academic, and those who are interested can search on the Internet. In general, the correlation with the same category is stronger; The correlation between the different categories is weaker.
Now, let's go back to the two questions at the beginning of the article. Through the above introduction, scientific solutions have been given in quantitative multi-factor strategies. The calculation process is complex and requires the use of python, but it cannot be directly implemented in Tongdaxin.
Then there are 2 solutions, one is to write your own python program, build an index evaluation framework, and then throw all the accessible indicators into it, let the system run, and finally output a result. This one is too complicated, the threshold is very high, the benefit is scientific, and the accuracy is high. The other is to simplify the conditions and reduce the requirements and accuracy to meet the requirements of the Tongdaxin environment.
For the initial version, the focus is on the framework, so I choose the second option, simplifying the conditions and requirements, in Tongdaxin, we use the win rate instead of IC, and the Sharpe ratio instead of IR. In the practical operation, a specific steps and criteria for the selection of sub-chart indicators are given:
Based on the index tree that comes with the Tongdaxin system, each indicator is backtested with historical 5 years of data, and the backtest report is retained;
In the backtest results, select the indicator with a win rate of 50%;
In the backtest results, select Sharpe Ratio (= alpha yield Beta yield ) 1;
The remaining indicators after screening are sorted from large to small according to the scoring formula = win rate * weight 1 + Sharpe ratio * weight 2;
Establish a new index tree according to the automatic index tree framework of the screened index tree according to the automatic index tree framework of the Tongdaxin system;
In the new indicator tree, only the metrics in the same category are retained with the highest score in step 4, and the rest of the metrics are deleted from the new indicator tree.
The metrics that are retained after re-screening are the indicators selected for the sub-graph.
Repeat the above steps for variants of other basic indicators on the network to update and iterate to the subgraph indicators.
The preceding steps address the specific operation steps and criteria for indicator selection and iterative update. Initial Weight 1 and Weight 2 can be set to 05, it can be flexibly adjusted, the win rate and the Sharpe ratio, pay more attention to which weight, as long as the weight 1 + weight 2 = 1.
In the next section, we will follow the above steps and standards to screen the basic indicators of Tongdaxin and see how effective it is. Because there are many indicators, it will be more time-consuming, and we will share the follow-up results while doing it.
end】