• codingforfun's avatar
    #11143 [FIX] Normalize node risk with sample weight sum · 24e2e0d3
    codingforfun authored
    In case of regression trees, node risk is computed as sum of squared
    error. To get a meaningfull value to compare with it needs to be
    normalized to the number of samples in the node (or more generally to
    the sum of sample weights in this node). Otherwise the sum of squared
    error is highly dependend on the number of samples in the node and
    comparision with `regressionAccuracy` parameter is not very meaningful.
    
    After normalization `node_risk` means in fact sample variance for all
    samples in the node, which makes much more sence and seams to be what
    was originaly intended by the code given that node risk is later used as
    a split termination criteria by
    ```
    sqrt(node.node_risk) < params.getRegressionAccuracy()
    ```
    24e2e0d3
Name
Last commit
Last update
.github Loading commit data...
3rdparty Loading commit data...
apps Loading commit data...
cmake Loading commit data...
data Loading commit data...
doc Loading commit data...
include Loading commit data...
modules Loading commit data...
platforms Loading commit data...
samples Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.tgitconfig Loading commit data...
CMakeLists.txt Loading commit data...
CONTRIBUTING.md Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...