Files · 24e2e0d3f9bf18f23a995cee66d4685d07b8e440 · submodule / opencv

#11143 [FIX] Normalize node risk with sample weight sum · 24e2e0d3

codingforfun authored Mar 27, 2018

In case of regression trees, node risk is computed as sum of squared
error. To get a meaningfull value to compare with it needs to be
normalized to the number of samples in the node (or more generally to
the sum of sample weights in this node). Otherwise the sum of squared
error is highly dependend on the number of samples in the node and
comparision with `regressionAccuracy` parameter is not very meaningful.

After normalization `node_risk` means in fact sample variance for all
samples in the node, which makes much more sence and seams to be what
was originaly intended by the code given that node risk is later used as
a split termination criteria by
```
sqrt(node.node_risk) < params.getRegressionAccuracy()
```

24e2e0d3

Name	Last commit	Last update
.github		Loading commit data...
3rdparty		Loading commit data...
apps		Loading commit data...
cmake		Loading commit data...
data		Loading commit data...
doc		Loading commit data...
include		Loading commit data...
modules		Loading commit data...
platforms		Loading commit data...
samples		Loading commit data...
.gitattributes		Loading commit data...
.gitignore		Loading commit data...
.tgitconfig		Loading commit data...
CMakeLists.txt		Loading commit data...
CONTRIBUTING.md		Loading commit data...
LICENSE		Loading commit data...
README.md		Loading commit data...

README.md