I recently made a graph where I show the error bars for a certain number of "experiment". In another way, in my algorithm I'm minimizing the objective function so I would expect that increasing the sampling I'll get lower value of the objective function.
As you can see in the graph, the second value from the left, 2.5
on the x-axis, contain only 2.5% of the configurations, so we wouldn't expect it to perform as well as if we used 100% of the configurations.
I think that this is related to the asymmetry of the distributions. Is there any approach that can fix this problem - aka a method to compute CI for asymmetric unknown distributions?
This example should be useful to make this graph understandable!
%%%%%%%%%%%%%%%%%%%% EDIT %%%%%%%%%%%%%%%%%%%%%
i = number of replicates (with different seed so different sampling every replicate)
z = objective function value
n = number of configurations
j = 1...n
Example: n=1000
, i=100
- Step 1. Analyze all the
1000
configurations and compute the minimum ofz_j
. Store it and replicate fori
. Then compute mu and sigma of thosez_i
- Step 2. Analyze
50%
of the initial1000
configuration and compute the minimum ofz_j
. Store it and replicate for i. Then compute mu and sigma of thosez_i
- Step 3. Analyze
10%
of the initial1000
- Step 4. Analyze
5%
of the initial1000
- Step 5. Analyze
2.5%
of the initial1000
So we will have mu_100
, mu_50
, mu_10
, mu_5
, mu_2.5
and mu_1
and sigma_100
, sigma_50
,...
Now I'm able to make those error bars like mu_100 + - 2 * sigma_100
....