java - Java 数学 DescriptiveStatistics 删除值

Question

我是 Java 新手，一直在使用 Esper CEP 引擎。然而，这个问题与 Esper 无关，它更像是一个 Java 问题。

首先，我的课：-

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics;

import com.espertech.esper.epl.agg.AggregationSupport;
import com.espertech.esper.epl.agg.AggregationValidationContext;

public class CustomPercentiles extends AggregationSupport {
    private List<Double> numbers = new ArrayList<Double>();

    public CustomPercentiles(){
        super();
    }

    public void clear() {
        numbers.clear();
    }

    public void enter(Object arg0) {
        Double value = (Double) (double) (Integer) arg0;
        if (value > 0){
            //Not interested in < 1
            numbers.add(value);         
        }
    }

    public void leave(Object arg0) {
        Double value = (Double) (double) (Integer) arg0;
        if (value > 0){
            //Not interested in < 1
            numbers.remove(value);          
        }
    }

    public Object getValue() {
        DescriptiveStatistics stats = new DescriptiveStatistics();
        Map<String, Integer> result = new HashMap<String, Integer>();
        for (Double number:numbers.subList(0, numbers.size())){
            stats.addValue(number);     
        }
        result.put("median", (int) stats.getPercentile(50));
        result.put("pct90", (int) stats.getPercentile(90));
        result.put("pct10", (int) stats.getPercentile(10));
        result.put("mean", (int) stats.getMean());
        result.put("std", (int) stats.getStandardDeviation());

        return result ;
    }

    public Class getValueType() {
        return Object.class;
    }

    @Override
    public void validate(AggregationValidationContext arg0) {
        // TODO Auto-generated method stub
    }

}

基本上，Esper 会根据与此处无关的逻辑随时调用 enter(value) 和 leave(value)。它调用 getValue() 来获取计算结果。

因为我想计算百分位数，所以我需要所有可用的数字来处理它。为此，我将其存储在名为 numbers 的全局列表中，并在 getValue() 中将所有数字放入DescriptiveStatistics实例中，然后处理我需要的统计信息。

我的假设是，每次我将列表作为新的 DescriptiveStatistics 对象时，它都需要进行排序。有什么方法可以维护一个类似 DescriptiveStatistics 的对象作为我的全局对象？

我使用 ArrayList vs DescriptiveStatistics 作为我的全局对象的唯一原因是 DescriptiveStatistics 没有删除方法。即我不能按值删除对象。

实际上，在任何给定时间都有数百个此类实例在运行，并且每 1 到 10 秒都会调用每个实例的 getValue()。我目前没有任何性能问题，但正在寻找一些优化帮助以避免将来出现问题。

替代解释：-

我在这里做的是维护一个数字列表。Esper 会多次调用 enter() 和 leave() 方法来告诉我哪些数字应该保留在列表中。在我的情况下，这是基于时间的聚合。我告诉 esper 我想根据最后 1 分钟的数字进行计算。

So on 00:00:00 esper calls enter(10)
my numbers becomes [10]
So on 00:00:05 esper calls enter(15)
my numbers becomes [10, 15]
So on 00:00:55 esper calls enter(10)
my numbers becomes [10, 15, 10]
So on 00:01:00 esper calls leave(10)
my numbers becomes [15, 10]
So on 00:01:05 esper calls leave(15)
my numbers becomes [15]

现在，在此期间 getValue() 可能已被多次调用。每次调用它时，都会根据数字的当前内容返回计算结果。

getValue() 计算第 10、50 和 90 个百分位数。为了计算百分位数，DescriptiveStatistics 需要对数字进行排序。（100 个数字的第 10 个百分位将是排序后列表的第 10 个数字。）。

所以我正在寻找一种能够从 DescriptiveStatistics 实例中取出任意数字的方法。或者向其他一些库寻求推荐，这些库可以给我中位数和百分位数，同时能够在知道值的同时从列表中取出一个数字。

DescriptiveStatistics 有一个 removeMostRecentValue()，但这不是我想要做的。

score 0 · Accepted Answer

据我了解，您正在寻求一种使用DescriptiveStatistics类作为列表的方法，而不是“数字”。这意味着，您希望从 DescriptiveStatistics 变量中动态添加和删除数字。

据我所知，没有比你现在做的更好的方法了。

在再次计算百分位数之前，您确定需要该功能从列表中删除特定数字吗？它不总是新的数字吗？

这听起来有点像您想学习更多 Java 基础知识。

无论如何，由于我无法真正为您的问题提供合格的答案，我想我至少会帮助您更正一些代码，以遵循更好的做法：

public class CustomPercentiles extends AggregationSupport {
    private List<Double> numbers = new ArrayList<Double>();

    //Methods that are inherited from super-classes and interfaces
    //should have the "@Override" annotation,
    //both for the compiler to check if it really is inherited,
    //but also to make it more clear which methods are new in this class.
    @Override    
    public void clear() {
        numbers.clear();
    }

    @Override
    public void enter(Object value) {
        double v = (double) value;
        if (v > 0){
            numbers.add(v);            
        }
    }

    @Override
    public void leave(Object value) {
        double v = (double) value;
        if (v > 0){
            numbers.remove(v);            
        }
    }

    @Override
    public Object getValues() {
        DescriptiveStatistics stats = new DescriptiveStatistics();
        Map<String, Integer> result = new HashMap<String, Integer>();
        //It is unnecessary to call number.subList(0, numbers.size())
        //since it will just return the entire list.
        for (Double number : numbers){
            stats.addValue(number);        
        }
        result.put("median", (int) stats.getPercentile(50));
        result.put("pct90", (int) stats.getPercentile(90));
        result.put("pct10", (int) stats.getPercentile(10));
        result.put("mean", (int) stats.getMean());
        result.put("std", (int) stats.getStandardDeviation());

        return result ;
    }

    //Judgning from the API of AggregationSupport,
    //I would say this method should return Double.class
    //(it basically seems like a bad way of implementing generics).
    //Are you sure it should return Object.class?
    public Class getValueType() {
        return Object.class;
    }

    @Override
    public void validate(AggregationValidationContext arg0) {
        // TODO Auto-generated method stub
    }

}

java - Java 数学 DescriptiveStatistics 删除值

1 回答 1

Related

Reference