我可能已经找到了C5.0 (See5.0) 的一个可能的 C++“实现”,但我无法深入研究源代码以确定它是否真的像宣传的那样工作。
为了重申我最初的担忧,端口的作者对 C5.0 算法陈述了以下内容:
See5Sam [C5.0] 的另一个缺点是不可能同时拥有多个应用程序树。每次运行可执行文件时都会从文件中读取应用程序,并将其存储在全局变量中。
一旦我有时间查看源代码,我会更新我的答案。
更新
看起来不错,这里是 C++ 接口:
class CMee5
{
public:
/**
Create a See 5 engine from tree/rules files.
\param pcFileStem The stem of the See 5 file system. The engine
initialisation will look for the following files:
- pcFileStem.names Vanilla See 5 names file (mandatory)
- pcFileStem.tree or pcFileStem.rules Vanilla See 5 tree or rules
file (mandatory)
- pcFileStem.costs Vanilla See 5 costs file (mandatory)
*/
inline CMee5(const char* pcFileStem, bool bUseRules);
/**
Release allocated memory for this engine.
*/
inline ~CMee5();
/**
General classification routine accepting a data record.
*/
inline unsigned int classifyDataRec(DataRec Case, float* pOutConfidence);
/**
Show rules that were used to classify the last case.
Classify() will have set RulesUsed[] to
number of active rules for trial 0,
first active rule, second active rule, ..., last active rule,
number of active rules for trial 1,
first active rule, second active rule, ..., last active rule,
and so on.
*/
inline void showRules(int Spaces);
/**
Open file with given extension for read/write with the actual file stem.
*/
inline FILE* GetFile(String Extension, String RW);
/**
Read a raw case from file Df.
For each attribute, read the attribute value from the file.
If it is a discrete valued attribute, find the associated no.
of this attribute value (if the value is unknown this is 0).
Returns the array of attribute values.
*/
inline DataRec GetDataRec(FILE *Df, Boolean Train);
inline DataRec GetDataRecFromVec(float* pfVals, Boolean Train);
inline float TranslateStringField(int Att, const char* Name);
inline void Error(int ErrNo, String S1, String S2);
inline int getMaxClass() const;
inline int getClassAtt() const;
inline int getLabelAtt() const;
inline int getCWtAtt() const;
inline unsigned int getMaxAtt() const;
inline const char* getClassName(int nClassNo) const;
inline char* getIgnoredVals();
inline void FreeLastCase(void* DVec);
}
我会说这是迄今为止我找到的最好的选择。