我正在尝试使用 tess-two API 识别 android 中的随机字符。我有一张带有字符串的打印纸:“5XqaLB”
当我向相机显示字符串的部分以识别它时,我得到以下示例:
original -> result
"5XqaLB" -> "5anLB"
"XqaLB" -> "anLB"
"qaLB" -> "qaLB"
"5Xq" -> "5Xq"
我想这是因为 tesseract 试图用已识别的字符猜测一个单词。我搜索了很多,但找不到解决方案。任何人都有避免这种 tesseract 替换的想法?
已经尝试过白名单、黑名单和 confs,例如:
baseApi.setVariable("load_system_dawg", "0");
baseApi.setVariable("load_freq_dawg", "0");
baseApi.setVariable("load_punc_dawg", "0");
baseApi.setVariable("load_number_dawg", "0");
baseApi.setVariable("load_unambig_dawg", "0");
baseApi.setVariable("load_bigram_dawg", "0");
baseApi.setVariable("load_fixed_length_dawgs", "0");
baseApi.setVariable("segment_penalty_garbage", "0");
baseApi.setVariable("segment_penalty_dict_nonword", "0");
baseApi.setVariable("segment_penalty_dict_frequent_word", "0");
baseApi.setVariable("segment_penalty_dict_case_ok", "0");
baseApi.setVariable("segment_penalty_dict_case_bad", "0");
谁能猜出如何让 tesseract 只识别普通字符?