solr - Solr 索引多个值作为一个字段

Question

我想对实际上分组了四个值的双字段进行查询，并且每个文档都可以有多个实例。所以我需要一个可以存储这样的东西的字段

<doc>
   <field name="id">id</field>
   <field name="valueGroup">1 2 3 4</field>
   <field name="valueGroup">5 6 7 8</field>
</doc>

然后以这种方式进行范围查询：valueGroup：[0,0,0,0 to 3,8,8,8]。我不能将其索引为具有 multivalued="true" 的单个字段，因为每个组都需要单独处理。我知道有一个字段类型 LatLon 但它只有两个值。如何获得超过 2 维的字段？

score 0 · Accepted Answer

正如我在回复您对我的 SO 问题的评论时提到的那样，我对执行一些复杂的过滤也有相当小众的要求。最终，我必须创建一个自定义字段类，它允许我重写负责返回包含自定义逻辑的查询对象的方法以过滤结果。这种方法应该非常适合您：

public class MyCustomFieldType extends FieldType {
    /**
     * {@inheritDoc}
     */
    @Override
    protected void init(final IndexSchema schema, final Map<String, String> args) {
        trueProperties |= TOKENIZED;
        super.init(schema, args);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final XMLWriter xmlWriter, final String name, final Fieldable fieldable)
        throws IOException
    {
        xmlWriter.writeStr(name, fieldable.stringValue());
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final TextResponseWriter writer, final String name, final Fieldable fieldable)
        throws  IOException
    {
        writer.writeStr(name, fieldable.stringValue(), true);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public SortField getSortField(final SchemaField field, final boolean reverse) {
        return getStringSort(field, reverse);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setAnalyzer(final Analyzer analyzer) {
        this.analyzer = analyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setQueryAnalyzer(final Analyzer queryAnalyzer) {
        this.queryAnalyzer = queryAnalyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public Query getFieldQuery(
        final QParser parser, final SchemaField field, final String externalVal)
    {
        // Do some parsing of the user's input (if necessary) from the query string (externalVal)
        final String parsedInput = ...

        // Instantiate your custom filter, taking note to wrap it in a caching wrapper!
        final Filter filter = new CachingWrapperFilter(
            new MyCustomFilter(field, parsedValue));

        // Return a query that runs your filter against all docs in the index
        // NOTE: depending on your needs, you may be able to do a more fine grained query here
        // instead of a MatchAllDocsQuery!!
        return new FilteredQuery(new MatchAllDocsQuery(), filter);
    }
}

现在您需要一个自定义过滤器...

public class MyCustomFilter extends Filter {
    /**
     * The field that is being filtered.
     */
    private final SchemaField field;

    /**
     *  The value to filter against.
     */
    private final String filterBy;

    /**
     * 
     *
     * @param field     The field to perform filtering against.
     * @param filterBy  A value to filter by.
     */
    public ProgrammeAvailabilityFilter(
        final SchemaField field,
        final String filterBy)
    {
        this.field = field;
        this.filterBy = filterBy;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public DocIdSet getDocIdSet(final IndexReader reader) throws IOException {

        final FixedBitSet bitSet = new FixedBitSet(reader.maxDoc());

        // find all the docs you want to run the filter against
        final Weight weight = new IndexSearcher(reader).createNormalizedWeight(
            new SOME_QUERY_TYPE_HERE());

        final Scorer docIterator = weight.scorer(reader, true, false);

        if (docIterator == null) {
            return bitSet;
        }

        int docId;

        while ((docId = docIterator.nextDoc()) != Scorer.NO_MORE_DOCS) {

            final Document doc = reader.document(docId);

            for (final String indexFieldValue : doc.getValues(field.getName())) {
                // CUSTOM LOGIC GOES HERE

                // If your criteria are met, consider the doc a match
                bitSet.set(docId);
            }
        }

        return bitSet;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public boolean equals(final Object other) {
        // NEEDED FOR CACHING
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public int hashCode() {
        // NEEDED FOR CACHING
    }
}

上面的示例显然是非常基础的，但是如果您将其用作模板并进行调整以提高性能并添加您的自定义逻辑，您应该会得到您所需要的。还要确保在您的过滤器中实现hashCode和equals方法，因为它们将用于缓存。在查询字符串中，您可以fq像这样提供参数：`?q=some query&fq=myfield:[0,0,0,0 to 3,8,8,8]。

正如我所提到的，这种方法对我和我的团队非常有效，因为我们对内容过滤有非常具体的要求。

祝你好运。:)

solr - Solr 索引多个值作为一个字段

1 回答 1

Related

Reference