2

我是 Solr 的新手,想索引一个由 PhpMyAdmin 制作的 XML 文件。但是,该文件具有表和列,每当我将其放在示例目录中时,Solr 都不会对其进行索引。

==================================================== ====================

<?xml version="1.0" encoding="utf-8"?>
<!--
- phpMyAdmin XML Dump
- version 3.5.1
- http://www.phpmyadmin.net
-
- Host: localhost
- Generation Time: Nov 22, 2012 at 07:33 AM
- Server version: 5.5.24-log
- PHP Version: 5.3.13
-->

<pma_xml_export version="1.0" xmlns:pma="http://www.phpmyadmin.net/some_doc_url/">
    <!--
    - Structure schemas
    -->
    <pma:structure_schemas>
        <pma:database name="blog" collation="latin1_swedish_ci" charset="latin1">
            <pma:table name="post">
                CREATE TABLE `post` (
                  `post_id` int(11) NOT NULL,
                  `Title` varchar(50) NOT NULL,
                  `Author` varchar(50) NOT NULL,
                  `Status` varchar(15) NOT NULL,
                  `Date` date NOT NULL,
                  `Time` time NOT NULL,
                  `Text` varchar(1000) NOT NULL,
                  `Category` varchar(25) NOT NULL,
                  `Tags` varchar(10000) NOT NULL,
                  `Links` varchar(10000) NOT NULL,
                  `Ratings` int(11) NOT NULL,
                  PRIMARY KEY (`post_id`),
                  UNIQUE KEY `post_id` (`post_id`),
                  UNIQUE KEY `post_id_2` (`post_id`)
                ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
            </pma:table>
        </pma:database>
    </pma:structure_schemas>

    <!--
    - Database: 'blog'
    -->
    <database name="blog">
        <!-- Table post -->
        <table name="post">
            <column name="post_id">1</column>
            <column name="Title">Human Interface device to Com-port</column>
            <column name="Author">n72.241</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-03-19</column>
            <column name="Time">10:00:09</column>
            <column name="Text">Is there a way to force input from USB HID into specific com-port?</column>
            <column name="Category">Human Interface</column>
            <column name="Tags">interface, com-port, device, human interface</column>
            <column name="Links">www.something.com</column>
            <column name="Ratings">8</column>
        </table>
        <table name="post">
            <column name="post_id">2</column>
            <column name="Title">Human Interface device to Com-port</column>
            <column name="Author">Narmeen</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-03-19</column>
            <column name="Time">10:15:30</column>
            <column name="Text">What do you exactly mean? serial data throughput is thousands of time slower then usb</column>
            <column name="Category"></column>
            <column name="Tags"></column>
            <column name="Links"></column>
            <column name="Ratings">0</column>
        </table>
        <table name="post">
            <column name="post_id">3</column>
            <column name="Title">Human Interface device to Com-port</column>
            <column name="Author">orb</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-03-19</column>
            <column name="Time">10:25:30</column>
            <column name="Text">on hardware/firmware level or OS/driver level, and if OS/driver, then what OS?</column>
            <column name="Category"></column>
            <column name="Tags"></column>
            <column name="Links"></column>
            <column name="Ratings">0</column>
        </table>
        <table name="post">
            <column name="post_id">4</column>
            <column name="Title">Human Interface device to Com-port</column>
            <column name="Author">someone</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-03-19</column>
            <column name="Time">11:00:00</column>
            <column name="Text">Im putting some long text to see how its looks on the main site.A human interface device or HID is a type of computer device that interacts directly with, and most often takes input from, humans and may deliver output to humans. The term &quot;HID&quot; most commonly refers to the USB-HID specification. The term was coined by Mike Van Flandern of Microsoft when he proposed the USB committee create a Human Input Device class working group.[when?] The working group was renamed as the Human Interface Device class at the suggestion of Tom Schmidt of DEC because the proposed standard supported bi-directional communication.[when?]ww/</column>
            <column name="Category"></column>
            <column name="Tags"></column>
            <column name="Links"></column>
            <column name="Ratings">0</column>
        </table>
        <table name="post">
            <column name="post_id">5</column>
            <column name="Title">Human Interface device to Com-port</column>
            <column name="Author">n72.241</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-11-08</column>
            <column name="Time">11:15:00</column>
            <column name="Text">Human interface guidelines (HIG) are software development documents which offer application developers a set of recommendations. Their aim is to improve the experience for the users by making application interfaces more intuitive, learnable, and consistent. Most guides limit themselves to defining a common look and feel for applications in a particular desktop environment. The guides enumerate specific policies. Policies are sometimes based on studies of human-computer interaction (so called usability studies), but most are based on arbitrary conventions chosen by the platform developers. The central aim of a HIG is to create a consistent experience across the environment (generally an operating system or desktop environment), including the applications and other tools being used. This means both applying the same visual design and creating consistent access to and behaviour of common elements of the interface - from simple ones such as buttons and icons up to more complex construction</column>
            <column name="Category">Human interface</column>
            <column name="Tags">human cateogy text checking</column>
            <column name="Links">something.com</column>
            <column name="Ratings">8</column>
        </table>
        <table name="post">
            <column name="post_id">6</column>
            <column name="Title">some other things</column>
            <column name="Author">me</column>
            <column name="Status">not answered</column>
            <column name="Date">2012-11-30</column>
            <column name="Time">10:00:00</column>
            <column name="Text">Rommendations and advice meant to help developers create better applications. Developers sometimes intentionally choose to break them if they think that the guidelines do not fit their application, or usability testing reveals an advantage in doing so. But in turn, the organization publishing the HIG might withhold endorsement of the application. Mozilla Firefox's user interface, for example, goes against the GNOME project's HIG, which is one of the main arguments for</column>
            <column name="Category">Not right</column>
            <column name="Tags">here, there anywhere</column>
            <column name="Links">checking.com</column>
            <column name="Ratings">5</column>
        </table>
        <table name="post">
            <column name="post_id">7</column>
            <column name="Title">some other things again</column>
            <column name="Author">xyz</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-12-29</column>
            <column name="Time">12:00:00</column>
            <column name="Text">Human interface guidelines often describe the visual design rules, including icon and window design and style. Frequently they specify how user input and interaction mechanisms work. Aside from the detailed rules, guidelines sometimes also make broader suggestions about how to organize and design the application and write user-interface text.
HIGs are also done for applications. In this case the HIG will build on a platform HIG by adding the common semantics for a range of application functions</column>
            <column name="Category">nothing</column>
            <column name="Tags">sfksdjghsklgjlsgj</column>
            <column name="Links">something.com</column>
            <column name="Ratings">0</column>
        </table>
        <table name="post">
            <column name="post_id">8</column>
            <column name="Title">noting to say</column>
            <column name="Author">na</column>
            <column name="Status"></column>
            <column name="Date">2012-11-16</column>
            <column name="Time">00:00:00</column>
            <column name="Text">what if this dosent works then what will i do now here</column>
            <column name="Category">sdfsdfs</column>
            <column name="Tags"></column>
            <column name="Links"></column>
            <column name="Ratings">0</column>
        </table>
        <table name="post">
            <column name="post_id">9</column>
            <column name="Title">checkinf for time</column>
            <column name="Author">na</column>
            <column name="Status">Answered</column>
            <column name="Date">2012-10-10</column>
            <column name="Time">09:00:00</column>
            <column name="Text">hoping this works now</column>
            <column name="Category">nothing</column>
            <column name="Tags">afjalfjaf</column>
            <column name="Links">kdflsdfj</column>
            <column name="Ratings">8</column>
        </table>
    </database>
</pma_xml_export>
4

1 回答 1

0

首先,为了索引 XML 文件,您应该将其转换为 Solr 文档格式,如下所示。

<add>
    <doc>
        <field ...
        <field ...
    </doc>
    <doc>
        <field ...
        <field ...
    </doc>
</add>

或者,您可以使用数据导入处理程序通过从关系数据库获取数据来索引数据。

于 2012-11-29T09:31:25.437 回答