I have a modular application which by nature means that some or all parts are enabled or disabled at any one time, and new ones can be added or removed at any time.
Looking at the Solr documentation, everything about datasources appears to be in XML files buried away in the Solr directories.
I have yet to find an obvious way of adding to that index programmatically (without say modifying those original files). I need to be able to configure Solr to look for data sources in my enabled modules.
Presumably having it traverse my directory structure looking for them is not ideal so I'm guessing a sensible option would be somehow to point solr at say a .php file (or any other script) which would return a single formatted XML file containing the data sources for each module. I guess to do this I would do something similar to the below?
solr-config.xml
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">/var/www/site.com/data-config.php</str>
</lst>
</requestHandler>
data-config.xml - and create 1-n documents programmatically, pulled from each module
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/collection" user="root" password="***" batchSize="1" />
<document name="module_name">
<entity name="module_entity" query="SELECT * FROM module_table">
<field column="id" name="id" />
<field column="name" name="name" />
<field column="age" name="age" />
<field column="description" name="description" />
</entity>
</document>
</dataConfig>
I'm assuming this way will work, I'll be trying it tomorrow when I'm back at a suitable computer, but in the mean time I thought I'd ask if there was a better way which I'd overlooked?
Edit: Someone has pointed out to me that pointing it at a PHP script will just read the file, not execute it and therefore not get valid XML back. Therefore a more suitable way would be to have a cronjob execute a script which builds an XML file?