I want to know what is the meaning of a fragment in Marklogic Server. Please explain in detail what is a fragmentation process in Marklogic and why we should avoid fragmentation.
1 回答
Did you read the documentation? http://docs.marklogic.com/guide/admin/fragments#chapter
In a nutshell:
Marklogic indexes are fragment based, which means they have a granularity that ends at fragment level. Normally a document is stored as a single fragment (with optionally a separate one for its properties), but you can define fragment roots and fragment parents. They cause documents to be cut into pieces at storage, but in such a way that if you serialize the root of the document to the output, all sub-part are retrieved, and joined together as if the document was never cut into pieces.
Why avoid?
Most importantly because cts queries (which are used by the search library as well), normally don't cross fragment borders. You have to explicitly indicate you want to cross those borders, for instance by using cts:document-fragment-query, of which a good example is given here: http://developer.marklogic.com/pubs/5.0/apidocs/cts-query.html#cts:document-fragment-query
Making the fragmentation explicit by storing the fragments as separate documents makes this explicit. You'd have to do multiple searches to combine results. MarkLogic is very quick, so doing a few searches instead of only one usually works almost as fast.
Some also observed some performance issues with handling large documents consisting of many (tens of thousands of) fragments. Something that won't happen with separate documents.
HTH!