When creating the mappings for an index that can search through multiple books, is it preferable to use nested mappings like below, or using documents with a parent-child relationship
book: {
properties: {
isbn: { //- ISBN of the book
type: 'string' //- 9783791535661
},
title: { //- Title of the book
type: 'string' //- Alice in Wonderland
},
author: { //- Author of the book(maybe should be array)
type: 'string' //- Lewis Carroll
},
category: { //- Category of the book(maybe should be array)
type: 'string' //- Fantasy
},
toc: { //- Array of the chapters in the book
type: 'nested',
properties: {
html: { //- HTML Content of a chapter
type: 'string' //- <!DOCTYPE html><html>...</html>
},
title: { //- Title of the chapter
type: 'string' //- Down the Rabbit Hole
},
fileName: { //- File name of this chapter
type: 'string' //- chapter_1.html
},
firstPage: { //- The first page of this chapter
type: 'integer' //- 3
},
numberOfPages: { //- How many pages are in this chapter
type: 'integer' //- 27
},
sections: { //- An array of all of the sections within a chapter
type: 'nested',
properties: {
html: { //- The html content of a section
type: 'string' //- <section>...</section>
},
title: { //- The title of a section
type: 'string' //- section number 2 or something
},
figures: { //- Array of the figures within a section
type: 'nested',
properties: {
html: { //- HTML content of a figure
type: 'string' //- <figure>...</figure>
},
caption: { //- The name of a figure
type: 'string' //- Figure 1
},
id: { //- Id of a figure
type: 'string', // figure4
}
}
},
paragraphs: { //- Array of the paragraphs within a section
type: 'nested',
properties: {
html: { //- HTML content of a paragraph
type: 'string', //- <p>...</p>
}
id: { //- Id of a paragraph
type: 'string', // paragraph3
}
}
}
}
}
}
}
}
}
The size of an entire books html is approximately 250kB. I would want to query things such as
- the best matching paragraph including it's nearest paragraphs on either side
- the best matching section from a single book including any child sections
- the best figure given it is inside a section with a matching title
- etc
I don't really know the specifics of the queries I would want to perform, but it is important to have a lot of flexibility to be able to try out very weird ones without having to change all of my mappings too much.