Inspired by iOS7 iMessage's next-word-prediction, I've decided to try to write a script that will learn, based on user input, which words / letters are most likely wanted to complete the user's current word or which word might most likely be desired next.
To do this, I'm going to use a data structure very similar to a Radix Tree (AKA a Patricia Trie).
Take this user input for example:
I like icecream
From that, my goal is to generate the following data structure:
var speakData = {
"I": { //the key
value: "I", //the stored value for this unit of the combination
count: 1, //the number of times that this combination has occured
followables: { // the next level of the tree; all combinations
// that might follow this one
" ": {
value: " ",
count: 1,
followables: {
"l": {
value: "l",
count: 1,
followables: {
"i": {
value: "i",
count: 1,
followables: {
"k": {
value: "k",
count: 1,
followables: {
"e": {
value: "e",
count: 1,
followables: {
// and so on
}
}
}
}
}
}
}
}
}
}
}
}
}
This is essentially a Radix Tree, with some extra information, allowing me to weigh the probability of the learned possibilities that the user might want to type next.
From the above extremely limited data set, when the user types an "I", our best (and only) guess is that the next character will be a " ".
So now that I've explained my goal and method, here's my question:
How can I build this data structure from any given user input?
function learn(message, brain){
for(var i = 0; i < message.length; i++){
brain[message[i]] = {};
brain[message[i]].value = message[i];
brain[message[i]].count++;
brain[message[i]].followables =
}
}
This is as far as I've gotten, but I'm not sure how to insert the next values at the proper positions recursively.