I have postgres array column which I wanted to be indexed and then use it in search. Here is example below,
phones = [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ]
I have analyzed the tokens using analyze api, elasticsearch is tokenizing the field into multiple fields as follow
curl -XGET 'localhost:9200/_analyze' -d '
{
"analyzer" : "standard",
"text" : [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ]
}'
{
"tokens": [
{
"token": "analyzer",
"start_offset": 6,
"end_offset": 14,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "standard",
"start_offset": 19,
"end_offset": 27,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "text",
"start_offset": 33,
"end_offset": 37,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "175",
"start_offset": 45,
"end_offset": 48,
"type": "<NUM>",
"position": 4
},
{
"token": "2",
"start_offset": 50,
"end_offset": 51,
"type": "<NUM>",
"position": 5
},
{
"token": "123",
"start_offset": 53,
"end_offset": 56,
"type": "<NUM>",
"position": 6
},
{
"token": "25",
"start_offset": 57,
"end_offset": 59,
"type": "<NUM>",
"position": 7
},
{
"token": "32",
"start_offset": 60,
"end_offset": 62,
"type": "<NUM>",
"position": 8
},
{
"token": "123456789",
"start_offset": 66,
"end_offset": 75,
"type": "<NUM>",
"position": 9
},
{
"token": "12",
"start_offset": 80,
"end_offset": 82,
"type": "<NUM>",
"position": 10
},
{
"token": "111",
"start_offset": 83,
"end_offset": 86,
"type": "<NUM>",
"position": 11
},
{
"token": "111",
"start_offset": 87,
"end_offset": 90,
"type": "<NUM>",
"position": 12
},
{
"token": "11",
"start_offset": 91,
"end_offset": 93,
"type": "<NUM>",
"position": 13
}
]
}
I wanted elasticsearch either to not do the tokenization and store the numbers without special characters e.g "+175 (2) 123-25-32" to be converted into "+17521232532" OR simply index the number as it is so that It would be available in search result.
My mapping is as below,
{ :id => { :type => "string"}, :secondary_phones => { :type => "string" } }
Here is how I am trying todo the query
settings = {
query: {
filtered: {
filter: {
bool: {
should: [
{ terms: { phones: [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ] } },
]
}
}
}
},
size: 100,
}
P.S I have also tried by removing the special characters but no luck.
I am sure it is achievable and I am missing something. Suggestions please.
Thanks.