TL;DR: Your expectation is incorrect. goodbye&goodnight
has two word-separation points, one on each side of the ampersand. Whether the ampersand is encoded or not is irrelevant.
As far as I know, CSS doesn't fully specify what a "word" is, but there is a recommendation to use the Unicode standard word separation algorithm, which you can find here (UAX29).
An informal summary is that a word is a sequence of letters, numbers or "Connector_Punctuation" symbols (ties), and possibly containing "MidLetter", "MidNum" or "MidNumLet" symbols (there's a list in the referenced document), depending on the immediate context of the symbol. &
is not in any of those categories, so a UAX29-conformant word-separation algorithm should split words before and after an &
.
A word separation algorithm may take language into account. Indeed, it may do just about anything, but it's supposed to be unsurprising for a native speaker of the language. Non-programmers would probably be surprised if word&word
were considered one word.