This looks like a character set problem.
Have a look at the source page and see what character set it is encoded in. This might be in a Content-Type
HTTP header, or it might be in a <meta>
tag at the start of the document. Then, when you handle the data, make sure that everything you do handles it in the same format.
You probably want to store the data in UTF-8. Thus, if you capture in another format, in general it is a good idea to convert it from that charset to UTF-8; this will mean you can capture from a wide range of sources and store it in the same database. Look at iconv
in the PHP manual if you wish to learn more about charset conversion.
Are you printing the output to console or a browser? If the former, note that some consoles (old versions of Windows in particular) do not handle UTF-8 well at all. If you are echoing to a browser, make sure your character set is set to "UTF-8" in your own HTML.