Text encoding changed, content “gone”

I don’t know how it happened but for some of my pages the text encoding in the content file changed from Unicode (UTF-8) to Western (Mac OS Roman) and all content of these pages (layouts and blocks) are gone from the panel. It’s still in the file but with Umlaut and other special characters converted to weird character combinations (such as √º for ü).

Nothing changes in the panel if I change the text encoding manually to UTF-8 again. If I add new content through the panel the content in the file is overwritten (of course) and is shown as usual but if I revert that change in the file, nothing is shown in the panel again. Has anyone experienced such a thing? I think it must be some kind of illegal character but I can’t find anything (it’s also a lot of content).

To expand on that: I opened a terminal shell and did file -I [filename] and for the pages that are working it came back as “text/html; charset=utf-8” and for those that aren’t it showed “application/octet-stream; charset=binary”. Very strange, as they are all text files with the extension “txt”. :thinking:

Did you edit these files outside of the Panel, in a text editor?

Not knowingly. I transferred them via FTP; perhaps the client misinterpreted something in these files and determined them to be binary instead of text, I don’t know. However, after examining one of the problematic files closer I found that there was some text cut off and “replaced” by some hidden control characters.
I’m going to paste that excerpt literally:

…
</p></li></ul>"},"id"PúäPú䇫à ]â∏úäpúä@púä System der
…

It came from regular JSON formatting of the content, then screwed up at the ID of a block, apparently, and then randomly continues with the text of another block.
I’ll have to dig deeper as to what content was cut.

Weird. But no idea how something like that can happen.