codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Changing strings in files


On Tue, Nov 10, 2020 at 9:06 PM Manfred Lotz <ml_news at posteo.de> wrote:
> The reason I want to check if a file is a text file is that I don't
> want to try replacing patterns in binary files (executable binaries,
> archives, audio files aso).
>

I'd recommend two checks, then:

1) Can the file be decoded as UTF-8?
2) Does it contain any NULs?

The checks can be done in either order; you can check if the file
contains any b"\0" or you can check if the decoded text contains any
u"\0", since UTF-8 guarantees that those are the same.

If both those checks pass, it's still possible that the file isn't one
you want to edit, but it is highly likely to be text.

ChrisA