[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Changing strings in files

On Tue, 10 Nov 2020 18:37:54 +1100
Cameron Simpson <cs at> wrote:

> On 10Nov2020 07:24, Manfred Lotz <ml_news at> wrote:
> >I have a situation where in a directory tree I want to change a
> >certain string in all files where that string occurs.
> >
> >My idea was to do
> >
> >- os.scandir and for each file  
> Use os.walk for trees. scandir does a single directory.

Perhaps better. I like to use os.scandir this way

def scantree(path: str) -> Iterator[os.DirEntry[str]]:
    """Recursively yield DirEntry objects (no directories)
          for a given directory.
    for entry in os.scandir(path):
        if entry.is_dir(follow_symlinks=False):
            yield from scantree(entry.path)

        yield entry

Worked fine so far. I think I coded it this way because I wanted the
full path of the file the easy way.

> >   - check if a file is a text file  
> This requires reading the entire file. You want to check that it 
> consists entirely of lines of text. In your expected text encoding - 
> these days UTF-8 is the common default, but getting this correct is 
> essential if you want to recognise text. So as a first cut, totally 
> untested:
> ...

The reason I want to check if a file is a text file is that I don't
want to try replacing patterns in binary files (executable binaries,
archives, audio files aso).

Of course, to make this nicely work some heuristic check would be the
right thing (this is what file command does). I am aware that an
heuristic check is not 100% but I think it is good enough.