Suggestions on mechanism or existing code - maintain persistence of file download history
On 2020-01-29 20:00, jkn wrote:
> Hi all
> I'm almost embarrassed to ask this as it's "so simple", but thought I'd give
> it a go...
> I want to be a able to use a simple 'download manager' which I was going to write
> (in Python), but then wondered if there was something suitable already out there.
> I haven't found it, but thought people here might have some ideas for existing work, or approaches.
> The situation is this - I have a long list of file URLs and want to download these
> as a 'background task'. I want this to process to be 'crudely persistent' - you
> can CTRL-C out, and next time you run things it will pick up where it left off.
> The download part is not difficult. Is is the persistence bit I am thinking about.
> It is not easy to tell the name of the downloaded file from the URL.
> I could have a file with all the URLs listed and work through each line in turn.
> But then I would have to rewrite the file (say, with the previously-successful
> lines commented out) as I go.
Why comment out the lines yourself when the download manager could do it
Load the list from disk.
For each uncommented line:
Download the file.
Comment out the line.
Write the list back to disk.
> I also thought of having the actual URLs as filenames (of zero length) in a
> 'source' directory. The process would then look at each filename in turn, and
> download the appropriate URL. Then the 'filename file' would either be moved to
> a 'done' directory, or perhaps renamed to something that the process wouldn't
> subsequently pick up.
> But I would have thought that some utility to do this kind of this exists already. Any pointers? Or any comments on the above suggested methods?