codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Suggestions on mechanism or existing code - maintain persistence of file download history


On 2020-01-29 20:00, jkn wrote:
> Hi all
>      I'm almost embarrassed to ask this as it's "so simple", but thought I'd give
> it a go...
> 
> I want to be a able to use a simple 'download manager' which I was going to write
> (in Python), but then wondered if there was something suitable already out there.
> I haven't found it, but thought people here might have some ideas for existing work, or approaches.
> 
> The situation is this - I have a long list of file URLs and want to download these
> as a 'background task'. I want this to process to be 'crudely persistent' - you
> can CTRL-C out, and next time you run things it will pick up where it left off.
> 
> The download part is not difficult. Is is the persistence bit I am thinking about.
> It is not easy to tell the name of the downloaded file from the URL.
> 
> I could have a file with all the URLs listed and work through each line in turn.
> But then I would have to rewrite the file (say, with the previously-successful
> lines commented out) as I go.
> 
Why comment out the lines yourself when the download manager could do it 
for you?

Load the list from disk.

For each uncommented line:

     Download the file.

     Comment out the line.

     Write the list back to disk.

> I also thought of having the actual URLs as filenames (of zero length) in a
> 'source' directory. The process would then look at each filename in turn, and
> download the appropriate URL. Then the 'filename file' would either be moved to
> a 'done' directory, or perhaps renamed to something that the process wouldn't
> subsequently pick up.
> 
> But I would have thought that some utility to do this kind of this exists already. Any pointers? Or any comments on the above suggested methods?
>