[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Suggestions on mechanism or existing code - maintain persistence of file download history

Hi all
    I'm almost embarrassed to ask this as it's "so simple", but thought I'd give
it a go...

I want to be a able to use a simple 'download manager' which I was going to write
(in Python), but then wondered if there was something suitable already out there.
I haven't found it, but thought people here might have some ideas for existing work, or approaches.

The situation is this - I have a long list of file URLs and want to download these
as a 'background task'. I want this to process to be 'crudely persistent' - you
can CTRL-C out, and next time you run things it will pick up where it left off.

The download part is not difficult. Is is the persistence bit I am thinking about.
It is not easy to tell the name of the downloaded file from the URL.

I could have a file with all the URLs listed and work through each line in turn.
But then I would have to rewrite the file (say, with the previously-successful
lines commented out) as I go.

I also thought of having the actual URLs as filenames (of zero length) in a
'source' directory. The process would then look at each filename in turn, and
download the appropriate URL. Then the 'filename file' would either be moved to
a 'done' directory, or perhaps renamed to something that the process wouldn't
subsequently pick up.

But I would have thought that some utility to do this kind of this exists already. Any pointers? Or any comments on the above suggested methods?