[nycbug-talk] Off topic: Best way to mirror large ftp
Marc Spitzer
mspitzer at gmail.com
Wed Nov 26 15:49:00 EST 2008
On Wed, Nov 26, 2008 at 11:12 AM, Matt Juszczak <matt at atopia.net> wrote:
> I have about 3 TB of data I need to mirror off of an FTP box. Using
> traditional methods, it would take me about 16+ days to get all of that
> information.
>
> I've looked at things like lftp, and a few other "scripts" out there, but
> ideally I would love to find something that can:
>
> 1) Index the entire FTP
mtree on server?
> 2) Split the downloads into multiple threads
how much bandwidth do you have to work with?
> 3) Update the index at any time (the FTP server changes) and download the
> differences (yes, this may be an expensive operation I know)
run mtree every so often on server?
>
> Any suggestions? Off topic I know, but I've been struggling for some time
> now on this issue and I'm hoping some of you fellow sysadmins have some
> suggestions.
run the following on the server:
1: run "find . -type d > dir_list"
2: run "find . -type f >file_list"
on client
3: down load both files
4: cat dir_list |xargs -n 20 mkdir -p
5: split file_list -l "pick reasonable number"
6: run a bunch of shell scripts to do the fetch, one per out put file from 5
....
or just run rsync and let it do its job.
marc
marc
--
Freedom is nothing but a chance to be better.
Albert Camus
More information about the talk
mailing list