[nycbug-talk] Off topic: Best way to mirror large ftp

Jesse Callaway bonsaime at gmail.com
Wed Nov 26 23:14:04 EST 2008


On Wed, Nov 26, 2008 at 11:12 AM, Matt Juszczak <matt at atopia.net> wrote:

> I have about 3 TB of data I need to mirror off of an FTP box.  Using
> traditional methods, it would take me about 16+ days to get all of that
> information.
>
> I've looked at things like lftp, and a few other "scripts" out there, but
> ideally I would love to find something that can:
>
> 1) Index the entire FTP
> 2) Split the downloads into multiple threads
> 3) Update the index at any time (the FTP server changes) and download the
> differences (yes, this may be an expensive operation I know)
>
> Any suggestions?  Off topic I know, but I've been struggling for some time
> now on this issue and I'm hoping some of you fellow sysadmins have some
> suggestions.
>
> Thanks!
>
> -Matt
> _______________________________________________
> talk mailing list
> talk at lists.nycbug.org
> http://lists.nycbug.org/mailman/listinfo/talk
>

Really, lftp and ncftp are the only decent clients for this sort of thing.
If you use lftp make sure to tune the number of parallel threads to
something realistic, but greater than say 3.
lftp is very fancy, and is about as feature rich as wget, so you have to
experiment with it to make it run the way you mean it to.
ncftp works as well, but the number of knobs it has for checking "sameness"
of teh remote vs local version of a file is much fewer. This makes it
slower, since you have to clobber if there's any doubt.
Both have some seriously cool batching options. If you use an ftp client
more than once a day there are really no other options.
I use ncftp for everyday, and lftp if I have to do heavy lifting.

So, I say stick with lftp. It's a pain at first because it can sometimes
hang or skip files, but keep working with it.

-jesse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nycbug.org/pipermail/talk/attachments/20081126/6f12074d/attachment.html>


More information about the talk mailing list