I received many responses, some pointed at tools (which is what I was looking for, honestly), but most had a common theme to them. :) Original Post: > We need to sync 10TB of data in small files from one North American > coast to the other. > > Our tenative plans are to sneakernet the data and then use some form of > sync to catch up the delta. > > Aside from bandwidth constraints, we found that rsync quickly craps out > with large numbers of files. > > What tools have you used to do this? Most popular question: > Does all 10TB of it change daily? No. The data comes in two flavors: * Oracle DBF files (yes, changes daily), less than a TB here * Small static files, the files themselves don't change, their count simply increases - side note: These files are about 16 subdirs deep and heavily scattered (er.. I mean.. "distributed") Other common questions/comments: > You didn't specify how rsync craps out, but i'm guessing I forget the specifics, but it was basically "out of memory" due to the number of files and subdirs it has to dig in. > what version of rsync you're using 2.6.8 (looking at 2.6.9 now to see if it addresses any of the problems we've had) > but you can often throw ram at the issue. Not in this case, unfortunately. > In addition, you can fire off rsync on a subtree so it has less work to > do. That's certainly a consideration. It won't be easy (c.f. "about 16 subdirs deep" above). Then there were these: > (Deborah Santomauro) Have you tried "rdist"? and > (Anthony D'Atri) rdist 6 from www.magnicomp.com with SSH as the transport works great for managing files. Holy cow, now that's oldschool love! I'll look into that. Brad Morrison mentioned: > I think cpio has a flag to skip files with equal or newer mod dates, Yeah, we've also considered something like: "rsync -av `find . -newer <somefile> -print` dest:/path" just to limit the volume of files that rsync has to consider. There was mention of NetApp, zfs, VxFS/VxVM, which aren't options in this situation. As much as I tried to get to zfs, it wasn't available at the time we upgraded the DB/file servers to Sol10. Hutin Bertrand mentioned an app called "aide", which is an Intrusion Detection tool (think tripwire) that you can use to spot files/subdirs that have changed. interesting find. (http://sourceforge.net/projects/aide) Gedaliah Wolosh pointed me at http://www.openafs.org Karl Rossing mentioned http://opensolaris.org/os/project/avs/ AFS is quite an endeavor, we're not quite prepared to go that route. AVS looks interesting, we might be able to do something with that, if heavy rsync-frobulation doesn't work out. Thanks all! Rob++ -- Internet: windsor@warthog.com __o Life: Rob@Carrollton.Texas.USA.Earth _`\<,_ (_)/ (_) "They couldn't hit an elephant at this distance." -- Major General John Sedgwick _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Thu Feb 22 12:48:44 2007
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:04 EST