[talk] Climate Mirror

John C. Vernaleo john at netpurgatory.com
Wed Dec 14 13:05:25 EST 2016

John C. Vernaleo, Ph.D.
john at netpurgatory.com

On Wed, 14 Dec 2016, Pete Wright wrote:

> On 12/14/16 5:52 AM, Brian Cully wrote:
>> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy <ike at blackskyresearch.net> 
> wrote:
>>>> Maybe torrents, IPFS, ...? Or a collaborative
>>>> distributed file system. Perhaps using QFS, MFS or LFS?
>>> While I just got pretty excited about NYC*BUG’s ability to take this on 
> whole hog, I ABSOLUTELY would love to see this explored further.
>>> Could you propose something we could get involved in as a group from 
> NYC*BUG, perhaps something people can run to donate a small chunk of their 
> own smaller servers?
>> 	I like the idea of using torrents. There are lots of upsides: it’s 
> easy to get involved by sharing a smallish chunk of the set, the relatively 
> small tracker file can be separately copied around to ensure there’s no 
> single point of attack, perhaps even via git or something similar to ensure 
> it’s not tampered with (and made trivially available via github), and it’s 
> pretty fire-and-forget (just leave it running on a routable server).
>> 	The major downside I see is that unless the data has already been 
> made available via torrent, someone’s gotta seed the thing, which still means 
> you need at least one server with a lot of disk space to get the project 
> started. That’s something that we may want anyway, just to ensure the thing 
> can always be seeded (at least until the feds come knocking, but hopefully by 
> then there are many redundant copies of the data sitting around the world).
>> 	I know I’d certainly be willing to donate a few TB on my server to 
> hosting a portion of the data set, but there’s no way I could host the whole 
> thing, and I’d also be willing to throw some money into the hat to get the 
> seed up.
> I was thinking about using the torrent protocol last night and i think 
> there are two issues that would prevent this:
> - we'd have to generate check-sums for every dataset that is stored, 
> then generate URI's for each of them.  I am pretty confident the data 
> here is not bt friendly...which leads to my second point
> - the academic/prof consumers of this data are probably not going to use 
> bt to download these files for research.  unfortunately ftp and http are 
> probably used very frequently in these arena's.
> having said this - i def feel that bt would be a *much* better method to 
> distribute and share the cost of hosting data...but i'm not sure if they 
> are ready for this or not :)

Just to add to that, it isn't even always a case of them being ready or 
not for bt, but their institution may block bittorrent.

Having been responsible for distributing satellite data in the past, http 
and ftp and definitely what the researchers want and need.

More information about the talk mailing list