From george at ceetonetechnology.com Fri Dec 2 15:55:22 2016 From: george at ceetonetechnology.com (George Rosamond) Date: Fri, 2 Dec 2016 15:55:22 -0500 Subject: [talk] NYC*BUG Holiday Hang-out Message-ID: <1353718f-71c8-f003-a5f9-4932fc8f6df6@ceetonetechnology.com> Wednesday, December 7 645 PM Suspenders at 108 Greenwich Street. We'll meet at one end of the restaurant/bar, and keep it informal this year. We are sorting out meetings for the new year with space reserved for: January 4 February 1 (we need space for that date) March 1 April 5 Those who previously expressed interest in doing a meeting, please ping admin@ to sort out the details. You know who you are :) From jesse at emptysquare.net Fri Dec 2 16:43:00 2016 From: jesse at emptysquare.net (A. Jesse Jiryu Davis) Date: Fri, 2 Dec 2016 16:43:00 -0500 Subject: [talk] I improved CPython's DNS system with your help Message-ID: Hi list, I asked you some questions about getaddrinfo on BSD last year. With your help, I submitted some patches to CPython that allowed it to call getaddrinfo concurrently on BSD, and now I've written an article about it: https://engineering.mongodb.com/post/the-saga-of-concurrent-dns-in-python-and-the-defeat-of-the-wicked-mutex-troll/ If silly fantasy stories are your thing, enjoy! If not, you can just read the report on bugs.python.org: http://bugs.python.org/issue25924 Thanks especially to Christos for his help testing my theory about Mac OS X. Peace, Jesse -------------- next part -------------- An HTML attachment was scrubbed... URL: From kmsujit at gmail.com Sat Dec 3 06:28:34 2016 From: kmsujit at gmail.com (Sujit K M) Date: Sat, 3 Dec 2016 16:58:34 +0530 Subject: [talk] I improved CPython's DNS system with your help In-Reply-To: References: Message-ID: > http://bugs.python.org/issue25924 /* On systems on which getaddrinfo() is believed to not be thread-safe, (this includes the getaddrinfo emulation) protect access with a lock. */ #if defined(WITH_THREAD) && (defined(__APPLE__) || \ (defined(__FreeBSD__) && __FreeBSD_version+0 < 503000) || \ defined(__OpenBSD__) || defined(__NetBSD__) || \ defined(__VMS) || !defined(HAVE_GETADDRINFO)) #define USE_GETADDRINFO_LOCK #endif Don't you think it is better get it an additioinal defined(__DONT_KNOW_OS). I might be wrong if you are using USE_GETADDRINFO_LOCK is specific to OS. From njt at ayvali.org Mon Dec 5 14:13:16 2016 From: njt at ayvali.org (N.J. Thomas) Date: Mon, 5 Dec 2016 11:13:16 -0800 Subject: [talk] I improved CPython's DNS system with your help In-Reply-To: References: Message-ID: <20161205191316.GH25387@ayvali.org> * A. Jesse Jiryu Davis [2016-12-02 16:43:00-0500]: > I submitted some patches to CPython that allowed it to call > getaddrinfo concurrently on BSD, and now I've written an article about > it: Very nice writeup. A shame that OS X (macOS?) doesn't have a bug tracker, or public API docs, or something similar so you don't have to pay $$$ and chase random people down to answer your fairly straightforward question. Thomas From edlinuxguru at gmail.com Mon Dec 5 14:21:14 2016 From: edlinuxguru at gmail.com (Edward Capriolo) Date: Mon, 5 Dec 2016 14:21:14 -0500 Subject: [talk] I improved CPython's DNS system with your help In-Reply-To: <20161205191316.GH25387@ayvali.org> References: <20161205191316.GH25387@ayvali.org> Message-ID: On Mon, Dec 5, 2016 at 2:13 PM, N.J. Thomas wrote: > * A. Jesse Jiryu Davis [2016-12-02 16:43:00-0500]: > > I submitted some patches to CPython that allowed it to call > > getaddrinfo concurrently on BSD, and now I've written an article about > > it: > > Very nice writeup. A shame that OS X (macOS?) doesn't have a bug > tracker, or public API docs, or something similar so you don't have to > pay $$$ and chase random people down to answer your fairly > straightforward question. > > Thomas > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk > Apple is an amazingly closed door place. https://en.wikipedia.org/wiki/FoundationDB "A notice on the FoundationDB web site indicated that the company has "evolved" its mission and would no longer offer downloads of the software" The rumor mill says Apple employees have to jump through huge hoops to open source anything. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kmsujit at gmail.com Tue Dec 6 05:11:53 2016 From: kmsujit at gmail.com (Sujit K M) Date: Tue, 6 Dec 2016 15:41:53 +0530 Subject: [talk] I improved CPython's DNS system with your help In-Reply-To: <20161205191316.GH25387@ayvali.org> References: <20161205191316.GH25387@ayvali.org> Message-ID: On Tue, Dec 6, 2016 at 12:43 AM, N.J. Thomas wrote: > * A. Jesse Jiryu Davis [2016-12-02 16:43:00-0500]: >> I submitted some patches to CPython that allowed it to call >> getaddrinfo concurrently on BSD, and now I've written an article about >> it: > > Very nice writeup. A shame that OS X (macOS?) doesn't have a bug > tracker, or public API docs, or something similar so you don't have to > pay $$$ and chase random people down to answer your fairly > straightforward question. > Not just that, Even support is not good enough. The apps don't work even to what ever is guarantee(iTunes redeem feature is one that readily comes to mind). Even there seems to be no haste in fixing these. From george at ceetonetechnology.com Wed Dec 7 12:44:24 2016 From: george at ceetonetechnology.com (George Rosamond) Date: Wed, 7 Dec 2016 12:44:24 -0500 Subject: [talk] NYC*BUG Holiday Hangout Tonight Message-ID: <2cbb2bd3-c7bb-6ff7-3222-e0524663a22a@ceetonetechnology.com> Wednesday, December 7 645 PM Suspenders at 108 Greenwich Street. We'll meet at one end of the restaurant/bar, and keep it informal this year. We are sorting out meetings for the new year with space reserved for: January 4 February 1 (we need space for that date) March 1 April 5 Those who previously expressed interest in doing a meeting, please ping admin@ to sort out the details. From george at ceetonetechnology.com Wed Dec 7 13:29:21 2016 From: george at ceetonetechnology.com (George Rosamond) Date: Wed, 7 Dec 2016 13:29:21 -0500 Subject: [talk] network discovery application Message-ID: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> Outside of the obvious choice of nmap, I vaguely remember some open source network discovery tools that scanned a network and produced easy to read graphical outputs, maybe even with doing OUI look ups to identify devices. The intended audience is less-technical users, in order for them to get regular snapshots of a network. Anyone have any ideas? Preferably something in ports... g From john.joe.villa at gmail.com Wed Dec 7 13:39:43 2016 From: john.joe.villa at gmail.com (John Villa) Date: Wed, 7 Dec 2016 13:39:43 -0500 Subject: [talk] network discovery application In-Reply-To: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> Message-ID: Have you looked into this? https://nmap.org/zenmap/ On Wed, Dec 7, 2016 at 1:29 PM, George Rosamond < george at ceetonetechnology.com> wrote: > Outside of the obvious choice of nmap, I vaguely remember some open > source network discovery tools that scanned a network and produced easy > to read graphical outputs, maybe even with doing OUI look ups to > identify devices. > > The intended audience is less-technical users, in order for them to get > regular snapshots of a network. > > Anyone have any ideas? Preferably something in ports... > > g > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: From justin at shiningsilence.com Wed Dec 7 23:43:15 2016 From: justin at shiningsilence.com (Justin Sherrill) Date: Wed, 7 Dec 2016 23:43:15 -0500 Subject: [talk] network discovery application In-Reply-To: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> Message-ID: Zenoss and LibreNMS are ostensibly monitoring packages, but they try to auto-map your network for you. There's also The Dude from MicroTik, which I think is at least source-available. I have not really used these products, just heard them recommended, so they may or may not have the usability level you want. On Wed, Dec 7, 2016 at 1:29 PM, George Rosamond wrote: > Outside of the obvious choice of nmap, I vaguely remember some open > source network discovery tools that scanned a network and produced easy > to read graphical outputs, maybe even with doing OUI look ups to > identify devices. > > The intended audience is less-technical users, in order for them to get > regular snapshots of a network. > > Anyone have any ideas? Preferably something in ports... > > g > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk From kmsujit at gmail.com Thu Dec 8 06:28:54 2016 From: kmsujit at gmail.com (Sujit K M) Date: Thu, 8 Dec 2016 16:58:54 +0530 Subject: [talk] network discovery application In-Reply-To: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> Message-ID: On Wed, Dec 7, 2016 at 11:59 PM, George Rosamond wrote: > Outside of the obvious choice of nmap, I vaguely remember some open > source network discovery tools that scanned a network and produced easy > to read graphical outputs, maybe even with doing OUI look ups to > identify devices. > > The intended audience is less-technical users, in order for them to get > regular snapshots of a network. > > Anyone have any ideas? Preferably something in ports... > I sort of got confused here. If you need to generate the network topology you need to take a dump of the network and then analyse that. If you are asking whether there is any way to analyse the dump, http://security.stackexchange.com/questions/122304/how-to-map-a-network-passively-with-wireshark-dumps. From njt at ayvali.org Thu Dec 8 12:39:03 2016 From: njt at ayvali.org (N.J. Thomas) Date: Thu, 8 Dec 2016 09:39:03 -0800 Subject: [talk] network discovery application In-Reply-To: References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> Message-ID: <20161208173903.GQ25387@ayvali.org> * John Villa [2016-12-07 13:39:43-0500]: > On Wed, Dec 7, 2016 at 1:29 PM, George Rosamond wrote: > > I vaguely remember some open source network discovery tools that > > scanned a network and produced easy to read graphical outputs, > > > > Anyone have any ideas? Preferably something in ports... > > Have you looked into this? > https://nmap.org/zenmap/ I think Zenmap or lanmap should do what you want. Both are in FreeBSD ports. (You may have to use a bit of elbow grease to get it setup for non-tech users.) Also check if OpenNMS is right for you -- but it's a java program, so I don't think there's a port for that. Thomas From pete at nomadlogic.org Thu Dec 8 13:01:51 2016 From: pete at nomadlogic.org (Pete Wright) Date: Thu, 8 Dec 2016 10:01:51 -0800 Subject: [talk] network discovery application In-Reply-To: <20161208173903.GQ25387@ayvali.org> References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> <20161208173903.GQ25387@ayvali.org> Message-ID: On 12/8/16 9:39 AM, N.J. Thomas wrote: > * John Villa [2016-12-07 13:39:43-0500]: >> On Wed, Dec 7, 2016 at 1:29 PM, George Rosamond wrote: >>> I vaguely remember some open source network discovery tools that >>> scanned a network and produced easy to read graphical outputs, >>> >>> Anyone have any ideas? Preferably something in ports... >> >> Have you looked into this? >> https://nmap.org/zenmap/ > > I think Zenmap or lanmap should do what you want. Both are in FreeBSD > ports. (You may have to use a bit of elbow grease to get it setup for > non-tech users.) > > Also check if OpenNMS is right for you -- but it's a java program, so I > don't think there's a port for that. > I can unequivocally say that no-one should touch OpenNMS unless they are secretly building infrastructure to take down a company from the inside - it combines the magic of Java with the insanity of SNMP slathered with a nice layer of XML for configuration. those battle wounds are still a bit fresh :) -pete -- Pete Wright pete at nomadlogic.org nomadlogicLA From venture37 at geeklan.co.uk Thu Dec 8 18:53:40 2016 From: venture37 at geeklan.co.uk (Sevan Janiyan) Date: Thu, 8 Dec 2016 23:53:40 +0000 Subject: [talk] network discovery application In-Reply-To: References: <2737936f-ae98-e75f-7bc4-b551d9b653b7@ceetonetechnology.com> <20161208173903.GQ25387@ayvali.org> Message-ID: <57014ecd-ec04-e99f-1e04-30a4971f1c4a@geeklan.co.uk> On 08/12/2016 18:01, Pete Wright wrote: > I can unequivocally say that no-one should touch OpenNMS unless they are > secretly building infrastructure to take down a company from the inside > - it combines the magic of Java with the insanity of SNMP slathered with > a nice layer of XML for configuration. > > > those battle wounds are still a bit fresh :) <3 I'll say that the auto discovery thing in OpenNMS is pretty cool. Point it at network ranges & it goes and probes the ranges and starts adding detected devices to the system. I gave up with having anything to do with it back in 2014 but someone else picked up the ball and suffered for a couple of years going back and forth after. Turns out there was issues both in OpenJDK 8 & OpenNMS which never showed up on Linux because it was a bit more permissive about mistakes than FreeBSD. I guess running it on FreeBSD is a possible thing again now (I never ran v8. sunjdk5, openjdk 6 & 7 where the versions I tried it on). Not sure if it'll ever make it into ports (data, settings & binaries are intertwined in the same directory (just like on Solaris where each application installs under a separate prefix)). The other thing is that it uses maven for the build system & has a hefty dependency list. It seemed the open source offering was very much the testing ground for the commercial offering (through the amount of bug reports filed & having to run bleeding edge to obtain fixes which is a lost effort because you want to use a stable version to build on/package, not get side tracked into navigating through a minefield). For all the things java is supposed to do in terms of portability, it was pretty fragile as soon as you moved away from redhat flavoured? linux distro. Certainly the Solaris packages were broken for some time through the mix use of /bin/sh & /bin/bash in scripts. Was broken on OminOS if I recall too for other reasons. There's a bunch of really clumsy & painful stuff which did eventually get resolved before all that so it wasn't just all about maiming. Sevan From bcallah at devio.us Tue Dec 13 13:23:19 2016 From: bcallah at devio.us (Brian Callahan) Date: Tue, 13 Dec 2016 13:23:19 -0500 Subject: [talk] AsiaBSDCon and BSDCan 2017 CfP dates Message-ID: <68856880-8bed-6a64-cfcb-917bb051863f@devio.us> Hi all -- Looks like AsiaBSDCon 2017 papers are due December 31 and BSDCan 2017 abstracts are due January 19. I'm probably gonna write something for both conferences. Anyone else planning on attending or presenting? ~Brian From pete at nomadlogic.org Tue Dec 13 19:07:09 2016 From: pete at nomadlogic.org (Pete Wright) Date: Tue, 13 Dec 2016 16:07:09 -0800 Subject: [talk] Climate Mirror Message-ID: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> Not to go super political on this list - but I'm personally pretty keen to preserve climate science related data-sets for future generations (as well as our own). So in that light - wondering if anyone has taken a look at this: http://climatemirror.org/ I'm investigating now to see if there are any datasets I can help mirror - if anyone else on this list is interested let me know, maybe we can make a large scale effort? -pete -- Pete Wright pete at nomadlogic.org nomadlogicLA From mirimir at riseup.net Tue Dec 13 21:36:51 2016 From: mirimir at riseup.net (Mirimir) Date: Tue, 13 Dec 2016 19:36:51 -0700 Subject: [talk] Climate Mirror In-Reply-To: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> Message-ID: <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> On 12/13/2016 05:07 PM, Pete Wright wrote: > Not to go super political on this list - but I'm personally pretty keen > to preserve climate science related data-sets for future generations (as > well as our own). So in that light - wondering if anyone has taken a > look at this: > > http://climatemirror.org/ > > I'm investigating now to see if there are any datasets I can help mirror > - if anyone else on this list is interested let me know, maybe we can > make a large scale effort? Some of these datasets are humongous. Many TB. I wonder about alternatives to straight-up mirrors. I mean, a 100TB server is a nontrivial investment. Maybe torrents, IPFS, ...? Or a collaborative distributed file system. Perhaps using QFS, MFS or LFS? > -pete > From ike at blackskyresearch.net Wed Dec 14 00:29:53 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 00:29:53 -0500 Subject: [talk] Climate Mirror In-Reply-To: <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> Message-ID: <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> Word, > On Dec 13, 2016, at 9:36 PM, Mirimir wrote: > > On 12/13/2016 05:07 PM, Pete Wright wrote: >> Not to go super political on this list - but I'm personally pretty keen >> to preserve climate science related data-sets for future generations (as >> well as our own). So in that light - wondering if anyone has taken a >> look at this: >> >> http://climatemirror.org/ >> >> I'm investigating now to see if there are any datasets I can help mirror >> - if anyone else on this list is interested let me know, maybe we can >> make a large scale effort? I?m 100% in. Any job for a big ZFS server and a lot of disk makes me quite happy :) [x] Colo: If there is general consensus that this is worthwhile, NYC*BUG does have colo space for a big isolated storage box. [x] Admins: There is certainly enough admin experience around NYC*BUG to keep some storage/mirrors online, and I?d be quite enthused to take lead to get this one bootstrapped. [?] Server(s): Acquiring the hardware, and a box of 8tb disks, well- that may take a second to get a HW donation which works? > > Some of these datasets are humongous. Many TB. In-deedy :) > I wonder about > alternatives to straight-up mirrors. I mean, a 100TB server is a > nontrivial investment. Nontrivial, but not impossible. Shooting from the hip, based on a brand-new supermicro storage without many frills: - 3u supermicro with 24x SATA drive bays, 32g ram, a few cheap bottom-of-curve CPU?s $5000 - 20 pcs 8TB sata disks, $220 * 20 $4400 So lets say 9500 bucks. I?m already thinking of vendors to ping for straight donations. (By December end, I know there will be some 3-year-old inventory that would serve our purposes) > Maybe torrents, IPFS, ...? Or a collaborative > distributed file system. Perhaps using QFS, MFS or LFS? While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? Best, .ike > >> -pete >> > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk From bcully at gmail.com Wed Dec 14 08:52:32 2016 From: bcully at gmail.com (Brian Cully) Date: Wed, 14 Dec 2016 08:52:32 -0500 Subject: [talk] Climate Mirror In-Reply-To: <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> Message-ID: <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy wrote: > >> Maybe torrents, IPFS, ...? Or a collaborative >> distributed file system. Perhaps using QFS, MFS or LFS? > > While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. > > Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? I like the idea of using torrents. There are lots of upsides: it?s easy to get involved by sharing a smallish chunk of the set, the relatively small tracker file can be separately copied around to ensure there?s no single point of attack, perhaps even via git or something similar to ensure it?s not tampered with (and made trivially available via github), and it?s pretty fire-and-forget (just leave it running on a routable server). The major downside I see is that unless the data has already been made available via torrent, someone?s gotta seed the thing, which still means you need at least one server with a lot of disk space to get the project started. That?s something that we may want anyway, just to ensure the thing can always be seeded (at least until the feds come knocking, but hopefully by then there are many redundant copies of the data sitting around the world). I know I?d certainly be willing to donate a few TB on my server to hosting a portion of the data set, but there?s no way I could host the whole thing, and I?d also be willing to throw some money into the hat to get the seed up. -bjc From viewtiful.icchan at gmail.com Wed Dec 14 08:56:51 2016 From: viewtiful.icchan at gmail.com (Robert Menes) Date: Wed, 14 Dec 2016 08:56:51 -0500 Subject: [talk] Climate Mirror In-Reply-To: References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> Message-ID: I'm down to help cobble together the needed hardware and help get it all online. Would be good experience. :) --Robert On Dec 14, 2016 00:30, "Isaac (.ike) Levy" wrote: Word, > On Dec 13, 2016, at 9:36 PM, Mirimir wrote: > > On 12/13/2016 05:07 PM, Pete Wright wrote: >> Not to go super political on this list - but I'm personally pretty keen >> to preserve climate science related data-sets for future generations (as >> well as our own). So in that light - wondering if anyone has taken a >> look at this: >> >> http://climatemirror.org/ >> >> I'm investigating now to see if there are any datasets I can help mirror >> - if anyone else on this list is interested let me know, maybe we can >> make a large scale effort? I?m 100% in. Any job for a big ZFS server and a lot of disk makes me quite happy :) [x] Colo: If there is general consensus that this is worthwhile, NYC*BUG does have colo space for a big isolated storage box. [x] Admins: There is certainly enough admin experience around NYC*BUG to keep some storage/mirrors online, and I?d be quite enthused to take lead to get this one bootstrapped. [?] Server(s): Acquiring the hardware, and a box of 8tb disks, well- that may take a second to get a HW donation which works? > > Some of these datasets are humongous. Many TB. In-deedy :) > I wonder about > alternatives to straight-up mirrors. I mean, a 100TB server is a > nontrivial investment. Nontrivial, but not impossible. Shooting from the hip, based on a brand-new supermicro storage without many frills: - 3u supermicro with 24x SATA drive bays, 32g ram, a few cheap bottom-of-curve CPU?s $5000 - 20 pcs 8TB sata disks, $220 * 20 $4400 So lets say 9500 bucks. I?m already thinking of vendors to ping for straight donations. (By December end, I know there will be some 3-year-old inventory that would serve our purposes) > Maybe torrents, IPFS, ...? Or a collaborative > distributed file system. Perhaps using QFS, MFS or LFS? While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? Best, .ike > >> -pete >> > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk _______________________________________________ talk mailing list talk at lists.nycbug.org http://lists.nycbug.org/mailman/listinfo/talk -------------- next part -------------- An HTML attachment was scrubbed... URL: From pete at nomadlogic.org Wed Dec 14 12:55:52 2016 From: pete at nomadlogic.org (Pete Wright) Date: Wed, 14 Dec 2016 09:55:52 -0800 Subject: [talk] Climate Mirror In-Reply-To: <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> Message-ID: <65ca3f49-bce9-388e-a36c-b68ab5af1039@nomadlogic.org> On 12/13/16 9:29 PM, Isaac (.ike) Levy wrote: >> I wonder about >> alternatives to straight-up mirrors. I mean, a 100TB server is a >> nontrivial investment. > > Nontrivial, but not impossible. Shooting from the hip, based on a brand-new supermicro storage without many frills: > > - 3u supermicro with 24x SATA drive bays, 32g ram, a few cheap bottom-of-curve CPU?s > $5000 > - 20 pcs 8TB sata disks, $220 * 20 > $4400 > > So lets say 9500 bucks. I?m already thinking of vendors to ping for straight donations. > (By December end, I know there will be some 3-year-old inventory that would serve our purposes) > might be worth pining the folks at iXsystems to see if they can cut a deal for a system like this. -- Pete Wright pete at nomadlogic.org nomadlogicLA From pete at nomadlogic.org Wed Dec 14 12:59:34 2016 From: pete at nomadlogic.org (Pete Wright) Date: Wed, 14 Dec 2016 09:59:34 -0800 Subject: [talk] Climate Mirror In-Reply-To: <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> Message-ID: <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> On 12/14/16 5:52 AM, Brian Cully wrote: > On 14-Dec-2016, at 00:29, Isaac (.ike) Levy wrote: > >> >>> Maybe torrents, IPFS, ...? Or a collaborative >>> distributed file system. Perhaps using QFS, MFS or LFS? >> >> While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. >> >> Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? > > I like the idea of using torrents. There are lots of upsides: it?s easy to get involved by sharing a smallish chunk of the set, the relatively small tracker file can be separately copied around to ensure there?s no single point of attack, perhaps even via git or something similar to ensure it?s not tampered with (and made trivially available via github), and it?s pretty fire-and-forget (just leave it running on a routable server). > > The major downside I see is that unless the data has already been made available via torrent, someone?s gotta seed the thing, which still means you need at least one server with a lot of disk space to get the project started. That?s something that we may want anyway, just to ensure the thing can always be seeded (at least until the feds come knocking, but hopefully by then there are many redundant copies of the data sitting around the world). > > I know I?d certainly be willing to donate a few TB on my server to hosting a portion of the data set, but there?s no way I could host the whole thing, and I?d also be willing to throw some money into the hat to get the seed up. > I was thinking about using the torrent protocol last night and i think there are two issues that would prevent this: - we'd have to generate check-sums for every dataset that is stored, then generate URI's for each of them. I am pretty confident the data here is not bt friendly...which leads to my second point - the academic/prof consumers of this data are probably not going to use bt to download these files for research. unfortunately ftp and http are probably used very frequently in these arena's. having said this - i def feel that bt would be a *much* better method to distribute and share the cost of hosting data...but i'm not sure if they are ready for this or not :) -pete -- Pete Wright pete at nomadlogic.org nomadlogicLA From ike at blackskyresearch.net Wed Dec 14 13:35:58 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 13:35:58 -0500 Subject: [talk] Climate Mirror In-Reply-To: <65ca3f49-bce9-388e-a36c-b68ab5af1039@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <65ca3f49-bce9-388e-a36c-b68ab5af1039@nomadlogic.org> Message-ID: <20991C22-B545-4963-A1E4-962921484FB2@blackskyresearch.net> > On Dec 14, 2016, at 12:55 PM, Pete Wright wrote: > > might be worth pining the folks at iXsystems to see if they can cut a deal for a system like this. Yep, they?re certainly one in the queue of folks to ping- I?d hope they have some deadstock ?last years model? storage chassis collecting dust in the CA warehouse? I?ve got my contacts there, but does anyone else on list want to take point on talking to iXsystems about this? Rocket- .ike From ike at blackskyresearch.net Wed Dec 14 13:40:09 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 13:40:09 -0500 Subject: [talk] Climate Mirror In-Reply-To: <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> Message-ID: <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> > On Dec 14, 2016, at 12:59 PM, Pete Wright wrote: > > > > On 12/14/16 5:52 AM, Brian Cully wrote: >> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy wrote: >> >>> >>>> Maybe torrents, IPFS, ...? Or a collaborative >>>> distributed file system. Perhaps using QFS, MFS or LFS? >>> >>> While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. >>> >>> Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? >> >> I like the idea of using torrents. There are lots of upsides: it?s easy to get involved by sharing a smallish chunk of the set, the relatively small tracker file can be separately copied around to ensure there?s no single point of attack, perhaps even via git or something similar to ensure it?s not tampered with (and made trivially available via github), and it?s pretty fire-and-forget (just leave it running on a routable server). >> >> The major downside I see is that unless the data has already been made available via torrent, someone?s gotta seed the thing, which still means you need at least one server with a lot of disk space to get the project started. That?s something that we may want anyway, just to ensure the thing can always be seeded (at least until the feds come knocking, but hopefully by then there are many redundant copies of the data sitting around the world). >> >> I know I?d certainly be willing to donate a few TB on my server to hosting a portion of the data set, but there?s no way I could host the whole thing, and I?d also be willing to throw some money into the hat to get the seed up. >> > > > I was thinking about using the torrent protocol last night and i think there are two issues that would prevent this: > > > - we'd have to generate check-sums for every dataset that is stored, then generate URI's for each of them. I am pretty confident the data here is not bt friendly...which leads to my second point > > - the academic/prof consumers of this data are probably not going to use bt to download these files for research. unfortunately ftp and http are probably used very frequently in these arena's. > > having said this - i def feel that bt would be a *much* better method to distribute and share the cost of hosting data...but i'm not sure if they are ready for this or not :) > > -pete From my view, everything points back to needing some simple big disk online to have complete sets- even as a base to seed torrents/other. I?m personally going to focus on that end, but I?d really love to see more ideas for distributed data hit this list- particularly if someone has actionable ways to get involved, (e.g. how-to use this pkg, use this torrent, configure like so, etc?) Well worth the discussion and collaboration here, even if this gets messy or incomplete at first! Rocket- .ike From _ at thomaslevine.com Wed Dec 14 14:15:43 2016 From: _ at thomaslevine.com (Thomas Levine) Date: Wed, 14 Dec 2016 19:15:43 +0000 Subject: [talk] Climate Mirror In-Reply-To: <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> Message-ID: <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> The data don't need to be online; save them to a redundant bunch of cheap hard drives (or maybe tapes), and distribute them among lots of bookshelves. They can even be slow and small hard drives pulled from old computers; we need to write to each one only once, we might need to read from each one once, and we otherwise only need to turn them on once every couple years to make sure that they're still intact. Maintain a website with a list of the datasets, the datasets' checksums, and the contact information for the people with the hard drives on their bookshelves. Note that this is my opinion only on how this project could be implemented. I don't know enough about the datasets or the likely effects of geopolitics on their implementation in order to comment as to whether I think the project should be implemented. On Wed, Dec 14, 2016, at 06:40 PM, Isaac (.ike) Levy wrote: > > > On Dec 14, 2016, at 12:59 PM, Pete Wright wrote: > > > > > > > > On 12/14/16 5:52 AM, Brian Cully wrote: > >> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy wrote: > >> > >>> > >>>> Maybe torrents, IPFS, ...? Or a collaborative > >>>> distributed file system. Perhaps using QFS, MFS or LFS? > >>> > >>> While I just got pretty excited about NYC*BUG?s ability to take this on whole hog, I ABSOLUTELY would love to see this explored further. > >>> > >>> Could you propose something we could get involved in as a group from NYC*BUG, perhaps something people can run to donate a small chunk of their own smaller servers? > >> > >> I like the idea of using torrents. There are lots of upsides: it?s easy to get involved by sharing a smallish chunk of the set, the relatively small tracker file can be separately copied around to ensure there?s no single point of attack, perhaps even via git or something similar to ensure it?s not tampered with (and made trivially available via github), and it?s pretty fire-and-forget (just leave it running on a routable server). > >> > >> The major downside I see is that unless the data has already been made available via torrent, someone?s gotta seed the thing, which still means you need at least one server with a lot of disk space to get the project started. That?s something that we may want anyway, just to ensure the thing can always be seeded (at least until the feds come knocking, but hopefully by then there are many redundant copies of the data sitting around the world). > >> > >> I know I?d certainly be willing to donate a few TB on my server to hosting a portion of the data set, but there?s no way I could host the whole thing, and I?d also be willing to throw some money into the hat to get the seed up. > >> > > > > > > I was thinking about using the torrent protocol last night and i think there are two issues that would prevent this: > > > > > > - we'd have to generate check-sums for every dataset that is stored, then generate URI's for each of them. I am pretty confident the data here is not bt friendly...which leads to my second point > > > > - the academic/prof consumers of this data are probably not going to use bt to download these files for research. unfortunately ftp and http are probably used very frequently in these arena's. > > > > having said this - i def feel that bt would be a *much* better method to distribute and share the cost of hosting data...but i'm not sure if they are ready for this or not :) > > > > -pete > > From my view, everything points back to needing some simple big disk > online to have complete sets- even as a base to seed torrents/other. > > I?m personally going to focus on that end, but I?d really love to see > more ideas for distributed data hit this list- particularly if someone > has actionable ways to get involved, (e.g. how-to use this pkg, use this > torrent, configure like so, etc?) > > Well worth the discussion and collaboration here, even if this gets messy > or incomplete at first! > > Rocket- > .ike > > > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk From pete at nomadlogic.org Wed Dec 14 14:23:06 2016 From: pete at nomadlogic.org (Pete Wright) Date: Wed, 14 Dec 2016 11:23:06 -0800 Subject: [talk] Climate Mirror In-Reply-To: <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> Message-ID: <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> On 12/14/16 11:15 AM, Thomas Levine wrote: > The data don't need to be online; save them to a redundant bunch of > cheap hard drives (or maybe tapes), and distribute them among lots of > bookshelves. They can even be slow and small hard drives pulled from old > computers; we need to write to each one only once, we might need to read > from each one once, and we otherwise only need to turn them on once > every couple years to make sure that they're still intact. Maintain a > website with a list of the datasets, the datasets' checksums, and the > contact information for the people with the hard drives on their > bookshelves. > > Note that this is my opinion only on how this project could be > implemented. I don't know enough about the datasets or the likely > effects of geopolitics on their implementation in order to comment as to > whether I think the project should be implemented. > not to nit-pick but i would strongly recommend *against* using HDD's in this manner (magnetic spinning ones, or SSD ones). Drives are not designed to reliably store data cold like this mechanically or electrically. This is why tapes are still in use to this day - they *are* designed for cold store. And if you do hit a bad sector it is quite possible to skip that sector and continue reading data. This is coming from quite a bit first-hand experience where I've lost data-sets which were in cold-storage on HDD's for about a year that were totally lost, versus data on tapes which were in cold-store for around 5-7years where we had few problems recovering our assets. -p -- Pete Wright pete at nomadlogic.org nomadlogicLA From ike at blackskyresearch.net Wed Dec 14 14:25:16 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 14:25:16 -0500 Subject: [talk] Climate Mirror In-Reply-To: <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> Message-ID: Word, > On Dec 14, 2016, at 2:15 PM, Thomas Levine <_ at thomaslevine.com> wrote: > > The data don't need to be online; Very creative- I like that premise. > save them to a redundant bunch of > cheap hard drives (or maybe tapes), and distribute them among lots of > bookshelves. They can even be slow and small hard drives pulled from old > computers; we need to write to each one only once, we might need to read > from each one once, and we otherwise only need to turn them on once > every couple years to make sure that they're still intact. Maintain a > website with a list of the datasets, the datasets' checksums, and the > contact information for the people with the hard drives on their > bookshelves. Actually, just to get in the weeds and get all BSD on this: ZFS mirrors of cheap/crappy old drives would likely go a long way toward ?on-shelf" preservation. Basically, pulling down the data to a ZFS mirror could help mitigate bit-rot, and ?checking? the data after a year could be literally plugging in the drives, and performing a ZFS scrub to look for dead/bad blocks, and repair from mirrored block. I haven?t thought of ZFS in this offlined context, but this approach almost seems too easy. Without going nuts on the relative merits of various ZFS block replication schemes, can anyone poke holes in why this may overcomplicate the idea? I?m all ears? (This incidentally is something I could start tonight from home, using a drawer full of flaky old 2Tb drives?) > > Note that this is my opinion only on how this project could be > implemented. I don't know enough about the datasets or the likely > effects of geopolitics on their implementation in order to comment as to > whether I think the project should be implemented. I?m with you on that- I?m happy to trust the original project directors lists of what?s most important and most relevant to climate scientists. Lots of what?s listed is NOAA and NASA, the value of the data seems self-evident to me just by the names of the data sets. Best, .ike From ike at blackskyresearch.net Wed Dec 14 14:28:18 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 14:28:18 -0500 Subject: [talk] Climate Mirror In-Reply-To: <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> Message-ID: <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> > On Dec 14, 2016, at 2:23 PM, Pete Wright wrote: > > > > On 12/14/16 11:15 AM, Thomas Levine wrote: >> The data don't need to be online; save them to a redundant bunch of >> cheap hard drives (or maybe tapes), and distribute them among lots of >> bookshelves. They can even be slow and small hard drives pulled from old >> computers; we need to write to each one only once, we might need to read >> from each one once, and we otherwise only need to turn them on once >> every couple years to make sure that they're still intact. Maintain a >> website with a list of the datasets, the datasets' checksums, and the >> contact information for the people with the hard drives on their >> bookshelves. >> >> Note that this is my opinion only on how this project could be >> implemented. I don't know enough about the datasets or the likely >> effects of geopolitics on their implementation in order to comment as to >> whether I think the project should be implemented. >> > > not to nit-pick but i would strongly recommend *against* using HDD's in this manner (magnetic spinning ones, or SSD ones). Drives are not designed to reliably store data cold like this mechanically or electrically. This is why tapes are still in use to this day - they *are* designed for cold store. And if you do hit a bad sector it is quite possible to skip that sector and continue reading data. > > This is coming from quite a bit first-hand experience where I've lost data-sets which were in cold-storage on HDD's for about a year that were totally lost, versus data on tapes which were in cold-store for around 5-7years where we had few problems recovering our assets. Hrm. You may have put the silver bullet in my previous post. Pete: any thoughts on mitigating this effect by using ZFS mirrors? Perhaps even increasing the block mirror count across disks, so even on one of the mirrored disks there are 2 mirrored blocks? I mean, one crappy way to test this is to just do it and wait a year :P Best, .ike From pete at nomadlogic.org Wed Dec 14 14:31:48 2016 From: pete at nomadlogic.org (Pete Wright) Date: Wed, 14 Dec 2016 11:31:48 -0800 Subject: [talk] Climate Mirror In-Reply-To: <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> Message-ID: <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> On 12/14/16 11:28 AM, Isaac (.ike) Levy wrote: > >> On Dec 14, 2016, at 2:23 PM, Pete Wright wrote: >> >> >> >> On 12/14/16 11:15 AM, Thomas Levine wrote: >>> The data don't need to be online; save them to a redundant bunch of >>> cheap hard drives (or maybe tapes), and distribute them among lots of >>> bookshelves. They can even be slow and small hard drives pulled from old >>> computers; we need to write to each one only once, we might need to read >>> from each one once, and we otherwise only need to turn them on once >>> every couple years to make sure that they're still intact. Maintain a >>> website with a list of the datasets, the datasets' checksums, and the >>> contact information for the people with the hard drives on their >>> bookshelves. >>> >>> Note that this is my opinion only on how this project could be >>> implemented. I don't know enough about the datasets or the likely >>> effects of geopolitics on their implementation in order to comment as to >>> whether I think the project should be implemented. >>> >> >> not to nit-pick but i would strongly recommend *against* using HDD's in this manner (magnetic spinning ones, or SSD ones). Drives are not designed to reliably store data cold like this mechanically or electrically. This is why tapes are still in use to this day - they *are* designed for cold store. And if you do hit a bad sector it is quite possible to skip that sector and continue reading data. >> >> This is coming from quite a bit first-hand experience where I've lost data-sets which were in cold-storage on HDD's for about a year that were totally lost, versus data on tapes which were in cold-store for around 5-7years where we had few problems recovering our assets. > > Hrm. You may have put the silver bullet in my previous post. > > Pete: any thoughts on mitigating this effect by using ZFS mirrors? Perhaps even increasing the block mirror count across disks, so even on one of the mirrored disks there are 2 mirrored blocks? > > I mean, one crappy way to test this is to just do it and wait a year :P tbh - i'd be most concerned about mechanical issues on magnetic HDD's. Unlike a tape where I can physically forward the tape to a new sector if I run into a problem (something i've had to do!) I have seen my fair share of drives sit idle for a period of time only to refuse to spin-up when i tried to revive/recycle them. -p -- Pete Wright pete at nomadlogic.org nomadlogicLA From _ at thomaslevine.com Wed Dec 14 14:44:02 2016 From: _ at thomaslevine.com (Thomas Levine) Date: Wed, 14 Dec 2016 19:44:02 +0000 Subject: [talk] Climate Mirror In-Reply-To: <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> Message-ID: <1481744642.2292918.819159321.35FA8504@webmail.messagingengine.com> Any form of cold storage would be great; we don't need to read from them quickly, so the choice is mostly about cost. I thought twelve redundant cheap/free hard drives might be cheaper or easier than like three redundant tapes. Since there will be like one file per disk, it could be neat to not use a filesystem and instead develop some custom compression and error correction method for the specific file formats. But I don't trust anyone to implement that perfectly or to remember ten years from now how that works, so I don't recommend it. Regarding my speculation about the utility of this project: I will be very disturbed if people haven't already been mirroring the data, so I am puzzled as to why a potentially enormous change in government policy should make the mirroring into a pressing issue. On Wed, Dec 14, 2016, at 07:31 PM, Pete Wright wrote: > > > On 12/14/16 11:28 AM, Isaac (.ike) Levy wrote: > > > >> On Dec 14, 2016, at 2:23 PM, Pete Wright wrote: > >> > >> > >> > >> On 12/14/16 11:15 AM, Thomas Levine wrote: > >>> The data don't need to be online; save them to a redundant bunch of > >>> cheap hard drives (or maybe tapes), and distribute them among lots of > >>> bookshelves. They can even be slow and small hard drives pulled from old > >>> computers; we need to write to each one only once, we might need to read > >>> from each one once, and we otherwise only need to turn them on once > >>> every couple years to make sure that they're still intact. Maintain a > >>> website with a list of the datasets, the datasets' checksums, and the > >>> contact information for the people with the hard drives on their > >>> bookshelves. > >>> > >>> Note that this is my opinion only on how this project could be > >>> implemented. I don't know enough about the datasets or the likely > >>> effects of geopolitics on their implementation in order to comment as to > >>> whether I think the project should be implemented. > >>> > >> > >> not to nit-pick but i would strongly recommend *against* using HDD's in this manner (magnetic spinning ones, or SSD ones). Drives are not designed to reliably store data cold like this mechanically or electrically. This is why tapes are still in use to this day - they *are* designed for cold store. And if you do hit a bad sector it is quite possible to skip that sector and continue reading data. > >> > >> This is coming from quite a bit first-hand experience where I've lost data-sets which were in cold-storage on HDD's for about a year that were totally lost, versus data on tapes which were in cold-store for around 5-7years where we had few problems recovering our assets. > > > > Hrm. You may have put the silver bullet in my previous post. > > > > Pete: any thoughts on mitigating this effect by using ZFS mirrors? Perhaps even increasing the block mirror count across disks, so even on one of the mirrored disks there are 2 mirrored blocks? > > > > I mean, one crappy way to test this is to just do it and wait a year :P > > tbh - i'd be most concerned about mechanical issues on magnetic HDD's. > > Unlike a tape where I can physically forward the tape to a new sector if > I run into a problem (something i've had to do!) I have seen my fair > share of drives sit idle for a period of time only to refuse to spin-up > when i tried to revive/recycle them. > > -p > > > -- > Pete Wright > pete at nomadlogic.org > nomadlogicLA From bcully at gmail.com Wed Dec 14 14:51:24 2016 From: bcully at gmail.com (Brian Cully) Date: Wed, 14 Dec 2016 14:51:24 -0500 Subject: [talk] Climate Mirror In-Reply-To: <1481744642.2292918.819159321.35FA8504@webmail.messagingengine.com> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> <1481744642.2292918.819159321.35FA8504@webmail.messagingengine.com> Message-ID: <76E5E6C0-8E3A-48FF-B0E8-8045D01A7FBA@gmail.com> > On 14-Dec-2016, at 14:44, Thomas Levine <_ at thomaslevine.com> wrote: > > Regarding my speculation about the utility of this project: I will be > very disturbed if people haven't already been mirroring the data, so I > am puzzled as to why a potentially enormous change in government policy > should make the mirroring into a pressing issue. > Welcome to acedemia. Backups for large datasets are often non-existant for all kinds of reasons, and tend to be low on the list of concerns for harried researchers who have to spend most of their time running a lab, teaching, conferences, and applying for grants in order to keep their facilities afloat. -bjc From assaf at eml.cc Wed Dec 14 14:46:33 2016 From: assaf at eml.cc (Assaf Rutenberg) Date: Wed, 14 Dec 2016 14:46:33 -0500 Subject: [talk] Climate Mirror In-Reply-To: <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> Message-ID: As far as maintaining a non US mirror or storage space, i am in the midst of moving to Ecuador and can reliably maintain a NAS/Server down there, albeit with the slower connections afforded in South America, if someone can help me set one up. I think i even have a box of 4TB drives packed away that i could use. I'll be back in the States early in January for a month or two and will make it to the next nyc-bug meetup to see if there is any way i can be of help. -assaf On December 14, 2016 2:31:48 PM GMT-05:00, Pete Wright wrote: > > >On 12/14/16 11:28 AM, Isaac (.ike) Levy wrote: >> >>> On Dec 14, 2016, at 2:23 PM, Pete Wright >wrote: >>> >>> >>> >>> On 12/14/16 11:15 AM, Thomas Levine wrote: >>>> The data don't need to be online; save them to a redundant bunch of >>>> cheap hard drives (or maybe tapes), and distribute them among lots >of >>>> bookshelves. They can even be slow and small hard drives pulled >from old >>>> computers; we need to write to each one only once, we might need to >read >>>> from each one once, and we otherwise only need to turn them on once >>>> every couple years to make sure that they're still intact. Maintain >a >>>> website with a list of the datasets, the datasets' checksums, and >the >>>> contact information for the people with the hard drives on their >>>> bookshelves. >>>> >>>> Note that this is my opinion only on how this project could be >>>> implemented. I don't know enough about the datasets or the likely >>>> effects of geopolitics on their implementation in order to comment >as to >>>> whether I think the project should be implemented. >>>> >>> >>> not to nit-pick but i would strongly recommend *against* using HDD's >in this manner (magnetic spinning ones, or SSD ones). Drives are not >designed to reliably store data cold like this mechanically or >electrically. This is why tapes are still in use to this day - they >*are* designed for cold store. And if you do hit a bad sector it is >quite possible to skip that sector and continue reading data. >>> >>> This is coming from quite a bit first-hand experience where I've >lost data-sets which were in cold-storage on HDD's for about a year >that were totally lost, versus data on tapes which were in cold-store >for around 5-7years where we had few problems recovering our assets. >> >> Hrm. You may have put the silver bullet in my previous post. >> >> Pete: any thoughts on mitigating this effect by using ZFS mirrors? >Perhaps even increasing the block mirror count across disks, so even on >one of the mirrored disks there are 2 mirrored blocks? >> >> I mean, one crappy way to test this is to just do it and wait a year >:P > >tbh - i'd be most concerned about mechanical issues on magnetic HDD's. > >Unlike a tape where I can physically forward the tape to a new sector >if >I run into a problem (something i've had to do!) I have seen my fair >share of drives sit idle for a period of time only to refuse to spin-up > >when i tried to revive/recycle them. > >-p > > >-- >Pete Wright >pete at nomadlogic.org >nomadlogicLA > >_______________________________________________ >talk mailing list >talk at lists.nycbug.org >http://lists.nycbug.org/mailman/listinfo/talk -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ike at blackskyresearch.net Wed Dec 14 15:26:58 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 15:26:58 -0500 Subject: [talk] Climate Mirror In-Reply-To: <76E5E6C0-8E3A-48FF-B0E8-8045D01A7FBA@gmail.com> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> <1481744642.2292918.819159321.35FA8504@webmail.messagingengine.com> <76E5E6C0-8E3A-48FF-B0E8-8045D01A7FBA@gmail.com> Message-ID: <07F5CC64-3F73-4419-A4EE-F2C11E781A30@blackskyresearch.net> > On Dec 14, 2016, at 2:51 PM, Brian Cully wrote: > > >> On 14-Dec-2016, at 14:44, Thomas Levine <_ at thomaslevine.com> wrote: >> >> Regarding my speculation about the utility of this project: I will be >> very disturbed if people haven't already been mirroring the data, so I >> am puzzled as to why a potentially enormous change in government policy >> should make the mirroring into a pressing issue. >> > > Welcome to acedemia. Backups for large datasets are often non-existant for all kinds of reasons, and tend to be low on the list of concerns for harried researchers who have to spend most of their time running a lab, teaching, conferences, and applying for grants in order to keep their facilities afloat. > > -bjc +1 from the private sector here too, backing up even critical stuff is just not priority. I mean, modern civilization did *loose* the original moon landing tapes, right? Even when mirrors/backups are in place, here?s some knee-jerk reading on loftier technical ends of the problem, http://www.bbc.com/news/science-environment-31450389 http://queue.acm.org/detail.cfm?id=1866298 I do believe that a culture of distributed, heterogeneous backups spread across many groups and technical cultures is the best way for important data to survive long term. Best, .ike From ike at blackskyresearch.net Wed Dec 14 15:28:57 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 15:28:57 -0500 Subject: [talk] Climate Mirror In-Reply-To: References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> <6CA16479-D360-42DB-AD77-6CBFE552033F@blackskyresearch.net> <4dbe3813-3059-c87d-47c8-1abb1d757219@nomadlogic.org> Message-ID: <8E2C015E-98E8-487E-9B91-D61C7D65051F@blackskyresearch.net> > On Dec 14, 2016, at 2:46 PM, Assaf Rutenberg wrote: > > As far as maintaining a non US mirror or storage space, i am in the midst of moving to Ecuador and can reliably maintain a NAS/Server down there, albeit with the slower connections afforded in South America, if someone can help me set one up. I think i even have a box of 4TB drives packed away that i could use. I'll be back in the States early in January for a month or two and will make it to the next nyc-bug meetup to see if there is any way i can be of help. > > -assaf That is spectacular. If you already have the gear down there, I say just set it up and start siphoning down data sets! Hit list if you hit snags? Rocket- .ike > > On December 14, 2016 2:31:48 PM GMT-05:00, Pete Wright wrote: > > > On 12/14/16 11:28 AM, Isaac (.ike) Levy > wrote: > > > On Dec 14, 2016, > at 2:23 PM, Pete Wright wrote: > > > > > On 12/14/16 11:15 AM, Thomas Levine wrote: > The data don't > need to be online; save them to a redundant bunch of > > cheap hard > drives (or maybe tapes), and distribute them among lots of > > > bookshelves. They can even be slow and small hard drives pulled from > old > > computers; we need to write to each one only once, we might > need to read > > from each one once, and we otherwise only need to > turn them on once > > every couple years to > make sure that they're still intact. Maintain a > > website with a > list of the datasets, the datasets' checksums, and the > > contact > information for the people with the hard drives on their > > > bookshelves. > > > Note that this is my opinion only on how this > project could be > > implemented. I don't know enough about the > datasets or the likely > > effects of geopolitics on their > implementation in order to comment as to > > whether I think the > project should be implemented. > > > > not to nit-pick > but i would strongly recommend *against* using HDD's in this manner > (magnetic spinning ones, or SSD ones). Drives are not designed to > reliably store data cold like this mechanically or electrically. This > is why tapes are still in use to this day - they *are* designed for > cold store. And if you do hit a bad sector it is quite possible to > skip that sector and continue reading data. > > > This is coming > from quite a bit first-hand experience where > I've lost data-sets which were in cold-storage on HDD's for about a year that were totally lost, versus data on tapes which were in cold-store for around 5-7years where we had few problems recovering our assets. > > > Hrm. You may have put the silver bullet in my previous post. > > Pete: any thoughts on mitigating this effect by using ZFS mirrors? Perhaps even increasing the block mirror count across disks, so even on one of the mirrored disks there are 2 mirrored blocks? > > I mean, one crappy way to test this is to just do it and wait a year :P > > tbh - i'd be most concerned about mechanical issues on magnetic HDD's. > > Unlike a tape where I can physically forward the tape to a new sector if > I run into a problem (something i've had to do!) I have seen my fair > share of drives sit idle for a period of time only to refuse to spin-up > when i tried to revive/recycle them. > > -p > > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. From mirimir at riseup.net Wed Dec 14 17:23:12 2016 From: mirimir at riseup.net (Mirimir) Date: Wed, 14 Dec 2016 15:23:12 -0700 Subject: [talk] Climate Mirror In-Reply-To: <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> Message-ID: On 12/14/2016 11:40 AM, Isaac (.ike) Levy wrote: > >> On Dec 14, 2016, at 12:59 PM, Pete Wright >> wrote: >> >> >> >> On 12/14/16 5:52 AM, Brian Cully wrote: >>> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy >>> wrote: >>> >>>> >>>>> Maybe torrents, IPFS, ...? Or a collaborative distributed >>>>> file system. Perhaps using QFS, MFS or LFS? >>>> >>>> While I just got pretty excited about NYC*BUG?s ability to take >>>> this on whole hog, I ABSOLUTELY would love to see this explored >>>> further. >>>> >>>> Could you propose something we could get involved in as a group >>>> from NYC*BUG, perhaps something people can run to donate a >>>> small chunk of their own smaller servers? >>> >>> I like the idea of using torrents. There are lots of upsides: >>> it?s easy to get involved by sharing a smallish chunk of the set, >>> the relatively small tracker file can be separately copied around >>> to ensure there?s no single point of attack, perhaps even via git >>> or something similar to ensure it?s not tampered with (and made >>> trivially available via github), and it?s pretty fire-and-forget >>> (just leave it running on a routable server). >>> >>> The major downside I see is that unless the data has already been >>> made available via torrent, someone?s gotta seed the thing, which >>> still means you need at least one server with a lot of disk space >>> to get the project started. That?s something that we may want >>> anyway, just to ensure the thing can always be seeded (at least >>> until the feds come knocking, but hopefully by then there are >>> many redundant copies of the data sitting around the world). Right. But how many servers, and how much storage? Taking a quick look at that spreadsheet, I saw one 100TB dataset. And I wouldn't be surprised if there were several PB overall. But the initial box wouldn't need to seed 100TB or whatever in one go. Maybe 1TB chunks. That's pretty common for HD video. Once there were other seeders, the initial box could start seeding another 1TB chunk. >>> I know I?d certainly be willing to donate a few TB on my server >>> to hosting a portion of the data set, but there?s no way I could >>> host the whole thing, and I?d also be willing to throw some money >>> into the hat to get the seed up. >>> >> >> >> I was thinking about using the torrent protocol last night and i >> think there are two issues that would prevent this: >> >> >> - we'd have to generate check-sums for every dataset that is >> stored, then generate URI's for each of them. I am pretty >> confident the data here is not bt friendly...which leads to my >> second point Yes, seedboxes need decent CPU and RAM for checksums. >> - the academic/prof consumers of this data are probably not going >> to use bt to download these files for research. unfortunately ftp >> and http are probably used very frequently in these arena's. Well, just about every Linux distro comes with a BT client. And there's a webGUI for Transmission, which simplifies running remote servers. I'm sure that there's comparable stuff on *BSD, Mac and Windows. >> having said this - i def feel that bt would be a *much* better >> method to distribute and share the cost of hosting data...but i'm >> not sure if they are ready for this or not :) >> >> -pete > > From my view, everything points back to needing some simple big disk > online to have complete sets- even as a base to seed torrents/other. Yes. Plural, I think. There's a _lot_ of data. And it needs replication. > I?m personally going to focus on that end, but I?d really love to see > more ideas for distributed data hit this list- particularly if > someone has actionable ways to get involved, (e.g. how-to use this > pkg, use this torrent, configure like so, etc?) I've been playing with distributed file systems. Mainly LizardFS and Quantcast QFS. My primary focus has been running nodes as Tor onion services, with IPv6 OnionCat links. I've also been playing with MPTCP. With six IPv6 OnionCat links, I get 30-50Mbps between onions. With real multihomed servers, sans Tor, I suspect that Tbps is doable. Arguably, Tor-level privacy is unnecessary for this effort. I'll shift focus to that, for now. > Well worth the discussion and collaboration here, even if this gets > messy or incomplete at first! > > Rocket- .ike > > > > _______________________________________________ talk mailing list > talk at lists.nycbug.org http://lists.nycbug.org/mailman/listinfo/talk > From mirimir at riseup.net Wed Dec 14 17:39:16 2016 From: mirimir at riseup.net (Mirimir) Date: Wed, 14 Dec 2016 15:39:16 -0700 Subject: [talk] Climate Mirror In-Reply-To: References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> Message-ID: On 12/14/2016 12:25 PM, Isaac (.ike) Levy wrote: > Word, > >> On Dec 14, 2016, at 2:15 PM, Thomas Levine <_ at thomaslevine.com> >> wrote: >> >> The data don't need to be online; > > Very creative- I like that premise. > >> save them to a redundant bunch of cheap hard drives (or maybe >> tapes), and distribute them among lots of bookshelves. They can >> even be slow and small hard drives pulled from old computers; we >> need to write to each one only once, we might need to read from >> each one once, and we otherwise only need to turn them on once >> every couple years to make sure that they're still intact. Maintain >> a website with a list of the datasets, the datasets' checksums, and >> the contact information for the people with the hard drives on >> their bookshelves. > > Actually, just to get in the weeds and get all BSD on this: > > ZFS mirrors of cheap/crappy old drives would likely go a long way > toward ?on-shelf" preservation. Basically, pulling down the data to a > ZFS mirror could help mitigate bit-rot, and ?checking? the data after > a year could be literally plugging in the drives, and performing a > ZFS scrub to look for dead/bad blocks, and repair from mirrored > block. > > I haven?t thought of ZFS in this offlined context, but this approach > almost seems too easy. > > Without going nuts on the relative merits of various ZFS block > replication schemes, can anyone poke holes in why this may > overcomplicate the idea? I?m all ears? Well, it is my understanding that ZFS mitigates bit-rot. But I actually know nothing about it. However, offline storage will be essential, I think. There's a lot of data to replicate. As I argued in another subthread, BT seedboxes could seed data in 1TB chunks. And once there are other seeders, they could take their copy offline, and seed the next chunk. Also, with huge datasets, you can get ~20Gbps by shipping a bunch of disks or tape. > (This incidentally is something I could start tonight from home, > using a drawer full of flaky old 2Tb drives?) > >> >> Note that this is my opinion only on how this project could be >> implemented. I don't know enough about the datasets or the likely >> effects of geopolitics on their implementation in order to comment >> as to whether I think the project should be implemented. > > I?m with you on that- I?m happy to trust the original project > directors lists of what?s most important and most relevant to climate > scientists. Lots of what?s listed is NOAA and NASA, the value of the > data seems self-evident to me just by the names of the data sets. > > Best, .ike > > > > > _______________________________________________ talk mailing list > talk at lists.nycbug.org http://lists.nycbug.org/mailman/listinfo/talk > From george at ceetonetechnology.com Wed Dec 14 18:22:00 2016 From: george at ceetonetechnology.com (George Rosamond) Date: Wed, 14 Dec 2016 23:22:00 +0000 Subject: [talk] meeting idea Message-ID: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> Anyone see this? http://www.openrandom.org/ They ship out of Brooklyn... so might be worth pinging them to do a meeting? I know it's a product, but it's totally open-sourced from what I can see, and a meeting on hardware RNG is certainly worth having... thoughts? g From ike at blackskyresearch.net Wed Dec 14 18:58:01 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Wed, 14 Dec 2016 18:58:01 -0500 Subject: [talk] meeting idea In-Reply-To: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> References: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> Message-ID: > On Dec 14, 2016, at 6:22 PM, George Rosamond wrote: > > Anyone see this? > > http://www.openrandom.org/ > > They ship out of Brooklyn... so might be worth pinging them to do a meeting? > > I know it's a product, but it's totally open-sourced from what I can > see, and a meeting on hardware RNG is certainly worth having... > > thoughts? > > g Astounding. I?d be super excited to see someone walk through both internals/implementation, and practicals for that gear, at a NYC*BUG meeting. Best, .ike From jim at netgate.com Wed Dec 14 20:59:56 2016 From: jim at netgate.com (Jim Thompson) Date: Wed, 14 Dec 2016 19:59:56 -0600 Subject: [talk] meeting idea In-Reply-To: References: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> Message-ID: I was involved with something like this a long time ago. (The AtomAge RNG mentioned here: https://seifried.org/security/cryptography/20000126-random-numbers.html) Our architecture was quite similar to the Z1FFER <= 0.2.x. It's very difficult to eliminate the 60Hz (50Hz many places outside the USA) 'hum', which will show up in your stream of (now not quite so) random numbers. You'll also see periodic power supply 'noise' that can destroy the randomness of the circuit. We ended up battery powered, IIRC< 6 or 8 'D' cells. The modular entropy multiplier architecture (which wasn't invented until 1999, several years after the AtomAge) solves the issues related to foreign signal injection. More reading on a similar (and still open source) designs to the Z1FFER > 0.2.x (and the advantages of MEM) here: https://github.com/waywardgeek/infnoise https://github.com/alwynallan/redoubler Jim On Wed, Dec 14, 2016 at 5:58 PM, Isaac (.ike) Levy wrote: > > > On Dec 14, 2016, at 6:22 PM, George Rosamond < > george at ceetonetechnology.com> wrote: > > > > Anyone see this? > > > > http://www.openrandom.org/ > > > > They ship out of Brooklyn... so might be worth pinging them to do a > meeting? > > > > I know it's a product, but it's totally open-sourced from what I can > > see, and a meeting on hardware RNG is certainly worth having... > > > > thoughts? > > > > g > > Astounding. I?d be super excited to see someone walk through both > internals/implementation, and practicals for that gear, at a NYC*BUG > meeting. > > Best, > .ike > > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raulcuza at gmail.com Wed Dec 14 23:13:29 2016 From: raulcuza at gmail.com (Raul Cuza) Date: Wed, 14 Dec 2016 23:13:29 -0500 Subject: [talk] meeting idea In-Reply-To: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> References: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> Message-ID: On Dec 14, 2016 18:23, "George Rosamond" wrote: Anyone see this? http://www.openrandom.org/ They ship out of Brooklyn... so might be worth pinging them to do a meeting? I know it's a product, but it's totally open-sourced from what I can see, and a meeting on hardware RNG is certainly worth having... thoughts? g ______________________________ Yes! Let's invite them. Can we bookend it with a crypto talk too? I don't know anyone but will do the asking if you have suggestions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcallah at devio.us Wed Dec 14 23:51:49 2016 From: bcallah at devio.us (Brian Callahan) Date: Wed, 14 Dec 2016 23:51:49 -0500 Subject: [talk] meeting idea In-Reply-To: References: <5eb1aa7c-f084-1bde-d9e0-87954987c6d7@ceetonetechnology.com> Message-ID: <1ccd9a11-7b52-63c1-fc8d-11bedfa512cb@devio.us> On 12/14/2016 11:13 PM, Raul Cuza wrote: > > > On Dec 14, 2016 18:23, "George Rosamond" > wrote: > > Anyone see this? > > http://www.openrandom.org/ > > They ship out of Brooklyn... so might be worth pinging them to do > a meeting? > > I know it's a product, but it's totally open-sourced from what I can > see, and a meeting on hardware RNG is certainly worth having... > > thoughts? > > g > > ______________________________ > > > Yes! Let's invite them. Can we bookend it with a crypto talk too? I > don't know anyone but will do the asking if you have suggestions. > > I noticed it's for Arduino but maybe such a talk can help open the door for some USB-based (or internal?) product that we can help write *BSD drivers for. So yes, I'm for this. ~Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at netpurgatory.com Wed Dec 14 13:05:25 2016 From: john at netpurgatory.com (John C. Vernaleo) Date: Wed, 14 Dec 2016 13:05:25 -0500 (EST) Subject: [talk] Climate Mirror In-Reply-To: <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> Message-ID: ------------------------------------------------------- John C. Vernaleo, Ph.D. www.netpurgatory.com john at netpurgatory.com ------------------------------------------------------- On Wed, 14 Dec 2016, Pete Wright wrote: > > > On 12/14/16 5:52 AM, Brian Cully wrote: >> On 14-Dec-2016, at 00:29, Isaac (.ike) Levy > wrote: >> >>> >>>> Maybe torrents, IPFS, ...? Or a collaborative >>>> distributed file system. Perhaps using QFS, MFS or LFS? >>> >>> While I just got pretty excited about NYC*BUG?s ability to take this on > whole hog, I ABSOLUTELY would love to see this explored further. >>> >>> Could you propose something we could get involved in as a group from > NYC*BUG, perhaps something people can run to donate a small chunk of their > own smaller servers? >> >> I like the idea of using torrents. There are lots of upsides: it?s > easy to get involved by sharing a smallish chunk of the set, the relatively > small tracker file can be separately copied around to ensure there?s no > single point of attack, perhaps even via git or something similar to ensure > it?s not tampered with (and made trivially available via github), and it?s > pretty fire-and-forget (just leave it running on a routable server). >> >> The major downside I see is that unless the data has already been > made available via torrent, someone?s gotta seed the thing, which still means > you need at least one server with a lot of disk space to get the project > started. That?s something that we may want anyway, just to ensure the thing > can always be seeded (at least until the feds come knocking, but hopefully by > then there are many redundant copies of the data sitting around the world). >> >> I know I?d certainly be willing to donate a few TB on my server to > hosting a portion of the data set, but there?s no way I could host the whole > thing, and I?d also be willing to throw some money into the hat to get the > seed up. >> > > > I was thinking about using the torrent protocol last night and i think > there are two issues that would prevent this: > > > - we'd have to generate check-sums for every dataset that is stored, > then generate URI's for each of them. I am pretty confident the data > here is not bt friendly...which leads to my second point > > - the academic/prof consumers of this data are probably not going to use > bt to download these files for research. unfortunately ftp and http are > probably used very frequently in these arena's. > > having said this - i def feel that bt would be a *much* better method to > distribute and share the cost of hosting data...but i'm not sure if they > are ready for this or not :) > Just to add to that, it isn't even always a case of them being ready or not for bt, but their institution may block bittorrent. Having been responsible for distributing satellite data in the past, http and ftp and definitely what the researchers want and need. From john at netpurgatory.com Wed Dec 14 14:25:29 2016 From: john at netpurgatory.com (John C. Vernaleo) Date: Wed, 14 Dec 2016 14:25:29 -0500 (EST) Subject: [talk] Climate Mirror In-Reply-To: <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> References: <2fbc36e6-688a-5e9b-0cee-a8aa88251532@nomadlogic.org> <8293bf21-947c-a089-87fa-bce41d6f0215@riseup.net> <258F5AE2-29E9-4C8E-AA50-A11907CCB14B@blackskyresearch.net> <4D642D5D-B750-4975-9C0F-742B9361AD7A@gmail.com> <2a1a8120-dd92-490f-8d76-3ef48f9876fa@nomadlogic.org> <9FF99B52-2E42-4EF1-ADB0-C961DC1B3660@blackskyresearch.net> <1481742943.2286944.819128841.177F674A@webmail.messagingengine.com> <98248fa0-86b2-655c-194f-88d5c16f169e@nomadlogic.org> Message-ID: On Wed, 14 Dec 2016, Pete Wright wrote: > > > On 12/14/16 11:15 AM, Thomas Levine wrote: >> The data don't need to be online; save them to a redundant bunch of >> cheap hard drives (or maybe tapes), and distribute them among lots of >> bookshelves. They can even be slow and small hard drives pulled from old >> computers; we need to write to each one only once, we might need to read >> from each one once, and we otherwise only need to turn them on once >> every couple years to make sure that they're still intact. Maintain a >> website with a list of the datasets, the datasets' checksums, and the >> contact information for the people with the hard drives on their >> bookshelves. >> >> Note that this is my opinion only on how this project could be >> implemented. I don't know enough about the datasets or the likely >> effects of geopolitics on their implementation in order to comment as to >> whether I think the project should be implemented. >> > > not to nit-pick but i would strongly recommend *against* using HDD's in this > manner (magnetic spinning ones, or SSD ones). Drives are not designed to > reliably store data cold like this mechanically or electrically. This is why > tapes are still in use to this day - they *are* designed for cold store. And > if you do hit a bad sector it is quite possible to skip that sector and > continue reading data. > > This is coming from quite a bit first-hand experience where I've lost > data-sets which were in cold-storage on HDD's for about a year that were > totally lost, versus data on tapes which were in cold-store for around > 5-7years where we had few problems recovering our assets. > > -p > Yep. That's why the high-e astrophysics archive at goddard (the only archive I know well) still uses tape for long term storage. From mark.saad at ymail.com Thu Dec 15 10:48:26 2016 From: mark.saad at ymail.com (Mark Saad) Date: Thu, 15 Dec 2016 15:48:26 +0000 (UTC) Subject: [talk] Server controll panel References: <1817344556.3339905.1481816906570.ref@mail.yahoo.com> Message-ID: <1817344556.3339905.1481816906570@mail.yahoo.com> All I stumbled on this the other day , its a Python based server control panel in the vein of easyadmin , cpanel , etc ; Ajenti Server Admin Panel http://ajenti.org/ I was wondering if anyone out there has used it ? -- Mark Saad mark.saad at ymail.com From pete at nomadlogic.org Thu Dec 15 19:21:30 2016 From: pete at nomadlogic.org (Pete Wright) Date: Thu, 15 Dec 2016 16:21:30 -0800 Subject: [talk] SAN SSL Certificates Message-ID: <3945b9a9-33ce-f76d-427e-0ba636e4cb31@nomadlogic.org> So I've recently starting working for a company that manages quite a few websites around the states and noticed that one practice they have done here is use SAN (Subject Alternative Name) SSL certificates. This allows them to purchase a single SSL certificate that is valid for I think up to 20 domains. I had heard of this certificate type before, but thought its primary intent was to be used for internal AD domains management. I've been looking around at some other publishers and have noticed a fair amount of other sites using this certificate type. My initial thought is that this seems like an interesting attack vector. For example if www.foobar.com has a SAN cert for a bunch of other domains, then I know they are all under one umbrella. But also...I've found that most use-cases involve domains that appear to be middleware or backend services under these certs. For example www.bloomberg.com's cert also is valid for: fonts.gotraffic.net, bbg-img.bwbx.io etc... Anyone thought about this with a more clear head than me? -p -- Pete Wright pete at nomadlogic.org nomadlogicLA From ike at blackskyresearch.net Sun Dec 18 14:19:34 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Sun, 18 Dec 2016 14:19:34 -0500 Subject: [talk] NYC*BUG Climate Mirror and roll call Message-ID: <9B8FC473-5926-4935-90E0-B58CB96F8C15@blackskyresearch.net> Hi All, Sorry for the cross-post, I won?t make a habit of it. For the new Climate Mirror list, I wanted to brief folks as to what NYC*BUG (the NYC *BSD Users Group) is independently working on. -- NYC*BUG is working to get a big NYC mirror in place for Climate Data: [x] Colocation: NYC*BUG has run a donated colo cabinet in NYC, donated by New York Internet, with many BSD/UNIX/Open Source projects in and out over the last 12 years. We?ve currently got space, power, and excellent/flexible internet connectivity in place. Power will obviously become a practical issue but we?ll cross that bridge when we get to it, and for now we have 2 redundant circuits to get us lit solid. [x] Skilled Large Storage Operators: NYC*BUG is filled with folks possessing advanced and practical experience building/maintaining large scale storage systems, with practical expertise in ZFS on FreeBSD. For this effort, we already have a small group of capable admins scheming to get a large storage system up just for this climate data. Additionally, I personally used to work at a public data company, and from that I am well versed in the practical challenges mirroring data from many government agencies. In short, we certainly know what we?re getting ourselves into :) [-] Server(s) Hardware: Today we?re currently in the process of reaching out to various hardware vendors we have professional relationships with, trying to acquire at least one large server (24-48 SATA bays), as well as disks. This is our last hurdle- and getting something simply "not too junky" donated is important, as our experiences with volunteerism in this area is that too-flaky or too-hackish hardware is extremely costly for volunteers time. We are even collaborating with a vendor on the idea of arranging for individual contributors to donate 8Tb drives by purchasing one, just organizational work. With that, by nature of our organization, we?re trying to keep our mirroring independent- plenty of cooks in our kitchen for sure- but if folks come across anyone who has hardware to donate, (or would buy hardware to donate), we have more than a decade of community experience to put it into practice and keep it online. -- Additionally, some breakout discussions have come up around NYC*BUG to address various distributed techniques, to that many individuals can contribute slices of what they have, but we?re still all just hashing out different ideas, (no pun intended), and even Torrent based ideas require a complete base to get distributed as ?shards? so speak. So we?re focusing on that. Please feel free to contribute ideas or suggestions to what we have above, and if you like any of our ideas, by all means take them and run with them in your environments! Also, some of our operational progress/chatter as we continue will remain on NYC*BUG lists, to keep the noise down here. Best, .ike From ike at blackskyresearch.net Fri Dec 23 20:03:14 2016 From: ike at blackskyresearch.net (Isaac (.ike) Levy) Date: Sat, 24 Dec 2016 10:03:14 +0900 Subject: [talk] Gitlabs going post-cloud with Metal Message-ID: Hey all, So, a breath of fresh air: Gitlabs, (revenue for Github), is apparently leaving the cloud for metal. Their open-ish community input process is pretty cool, and their numbers and projections line up *exactly* with what I've presented before- (one order of magnitude cost savings over 3 years, with radically increased computing/network horsepower). As if that isn?t cool enough, NYI is one of the two datacenters they are considering, (w00t!) Their full breakdown and request for comments is here, https://about.gitlab.com/2016/12/11/proposed-server-purchase-for-gitlab-com/ They posted a cost spreadsheet breakdown here, https://docs.google.com/spreadsheets/d/1XG9VXdDxNd8ipgPlEr7Nb7Eg22twXPuzgDwsOhtdYKQ/edit#gid=894825456 And for the adventurous, a Hacker News thread, https://news.ycombinator.com/item?id=13153031 -- Absolutely worth a read for those interested in the topic- even the comments, it's interesting to me to see so much good old marketing FUD being spouted by cloud fanboys. (Sounds just like Microsoft vs *NIX in the early 2000?s). And regarding their actual spec, it?s pretty interesting too- straightforward, if not a bit too many bells and whistles going nuts with gear IMHO :) Even with the bells and whistles, their cost comes down to reasonable/sustainable levels- good stuff? After growing something as big as Gitlabs in AWS, I certainly guess they *deserve* to rock some bells and whistles. Rocket- .ike From kmsujit at gmail.com Sat Dec 24 01:52:24 2016 From: kmsujit at gmail.com (Sujit K M) Date: Sat, 24 Dec 2016 12:22:24 +0530 Subject: [talk] Gitlabs going post-cloud with Metal In-Reply-To: References: Message-ID: On Sat, Dec 24, 2016 at 6:33 AM, Isaac (.ike) Levy wrote: > Hey all, > > So, a breath of fresh air: > > Gitlabs, (revenue for Github), is apparently leaving the cloud for metal. I find Gitlabs to be on the wrong foot all the time. I think that is apart of their attitude. If you look at GitHub, I have a paid account and I don't think it is not worth it. So it the way to look at it. From edlinuxguru at gmail.com Mon Dec 26 10:02:09 2016 From: edlinuxguru at gmail.com (Edward Capriolo) Date: Mon, 26 Dec 2016 10:02:09 -0500 Subject: [talk] Gitlabs going post-cloud with Metal In-Reply-To: References: Message-ID: On Fri, Dec 23, 2016 at 8:03 PM, Isaac (.ike) Levy wrote: > Hey all, > > So, a breath of fresh air: > > Gitlabs, (revenue for Github), is apparently leaving the cloud for metal. > > Their open-ish community input process is pretty cool, and their numbers > and projections line up *exactly* with what I've presented before- (one > order of magnitude cost savings over 3 years, with radically increased > computing/network horsepower). > > As if that isn?t cool enough, NYI is one of the two datacenters they are > considering, (w00t!) > > Their full breakdown and request for comments is here, > https://about.gitlab.com/2016/12/11/proposed-server- > purchase-for-gitlab-com/ > > They posted a cost spreadsheet breakdown here, > https://docs.google.com/spreadsheets/d/1XG9VXdDxNd8ipgPlEr7Nb7Eg22twX > PuzgDwsOhtdYKQ/edit#gid=894825456 > > And for the adventurous, a Hacker News thread, > https://news.ycombinator.com/item?id=13153031 > > -- > Absolutely worth a read for those interested in the topic- even the > comments, it's interesting to me to see so much good old marketing FUD > being spouted by cloud fanboys. (Sounds just like Microsoft vs *NIX in the > early 2000?s). > > And regarding their actual spec, it?s pretty interesting too- > straightforward, if not a bit too many bells and whistles going nuts with > gear IMHO :) Even with the bells and whistles, their cost comes down to > reasonable/sustainable levels- good stuff? After growing something as big > as Gitlabs in AWS, I certainly guess they *deserve* to rock some bells and > whistles. > > Rocket- > .ike > > > _______________________________________________ > talk mailing list > talk at lists.nycbug.org > http://lists.nycbug.org/mailman/listinfo/talk One thing I have noticed is that in the recent past, vendors and developers where quick to site the advantages of the cloud for the "cost of system admin". However, cloud providers have began to target higher up the food chain. They are starting to produce software to replace vendor software. Suddenly the software vendors find themselves in a situation similar to the "the hangman". They were more or less complacent initially but now find themselves in the cross hairs. They shrink into smaller corners of work behind the logic of "cant amazon do that better then we can?" -------------- next part -------------- An HTML attachment was scrubbed... URL: From pete at nomadlogic.org Tue Dec 27 13:03:37 2016 From: pete at nomadlogic.org (Pete Wright) Date: Tue, 27 Dec 2016 10:03:37 -0800 Subject: [talk] Gitlabs going post-cloud with Metal In-Reply-To: References: Message-ID: <9b8ba724-1712-0845-60a9-5518e67447f6@nomadlogic.org> On 12/23/16 5:03 PM, Isaac (.ike) Levy wrote: > Hey all, > > So, a breath of fresh air: > > Gitlabs, (revenue for Github), is apparently leaving the cloud for metal. > > Their open-ish community input process is pretty cool, and their numbers and projections line up *exactly* with what I've presented before- (one order of magnitude cost savings over 3 years, with radically increased computing/network horsepower). > > As if that isn?t cool enough, NYI is one of the two datacenters they are considering, (w00t!) > > Their full breakdown and request for comments is here, > https://about.gitlab.com/2016/12/11/proposed-server-purchase-for-gitlab-com/ > > They posted a cost spreadsheet breakdown here, > https://docs.google.com/spreadsheets/d/1XG9VXdDxNd8ipgPlEr7Nb7Eg22twXPuzgDwsOhtdYKQ/edit#gid=894825456 > > And for the adventurous, a Hacker News thread, > https://news.ycombinator.com/item?id=13153031 > > -- > Absolutely worth a read for those interested in the topic- even the comments, it's interesting to me to see so much good old marketing FUD being spouted by cloud fanboys. (Sounds just like Microsoft vs *NIX in the early 2000?s). > > And regarding their actual spec, it?s pretty interesting too- straightforward, if not a bit too many bells and whistles going nuts with gear IMHO :) Even with the bells and whistles, their cost comes down to reasonable/sustainable levels- good stuff? After growing something as big as Gitlabs in AWS, I certainly guess they *deserve* to rock some bells and whistles. > Thanks .ike this is interesting. I just left a Github shop where we had a github enterprise VM; it was "ok" aside from the fact i was an ubuntu VM security updates were pretty disruptive. at my new gig we are using gitlab and i'm actually pretty happy with it. one thing i've been able to do was use their native gitlab-runner for C.I. (and side-stepping a whole bunch of jenkins servers i don't have control over): https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/blob/master/docs/install/freebsd.md the runner is a golang program, and it "just works". having official builds from them is great though as it's allowing me to use this for my freebsd dev/prod infrastructure seamlessly, while not having to spend cycles porting their code. ...i should get off my ass and get my blog online and start documenting this stuff ;) -pete -- Pete Wright pete at nomadlogic.org nomadlogicLA