Post Reply 
Did I wreck the site? Was not my intention.
10-03-2017, 11:37 PM (This post was last modified: 10-04-2017 12:24 AM by Don Shepherd.)
Post: #6
RE: Did I wreck the site? Was not my intention.
(10-03-2017 10:52 PM)pier4r Wrote:  
(10-03-2017 10:17 PM)rprosperi Wrote:  Also, I can't help but notice that keeping it as-is, is a natural obstacle for people trying to grab the entire site - as you discovered. I've no idea at all if this is part of the reason for keeping it as-is, but it seems a useful side-effect.

That's fine, but not if the site collapse (a 503 server error means that the processor rendering the pages , likely Perl, is crashed. If the site goes down, it mean that the memory or the processes are blown).

One can employ limits on the webserver itself still serving static pages. That saves ram/cpu and maintenance for sure.

Also, just to clarify:

1st attempt (bandwidth limited) failed after 2 minutes - site still fine
2nd attempt (5 seconds limit between trials) failed after circa 10 minutes - site still fine
3rd attempt (10 seconds limit) failed after ca. 30 minutes - site fine
4th attempt (20 seconds) failed after ca. 45 minutes - site fine except the archive tried.
5th attempt (30 seconds) failed after ca. 60 minutes - site down

Now, although scraping a site may be not that nice, a request (a single one, just one connection) every 30 seconds is pretty nothing and if that much is able to bring down the site, there is some misconfiguration somewhere. In any case I would avoid it, wget or not.

For this I say, all the more reasons if those page are on the USB/DVD (so the work is already done), go for static pages.

For the p2p part. Sure in theory one should foster those purchases, I agree. The point is not to avoid to purchase something, rather to make something resilient so there is a shared backup. I learned that digital content, especially niche one, disappears too quickly if one takes it for granted.

edit: and yes I should have asked for permission. I only thought that there was no problem whatsoever doing it because I knew I was going to limit the download and one single downloader with very slow frequency of download normally is handled fine. (as I did for the c2.wiki already)

I could not get on the site earlier today. If whatever you were doing caused that, cut it out. Buy the museum thumb drive, like Bob and Massimo suggested.
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: Did I wreck the site? Was not my intention. - Don Shepherd - 10-03-2017 11:37 PM
What about an European MoHPC hub? - Dieter - 11-24-2017, 10:16 AM



User(s) browsing this thread: 6 Guest(s)