Techrights-sec2recent files are now hardlinked:Feb 22 03:55
Techrights-sec2$ stat --printf "%i\t%n\n" /home/gemini/gemini/tr_text_version/irc-log-techrights-210221.txt ~glr/tr_text_version/irc-log-techriFeb 22 03:55
Techrights-sec2ghts-210221.txtFeb 22 03:55
Techrights-sec2309272  /home/gemini/gemini/tr_text_version/irc-log-techrights-210221.txtFeb 22 03:55
Techrights-sec2309272  /home/glr/tr_text_version/irc-log-techrights-210221.txtFeb 22 03:55
Techrights-sec2but the old ones are not.  perhaps the directory should be wiped and theFeb 22 03:55
Techrights-sec2full 'copy' be re-run manually?Feb 22 03:55
schestowitz__I noticed the same when I checked after the cron job had runFeb 22 03:55
schestowitz__On a positive note, the gemini links were all fine, got it right the first time aroundFeb 22 03:55
schestowitz__Shall we wipe tr-archives with care not to also delete the originals? I dread losing something in case hard links exist somewhere?Feb 22 03:56
Techrights-sec2~gemini/gemini/tr_text_version/Feb 22 03:58
Techrights-sec2has the redundant files so it if is wiped and then the copy script run,Feb 22 03:58
Techrights-sec2it will get populated by hardlinks.Feb 22 03:58
schestowitz__as long as that being done won't have any effect on the original files (depending on what was done prior to this, maybe tests included)Feb 22 03:58
schestowitz__the script is run with sudo by "pi", but I can run it manuallyFeb 22 03:59
Techrights-sec2okFeb 22 04:00
Techrights-sec2IF run while files exist in the target directory, no changes are madeFeb 22 04:00
Techrights-sec2the target has to be emptyFeb 22 04:00
schestowitz__Yeah, I just want to be sure the emptying, however done, won't drain out anything ipfs uses as its files pool, about 1,000 filesFeb 22 04:01
Techrights-sec2If ~gemini/gemini/tr_text_version/ is a copy, then nothing is lost.Feb 22 04:03
Techrights-sec2IF ~gemini/gemini/tr_text_version/ contains hardlinks, then onlyFeb 22 04:03
Techrights-sec2the directory entry gets removed, any other hardlinks to the originalFeb 22 04:03
Techrights-sec2including the original, remain.Feb 22 04:03
schestowitz__do you have the privs to remove or to empty it safely? BTW, I think gemini is in sudoers alreadyFeb 22 04:04
Techrights-sec2yesFeb 22 04:10
Techrights-sec2all clearFeb 22 04:10
schestowitz__I will rerun now.... then count filesFeb 22 04:10
schestowitz__oh you beat me to itFeb 22 04:11
Techrights-sec2all copiedFeb 22 04:11
schestowitz__Maybe I will try to better partition the page to avoid it getting so giganticFeb 22 04:12
-TechrightsBN/ | Techrights Full IPFS IndexFeb 22 04:12
schestowitz__gemini@raspberrypi:~/gemini/not.tr_text_version $ ls -la | wc -lFeb 22 04:13
schestowitz__997Feb 22 04:13
schestowitz__gemini@raspberrypi:~/gemini/not.tr_text_version $ ls -la ../tr_text_version/ | wc -lFeb 22 04:13
schestowitz__1006Feb 22 04:13
schestowitz__regenrating the index to test linking to irc and bulletinsFeb 22 04:14
schestowitz__the linking will save us hundreds of MBs over time, and seeing we have almost 4GB spare and almost all the capsule is now in place (I think 2020 misses some bits) we should be OK for at least a yearFeb 22 04:15
schestowitz__ipfs does not scale well when the number of files grows and the operations (e.g. "add")  grow linearly in duration of runFeb 22 04:16
Techrights-sec2yes there are months before there is an issue again regarding spaceFeb 22 04:17
Techrights-sec2is there a way to subdivide the IPFS collection so that the Feb 22 04:19
Techrights-sec2time and other resources it requires remain manageable?Feb 22 04:19
schestowitz__I asked some months ago and do not recall the exact answer, but it's not a dead endFeb 22 04:19
schestowitz__The linkage to the new linked objects seems to work correctly, I've tested about 20 semi-randomlyFeb 22 04:22
Techrights-sec2I've put the 'copy' script in /usr/local/sbin/copy-tr-to-gemini.shFeb 22 04:23
Techrights-sec2it can be called from cron from thereFeb 22 04:23
schestowitz__diff says a comment line is the only difference to the original in the homedir for ipfsFeb 22 04:24
schestowitz__to avoid confusion of conflicting changes I've set pi crontab to point to the shared location and imported that file for backup purposesFeb 22 04:26
Techrights-sec2yes the clarification seems useful.  Feb 22 04:26
schestowitz__Maybe in I should split into 5 subpages?Feb 22 04:29
Techrights-sec2At least two pages, perhaps make a new one each quarter?Feb 22 04:30
schestowitz__working on it now, should not take long, testing will take longer and over the long run...Feb 22 04:35
Techrights-sec2Is it the number of files that bother IPFS or the combined size?Feb 22 04:36
schestowitz__mostly size because it goes about scanning them for any changes, I believe (hashing)Feb 22 04:36
schestowitz__OK, seems to be properly split up nowFeb 22 05:24
schestowitz__do we still need gemini/not.tr_text_version for anything or should I move it to /tmp as tentative for deletion (if that partition... maybe not even its own... is large enough?)Feb 22 05:25
schestowitz__OK, you did that alreadyFeb 22 05:28
*rianne has quit (Quit: Konversation terminated!)Feb 22 08:08
Techrights-sec22007 is in placeFeb 22 10:09
Techrights-sec22006 is also in placeFeb 22 10:09
Techrights-sec2I will redo 2021 later in the day, but for now 2006 through 2015 all have                                                      Feb 22 10:09
Techrights-sec2the latest conversion process.  Let me know if you spot any major room                                                         Feb 22 10:09
Techrights-sec2for improvement.Feb 22 10:09
Techrights-sec22021 is now improvedFeb 22 10:09
Techrights-sec2there is a local tarball of the articles: gemini-pages.2006-2021.tar.gzFeb 22 10:09
Techrights-sec22016 - 2020 still have the old style block quotes however.Feb 22 10:09
schestowitz__Excellent, maybe we can announce this later today. I keep struggling in recommending a client/browser as many are jailed in **ithubFeb 22 10:10
Techrights-sec2I have not explored the  clients, amfora was the first one that worked.Feb 22 10:11
Techrights-sec2None are in any repositories for convenient (and safe) download and Feb 22 10:11
Techrights-sec2automated maintenance.Feb 22 10:11
Techrights-sec2I sent the start URL for 2006 to GUS today.  Hopefully the old pages willFeb 22 10:15
Techrights-sec2get indexed. Feb 22 10:15
Techrights-sec2$ find /home/gemini/gemini/2* -mindepth 3 -type f -name '*.gmi' -print | wc -lFeb 22 10:15
Techrights-sec232458Feb 22 10:15
Techrights-sec2just over 32k articlesFeb 22 10:15
schestowitz__Biggest capsule in a matter of less than a fortnight :-)Feb 22 10:15
Techrights-sec2yes, it took a week to write the conversion code (about 1 week at 1 FTE)Feb 22 10:16
Techrights-sec2there after it took a week of waiting for the downloads (about 2 hours at 1 FTE)Feb 22 10:16
schestowitz__Once it's done it's done as I very rarely change anything old (by rarely I mean almost never(Feb 22 10:17
Techrights-sec2the scripts are in our Git repositoryFeb 22 10:18
Techrights-sec2and mirrored on the RPiFeb 22 10:18
Techrights-sec2I figure the old articles will remain static, but the scripts are thereFeb 22 10:18
Techrights-sec2in case anything needs updating.Feb 22 10:18
Techrights-sec2If the layout / structure changes, then the parser will need adjustment.Feb 22 10:18
schestowitz__Maybe they can be generalise to make a toolset of wordpress->gmi conversions. Can help 'recruit' many more sites for the space...Feb 22 10:19
Techrights-sec2They're too specific to be of much use outside of TR.  However theyFeb 22 10:22
Techrights-sec2can serve as exanples and the approximate workflow might be of use to manyFeb 22 10:22
Techrights-sec2others.  Feb 22 10:22
Techrights-sec2The wordpress part can contain too much variation, it is all custom HTML there.Feb 22 10:22
Techrights-sec2Fortunately you have been very consistent in use of HTML within the articlesFeb 22 10:22
Techrights-sec2so it was possible to parse.  The daily links needed their own subroutine, butFeb 22 10:22
Techrights-sec2everything else seems to fit into the same set of rules.Feb 22 10:22
schestowitz__Maybe I can do some blog posts explaining various aspects of the conversion of the code, if I can comprehend Perl well enough  (I have not tried). This way people can search and find useful code files or at least code samples they can reuse.Feb 22 10:24
Techrights-sec2gemini-scripts-README.txt has the internal write upFeb 22 10:24 can get a year at time of the back articlesFeb 22 10:24
schestowitz__Most people's wordpress sites are vastly smaller, so we sort of stress-tested it 'at scale', I suppose...Feb 22 10:25
Techrights-sec2The scale worked in our favor and made it more worth it to script.Feb 22 11:12
Techrights-sec2The initial scripts took only maybe one day (at 1 FTE) but then tweaksFeb 22 11:12
Techrights-sec2etc and working with Git for the first time added to that.Feb 22 11:12
Techrights-sec2The scripts can be shown but our Git repository has not been put into Feb 22 11:12
Techrights-sec2the HTTP server yet.  I have not read up on that yet, and wonder about Feb 22 11:12
Techrights-sec2a lot of the features.  I was talking with my xxxxxxxxxxxxxxxxx about Git recently Feb 22 11:12
Techrights-sec2and got a lot of tips but now have to learn.Feb 22 11:12
Techrights-sec2As for the scripts, if people have been consistent in their document structure,Feb 22 11:12
Techrights-sec2then the XPath approach will work for them too.Feb 22 11:12
schestowitz__Thanks for all the hard work. I think we need to do what we can to give back to gemini and help it grow.  Later on I'll examine the code to see if I personally can make sense of it, though I suspect publishing anything about it must be done after git goes public. vis a vis git, mind the ongoing TR series about github etc.Feb 22 11:14
Techrights-sec2No problem.  Gemini is a worthy project.  So contributing a decent capsuleFeb 22 11:15
Techrights-sec2helps.  I am writing up some notes for the Capsule.  Feb 22 11:15
schestowitz__Maybe at a later point I will convert my personal blog (>2000 posts) to gemini as it is also wordpress and the domain is managed similarly. I can try it on the pi and use another server software to test searching.Feb 22 11:16
schestowitz__thought: if we contract the right people and the code is well documented, gemini core people and pages will point to our repo and increase use among wordpress users (there's one for hugo that I saw)Feb 22 11:26
Techrights-sec2We should have some internal review of the scripts first, to ensure that Feb 22 11:28
Techrights-sec2nothing obvious is wrong.  Feb 22 11:28
schestowitz__MS-PL licence, definitely!Feb 22 11:28
Techrights-sec2Should the scripts be AGPL?Feb 22 11:28
Techrights-sec2gemini:// 22 11:51
Techrights-sec2I suppose it is ready for internal review.  Comments on where comments are needed in the code are important now, as would be comFeb 22 11:51
Techrights-sec2ments on what to change or redact.Feb 22 11:51
Techrights-sec2back in a bitFeb 22 11:51
schestowitz__I have just read the whole page and did not spot typos. It is written concisely and clearly.Feb 22 12:01
schestowitz__about 1000 requests so far this morningFeb 22 12:05
schestowitz__(though I cannot distinguish what types, I just check for communications in/out, over port 1965)Feb 22 12:06
Techrights-sec2 22 16:58
-TechrightsBN/ | ~hsanjuan/gemini-ipfs-gateway - sourcehut gitFeb 22 16:58
schestowitz__interesting!Feb 22 16:58
schestowitz__so we could serve over ipfs what we already have in the "tr" directory anywayFeb 22 16:59
schestowitz__so we could serve over ipfs and gemini what we already have in the "tr" directory anyway (for both ipfs and gemini)Feb 22 17:00
Techrights-sec2yes, but would it be of use for TR?Feb 22 17:00
Techrights-sec2I think so.Feb 22 17:01
Techrights-sec2I'm not up on IPFS thoughFeb 22 17:01
schestowitz__They solve very different problemsFeb 22 17:01
schestowitz__I struggle to think of a practical real-world scenario where you want to combine bothFeb 22 17:03
Techrights-sec2The bulletins are already mirrored via the filesystem too.Feb 22 17:06
schestowitz__yes, and if both the pi dies and the server get seized or something, ipfs will still be able to serve a copy (not that such a scenario ought to ever arise)Feb 22 17:07
schestowitz__I think of it as deterrent (against SLAPP or takedown demands with deadline)Feb 22 17:08
schestowitz__It's easier for them when there's a third party like Google that buckles for its own business reasons, or even Twitter without incentive to fight for youFeb 22 17:08
*schestowitz__ has quit (Quit: Konversation term)Feb 22 17:09
-NickServ-schestowitz__! has just authenticated as you (schestowitz)Feb 22 17:09
*schestowitz__ (~schestowi@unaffiliated/schestowitz) has joined #boycottnovellFeb 22 17:09
*ChanServ gives channel operator status to schestowitz__Feb 22 17:09
Techrights-sec2$new_status%7D_%7B$post-%3Epost_type%7D                                      Feb 22 19:27
Techrights-sec2can that be used to have WordPress trigger updates in Gemini and IPFS?  Feb 22 19:27
-TechrightsBN/ | {$new_status}_{$post->post_type} Wordpress hook details -- Adam Brown, BYU Political ScienceFeb 22 19:27
schestowitz__I am not sure; wasting lots of time fighting off  a major ddos attack on tm at the moment :(Feb 22 19:28
Techrights-sec2I notice that there have been TM outages    Feb 22 19:33
Techrights-sec2Is there anything that can be done upstream to mitigate the attacks?Feb 22 19:33
schestowitz__not serve CSS files, but then the attack pattern would just shiftFeb 22 19:33
Techrights-sec2Can a Vanish cache be placed in front?Feb 22 19:34
schestowitz__that might help only to some degree, depending on patterns. 20k reqs per 30 sec is still a lotFeb 22 19:35

