Theo Todman's Web Page - Notes Pages
Status: Web-Tools (2020 - September)
(Text as at 04/10/2020 00:27:22)
(For earlier versions of this Note, see the table at the end)
Rationale for this Project
- This Project was alluded to briefly in a footnote on research methodology in my original Research Proposal1 under the head Research - Internet Technology2. When last at Birkbeck, I wrote a more extensive paper3 defending the Project and describing its rationale. Now that my PhD is in suspense, I have decided to take this Project further. There’s a lot to do: still quite a few items on the “wish list”. It is fairly critical as an enabler for my research, so I need to get a move on as I want it all out of the way before I re-start4 formal research.
- For documentation on my website (currently password protected) follow the links below:-
- Functional5 Documentation6.
- Technical7 Documentation.
- Other Websites8
- I’ve created and continue to maintain a small website for a music group Julie and I attend – the Enigma Ensemble.
- I established the Hutton Bridge Club Website in 11Q4 using the standard Bridgewebs service, but with a couple of competitions using my own routines. This was handed over in 15Q3, but I’ve taken it on again as of March 2020, not that there’s currently much to do. It needs a spring clean, but I’m waiting to see whether the club (and its members) survive the coronavirus pandemic.
- In 16Q3 I created the Mountnessing Bridge Club Archive website, using the vast bulk of the pages from their legacy site, as the club had moved to Bridgewebs and lost its historical data. As of March 2020 I’ve taken over the aforementioned Mountnessing Bridge Club website itself.
- Sometime around 2005, I created a website for Dr. Sophie Botros, one of my supervisors at Birkbeck, but we then lost touch and it got maintained (very badly) by some desktop support outfit. In 15Q2 I took it back on again and spruced it up a bit, and maintain it periodically until 19Q3, when it was taken on by a professional outfit, Bookswarm. The “Sophie Botros” link in this bullet is now to their version of the site.
- I created and / or ran a multitude of other bridge websites, but as of January 2018 I have either handed them over or mothballed them9:-
- In 15Q1, I took over the support and development of the Essex Contract Bridge Association (ECBA) website, which also uses Bridgewebs, but is very much larger. I wrote a lot of code10 to make this job less tedious. The site was handed over in 17Q4.
- For several years, I collected data11 on bridge activity in the Billericay/Brentwood area (initially needed for a project to set up a new consolidated club) by “scraping” data off web pages, consolidating it into a database and modelling it in various ways.
- I used this data to generate websites with a multitude of ladders for small clubs (Essex Bridge Results). These are now mothballed.
- I created and maintained a new website for the First Class Bridge Academy, giving it “small clubs” ladders (Bernie's Ladders Archive) as these were easy to maintain with little intervention.
- In 16Q3 I created the Mountnessing Bridge Club Archive website, using the vast bulk of the pages from their legacy site, as the club had moved to Bridgewebs and lost its historical data.
- I created a website12 for displaying the textual and grammatical analyses and appendices of Pete’s PhD on the Acts of the Apostles: Acts: Test Site.
Summary of Progress during July - September 2020
Website (Total Hours = 140.75)
- I spent 143.75 hours in 20Q3 on this Project, or related work (476.25 hours YTD, where for "YTD" - Year to Date - I mean the (academic) year that commenced in October 2019). That's 110.3% of the planned effort (114.2% YTD). Overall, 20.6% of my Project effort in the Quarter was directed towards this project (making 17.1% YTD) - as against 18.5% planned (14.1% YTD).
- In 20Q3 I made quite a lot of further progress over a wide area of highly relevant items, as listed below, with a particular emphasis on Auto_Reference_Notes.
- Consequently, I overspent my time budget by 10%. Time well spent, though.
- Completed items included:-
- Own Website:
- Created Cross_Reference_Changes_Prune to keep Cross_Reference_Changes under control pending completion of the Xref project.
- BookCitings references not sequenced correctly. Corrected.
- BooksToNotes Link Pages: Re-engineered - used CreatePapersToNotesWebPages to generate pages
- As revealed by Spider: BooksToNotes and PapersToNotes links failing (45 items). Pages not being regenerated due to wrong joins in Book_Note_Link_List and Paper_Note_Link_List.
- CreateBookPaperAbstractsWebPages not translating +RR+ References for Notes Write-ups of associated Papers
- Added option in Auto_Reference_Notes to only confirm new items (leaving previously-flagged items untouched)
- Added Aeon Abstract (ie. Footnote) link to Aeon Webref items in Summary task List reports for 'read' items
- Determined why the run of Auto_Reference_Notes_Regen took 2.5 hours on 02/09/2020 (and climbing). It was down to the Aeon Note13 being searched; in particular Check_If_In_Container, checking for items in Footnotes and / or Quoted Text. As this Note is regenerated, there's no point auto-linking it. Time reduced to 23 minutes once this Note - and other Notes of inappropriate Note Groups - was removed.
- Corrected Issues with Nested Functors: Added Functor_ID to End-tag
- Added option in Auto_Reference_Notes to - for Live (ie. not TEMP) Notes - update the latest Archived Note to equal the updated Live Note, hence avoiding superfluous archives.
- Upgraded Auto_Reference_Notes to avoid key-words within +AA+-style Authors names (using Check_If_In_Container).
- Upgraded Auto_Reference_Notes to allow a parameterised 'termination or continue' for very long update runs
- PaperstoNotes Link Pages: Re-engineered and corrected CreatePapersToNotesWebPages pages
- Determined why Recalculation (cmdRecalculate_Click) took so long: it varied depending on how many Notes were regenerated, but seemed to be 17.5 minutes even if there are none. I think it was down to a problem with nested Functors leading to Notes growing inordinately: in particular the development logs. Now takes around 5 minutes.
- The size of the main database was bloating to over 1.5Gb during the spider run, so was approaching the 2Gb limit.
- Used Check_Database_Size, with a parameter, to monitor the size of the main database and added a message similar to those reporting the compact / repair of the Slave database each time a status message is printed.
- Based on the diagnostics produced, it was found that the bloating occured at two stages:-
→ In Spider_Scurry: during the creation of Raw_Links_Temp_Temp.
→ In Spider_Copy: during the creation of Raw_Links_Temp.
→ Put check in Check_Database_Size to STOP if the database is over 1.5 Gb (parameterised via Max_Database_Size).
→ Moved Raw_Links_Temp_Temp to the Slave database.
- System Resources Exceeded - "Run Time Error 3035"; query Full_Link_Same_Directory_Updt in Spider_Copy. Fixed by using SQL rather than a query.
- Quarterly Project Reports: Corrected Functor_08. The Project Planned YTD % kept having to be bodged!
- Added Weekly Project Plans to Priority Task List
- Created web-page (using Functor_21) showing oboe practice hours by work played
- Reformated WebLinks_Tester.htm, WebLinks_Tester_Map.htm, WebLinks_Tester_Full.htm & WebLinks_Tester_Full_Map.htm: The 'As Above" lines waste space. Consolidate onto single second line.
- Reformated WebLinks_Tester_Brief, WebLinks_Tester.htm, WebLinks_Tester_Map.htm, WebLinks_Tester_Full.htm & WebLinks_Tester_Full_Map.htm: Allowed more space for 'link returned', 'issue' and 'display text' and added Explanation column
- Improved WebRefs checker (Webrefs_Update) further to check for Error 403 "Forbidden". This involved finding way of checking pdfs where the returned page is in fact HTML or XML (see DevLog Ref 379).
- Other Websites:
- Full details for 20Q3 are given below14:-
Website Others (Total Hours = 3)
- Website - Development (Total Hours = 108.75)
- Todman (Theo) - Tottering Towers & Listing Buildings: Incorporate RICS involvement (1.25 hours)
→ See "Todman (Theo) - Tottering Towers & Listing Buildings" (1.25 hours)
- Website - Generator - Page for Oboe practice15 details (2.75 hours)
- Website - Generator - CreateBookPaperAbstractsWebPages not translating +RR+ References for Notes Write-ups of associated Papers (0.25 hours)
- Website - Generator - Regen_Note_Links failing for Note 117016 (1.25 hours)
- Website - Generator - Webrefs_Update failing because IE loops with Aeon / Psyche pages (3.25 hours)
- Website - Generator - Add Aeon Abstract link to Aeon Webref items in Summary task List reports (1 hour)
- Website - Generator - Add Weekly Project Plans to Priority Task List (0.75 hours)
- Website - Generator - As revealed by Spider: BooksToNotes and PapersToNotes links failing (1 hour)
- Website - Generator - Auto-Reference Notes - Review & Update Documentation & Processing (28.5 hours)
- Website - Generator - BookCitings references not sequenced correctly (1 hour)
- Website - Generator - Correct Functor_08. The Project Planned YTD % keeps having to be bodged! (1 hour)
- Website - Generator - Correct Paper_Note_Link_List_New for CreatePapersToNotesWebPages pages (12.5 hours)
- Website - Generator - Correct WebRef page links from WebLinks_Test pages (add 'Off-Page_Link_') (0.5 hours)
- Website - Generator - Document Auto-Referencing Notes (7.5 hours)
- Website - Generator - Document Cross-Referencing (4 hours)
- Website - Generator - Document Cross-Referencing: Create Cross_Reference_Changes_Prune (1 hour)
- Website - Generator - Document Spider (0.25 hours)
- Website - Generator - Highlight Archived Notes as not the latest (0.5 hours)
- Website - Generator - Improve WebLinks Tester suite of pages (17.5 hours)
- Website - Generator - Investigate & Fix WebRefs_Update checker for 404 check not working (6 hours)
- Website - Generator - Issues with Nested Functors (3.5 hours)
- Website - Generator - Revise Cross-Referencing Documentation (1.25 hours)
- Website - Generator - Site Map: Document17 and update (0.25 hours)
- Website - Generator - Spider - Monitor performance & Main database size (4.75 hours)
- Website - Generator - Spider - System Resources Exceeded - "Run Time Error 3035"; query "Full_Link_Same_Directory_Updt" (1.5 hours)
- Website - Generator - Update Development Log in the light of recent activities (3.25 hours)
- Website - Generator - Wrote ZapFiles - to clear out WebLinks Tester, Documentation and other page-sets prior to regeneration (2.5 hours)
→ See "Software Development - Website - Development" (107.5 hours)
- Website - Education (Total Hours = 3)
- Website - Infrastructure (Total Hours = 9.25)
- Buy & Commission replacement iPhone protective case (0.25 hours)
- Buy & Commission replacement Mouse (0.75 hours)
- EE Broadband - Router & Line Issues (4 hours)
- Microsoft Windows 10 / MS Office - Releases, Bugs & Periodic Re-boots (2.25 hours)
- PC Backups / OneDrive (0.75 hours)
- Printer - Problems with USB Drivers (0.25 hours)
- Try out old TV as Monitor (1 hour)
→ See "Admin - Website - Admin & Maintenance" (9.25 hours)
- Website - Maintenance (Total Hours = 19.75)
- 20Q2 Status Reports (2.75 hours)
- Website - Generator - WebRefs - Manual / Automatic URL Checks & Fixes (11 hours)
- Website - Periodic Full Regeneration (3.5 hours)
- Website - Run Web Spider (2.5 hours)
→ See "Admin - Website - Admin & Maintenance" (19.75 hours)
- Website Others - Hutton DBC Maintenance
- Website Others - Mountnessing DBC Maintenance
Plans for the Near Future
The Plan below is taken automatically from the Priority 1 items on my Development Log, as published in my Outstanding Developments18 Report. I’ve maintained the weekly allocation at 10 hours. This is to allow further work on my Cross-Referencing project.
- Own Website: Priority 1 Items By Category:-
- Compact and Repair Problems
- On compacting and repairing my main database I sometimes get the error "The query cannot be completed. Either the size of the query result is larger than the maximum size of a database (2 GB), or there is not enough temporary storage space on the disk to store the query result".
- It happens 3 times while the database is re-opening.
- There is lots of space, and the database is only 600Mb (and the error started when it was under 500Mb).
- This mostly happens after I've run long processes, so I usually close the database, re-open it and then try the compact and repair. Usually this works, but not always. But I then try again and the message disappears.
- I strongly suspect that this is MS Access itself re-indexing tables, and blowing up a temporary database, but I can’t find any evidence for this on-line. Or help, other than suggestions to split databases and do other sensible things. Given that the error occurs when the database is re-opening, with no temporary file visible, is very strange.
- 17/04/20 - set MaxLocksPerFile to 1,000,000 (from the default 9,500). Sadly, it doesn't seem to have made a difference.
- Complete XRef-re-engineering project:-
- Ensure all links and link-pages use the new XRef table, and pension off the old tables.
- Look into writing out specific object-identifiers, and linking thereto for Citations, rather than paragraph references. An issue is multiple instances of the same object in a document.
- Check all link-types still work and fix any errors.
- Complete the auto-triggering of regeneration of “associated” link pages.
- Fix update bug in Convert_Webrefs.
- Fix Bug whereby PaperSummary pages seem to have “Works-” and “Books/Papers-” Citings that refer to the same link-pages.
- Document the process!
- Document19, Repair & update my Website site-map
- Review effectiveness of hyperlinking method in the light of PhD and Philosophy of Religion experience.
- Where possible, use ID rather than NAME for in-page hyperlinks
- Investigate Record-count discrepancies:-
- How do website files work as far as counts are concerned?
- Why aren't they recorded in Backup_History, nor the fact that the website was backed up?
- Different counts depending on whether new or old laptop is backed up. Investigate 63k discrepancy - lower on new laptop.
- Review architecture to improve performance; Need to document first
- Further improve the time to regenerate Book Summaries. Now takes about 27 minutes, but should be under 5 minutes!
- Change CreateBookPaperAbstractsWebPages so that - while a full re-gen uses the new method - re-gen for a particular book uses the old method (without the materialised view). This is so cmdRecalculate_Click doesn't need to be run beforehand.
- Investigate whether multiple Subject/Topic/Subtopic usage leads anywhere (ie. are just the first (of 3) actually used). Fix anything amiss.
- Reformat the BookCitings and PaperCitings pages:-
- Detail PaperCitings Pages: Include only useful information on the detail pages; but if there are multiple links from the same object, include them on the same line as 'extra links' as in BookCitings (copy the code: or, better, combine the two subs).
- Summary (Author Letter) pages: Include counts (as in Authors' Citations).
- Ensure uses the Cross_Reference table.
- Develop auto-reconciliation routines vs EBU results download
- Investigate the error reports from the Documenter, especially unused variables & queries.
- Provide Functional Documentation for Website Generator (using Notes)
- "Sitepoint (Learnable) - Sitepoint Learnable Web Development Courses": Membership cancelled, but plan what to do with the eBooks in my possession.
- Read "PC Pro - Computing in the Real World".
- Read "White (Ron) & Downs (Timothy Edward) - How Computers Work: The Evolution of Technology".
- iCloud for Windows: Re-install & solve 'The upload folder for iCloud Photos is missing' problem. Try on new Laptop.
- Add "Note Alternates" to Note pages.
- Add option in Auto_Reference_Notes to allow an updating run restricted to 'Read' Books / Papers only (useful for very long lists)
- Add option in Auto_Reference_Notes to automatically ignore words containing certain strings that include the key-word (eg. ignore 'grace' and 'trace' when indexing 'race')
- Allow the option to concatenate Notes in the Printed version (ie. linearly embed them essay-style), rather than treating the hyperlinks as footnotes – but still keep the hyperlink & cross-referencing in place.
- For use as "disclaimers" - eg. for "Plug Notes".
- For Thesis / essays: the difficulty here is the need for linking passages to make the text run smoothly.
- As part of the Cross-Referencing project, check out the consistent treatment of Note 87520, which should be universally ignored. Recently, links to it appeared on Book-Summaries, Book_Paper_Abstracts and Note_Book_Links, as a Note referencing a Book. The critical item was a row on the Note_Book_Links table.
- Determine why very long printable notes (eg. Level 3+ for Note 17021) are being truncated. Probably suppress them in any case, as they take far too long to load.
- Enhance Functors to work for selected non-temp Notes so that up-to-date stats can be incorporated. This is complex as I want to avoid production of an Archived Note each time a non-Temp Note is regenerated. I also want to ensure that Notes whose variable text consists entirely of Functors get archived correctly (which they might not if I removed Functor-generated text from Notes before saving them to the Notes table).
- Fix bugs in multi-level footnoting in Printable Notes – the referencing is going wrong.
- Investigate Note_Links: Section references seem to be incorrect
- Printable Notes: fix the bug whereby the “private” flag is round the wrong way.
- Split Aeon Page22 into multiple sub-pages (either by topic or by priority)
- Suppress the publication of the Printable versions of Temp Notes
- Upgrade Auto_Reference_Notes to reference Sub-Notes: Currently only affects one note - Somerset Maugham - so not yet urgent
- Upgrade Auto_Reference_Notes to save Notes_To_Regen to a new table prior to the run, add new rows to this table, and copy it back after the run. Finally, allow the option of regenerating these Notes. In the interim, use Notes_To_Regen to create Note 87423, then clear it before the copy-back.
- Upgraded Auto_Reference_Notes to log its actions to a new table (Auto_Reference_Notes_Actions) so that any errors can be investigated and improvements made.
- The monthly regeneration process for Paper Abstracts was still takeing just over 5 hours. Problem is with Cross_Reference_Deletions and Cross_Reference_Additions. I thought it could not be fixed until the cross-referencing project is fully complete and documented. However, it spontaneously improved to 1.6 hours in the August 2020 run. Monitor!
- Develop software & procedure to make adding more content to the photos pages easier to undertake.
- Timeline software: Add photos for Holidays & Family History
- Determine why Recalculation & Changed Book/Papers produce unneeded regeneration.
- Analyse the results of the data collection exercise and design a plan of campaign to fix broken Internal links and prevent recurrence.
- Correct the code so the problems discovered by the Spider don’t recur.
- Delete 'orphan pages' that are never linked to, ie. Use the Spider to prune redundant pages24 automatically where possible.
- Fix the historical data where errors are uncovered by the Spider. An easier task now the site has a full-regen function.
- Look into Sistrix Smart25. Errors and warnings itemised are:-
- Duplicate content: seems to be variants on theotodman.com
- Title Tags: Empty, too long, identical
- Page Not Found
- Filesize in excess of 1Mb
- Meta-Description: Empty
- Few words on Page
- H1: Not used, used multiple times per page, identical across pages
- Pictures: Alt attribute missing
Other Websites: Priority 1 Items By Category:-
- Webrefs_Update failing because IE loops with Aeon / Psyche pages. Currently doing manual checking - try to find an automated solution.
- Documentation & Bug-fixes: Phase 2
- Re-document the procedures in the light of recent changes.
- Resolve issues generated / revealed by the spider.
- Investigate - and fix where possible - broken links.
- Find a way of recording Missing Webrefs other than debug.print: create table, then suppress message for known problems
- Investigate items flagged as defunct. Populate Defunct_Explanation in WebRefs_Table. Consider use of FairUse (Link (Fair Use)) for documents no longer available that I'd downloaded.
- Investigate WebRefs with Issue = 'URL Translated OK': does the translation really work? How?
Summary of Progress to Date
This is hived off to various separate documents, which have now been harmonising and / or consolidated:-
- Summary of Progress to Date26.
- Outstanding Developments27,
- Functional Documentation28,
- A summary of time expended across the years developing my website29 is at "Software Development - Website - Development".
In-Page Footnotes:Footnote 4:
- Well, in a sense, I’ve missed the boat as I’m now putting effort into my research, though in an informal basis, so will need to continue with both projects in parallel.
- This was always likely to be necessary, as new features will always arise in use. It’s a prototype methodology, after all.
- This is very tedious to produce and consequently is both incomplete and out of date.
- This is much more fun, as it’s a purely technical task.
- I’ve written a vastly-improved general-purpose technical documenter for MS Access.
- It’s a shame to abandon the “mini websites” with all their ladders, as it’s rather well done.
- However, I couldn’t waste time on these after I’d abandoned bridge.
- In particular, for the ECBA “Victor Ludorum” competition.
- I cannot hand any of this code over, so the tedium will return, though not to me!
- I had agreed to share this data sometime early in 2018
- But will wait until asked again, as I doubt it’ll be of any real use to anyone.
- It used to exist in two versions, live and test.
- Pete decided not to renew the license for the live site, now it has achieved its purpose, so only the test site remains.
- Note that where fixes or small enhancements are made to a previously “completed” development, I don’t announce it again against the list of “completed” items above, though the work appears in the full list for the quarter.
- Note that Backup_Prune_Ctrl deletes (relevant) pages that weren't regenerated in the last full site-regen, but this isn't the same thing.
- See Sistrix
- This used to be called Optimizr, see Optimizr (Defunct) (which now auto-forwards to Sistrix).
- A quick look doesn’t show it to be an obvious scam, but I need to double-check.
- An unsolicited analysis of my site turned up monthly from Optimizr from January 2015 to October 2017, listing a large number of “problems” that I think I know about, but which are in the queue to address.
- It restarted in February 2018, under the Sistrix name (this seems to have been associated with Optimizr since November 2015).
- The free version of this software is restricted to 1,000 pages, which is a very small proportion of my Site, though I may be able to point it to difference base-URLs.
- But I do need to address the problems validly itemised, and a sub-set is still useful.
- As distinct from developing other peoples’ websites – time which is also recorded against this project, but not against this task.
Table of the Previous 12 Versions of this Note: (of 78)
Summary of Note Links from this Page
To access information, click on one of the links in the table above.
Summary of Note Links to this Page
To access information, click on one of the links in the table above.
Authors, Books & Papers Citing this Note
||Website - Development
Text Colour Conventions
- Black: Printable Text by me; © Theo Todman, 2020
- Blue: Text by me; © Theo Todman, 2020