Theo Todman's Web Page - Notes Pages


Website Documentation

Website Generator Documentation - Cross-Referencing

(Text as at 11/04/2022 00:01:26)

Previous VersionsNote ReferencesNote Citations


Introduction


The Cross_Reference Table Itself
Detailed Processing
  1. Cross_Reference Deletions
  2. Cross_Reference Additions
    • Cross_Reference_Add is a simple routine that just adds rows to the table, based on a string of parameters supplied. However, it doesn’t add Cross References for Note 8744, which is the temporary Note used for the generation of +LL+ “Links” pages.
    • The functions that directly call Cross_Reference_Add are those that convert the “+ΧΧ+” references in text of whatever sort to hyperlinks, ie:-
    • Both Notes and Archived Notes are covered by the same routines.
    • The Name reference is of the format Xnnnn_i, where:-
      • X = A, B, N or P (Images don’t have Name references and WebRefs have their own method).
      • nnnn is the object ID
      • i is an incremental counter for the number of times this object has appeared in the calling object (determined by a query on the Cross_Reference table itself.
  3. Cross_Reference_Changes
    • Following the first round of investigation and documentation, I’ve decided to delete all rows from this table more than 40 days old (or prior to the last Website Regen (as determined by query Website_Regen_Last_Run_Start) if this is earlier), using the Sub Cross_Reference_Changes_Prune (which uses Cross_Reference_Zapper). This is a temporary expedient until I introduce changes for non-Notes (Notes are already fully implemented). I’ve done this to see if it improves performance, which does seem to be the case.
    • The record-counts now appear in the following table (provided by Functor_21 using query Cross_Reference_Changes_By_Type):-
       
      Type_Calling ↓Type_Called →ABNPWTOTAL
      B  45   45
      N 22,7235,110414,55814,48756,882
      P 4,406172 4736,54411,595
      TOTAL 27,1295,327415,03121,03168,522

    • Key:-
      • A = Author
      • B = Book
      • I = Image
      • N = Note
      • N_A6 = Archived Note
      • P = Paper
      • W = WebRef
      • Calling types are in the first column, called types are the other column headings
      • Note that Images and WebRefs, by their nature, can be called, but cannot call.
    • This table (according to Functor_23, option 3) has 68,522 rows, as of 01/03/2022, split by month (using Functor_22, Cross_Reference_Changes_By_Month):-
      • 2022_01: 15,545
      • 2022_02: 52,977
    • Rows are added using two complex queries, but before describing them it’s worth describing what’s been going on. The table Cross_Reference_Zapper is populated with all the cross-references from the changed calling objects held in Cross_Reference, prior to the new ones being added in. They are removed from the Cross_Reference table ready for these new cross-references to be loaded. By the time we get to adding rows to Cross_Reference_Changes, the changes to Cross_Reference have already been applied, but comparison with Cross_Reference_Zapper tell us which pages to regenerate based on both deleted and added cross-references.
    • So, the queries are:-
      1. Cross_Reference_Changes_Deletions_Add is run first. If anything that was deleted hasn’t been replaced, the called pages have to be regenerated.
      2. Cross_Reference_Changes_Additions_Add which is slow because of an inner join to the query Cross_Reference_Latest (which is a summation query on Cross_Reference_Zapper) and an outer join to the table Cross_Reference_Zapper (for which, see below).
    • Something very cunning is going on here! Pages have to be regenerated whenever objects that call them have references either added or deleted, hence the two queries. Also, there needs to be some conflict avoidance.
    • In order to improve the run-times of a full website regeneration (where variable Full_Regen is set to True), I’ve removed the updates of Cross_Reference_Changes (but not – of course – of Cross_Reference) from all places where they are invoked. Improvements (as determined by Functor_23, options 4 – 8) have been:-
      1. CreateAbstractWebPages (Paper Abstracts: run time has reduced from 8.17 hours to 1.68 hours on 06/02/2022)
      2. CreateAuthorsWebPages (Authors: Had already reduced to 16 minutes; now 10 minutes on 06/02/2022).
      3. CreateBookPaperAbstractsWebPages (Book/Paper Abstracts: run time reduced from 72 minutes to 13 minutes on 06/02/2022).
      4. Notes_Text_Format
        → Notes: run time reduced from 3.62 hours to 50 minutes on 06/02/2022.
        → Notes Archived: run time reduced from 2.32 hours to 1.62 hours on 06/02/2022.
      This is a sensible move because – on a full re-gen – all pages are being regenerated in any case.
    • Rows are deleted by cmdRecalculate_Click using SQL driven by table Page_Regen, but only for Called_Type of “N”. So, the table only contains a few very recent rows of this type, but multitudes of rows for others, as is shown in the table above. I need to explain why this is the case: if looks like deletions may just have been forgotten.
    • So, what is the table actually used for? Most usages are either diagnostic or maintenance, and the only serious one seems to be Page_Regen_GEN, also invoked by cmdRecalculate_Click.
    • I suspect a fault in that this function regenerates the wrong pages. So, we might be on to something here! However, most pages – ie. authors, book and paper summaries – are regenerated by the badly-named cmdPaperSummaries_Click.
    • On investigation, using query Page_Regen_GEN_Test, a non-updating version of Page_Regen_GEN, there were (before Cross_Reference was truncated to the latest 40-days) 21.1k rows output to Page_Regen, including 4 to Author ID=0 and 2 to Image ID=0 (but these represented over 100k and 3k rows, respectively). Not sure of the purpose of including Images since they don’t have pages to regenerate (WebRefs are already excluded for that reason).
    • Table Page_Regen is then used 4 times in cmdRecalculate_Click:-
      • to warn how many Notes with be regenerated
      • to delete all its rows
      • to regenerate all its rows, as above
      • to regenerate all “called” Notes based on the rows just created.
    • No queries use the table other than in the circumstances just listed. So, it seems that the table is not used other than to regenerate Notes implicated in changes to other objects (including Notes).
    • Hence, it looks like the functions envisaged for the Cross_Reference_Changes table have not been fully implemented, and that it can be truncated until they have been!
    • Note that it’s not straightforward to fully implement regeneration of the “impacted” pages, as some are cross-references … more on this later.
    • I now delete all rows more than 40 old days in cmdRecalculate_Click.
  4. Cross_Reference_Zapper

Use of Links in Cross-Reference Pages
Improvements and Rationalisation Required
Performance Improvements



In-Page Footnotes:

Footnote 1: Footnote 2: Footnote 3: Footnote 5: Footnote 6: Footnote 16:


Table of the Previous 4 Versions of this Note:

Date Length Title
01/10/2021 13:17:46 20474 Website Generator Documentation - Cross-Referencing
04/10/2020 00:27:22 20400 Website Generator Documentation - Cross-Referencing
03/07/2020 22:09:07 17661 Website Generator Documentation - Cross-Referencing
27/06/2020 00:15:50 15964 Website Generator Documentation - Cross-Referencing



Note last updated Reference for this Topic Parent Topic
11/04/2022 00:01:26 1300 (Website Generator Documentation - Cross-Referencing) None


Summary of Notes Referenced by This Note

Test Note - Auto-XRef Website Generator Documentation - Author Narratives Website Generator Documentation - Book & Paper Summaries Website Generator Documentation - Book-Paper Abstracts Website Generator Documentation - Citations
Website Generator Documentation - Create Notes Web Pages Website Generator Documentation - Full Website Re-Gen Website Generator Documentation - Links & Link-Pages Website Generator Documentation - Paper Abstract Fixes Website Generator Documentation - Recalculation & Housekeeping

To access information, click on one of the links in the table above.




Summary of Notes Citing This Note

Status: Priority Task List (2023 - September) Status: Summary (2023 - June) Status: Web-Tools (2023 - June) Website - Outstanding Developments (2023 - September), 2 Website - Progress to Date (2023 - September), 2
Website Generator Documentation - Functors, 2, 3, 4, 5, 6 Website Generator Documentation - Links & Link-Pages Website Generator Documentation - Note References & Reading List, 2    

To access information, click on one of the links in the table above.




Text Colour Conventions

  1. Blue: Text by me; © Theo Todman, 2023




© Theo Todman, June 2007 - Sept 2023.Please address any comments on this page to theo@theotodman.com.File output:
Website Maintenance Dashboard
Return to Top of this PageReturn to Theo Todman's Philosophy PageReturn to Theo Todman's Home Page