Documentation

This site lets you explore a lightweight knowledge graph derived from automated NER over the corpus.

How to read the site

  • Entities are normalized names/terms detected in text (people, places, organizations, etc.).
  • Edges connect two entities when they co-occur in the same document.
  • Evidence shows the underlying documents and the exact mention spans that produced an edge.

What counts mean

docs counts the number of documents where both entities appear.

pairs is a strength score based on how often both entities are mentioned together within the same document(s).

These are discovery signals; they are not proof of a historical or philosophical relationship.

Database objects (high level)

The site queries a Postgres knowledge-graph schema (typically named kg). The core objects are:

  • kg.entities: one row per normalized entity (text + label)
  • kg.edges: entity co-mention edges with aggregate counts
  • kg.documents: document metadata
  • kg.doc_edges: edge strength broken down by document
  • kg.mentions: raw mention spans with offsets and labels

Views like kg.vw_top_edges_filtered are used to remove obvious boilerplate and show the most informative edges first.

If you spot a systematic labeling issue (e.g., repeated false positives), it usually means the model or normalization rules should be adjusted and re-run.