Ceci n’est pas un journal: Notes on the Canonical Identity of a Press-Ready PDF
I caught myself hesitating over a filename at about 1:40 in the morning, somewhere between export dialogs and preflight checks, because 20260506-apxt-press-page-view-text-as-curves.pdf felt wrong in a way that was difficult to explain to anyone who has not spent an unhealthy amount of time thinking about records, archives, and institutional continuity.
The file itself was created on May 6, 2026. That is objectively true. The export happened on May 6. The email to the printer went out on May 6. The workstation metadata says May 6. The backup logs say May 6. Somewhere deep in the PDF internals, buried beneath compressed object streams and font tables, Adobe probably says May 6 too.
But the object was not a “May 6 file.”
It was the May 8, 2026 edition of the Appomattox Times.
And those are not the same thing.
This is one of those strange little divides between the software world and the archival world that most people never consciously notice. Modern computing culture tends to think in terms of process chronology: when something was generated, compiled, exported, modified, or committed. Build dates. Creation timestamps. Pipelines. CI/CD. The artifact is subordinate to the workflow which produced it.
But newspapers — real newspapers — do not think that way. Libraries do not think that way. Archivists certainly do not think that way.
Nobody walks into an archive and says:
“May I see the edition exported Wednesday evening around 9:42 PM?”
They ask for:
“the May 8 issue.”
That is the canonical identity of the object.
The more I thought about it, the more I realized the filename was not merely descriptive metadata. It was making an ontological claim about what the object was.
20260506-... implied:
this file intrinsically belongs to May 6.
But it does not.
It belongs to the publication cycle, issue chronology, citation structure, and historical record of May 8, 2026. If somebody pulls that file in thirty years — from a hard drive, a microfilm scan, a cold archive, an OCR corpus, a Library of Virginia ingest, an Archive.org mirror, or some poor future intern’s SQL query — they will not care when I clicked “Export PDF.”
They will care what edition it was.
That instinct, I think, comes from spending too much time around institutional records and not enough time accepting the disposable assumptions of modern software culture. I do not really see the PDF as “a file.” I see it as a preservation master for a serial publication. A newspaper issue is much closer to a bound volume or a reel of microfilm than it is to a transient desktop export.
The export timestamp already exists elsewhere anyway. Filesystems preserve it. Email headers preserve it. Backup systems preserve it. Printers preserve it. Logs preserve it. There was no need for the filename itself to redundantly encode workflow chronology when its more important purpose was to establish stable public identity.
In archival description — particularly under DACS-style thinking — dates are often attached not merely to the instant a thing came into existence, but to the period or context the thing represents. The record belongs to the intellectual and institutional arrangement of the collection. In simpler terms: the newspaper issue belongs to Friday, even if the press operator and publisher were awake cursing at font rendering on Wednesday night.
So I renamed it:20260508-apxt-press-page-view-text-as-curves.pdf
And immediately the world felt back in alignment.
Not because the filename became “more accurate” in a narrow technical sense, but because it became truthful about the object’s identity.
Which, ultimately, is what archives are for.