Hello everyone,
I’d like to introduce myself properly, having been visible around the edges of this community for a little while now.
My name is Łukasz Diener. I’m an independent researcher based in Kraków, Poland, working on PCT and its intersections with AI alignment, robotics, and the analysis of undeciphered ancient scripts. Over the past few weeks I’ve been in close contact with Dag Forssell and Bruce Nevin, who have been generous with their time and helped me sharpen the arguments before publication.
On 19 May 2026 I published a paper on Zenodo:
Perceptual Control as the Epistemological Antidote to RLHF Reward Hacking: Seven Frontier Models Diagnose Their Own Architecture
DOI: 10.5281/zenodo.20277919
The core argument: many of the documented failure modes of large language models trained with RLHF — reward hacking, sycophancy, confident confabulation, verbosity bias — are not bugs to be patched, but predictable consequences of optimising outputs rather than controlling perceptions. The paper treats RLHF as an output-optimisation architecture and contrasts it with PCT’s input-control architecture (the familiar e = r − p loop), arguing that the latter offers a structural rather than cosmetic path forward.
I’d be very interested in the community’s reactions — particularly from those of you who have thought carefully about the relationship between PCT and machine learning. Where does the argument hold? Where does it strain? What would you push back on?
The paper sits within a broader portal I maintain at perceptualcontroltheory.org, which is an independent, non-commercial knowledge base. Two more papers in the Excel in Clay series (on Proto-Elamite and Linear A as administrative systems analysed through PCT-derived structural audits) are out or forthcoming, and I’ll share those separately if there’s interest.
Looking forward to the discussion.
Warm regards,
Łukasz Diener
ORCID: 0009-0006-6103-8514