Hensel Load-Store Architecture:
Distributed Register Design for Exact
2-adic Arithmetic
SciSci Inventions, No. 3
September 2025

James Douglas Boyd
Founder, CEO
Invention
Load-store architecture for the Hensel CPU
Institutions
▣ SciSci Research, Inc. (サイサイ・リサーチ)
▣ Future Computing (未来コン), SciSci's high-performance computing project
Discipline
High-performance computing (HPC)
Topics: LSU combinational logic, circuit tree design, relationship between χ-modification of operands by LSU loaders and Φ-perturbations of circuit tree paths, relationship to hop calculus and π-sequence passing, parallelization and scaling.
Topics
Efficient Operand Handling for Exact Computing
▣ The Hensel CPU architecture performs exact arithmetic on 2-adic numbers. Because the coefficients of 2-adic expansions are either 0 or 1, operands can be encoded in bits as finite lists of 2-adic expansion coefficients, and computations can be performed with standard MOSFET technology. As previously discussed in the Virtual Hensel report, the Hensel CPU is equipped with a distributed cluster of registers who addresses and arrangement are designed to efficiently handle 2-adic operands.
▣ The distributed register cluster allows operands to be broken down and assigned to address-matching registers via efficient transport. Many operands can be subject to load-store with a modest number of register this way. Distributed transport is parallelized, which, when paired with the parallelization of 2AALU arithmetic, bodes well for Hensel CPU scaling in the high-performance domain.
The Combinational Logic of Load-Store
▣ The LSU circuit has a tree-like structure, and can be thought of as a tree of "loader" devices. Each loader is simple in design. It consists of no more than a XOR gate, an XNOR gate, and a Φ-input.
▣ Each loader in the circuit tree at a given level receives an input from its parent vertex in the tree, as well as an input from its own Φ. These two inputs are then fed to its XOR gate and XNOR gate, and the outputs from each are then fed to the two loaders that are its children vertices in the circuit tree. The sequence of loaders that feed an output of 1 downstream all the way down to a sink vertex determines the path in the circuit tree.
▣ Just as cluster processor registers (CPRs) reside at level ℓ=1 within the processor cluster, so too do they reside at the end of each distinct path in the circuit tree. Moreover, the address of each CPR (e.g., (1,0,1)) is the same as the list of XOR-gate values in the circuit tree path terminating at that CPR. The Hensel load-store architecture dictates that χ-IDs be loaded to CPRs with matching addresses, and the circuit design actually gives one χ-address matching for free: one can load a χ-ID to the address-matching CPR simply by sending a π-sequence to the CPR at the end of the relevant circuit tree path.
▣ Given an operand with χ-IDs loaded to various CPRs, χ-modification is performed by changing Φ-input values, which in turn change the AND and XOR gate outputs, giving a new path in the circuit tree, terminating at a new CPR; this is termed Φ-perturbation. Φ-inputs are encoded by FC-2-2025, the encoding standard for arithmetic instructions. Thus, instructions for exact arithmetic in the Hensel processor are no more than specifications of Φ-inputs which perturb the circuit tree path to the CPR whose address matches the χ-ID for the arithmetic output.
▣ Loader carriers are packaged within one another recurrently, with a given loader packaged within the carrier of the loader that is its parent vertex in the circuit tree (alongside the other loader sharing the same parent vertex), with all loaders still being directly surface mounted. This gives the nested structure often discussed in Hensel reports.
Computational Advantages
▣ Matching between χ-IDs and CPRs is automatic, due to the positioning of CPRs in the circuit tree.
▣ Given any pair of FC-3-2025 encoded input operands and an arithmetic operation, one can generate a FC-2-2025 encoding for the arithmetic computation instructions, which is simply a list of Φ-inputs, which can be introduced in parallel. The only remaining computation is the running of the path in the circuit tree, which introduces a mild O(n) term, where n is one less than the nest depth. (In the case of the Virtual Hensel, n=6).
▣ The circuit tree, in which loaders feed outputs to their children nodes, lends itself readily to nested packaging, and thus gives rise to the nested structure described in previous reports, which is highly compatible with the nature of 2-adic distance and thus realizes the many efficiency and optimality properties for 2-adic computing discussed in the original report.
▣ Because all FC encodings are in bits, the circuit design employs standard transistor technology; no exotic hardware is required. Moreover, because the Hensel CPU performs arithmetic as χ-modification according to stored instructions, loader operations are very simple, and thus loader design is elementary. It is expected that the light cost footprint of loaders and registers will allow for high-nest-depth designs with broad operand capacity and arithmetic reach in order to realize an exact computing capability for high-performance computing.
Executive Summary
In hindsight, the Hensel CPU project, which began in May 2025, was nonetheless possible thanks to several academic visits to international mathematics research institutions, during which Boyd accrued some knowledge about p-adic analysis within the context of arithmetic geometry, analytic number theory, and representation theory. These include the Research Institute for Mathematical Sciences (RIMS; 数理解析研究所) in Kyoto, Japan; the Institute for Pure and Applied Mathematics (IMPA; Instituto Nacional de Matemática Pura e Aplicada) in Rio de Janeiro, Brazil, and the Nesin Mathematics Village (Nesin Matematik Köyü) in Şirince, Türkiye.

