A scheme for building generative models using tensor trees — Enabling the determination of causal structures such as phylogenetic trees

Building on previous work [1], we proposed [2] a novel approach for building a generative model using a nonnegative tensor tree and demonstrated its ability to capture the hidden structure of various synthetic and real-world datasets. Machine learning models based on tensor networks typically use a Born machine ansatz, where the wavefunction is directly modeled and the probability of a given sample is determined by squaring the wavefunction (the Born rule). However, in this scheme, we cannot directly interpret the model since it is a type of quantum model.

In our work [2], we considered an adaptive tensor tree constrained to have nonnegative elements. We showed that there is an equivalence between our proposed model and hidden Markov models, and that this establishes a basis for the probabilistic interpretation of our proposed model, which we call a “nonnegative adaptive tensor tree” (NATT). We call the tensor tree an adaptive tensor tree because it can change its structure according to the training data. We perform local rearrangements of the network structure favoring

To demonstrate a concrete application of our method, we consider real data drawn from bioinformatics, where we reconstruct a cladogram of species in the taxonomic family Carnivora (dogs, cats, and related species) using mitochondrial DNA sequence data from a specific gene common to all the considered organisms. We found that the resulting network showed a generally correct hierarchical clustering of species even if the only information provided was unlabeled DNA sequence alignment data. The figure shows an example network structure generated using our method. Our method correctly determines the hidden structure of the data by clustering similar species based entirely on genetic information. The advantage of our method is in the observation that the network structure and weights correspond to a probabilistic model that describes the mutation of nucleotides, which means that the generated tensor tree is fully interpretable as a model of a stochastic process. We expect that the NATT would be a useful tool in exploratory data science applications and transparent AI, where developing interpretable generative models for large datasets containing hidden correlation structures is of particular interest.

[1] Kenji Harada, Tsuyoshi Okubo and Naoki Kawashima, “Tensor tree learns hidden relational structures in data to construct generative models”, Mach. Learn.: Sci. Technol. 6 (2025) 025002.
[2] Katsuya O. Akamatsu, Kenji Harada, Tsuyoshi Okubo and Naoki Kawashima, “Plastic tensor networks for interpretable generative modeling”, Mach. Learn.: Sci. Technol. 7 (2026) 015014.

(https://doi.org/10.1088/2632-2153/ae3048)

Press release “Nonnegative adaptive tensor trees for generative modeling”

A press release has been issued on a joint research project with Kenji Harada of Kyoto University and Tsuyoshi Okubo of the Department of Physics at the University of Tokyo, entitled “A scheme for building generative models using tensor trees: Enabling the determination of causal structures such as phylogenetic trees.”

https://www.issp.u-tokyo.ac.jp/maincontents/news2.html?pid=29696

Does the majority rule transformation bring us to the correct fixed point?

Fig. 1: A block-spin transformation according to the majority rule performed on an Ising model configuration.

Block-spin transformations (Fig. 1) are a simple way to implement the real-space renormalization group (RSRG) for spin models. They work by dividing the lattice into blocks and determining representative spins for each block according to some rule or map. Two common choices of these maps are decimation, where the spin at a particular site is always chosen to represent the block, and the majority rule, where the most frequent spin in a block is chosen to represent it. By repeatedly applying each of these two maps to a large critical configuration, we can see that there is a qualitative difference in their effect (Fig. 2). Decimation is relatively well-understood, as it can be treated analytically, but despite the popularity and accessibility of the majority rule map, its behavior has not been thoroughly studied analytically or even numerically.

Fig. 2: The effect of repeatedly applying the decimation map (top) and majority rule map (bottom) to the same initial critical configuration. The renormalized configuration generated by the majority rule appears to be “cleaner” than its counterpart.

In our paper [1], we obtained numerical data to answer the question of whether the majority rule map brings us to the correct fixed point. We simulated large critical \(q=2,3,4\) Potts model configurations and repeatedly applied \(2\times2\) RSRG map iterations to the configurations to examine the spin and energy correlation functions across different choices of the RSRG map, including the decimation map and the majority rule map. We found that when the renormalized lattice size \(L_g\) is fixed, allowing the initial lattice size \(L_0\) and number of iterations \(g\) to vary, the RG flow of the aforementioned correlation functions converge to a nontrivial curve with the correct scaling behavior, under the majority rule map on critical Potts configurations (Fig. 3). We denote this property as “faithfulness”, as the RSRG map correctly preserves the critical behavior of the model under repeated iteration. We also provided an explanation for why the majority rule correctly reproduces the scaling of the magnetization in [1].

Fig. 3: Plots of the spin (left) and energy (right) correlation functions for the majority rule map on the critical \(q=2\) Potts (Ising) model, fixing a renormalized lattice size of \(L_g=256\) and allowing \(L_0\) and \(g\) to vary appropriately.

References

  1. K. O. Akamatsu and N. Kawashima, J. Stat. Phys. 191, 109 (2024).