Buckets:
Constructor Theory of Probability
Chiara Marletto1
Department of Materials, University of Oxford
April 2016
Unitary quantum theory, having no Born Rule, is non-probabilistic. Hence the notorious problem of reconciling it with the unpredictability and appearance of stochasticity in quantum measurements. Generalising and improving upon the so-called ‘decision-theoretic approach’ (Deutsch, 1999; Wallace, 2003, 2007, 2012), I shall recast that problem in the recently proposed constructor theory of information – where quantum theory is represented as one of a class of superinformation theories, which are local, non-probabilistic theories conforming to certain constructor-theoretic conditions. I prove that the unpredictability of measurement outcomes (to which I give an exact meaning via constructor theory), necessarily arises in superinformation theories. Then I explain how the appearance of stochasticity in (finitely many) repeated measurements can arise under superinformation theories. And I establish sufficient conditions for a superinformation theory to inform decisions (made under it) as if it were probabilistic, via a Deutsch–Wallace-type argument – thus defining a class of decision-supporting superinformation theories. This broadens the domain of applicability of that argument to cover constructor-theory compliant theories. In addition, in this version some of the argument’s assumptions, previously construed as merely decision-theoretic, follow from physical properties expressed by constructor-theoretic principles.
1 address for correspondence: chiara.marletto@gmail.com## 1. Introduction
Quantum theory without the Born Rule (hereinafter: unitary quantum theory) is a deterministic theory (Wallace, 2012), whose viability as a universal physical theory has long been debated (Deutsch, 1985; Albert, 2010; Kent, 2010; Wallace, 2012). A contentious issue is how to reconcile its determinism with the unpredictability and appearance of stochasticity in quantum measurements (Albert, 2010; Saunders, 2010; Kent, 2010). Specifically, two problems emerge: (i) how unpredictability can occur in unitary quantum theory, since, absent the Born rule and ‘collapse’-like processes, single measurements do not deliver single observed outcomes (see section 1.2); and (ii) how unitary quantum theory, despite its being non-probabilistic, can adequately account for the appearance of stochasticity observed in certain repeated measurements (see section 1.3).
Problems (i) and (ii) also arise in the recently proposed constructor theory of information (Deutsch & Marletto, 2015) (see sections 2 and 3). In that context unitary quantum theory is regarded as one of a class of theories, superinformation theories, most of them yet to be discovered, which are elegantly characterised by a simple, exact, constructor-theoretic condition. Specifically, certain physical systems permitted under such theories – called superinformation media – exhibit all the most distinctive properties of quantum systems. Like all theories conforming to the principles (Deutsch, 2013) of constructor theory, superinformation theories are expressed solely via statements about possible and impossible tasks, and are necessarily non-probabilistic. So a task being ‘possible’ in this sense means that it could be performed with arbitrarily high accuracy – not that it will happen with non-zero probability. Just as for unitary quantum theory, therefore, an explanation is required for how superinformation theories could account for unpredictable measurement outcomes and apparently stochastic processes.
To provide this, I shall first provide an exact criterion for unpredictability in constructor theory (section 4); then I shall show that unpredictability necessarily arises in superinformation theories (and hence in quantum theory) as a result of the impossibility of cloning certain states – thereby addressing problem (i). Then, I shall generalise and improve upon an existing class of proposed solutions to problem (ii)in quantum theory – known as the decision-theoretic approach (Deutsch, 1999; Wallace, 2003, 2007, 2012), by recasting them in constructor theory. This will entail expressing a number of physical conditions on superinformation theories for them to support the decision-theoretic approach – thus defining a class of decision-supporting superinformation theories, which include unitary quantum theory (sections 5, 6, 7). As I shall outline in section 1.4, switching to constructor theory widens the domain of applicability of such approaches, to cover potential generalisations of quantum theory that may be discovered amongst decision-supporting superinformation theories; it also clarifies the assumptions on which such approaches are based, by revealing that most of them are not decision-theoretic, as previously thought (Deutsch 1999, Wallace 2012), but physical.
1.1 The status of constructor theory
Constructor theory is a proposed fundamental theory of physics (Deutsch, 2013), consisting of principles aiming to underlie other physical theories (such as laws of motion of elementary particles, etc.), called subsidiary theories in this context. Its mode of explanation requires physical laws to be expressed exclusively via statements about which physical transformations (more precisely, tasks – section 2) are possible, which are impossible, and why. This is a radical departure from the prevailing conception of fundamental physics, which instead expresses laws as predictions about what happens, given dynamical equations and boundary conditions in space-time.
Constructor theory is not just a framework (such as, e.g., resource theory, (Coecke, Fritz & Spekkens, 2014), or category theory (Abramsky & Coecke, 2008)) for reformulating existing theories: its principles are proposed physical laws. They express regularities among subsidiary theories, including new regularities that the prevailing conception cannot adequately capture. It thereby addresses some of those theories' unsolved problems. Constructor theory's principles supplement subsidiary theories, illuminating their underlying meaning and informing the development of successors, just as, for instance, the principle of energy conservation does.
In this work I appeal to the principles of the constructor theory of information (Deutsch & Marletto, 2015). They express the regularities in physical laws that are implicitlyrequired by theories of information (e.g. Shannon's), via exact statements about possible and impossible tasks, thus giving a fully physical meaning to the hitherto fuzzily defined notion of information. Notions such as measurement and distinguishability, which are notoriously problematic to express in quantum theory yet constitute the foundation of the decision-theoretic approach, can be exactly expressed in constructor theory.
1.2 Unpredictability
Unpredictability is a property that must be clearly distinguished from randomness. The distinction is difficult to pin down in quantum theory, especially unitary quantum theory, but can be naturally expressed in the more general context of constructor theory. Unpredictability occurs in quantum systems even given perfect knowledge of dynamical laws and initial conditions2. When a perfect measurer of a quantum observable $\hat{X}$ – say the $x$ -component of the spin of a spin- $\frac{1}{2}$ particle – is presented with the particle in a superposition (or mixture) of eigenvectors of $\hat{X}$ , say $|0\rangle$ and $|1\rangle$ , it is impossible to predict reliably which outcome (0 or 1) will be observed. But in unitary quantum theory, a perfect measurement of $\hat{X}$ is merely a unitary transformation on the source $S_a$ (the system to be measured) and the target $S_b$ (the 'tape' of the measurer):
which implies, given the linearity of quantum theory
2 Thus it is sharply distinct from phenomena such as classical chaos.for arbitrary complex amplitudes $\alpha, \beta$ . Since no wave-function collapse is assumed, there is in reality no single “observed outcome” when the superposition is input. All possible outcomes occur simultaneously: in what sense are they unpredictable?
Additional explanations are needed – e.g. Everett’s (1957) is that the observer differentiates into multiple instances, each observing a different outcome, whence the impossibility of predicting which one (Saunders, 2010; Wallace, 2012). Such accounts, however, can only ever be approximate in quantum theory, as they rely on emergent notions such as observed outcomes and ‘universes’ (Wallace, 2012). Also, unpredictability is a counterfactual property: it is not about what will happen, but what cannot be made to happen. So, while the prevailing conception struggles to accommodate it, constructor theory does so naturally. Just as the impossibility of cloning a set of non-orthogonal quantum states is an exact property (Wootters & Zurek, 1982), in this paper I shall express unpredictability exactly as a consequence of the impossibility of cloning certain sets of states under superinformation theories (section 4). My exact, qualitative characterisation of unpredictability distinguishes it from (apparent) randomness, which, as I shall explain, requires a quantitative explanation.
1.3 The appearance of stochasticity
Another key finding of this paper is a sufficient set of conditions for superinformation theories to support a generalisation of the decision-theory approach to probability (Deutsch, 1999; Wallace, 2012), thereby explaining the appearance of stochasticity. This is the property that repeated identical measurements not only have different unpredictable outcomes, but are also, to all appearances, random. Specifically, consider the frequencies of each observed outcome3 $x$ in multiple measurements of a quantum observable $\hat{X}$ on $N$ systems each prepared in a superposition or mixture $\rho$ of $\hat{X}$ -eigenstates $|x\rangle$ . The appearance of stochasticity is that for sufficiently large $N$ the frequencies do not differ significantly (according to
3 In the relative-state sense (Everett, 1957).some a-priori fixed statistical test) from the numbers $\text{Tr}{\rho|x\rangle\langle x|}$ (and equality occurs in the limiting case of an ensemble (see section 6)).
To account for this, the Born Rule states that the probability that $x$ is the outcome of any individual $\hat{X}$ -measurement is $\text{Tr}{\rho|x\rangle\langle x|}$ , thus linking, by fiat, $\text{Tr}{\rho|x\rangle\langle x|}$ with the frequencies in finite sequences of experiments. In unitary quantum theory no such link appears, prima facie, to exist, since all possible outcomes occur in reality. How can that theory be used to form an expectation about finite sequences of experiments, as its Born-rule-endowed counterpart can?
The decision-theoretic approach claims to solve that problem (Deutsch, 1999; Wallace, 2003, 2007, 2012; Greaves & Myrvold, 2010; Saunders, 2004, 2005). It models measurements as deterministic games of chance: $\hat{X}$ is measured on a superposition or mixture $\rho$ of $\hat{X}$ -eigenstates; the reward is equal (in some currency) to the observed outcome. Thus the above problem is recast as that of how unitary quantum theory can inform decisions of a hypothetical rational player of that game, satisfying only non-probabilistic axioms of rationality. The decision-theory argument shows that the player, knowing unitary quantum theory (with no Born Rule) and the state $\rho$ , reaches the same decision, in situations where the Born Rule would apply, as if they were informed by a stochastic theory with Born-Rule probabilities $\text{Tr}{\rho|x\rangle\langle x|}$ . This explains how $\text{Tr}{\rho|x\rangle\langle x|}$ can inform expectations in single measurements under unitary quantum theory. One must additionally prove, from this, that unitary quantum theory is as testable as its Born-rule endowed counterpart (Wallace, 2012; Greaves & Myrvold, 2010; Deutsch, 2015).
Thus the decision-theoretic approach claims to explain the appearance of stochasticity in unitary quantum theory without invoking stochastic laws, rather as Darwin's theory of evolution explains the appearance of design in biological adaptations without invoking a designer. It has been challenged, especially in regard to testability (Kent, 2010; Dawid & Thébault, 2014), and defended in e.g. (Wallace, 2010; Greaves & Myrvold, 2010; Deutsch, 2015). Note that this work is not a defence of that approach; rather, it aims at clarifying and illuminating its assumptions (showing that most of them are physical), and at broadening its domain of applicability tomore general theories than quantum theory. However, a possible application of this result may be in investigating the physical meaning of the decision-theory argument in the context of the Everett interpretation.
In my generalised version of the decision-theoretic approach, I shall define a game of chance under superinformation theories (section 7) and then identify a sufficient set of conditions for them to support decisions (under that approach) in the presence of unpredictability (section 6). These conditions define the class of decision-supporting superinformation theories (including unitary quantum theory). Specifically, they include conditions for superinformation theories to support the generalisation $f_x$ of the numbers $\text{Tr}{\rho|x\rangle\langle x|}$ (section 5) corresponding to Born-rule probabilities. That is to say, my version of the decision-theory argument explains how the numbers $f_x$ can inform decisions of a player satisfying non-probabilistic rationality axioms under decision-supporting superinformation theories (section 7). Thus, decision-supporting superinformation theories would account for the appearance of stochasticity at least as adequately as unitary quantum theory.
1.4 Summary of the main results
Switching to constructor theory yields three interrelated results:
- The unpredictability of measurements in superinformation theories is exactly distinguished from the appearance of stochasticity, and proved to follow from the constructor-theory generalisation of the quantum no-cloning theorem (section 4).
- Sufficient set of conditions for superinformation theories to support the decision-theoretic argument (see sections 5, 6, 7) are provided, defining a class of decision-supporting superinformation theories, including unitary quantum theory. Constructor theory emancipates the argument from formalisms and concepts specific to (Everettian) quantum theory – such as ‘observed outcomes’ or ‘relative states’.
- Most premises of the decision-theory argument are no longer controversial decision-theoretic axioms, as in existing formulations, but follow from physical properties implied by exact principles of constructor theory.In section 2 and 3 I summarise as much of constructor theory as is needed; in section 4 I present the criterion for unpredictability; in sections 5 and 6 I give the condition for superinformation theories to permit the constructor-theoretic generalisation of the numbers $f_x = \text{Tr}{\rho|x\rangle\langle x|}$ ; in section 7, I present the decision-theory argument in constructor theory.
2. Constructor Theory
In constructor-theoretic physics the primitive notion of a ‘physical system’ is replaced by the slightly different notion of a substrate – a physical system some of whose properties can be changed by a physical transformation. Constructor theory’s primitive elements are tasks (as defined below), which intuitively can be thought of as the specifications of physical transformations affecting substrates. Since tasks involving only individual states are rarely fundamental, more general descriptors for substrates are convenient:
Attributes and variables. The subsidiary theory must provide a collection of states, attributes and variables, for any given substrate. These are physical properties of the substrate, and can be represented in several interrelated ways. For example, a traffic light is a substrate, each of whose 8 states (of three lamps, each of which can be on or off) is labelled by a binary string $(\sigma_r, \sigma_a, \sigma_g): \sigma_i \in {0, 1}, \forall i \in {r, a, g}$ , where, say, $\sigma_r = 0$ indicates the red lamp is off, and $\sigma_r = 1$ that it is on. Similarly for $i = a$ (amber) and $i = g$ (green). Thus for instance the state where the red lamp is on and the others switched off is $(1, 0, 0)$ .
An attribute is any property of a substrate that can be formally defined as a set of all the states in which the substrate has that property. So for example the attribute red of the traffic light, denoted by $r$ , is the set of all states in which the red lamp is on: $r = {(1, 0, 0), (1, 1, 0), (1, 0, 1), (1, 1, 1)}$ .
An intrinsic attribute is one that can be specified without referring to any other specific system. For example, ‘having the same colour lamp on’ is an intrinsic attribute of a pair of traffic lights, but ‘having the same colour lamp on as the other one in the pair’ is not an intrinsic attribute of either of them. In quantum theory,‘being entangled with each other’ is, likewise, an intrinsic attribute of a qubit pair; ‘having a particular density operator’ is an intrinsic attribute of a qubit, while the rest of its quantum state describes entanglement with other systems and so is non-intrinsic.
A physical variable is defined in a slightly unfamiliar way as any set of disjoint attributes of the same substrate. In quantum theory, this includes not only all observables, but many other constructs, such as any set ${x, y}$ where $x$ and $y$ are the attributes of being in distinct non-orthogonal states $|x\rangle$ and $|y\rangle$ of a quantum system. Whenever a substrate is in a state in an attribute $x \in X$ , where $X$ is a variable, we say that $X$ is sharp (on that system), with the value $x$ – where the $x$ are members of the set $X$ of labels4 of the attributes in $X$ . As a shorthand, “ $X$ is sharp in $a$ ” shall mean that the attribute $a$ is a subset of some attribute in $X$ . In the case of the traffic light, ‘whether some lamp is on’ is the variable $P = {\text{off}, \text{on}}$ , where I have introduced the attributes $\text{off} = {(0,0,0)}$ and $\text{on}$ , which contains all the states where at least one lamp is on. So, when the traffic light is, say, in the state $(1,0,0)$ where only the red lamp is on, we say that “ $P$ is sharp with value $\text{on}$ ”. Also, we say that $P$ is sharp in the attribute $r$ (red, defined above), with value $\text{on}$ – which means that $r \subseteq \text{on}$ . In quantum theory, the $z$ -component-of-spin variable of a spin- $\frac{1}{2}$ particle is the set of two attributes: that of the $z$ -component of the spin being $\frac{1}{2}$ , and $-\frac{1}{2}$ . That variable is sharp when the qubit is in a pure state with spin $\frac{1}{2}$ or $-\frac{1}{2}$ in the $z$ -direction, and is non-sharp otherwise.
Tasks. A task is the abstract specification of a physical transformation on a substrate, which is transformed from having some physical attribute to having another. It is
4 I shall always define symbols explicitly in their contexts, but for added clarity I use the convention: Small Greek letters (γράμματα) denote states; small italic boldface denotes attributes; CAPITAL ITALIC BOLDFACE denotes variables; small italic denotes labels; CAPITAL ITALIC denotes sets of labels; CAPITAL BOLDFACE denotes physical systems; and capital letters with arrow above (e.g. $\vec{C}$ ) denote constructors.expressed as a set of ordered pairs of input/output attributes $x_i \rightarrow y_i$ of the substrates. I shall represent it as5:
The ${x_i}$ are the legitimate input attributes, the ${y_i}$ are the output attributes. A constructor for the task $\mathfrak{A}$ is defined as a physical system that would cause $\mathfrak{A}$ to occur on the substrates and would remain unchanged in its ability to cause that again. Schematically:
where constructor and substrates jointly are isolated. This scheme draws upon two primitive notions that must be given physical meanings by the subsidiary theories, namely: the substrates with the input attribute are presented to the constructor, which delivers the substrates with the output attribute. A constructor is capable of performing $\mathfrak{A}$ if, whenever presented with the substrates with a legitimate input attribute of $\mathfrak{A}$ (i.e., in any state in that attribute) it delivers them in some state in one of the corresponding output attributes, regardless of how it acts on the substrate with any other attribute. A task on the traffic light substrate is ${\text{on} \rightarrow \text{off}}$ ; and a constructor for it is a device that must switch off all its lamps whenever presented when any of the states in on. In the case of the task ${\text{off} \rightarrow \text{on}}$ it is enough that, when the traffic light as a whole is switched off (in the state $(0,0,0)$ ), it delivers some state in the attribute on – say by switching on the red lamp only, delivering the state $(1,0,0)$ – not necessarily all of them.
The fundamental principle. A task $\mathfrak{T}$ is impossible if there is a law of physics that forbids its being carried out with arbitrary accuracy and reliability by a constructor. Otherwise, $\mathfrak{T}$ is possible, which I shall denote by $\mathfrak{T}'$ . This means that a constructor capable of performing $\mathfrak{T}$ can be physically realised with arbitrary accuracy and
5 This ‘ $\rightarrow$ ’ notation for ordered pairs is intended to bring out the notion of transformation inherent in a task.reliability (short of perfection). Catalysts and computers are familiar examples of approximations to constructors. So, ‘ $\mathcal{T}$ is possible’ means that it can be brought about with arbitrary accuracy, but it does not imply that it will happen, since it does not imply that a constructor for it will ever be built and presented with the right substrate. Conversely, a prediction that $\mathcal{T}$ will happen with some probability would not imply $\mathcal{T}$ ’s possibility: that ‘rolling a seven’ sometimes happens when shooting dice does not imply that the task ‘roll a seven under the rules of that game’ can be performed with arbitrarily high accuracy.
Non-probabilistic, counterfactual properties – i.e. about what does not happen, but could – are the centrepiece of constructor theory’s mode of explanation, as expressed by its fundamental principle:
- I. All (other) laws of physics are expressible solely in terms of statements about which tasks are possible, which are impossible, and why.
The radically different mode of explanation employed by this principle permits the formulation of new laws of physics (e.g. constructor information theory’s ones). Thus constructor theory differs in motivation and content from existing operational frameworks, such as resource theory (Coecke, Fritz & Spekkens, 2014). The latter aims at proving theorems following from subsidiary theories, allowing their formal properties to be expressed in a unified resource-theoretic formalism. Constructor theory, in contrast, proposes new principles, not derivable from subsidiary theories, to supplement them, elucidate their physical meaning, and impose severe restrictions ruling out some of them.
As remarked, a constructor is closely related to the notion of a chemical catalyst, as recently formalised, e.g. in resource theory (Fritz, 2015). A constructor is distinguished among generic catalyst-type objects in that it is required to be capable of performing the task reliably, repeatedly and to arbitrarily high accuracy. So it is not itself a physical object, but a manner of speaking about an infinite sequence of possible physical objects that would perform the task approximately.Hence principle I requires subsidiary theories to have two crucial properties (holding in unitary quantum theory): (i) They must support a topology over the set of physical processes they apply to, which gives a meaning to a sequence of approximate constructions, converging to an exact performance of $\mathfrak{T}$ ; (ii) They must be non-probabilistic – since they must be expressed exclusively as statements about possible/impossible tasks. For instance, the Born-Rule-endowed versions of quantum theory, being probabilistic, do not obey the principle.
Principle of Locality. I shall denote the combination of two substrates $\mathbf{S}_1$ and $\mathbf{S}_2$ by $\mathbf{S}_1 \oplus \mathbf{S}_2$ . Constructor theory requires all subsidiary theories to provide the following support for the concept of such a combination. First, $\mathbf{S}_1 \oplus \mathbf{S}_2$ is a substrate. Second, if subsidiary theories designate any task as possible which has $\mathbf{S}_1 \oplus \mathbf{S}_2$ as input substrate, they must provide a meaning for presenting $\mathbf{S}_1$ and $\mathbf{S}_2$ to the relevant constructor as the substrate $\mathbf{S}_1 \oplus \mathbf{S}_2$ . Third, and most importantly, they must conform to Einstein's (1949) principle of locality in the form:
II. There exists a mode of description such that the state of $\mathbf{S}_1 \oplus \mathbf{S}_2$ is the pair $(\xi, \zeta)$ of the states6 $\xi$ of $\mathbf{S}_1$ and $\zeta$ of $\mathbf{S}_2$ , and any construction undergone by $\mathbf{S}_1$ and not $\mathbf{S}_2$ can change only $\xi$ and not $\zeta$ .
Unitary quantum theory satisfies II, as is explicit in the Heisenberg picture (Deutsch & Hayden, 2000; Raymond-Robichaud & Brassard 2016). In that picture, the state of a quantum system is, at any one time, a minimal set of generators for the algebra of observables of that system, plus the Heisenberg state (Horsman & Vedral, 2007; Gottesman, 1999). Since the latter never changes, it can be abstracted away when specifying tasks: any residual 'non-locality' in that state (Wallace & Timpson, 2007) does not prevent quantum theory from satisfying principle II.
The parallel composition $\mathfrak{A} \otimes \mathfrak{B}$ of two tasks $\mathfrak{A}$ and $\mathfrak{B}$ is the task whose net effect on a substrate $\mathbf{M} \oplus \mathbf{N}$ is that of performing $\mathfrak{A}$ on $\mathbf{M}$ and $\mathfrak{B}$ on $\mathbf{N}$ . When $\mathfrak{A} \otimes \mathfrak{T}$ is
6 In which case the same must hold for intrinsic attributes.possible for some task $\mathcal{T}$ on some generic, naturally occurring substrate (as defined in Deutsch & Marletto, 2015), $\mathfrak{A}$ is possible with side-effects, which is written $\mathfrak{A}^{\mathcal{T}}$ . ( $\mathcal{T}$ represents the side-effect.)
3. Constructor theory of information
I shall now summarise the principles of the constructor theory of information (Deutsch & Marletto, 2015). The principles express exactly the properties required of physical laws by theories of (classical) information, computation and communication – such as the possibility of copying – as well as the exact relation between what has been called informally ‘quantum information’ and ‘classical information’.7
First, one defines computation media.8 A computation medium with computation variable $V$ (at least two of whose attributes have labels in a set $V$ ) is a substrate on which the task $\Pi(V)$ of performing a permutation $\Pi$ defined via the labels $V$
is possible (with or without side effects), for all $\Pi$ . $\Pi(V)$ is a reversible computation.9
Information media are computation media on which additional tasks are possible. Specifically, a variable $X$ is clonable if for some attribute $x_0$ of $\mathbf{S}$ the computation on the composite system $\mathbf{S} \oplus \mathbf{S}$
7 Thus, even though it is not itself an attempt to axiomatise quantum theory, it could provide physical foundations for information-based axiomatisations of quantum theory, (e.g. Clifton, Bub & Halvorson, 2003; Chiribella, D’Ariano & Perinotti, 2011), wherein ‘information’ is merely assumed to be a primitive, never explained).
8 This is just a label for the physical systems with the given definition. Crucially, it entails no reliance on any a-priori notion of computation (such as Turing-computability).
9 This is a logically reversible, (i.e., one-to-one) task, while the process implementing it may be physically irreversible, because of side-effects.namely cloning $X$ , is possible (with or without side-effects)10. An information medium is a substrate with at least one clonable computation variable, called an information variable (whose attributes are called information attributes). For instance, a qubit is a computation medium with any set of two pure states, even if they are not orthogonal (Deutsch & Marletto, 2015); with a set of two orthogonal states it is an information medium. Information media must also obey the principles of constructor information theory, which I shall now recall.
Interoperability. Let $X_1$ and $X_2$ be variables of substrates $S_1$ and $S_2$ respectively, and $X_1 \times X_2$ be the variable of the composite substrate $S_1 \oplus S_2$ whose attributes are labelled by the ordered pair $(x, x') \in X_1 \times X_2$ , where $X_1$ and $X_2$ are the sets of labels of $X_1$ and $X_2$ respectively, and $\times$ denotes the Cartesian product of sets. The interoperability principle is elegantly expressed as a constraint on the composite system of information media (and on their information variables):
III. The combination of two information media with information variables $X_1$ and $X_2$ is an information medium with information variable $X_1 \times X_2$ .
Distinguishing and measuring are expressed exactly in constructor theory as tasks involving information variables – without reference to any a priori notion of information. A variable $X$ of a substrate $S$ is distinguishable if
10 The usual notion of cloning, as in the no-cloning theorem (Wootters & Zurek, 1982), is (1) with $X$ as the set of all attributes of $S$ .where ${i_x}$ is an information variable (whereby $i_x \cap i_{x'} = \emptyset$ if $x \neq x'$ ). I write $x \perp y$ if ${x, y}$ is a distinguishable variable. Information variables themselves are distinguishable, by the interoperability principle III.11
A variable $X$ is measurable if a special case of the distinguishing task (2) is possible (with or without side-effects) – namely, when the original source substrate continues to exist12 in some attribute $y_x$ and the result is stored in a target substrate:
where $x_0$ is a generic, ‘receptive’ attribute and ‘ $X$ ’ = ‘ ${x' : x \in X}$ ’ is an information variable of the target substrate, which I shall call the output variable (which may, but need not, contain $x_0$ ). When $X$ is sharp on the source with any value $x$ , the target is changed to having the information attribute ‘ $x$ ’, meaning ‘ $\langle S \text{ had attribute } x \rangle$ ’.
A measurer of $X$ is any constructor capable of performing the task (3) for some choice of its output variable, labelling, and receptive state.13 Thus, it is also is a measurer of other variables: For example, it measures any subset of $X$ , or any coarsening of $X$ (a variable whose members are unions of attributes in $X$ ). Two notable coarsenings of $X_1 \times X_2$ are: $X_1 + X_2$ , where the attributes $(x_1, x_2)$ are re-labelled with numbers $x_1 + x_2$ (and combined accordingly), and $X_1 X_2$ , where the attributes $(x_1, x_2)$ are re-labelled with numbers $x_1 x_2$ (and likewise combined). I shall consider only non-perturbing measurements, i.e., $y_x \subseteq x$ in (3). Whenever the output variable is guaranteed to be sharp with a value ‘ $x$ ’, I shall say, with a slight abuse of terminology, that the measurer of $X$ “delivers a sharp output ‘ $x$ ’”.
11The set of any two non-orthogonal quantum states is not a distinguishable variable. In quantum theory two such states on ensembles are asymptotically distinguishable. This is generalised in constructor theory via the ensemble-distinguishability principle, which, in short, requires any two disjoint information attributes to be ensemble-distinguishable (Deutsch & Marletto, 2015).
12 Quantum observables that are usually measured destructively, e.g. polarisation, qualify as measurable in this sense, provided that measuring them non-destructively is possible in principle.
13 This differs from laboratory practice, where a measurer of $X$ is assigned some convenient, fixed labelling of its output states.The ‘bar’ operation. Given an information attribute $x$ , define the attribute $\bar{x}$ (‘ $x$ -bar’) as the union $\bigcup_{a:a \perp x} a$ of all attributes that are distinguishable from $x$ . With this useful tool one can construct a Boolean information variable, defined as ${x, \bar{x}}$ (which, as explained below, allows one to generalise quantum projectors). Also, for any variable $X$ , define the attribute $\mathbf{u}X \doteq \bigcup{x \in X} x$ . The attribute $\bar{\bar{\mathbf{u}}}_X$ is the constructor-theoretic generalisation of the quantum notion of the subspace spanned by a set of states. For example, consider an information variable $X = {0, 1}$ where $0$ and $1$ are the attributes of being in particular eigenstates of a non-degenerate quantum observable $\hat{X}$ (which also has other eigenstates). Then, $\bar{\bar{\mathbf{u}}}_X$ is the attribute of being in any of the possible superpositions and mixtures (prepared by any possible preparation14) of those two eigenstates of $\hat{X}$ .
Consistency of measurement. In quantum theory repeated measurements of physical properties are consistent in the following sense. Consider the variable $X = {0, 1}$ defined above. Let $2$ be the attribute of being in a particular eigenstate of $\hat{X}$ orthogonal to both $0$ and $1$ . All measurers of the variable $Z = {\mathbf{u}_X, 2}$ are then also measurers of the variable $Z' = {\bar{\bar{\mathbf{u}}}_X, 2}$ , so that all measurers of the former, when given any attribute $a \subseteq \bar{\bar{\mathbf{u}}}_X$ , will give the same sharp output ‘ $\mathbf{u}_X$ ’. The principle of consistency of measurement requires all subsidiary theories to have this property:
IV. Whenever a measurer of a variable $Z$ would deliver a sharp output when presented with an attribute $a \subseteq \bar{\bar{\mathbf{u}}}_Z$ , all other measurers of $Z$ would too.
It follows (Deutsch & Marletto 2015) that they would all deliver the same sharp output.
Observables. Since (from the definition of ‘bar’) $\bar{\bar{\bar{x}}} \equiv \bar{x}$ , attributes $x$ with $x = \bar{\bar{x}}$ have a useful property, whence the following constructor-theoretic generalisation of quantum information observables: An information observable $X$ is an information
14 In any local formalism for quantum theory, such as the Heisenberg picture, there are many states, differently prepared, with different local descriptors, that have the same density matrix.variable such that whenever a measurer of $X$ delivers a sharp output ' $x$ ' the input substrate really has the attribute $x$ .15 A necessary and sufficient condition for a variable to be an observable is that $x = \bar{x}$ for all its attributes $x$ (Deutsch & Marletto, 2015). For example, the above-defined variable $Z = {u_x, 2}$ is not an observable (a $Z$ -measurer delivers a sharp output ' $u_x$ ' even when presented with a state $\xi \in \bar{u}_x \setminus u_x$ , where ' $\setminus$ ' denotes set exclusion), but ${\bar{u}_x, 2}$ is.
Superinformation media. A superinformation medium $S$ is an information medium with at least two information observables, $X$ and $Y$ , that contain only mutually disjoint attributes and whose union is not an information observable. $Y$ and $X$ are called complementary observables. For example, in quantum physics any set of two orthogonal qubit states constitutes an information observable, but no union of two or more such sets does: its members are not all distinguishable. Superinformation theories are subsidiary theories obeying constructor theory and permitting superinformation media.
From that simple property it follows that superinformation media exhibit all the most distinctive properties of quantum systems (Deutsch & Marletto, 2015). In particular, the attributes $y$ in $Y$ are the constructor-theoretic generalisations of what in quantum theory is called “being in a superposition or mixture” of states in the complementary observable $X$ .
Generalised mixtures. Consider an attribute $y \in Y$ and define the observable $X_y \doteq {x \in X : x \not\prec y}$ . (In quantum theory $X$ could be the photon-number observable in some cavity, $|1\rangle\langle 1| + |2\rangle\langle 2| + \dots$ , and $y$ the attribute of being in some superposition or mixture of some of its eigenstates, e.g. $\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)$ . In that case $X_y$ would contain two attributes, namely those of being in the states $|0\rangle\langle 0|$ and $|1\rangle\langle 1|$ respectively.) One proves (Deutsch & Marletto, 2015) that:
15 “Delivering a sharp output ' $x$ ’ means that, according to the subsidiary theory, the output variable ' $X$ ' will be, objectively, sharp with value ' $x$ '; not (say) that the outcome ' $x$ ' is ‘observed in some universe’.1) $X$ is non-sharp in $y$ since $x \cap y = \emptyset, \forall x \in X$ (where ‘ $\emptyset$ ’ denotes the empty set), and $X_y$ contains at least two attributes.
- Some coarsenings of $X$ are sharp in $y$ , just as in quantum theory – where the state $\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)$ is in the $+1$ -eigenspace of the projector $|0\rangle\langle 0| + |1\rangle\langle 1|$ . The observable ${\bar{u}{X_y}, \bar{u}{X_y}}$ , $u_{X_y} = \bigcup_{x \in X_y} x$ , is the constructor-theoretic generalisation of such a projector,
and it is shown to be sharp in $y$ , with value $\bar{u}{X_y}$ . Like in quantum theory, any measurer of $X$ presented with $y$ , followed by a computation whose output is «whether the outcome was one of the ‘ $x$ ’ with $x \not\subset y$ », will provide a sharp output ‘ $\bar{u}{X_y}$ ’, corresponding to «yes». (Here and throughout, I adopt the convention that a ‘quoted’ attribute is the one that would be delivered by a measurement of the unquoted one, with suitable labelling – and likewise for variables.) In summary:
| Quantum Theory | Constructor Theory |
|---|---|
| is an eigenstate of an observable with | is complementary to |
For any observable $H = {h_1, \dots, h_n}$ , I call an information variable $z$ a generalised mixture of (the attributes in) $H$ if either $z$ is in $H$ (then it is a trivial mixture) or $(\forall h_i \in H)(z \cap h_i = \emptyset \ & \ z \not\subset h_i)$ and ${\bar{u}_H, \bar{u}_H}$ is sharp in $z$ with value $\bar{u}_H$ . (In quantum theory, the $h_i$ could be attributes of being eigenstates of some non-degenerate quantum observable, and a generalised mixture of those attributes would be a quantum superposition or mixture of those eigenstates.)
- Let $(a_y, b_y)$ be the attribute16 delivered by an $X$ -measurer (with substrates $S_a \oplus S_b$ ), when presented with $y$ (figure 1). (In quantum theory, $(a_y, b_y)$ could be an entangled
16 It is a pair of attributes by locality: in quantum theory, for a fixed Heisenberg state, that is a pair of sets of local descriptors, generators of the algebra of observables for $S_a$ and $S_b$ (Gottesman, 1999; Horsman & Vedral, 2010).state resulting from a measurement of $X$ , where $y$ was the state $\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)$ .) One can show that each of the local descriptors $a_y$ and $b_y$ has the same properties 1) and 2) as $y$ does, as follows:
Figure 1
Let $'X'_y \doteq {x' \in X : x \in X_y}$ (i) $'X'_y$ is not sharp in $b_y$ . (If it were, with value $'x'$ , that would imply, via the property of observables, that $y \subseteq x$ , contrary to the defining property that $x \cap y = \emptyset$ .) (ii) Also, $b_y \not\prec 'x'$ , $\forall 'x' \in 'X'_y$ (for if $b_y \perp 'x'$ , then $y$ could be distinguished from $x$ , contrary to assumption). (iii)
$b_y \subseteq \bar{\bar{u}}_{X'y}$ . For a measurer of ${\bar{\bar{u}}{X'y}, \bar{u}{X'_y}}$ applied to the target
substrate of an $X$ -measurer is also a measurer of ${\bar{\bar{u}}{X'y}, \bar{u}{X'y}}$ ; hence, when presented with $y$ , it must deliver a sharp output $'\bar{\bar{u}}{X'y}'$ . By the property of observables, ${\bar{\bar{u}}{X'y}, \bar{u}{X'y}}$ must be sharp in $b_y$ , with value $\bar{\bar{u}}{X'y}$ . By the same argument, one shows that $X_y$ is not sharp in $a_y$ , i.e. $a_y \cap x = \emptyset$ ; that $a_y \not\prec x$ , $\forall x \in X_y$ ; and that ${\bar{\bar{u}}{X_y}, \bar{u}{X_y}}$ is sharp in $a_y$ , with value $\bar{\bar{u}}_{X_y}$ .
Intrinsic parts of attributes. The attributes $a_y$ and $b_y$ are not intrinsic, for each depends on the history of interactions with other systems. (In quantum theory, $S_a$ and $S_b$ are entangled.) However, because of the principle of locality, given an information observable $X$ , one can define the $X$ -intrinsic part $[a_y]_X$ of the attribute $a_y$ as follows. Consider the attribute $(a_y, b_y)$ prepared by measuring $X$ on system $S_a$ using some particular substrate $S_b$ as the target substrate. In each such preparation, $S_a$ will have the same intrinsic attribute $[a_y]X$ , which I shall call the $X$ -intrinsic part of $a_y$ , which is therefore the union of all the attributes preparable in that way. The same construction defines the ' $X$ '-intrinsic part $[b_y]{X'}$ of $b_y$ .
It follows, from the corresponding property of $a_y$ : That ${\bar{\bar{u}}{X_y}, \bar{u}{X_y}}$ is sharp in $[a_y]X$ with value $\bar{\bar{u}}{X_y}$ ; that $[a_y]_X \cap x = \emptyset$ ; That $[a_y]_X \not\prec x$ , $\forall x \in X_y$ . Similarly for the 'quoted' variables and attributes. In quantum theory, $[a_y]X$ and $[b_y]{X'}$ are attributes of having the reduced density matrices on $S_a$ and $S_b$ . Unlike in (Zurek, 2005), they are not given any probabilistic interpretation, but only used as local descriptors oflocally accessible information (defined deterministically in constructor theory (Deutsch & Marletto, 2015)).
Successive measurements. In unitary quantum theory the consistency of measurement (see above) is the feature that when successive measurers of $\hat{X}$ are applied to the same source initially in the state $(\alpha|0\rangle + \beta|1\rangle)$ , with two systems $S_b$ and $S_{b'}$ as targets:
the projector for «whether the two target substrates hold the same value» is sharp with value 1. In constructor theory the generalisation of that property is required to hold. Define a useful device, the $X$ -comparer17 $\vec{C}_X$ . It is a constructor for the task of comparing two instances of a substrate in regard to an observable $X$ defined on each:
where $c_{x, x'} = \langle\text{yes}\rangle$ if $x = x'$ (i.e. if the first two substrates (sources) hold attributes with the same label) and $c_{x, x'} = \langle\text{no}\rangle$ otherwise. ${\langle\text{yes}\rangle, \langle\text{no}\rangle}$ is an information observable of a third substrate (target). In quantum theory if $\hat{X}$ has eigenstates ${|x\rangle}$ , $\vec{C}X$ is realised by a unitary that delivers «yes» (respectively «no») whenever the state of the sources is in $\text{Span}{x \in X} {|x\rangle|x\rangle}$ or a mixture thereof (respectively $\text{Span}_{x, x' \in X: x \neq x'} {|x\rangle|x'\rangle}$ ). The equivalent holds under constructor theory, by the principle of consistency of measurement IV: $\vec{C}X$ delivers the output «yes» if and only if the sources hold an attribute in $\overline{\overline{\bigcup{x \in X} {(x, x)}}}$ . Thus, in particular, if one of the sources has the attribute $x$ then $\vec{C}_X$ is a measurer of ${\bar{x}, \bar{x}}$ , i.e. of whether the other source also has the attribute $x$ (a property used in section 4).
17 $X$ is non-bold (denoting a set of labels) in order to stress that its task is not invariant under re-labelling.The fact that a quantum $\vec{C}_X$ would deliver a sharp «yes» when presented with the target substrates of successive measurements of an observable on the same source, is what makes ‘relative states’ and ‘universes’ meaningful in Everettian quantum theory, because it makes the notion of ‘observed outcome in a universe’ meaningful even when the input variable $X$ of the measurer is not sharp. The same holds in superinformation theories (figure 2).
Figure 2 Consistency of repeated measurements.
4. Unpredictability in superinformation media
Having presented the necessary parts of constructor information theory, I can now discuss unpredictability. I shall define it exactly in constructor theory, and show how it arises in superinformation media.
X-predictor. An X-predictor for the output of an X-measurer whose input attribute $z$ is drawn from some variable $Z$ (in short: ‘X-predictor for Z’), is a constructor for the task:
Figure 3 The scheme defining an X-predictor $\vec{P}_X$
where $P = {p_z}$ is an information
observable whose attributes $p_z$ – each representing the prediction «the outcome of the X-measurer will be ‘x’ given the attribute $z$ as input» – are required to satisfy the network of constructions in figure 3. $\vec{B}$ first prepares $S_a$ with the information attribute $z \in Z$ specified by some information attribute $s_z$ ; then the X-measurer $\vec{M}X$ is applied to $S_a$ ; and then its target $S_b$ and the output of the predictor, $p_z$ , are presented to an ‘X’-comparator $\vec{C}{X'}$ . If that delivers a sharp «yes», the prediction $p_z$ is confirmed. If it would be confirmed for all $z \in Z$ , then $\vec{P}_X$ is an X-predictor for $Z$ .
The exact definition of unpredictability is then:A substrate exhibits unpredictability if, for some observable $X$ , there is a variable $Z$ such that an $X$ -predictor for $Z$ is impossible
Hence unpredictability is the impossibility of an $X$ -predictor for a variable $Z$ . Note the similarity to ‘no-cloning’, i.e., the impossibility of a constructor for the cloning task (2) on the variable $Z$ .
No-cloning implies unpredictability. Indeed, I shall now show that superinformation theories (and thus unitary quantum theory) exhibit unpredictability as a consequence of the impossibility of cloning certain sets of attributes.
Consider two complementary observables $X$ and $Y$ of a superinformation medium and define the variable $Z = X_y \cup {y}$ . I show that there cannot be an $X$ -predictor for $Z$ .
For suppose there were. The predictor’s output information observable $P$ would have to include the observable ‘ $X'_y$ ’. For, if $z=x$ for some $x \in X_y$ , ‘ $X'y$ ’ has to be sharp on the target $S_b$ of the measurer with value ‘ $x$ ’; so the ‘ $X$ ’-comparer yielding a sharp «yes» would require $p_x = 'x'$ (as explained in section 3, $C{X'}$ is a measurer of ${\overline{\overline{x}}, \overline{\overline{x'}}}$ when ‘ $X$ ’ is sharp on one of its sources with value ‘ $x$ ’).
When $z=y$ , the $X$ -predictor’s output attribute $p_y$ must still cause the $X$ -comparer to output the sharp outcome «yes»; also, $P = 'X'_y \cup {p_y}$ is required to be an information variable: hence either $p_y = 'x'$ for some $'x' \in 'X'y$ ; or $'x' \in \overline{u}{X'y}$ . In the former case, again by considering $C{X'}$ as a measurer of ${\overline{\overline{x}}, \overline{\overline{x'}}}$ , ‘ $X'y$ ’ would have to be sharp on the target $S_b$ of the $X$ -measurer, with the value ‘ $x$ ’; whence $y \subseteq x$ , contrary to definition of superinformation. In the latter, since $y \subset \overline{u}{X'y}$ , $S_b$ would have the attribute $\overline{u}{X'_y}$ (section 3) so that the ‘ $X$ ’-comparer would have to output a sharp «no». This again contradicts the assumptions. So, there cannot exist an $X$ -predictor for $Z$ , just as there cannot be a cloner for $Z$ , because $Z$ is not an information variable.Thus, unpredictability is predicted by the superinformation theory's deterministic18 laws. Its physical explanation is given by the subsidiary theory. In Everettian quantum theory it is that there are different 'observed outcomes' across the multiverse. But constructor theory has emancipated unpredictability from 'observers', 'relative states' and 'universes', stating it as a qualitative information-theoretic property, just as no-cloning is.
5. X-indistinguishability equivalence classes
Quantum systems exhibit the appearance of stochasticity, which is more than mere unpredictability. Consider a quantum observable $\hat{X}$ of a $d$ -dimensional system $\mathbf{S}$ , with eigenstates $|x\rangle$ and eigenvalues $x$ . Successive measurements of $\hat{X}$ on $N$ instances of $\mathbf{S}$ , each identically prepared in a superposition or mixture $\rho$ of $\hat{X}$ -eigenstates, display the following convergence property: i) For large $N$ , the fraction of replicas delivering the observed outcome19 ' $x$ ' when $\hat{X}$ is measured can be expected not to differ significantly (according to some a-priori fixed statistical test) from the number $\text{Tr}{\rho|x\rangle\langle x|}$ ; ii) in an ensemble (infinite collection) of such replicas, each prepared in state $\rho$ (a "ρ-ensemble", for brevity), the fraction of instances that would give rise to an observed outcome ' $x$ ' equals $\text{Tr}{\rho|x\rangle\langle x|}$ (DeWitt, 1970).
But what justifies the expectation in i)? A frequentist approach to probability would simply postulate that the number $\text{Tr}{\rho|x\rangle\langle x|}$ from ii) is the 'probability' of the outcome $x$ when $\hat{X}$ is measured on $\rho$ – which would imply (via ad-hoc methodological rules, see e.g. (Papineau, 2006; and Deutsch, 2015)) that $\text{Tr}{\rho|x\rangle\langle x|}$ could inform decisions about finitely many measurements. In contrast, in unitary quantum theory, it is the decision-theoretic approach that establishes the same conclusion – with no ad-hoc probabilistic assumption. Absent that argument, the numbers $\text{Tr}{\rho|x\rangle\langle x|}$ are just labels of equivalence classes within the set of
18This is so even if the initial conditions are specified with perfect accuracy in contrast with classical 'chaos'.
19 Constructor theory's consistency of successive measurements (section 3), i.e. relative states in quantum theory, make the concept of observed outcome accurate for some purposes despite the lack of a single observed outcome.superpositions and mixtures of the states ${|x\rangle}$ . Each class, labelled by the $d$ -tuple $[f_x]{x \in X}$ , $0 \leq f_x \leq 1$ , $\sum{x \in X} f_x = 1$ , is the set of all states $\rho$ with $\text{Tr}{\rho|x\rangle\langle x|} = f_x$ . For instance, $c_0|0\rangle + e^{i\phi}c_1|1\rangle$ belongs to the class labelled by $[|c_0|^2, |c_1|^2]$ .
I shall now give sufficient conditions on superinformation theories for a generalisation of those equivalence classes, which I shall call ‘X-indistinguishability classes’, to exist on the set of all generalised mixtures of attributes of a given observable $X$ . One of the conditions for a superinformation theory to support the decision-theoretic argument will be that they allow for such classes (section 6).
In quantum theory, the equivalence classes are labelled by the $d$ -tuple $[f_x]_{x \in X}$ , where $f_x = \text{Tr}{\rho|x\rangle\langle x|}$ . Since the ‘trace’ operator need not be available in superinformation theories, to the end of constructing such equivalence classes I shall deploy a construction on fictitious ensembles. This is a novel mathematical construction, where properties of ensemble will be used to define properties of single systems without, of course, any probabilistic or frequentist interpretation.
At this stage, the $f_x$ are only labels of equivalence classes. Additional conditions will therefore be needed for the $f_x$ to inform decisions in the way that probabilities are assumed to do in stochastic theories (including traditional quantum theory via the Born Rule). I shall give these in section 7 via the decision-theory argument. No conclusion about decisions could possibly follow merely from what the results of measurements on an infinite ensemble would be, which is what my formal definition of the $f_x$ is about.
X-indistinguishability classes. I denote by $\mathbf{S}^{(N)}$ a substrate $\overbrace{\mathbf{S} \oplus \mathbf{S} \oplus \dots \mathbf{S}}^{N \text{ instances}}$ , consisting of $N$ replicas of a substrate $\mathbf{S}$ . Let us fix an observable $X$ of $\mathbf{S}$ , whose attributes I suppose with no relevant loss of generality to be labelled by integers: $X = {x : x \in X}$ , where $X = {0, 1, \dots, d-1}$ . Let $X^{(N)} \doteq {(x_1, x_2, x_3, \dots, x_N) : x_i \in X}$ be the set of strings of length $N$ whose digits can take values in $X$ , each denoted by $\underline{s} \doteq (s_1, s_2, s_3, \dots, s_N) : s_i \in X$ . $\mathbf{X}^{(N)} = {\underline{s} : \underline{s} \in X^{(N)}}$ is an observable of $\mathbf{S}^{(N)}$ . In quantum theory, supposing that $\hat{X}$ is an observable of a $d$ -dimensional system $\mathbf{S}$ , $\mathbf{X}^{(N)}$ mightbe $\hat{X}^{(N)} = \hat{X}_1 + d\hat{X}_2 + d^2\hat{X}_3 + \dots + d^{N-1}\hat{X}_N$ , whose non-degenerate eigenstates are the strings of length $N$ : $|\underline{s}\rangle \doteq |s_1\rangle|s_2\rangle\dots|s_N\rangle|s_i \in X$ .
Fix an $N$ . For any attribute $x$ in $X$ , I define a constructor $\vec{D}_x^{(N)}$ for the task of counting the number of replicas that hold a sharp value $x$ of $X$ :
where the numbers $f(x; \underline{s}) \doteq \frac{1}{N} \sum_{s_i \in \underline{s}} \delta_{x, s_i}$ label the attributes of the output information variable $O^{(N)} = {o^{(N)} : o \in \Phi^{(N)}}$ , with $\Phi^{(N)} = {f_i^{(N)}}$ denoting the set of fractions with
Figure 4. The constructor $\vec{D}_x^{(N)}$ .
denominator $N$ : $f_i^{(N)} \doteq \frac{i}{N}$ . Thus, whenever presented with the substrate $S^{(N)}$ on which $X^{(N)}$ is sharp with value $\underline{s}$ (i.e., $S^{(N)}$ is in a state $\xi \in \underline{s}$ ), $\vec{D}_x^{(N)}$ outputs the number of instances of $S$ on which $X$ is sharp with value $x$ . It could be realised, for instance, by measuring the
observable ' $X$ ' on each of the $N$ substrates in $S^{(N)}$ , and then by adding one unit to the output substrate, initially at 0, for each ' $x$ ' detected (figure 4). In quantum theory it effects a unitary operation defined by:
I shall now use $\vec{D}_x^{(N)}$ to define attributes of the substrate $S^{(N)}$ , whose limiting case for $N \rightarrow \infty$ will be used to define the $X$ -indistinguishability classes.
Consider first the observable $X_{x, f_i^{(N)}} \doteq {\underline{s} \in X^{(N)} : f(x, \underline{s}) = f_i^{(N)}}$ containing the strings $\underline{s}$ where a fraction $f_i^{(N)} \in \Phi^{(N)}$ of the replicas of $S$ hold a sharp attribute $x$ . They have the property that when presented to $\vec{D}x^{(N)}$ they make the observable $O^{(N)}$ sharp in output, with value $f_i^{(N)}$ . For example, for $N=3$ and $x=0$ , $\Phi^{(3)} = {0, \frac{1}{3}, \frac{2}{3}, 1}$ and $X{0, 2/3}$ contains the quantum states ${|001\rangle, |010\rangle, |100\rangle}$ . Now define the attribute $\underline{x}_{f_i^{(N)}}$ as the union of all the attributes $z^{(N)}$ of $S^{(N)}$ that when presented to $\vec{D}x^{(N)}$ make the observable $O^{(N)}$ sharp in output, with value $f_i^{(N)}$ . Next, consider the observable$F(\mathbf{x})^{(N)} \doteq {\underline{\mathbf{x}}{f_i^{(N)}} : f_i^{(N)} \in \Phi^{(N)}}$ . It follows from the consistency of measurement (section 3):
Therefore, crucially, for a given $\mathbf{x}$ , $F(\mathbf{x})^{(N)}$ can be sharp even if the observable $\mathbf{X}^{(N)}$ is not. In the example above, for $N=3$ and $x=0$ , $\underline{0}{2/3} \in F(\mathbf{0})^{(3)}$ is the set of all superpositions and mixtures of the eigenstates of $\hat{X}^{(3)}$ contained in $\mathbf{X}{0,2/3}$ , ${|001\rangle, |010\rangle, |100\rangle}$ : $\hat{X}^{(3)}$ is not sharp in most such mixtures and superpositions.
The observable $F(\mathbf{x})^{(N)}$ is key to generalising quantum theory's convergence property, for the latter is due to the fact that there exists the limit of the sequence of attributes $\underline{\mathbf{x}}_{f_i^{(N)}}$ for $N \rightarrow \infty$ . Let me now recall the formal expression of the convergence property in quantum theory (DeWitt, 1970)20:
Consider a state $|z\rangle = \sum c_x |x\rangle$ with the property that $|z\rangle^{\otimes N} = \sum c_{s_1} c_{s_2} \dots c_{s_N} |\underline{s}\rangle$ is a superposition of states $|\underline{s}\rangle = |s_1\rangle |s_2\rangle \dots |s_N\rangle$ $|s_i \in X$ each having a different $f(x; \underline{s}) = f_i^{(N)}$ . The convergence property is that for any positive, arbitrarily small $\varepsilon$ :
where $\delta(\underline{s}) = \sum_{x \in X} (f(x; \underline{s}) - |c_x|^2)^2$ . In other words, for any arbitrarily small $\varepsilon$ , there exists an $N$ such that the quantum implementation of $\bar{D}_x^{(N)}$ , when asked whether the fraction $f(x; \underline{s})$ of observed outcomes ' $x$ ' obtained when measuring $\hat{X}^{(N)}$ on $|z\rangle^{\otimes N}$ is within $\varepsilon$ of the value $|c_x|^2$ , is in a state as close (in the natural Hilbert space norm provided by quantum theory) as desired to one in which it answers «yes». Thus the proportion of instances delivering the observed outcome ' $x$ ' when $\hat{X}$ is measured on
20 My appealing to that property is not to conclude, using DeWitt's own words, that the "conventional statistical interpretation of quantum mechanics thus emerges from the formalism itself" (which would be circular); I shall only use it formally to construct certain attributes of the single substrate.the ensemble state $|z\rangle^\infty = \lim_{N \rightarrow \infty} |z\rangle^{\otimes N}$ is equal to $\text{Tr}{\rho_z |x\rangle\langle x|} = |c_x|^2$ (where $\rho_z \doteq |z\rangle\langle z|$ ). Therefore, all states $|z\rangle^{\otimes N}$ with the property $\text{Tr}{\rho_z |x\rangle\langle x|} = |c_x|^2$ will be grouped by $\bar{D}x^{(N)}$ in the same set, as $N$ tends to infinity, which can thus be labelled by $\text{Tr}{\rho_z |x\rangle\langle x|}$ . This set is an attribute of a single system, containing all quantum states with the property that $\text{Tr}{\rho_z |x\rangle\langle x|} = |c_x|^2 = f_x$ . The set of all superpositions and mixtures of eigenstates of $X$ is thus partitioned equivalence classes, labelled by the $d$ -tuple $[f_x]{x \in X}$ , $0 \leq f_x \leq 1$ , $\sum_{x \in X} f_x = 1$ .
A sufficient condition on a superinformation theory for a generalisation of these ‘X-indistinguishability classes’ to exist under it, on the set of all generalised mixtures of attributes of a given observable $X$ , is that it satisfy the following requirements:
E1) For each $\underline{x}_{f_i^{(N)}} \in F(x)^{(N)}$ , there exists the attribute of the ensemble of replicas of $S$ defined as
where $f^\infty = \lim_{N \rightarrow \infty} f_i^{(N)} \in \Phi$ ; $\Phi$ is the limiting set of $\Phi^{(N)}$ - which must exist, and its elements, which are real numbers, must have the property $\sum_{f^\infty \in \Phi} f^\infty = 1$ , $0 \leq f^\infty \leq 1$ .
The existence of the limit implies that those attributes do not intersect, i.e., the set $F(x)^{(\infty)} = {\underline{x}_{f^\infty} : f^\infty \in \Phi}$ is a (formal) variable of the ensemble – a limiting case of $F(x)^{(N)}$ , generalising its quantum analogue.
Given an attribute $z$ of $S$ , define $z^{(N)} \doteq \overbrace{(z, z, \dots, z)}^{\text{N terms}}$ , (in quantum theory, this is the attribute of being in the quantum state $|z\rangle^{\otimes N}$ ); introduce the auxiliary variable $X_{f^\infty} \doteq {z : \lim_{N \rightarrow \infty} z^{(N)} \subseteq \underline{x}{f^\infty}}$ and let $x{f^\infty} \doteq \bigcup_{z \in X_{f^\infty}} z$ – which, unlike its ensemble counterpart $\underline{x}_{f^\infty}$ , is an attribute of a single substrate $S$ .E2) For any generalised mixture $z$ of the attributes in $X$ , there exists a $d$ -tuple
call $[f(z)x]{x \in X}$ the X-partition of unity for the attribute $z$ .21
If $z$ has such a partition of unity, it must be unique because of E1). An X-indistinguishability equivalence class is defined as the set of all attributes with the same X-partition of unity: any two attributes within that class cannot be distinguished by measuring only the observable $X$ on each individual substrate, even in the limit of an infinite ensemble. In quantum theory, $x_f$ contains all states $\rho$ with $\text{Tr}{\rho|x\rangle\langle x|} = f$ . A superinformation theory “admits X-partitions of unity” (on the set of generalised mixtures of attributes in $X$ ) if conditions E1) and E2) are satisfied.
A key innovation of this paper is showing how the mathematical construction of an abstract infinite ensemble (culminating in the property E1)) can define structure on individual systems (via property E2)) – the attributes $x_f$ and the X-partition of unity – without recourse to the frequency interpretation of probability or any other probabilistic assumption.
X-partition of unity of the X-intrinsic part. Consider now the attribute of being in the quantum state $|z\rangle = c_0|0\rangle + e^{i\phi}c_1|1\rangle$ whose X-partition of unity is $[|c_0|^2, |c_1|^2]$ . In quantum theory, the reduced density matrices of the source and target substrate as delivered by an X-measurer acting on $|z\rangle$ still have the same partition of unity. The same holds in constructor theory. Consider the X-intrinsic part $[a_z]_X$ (section 4) of the attribute $a_z$ generated by measuring $X$ on the attribute $z$ . By definition of $\vec{D}_x^{(N)}$ ,
21 The existence of $x_{f^\infty}$ does not require there to be any corresponding attribute of the single system for finite $N$ : $x_{f^\infty}$ is constructed as a limit of a sequence of attributes on the ensemble. For example in quantum theory most sets $X_{f_i^{(N)}} \doteq {z : z^{(N)} \subseteq x_{f_i^{(N)}}}$ are empty, for any $N$ , except for $f_i^{(N)} = 1$ , containing $x^{(N)}$ ; and $f_i^{(N)} = 0$ , containing all attributes $\tilde{x}^{(N)} : \tilde{x} \in X, \tilde{x} \neq x$ .prepending a measurer of $X$ to each of the input substrates of $\bar{D}_x^{(N)}$ will still give a $\bar{D}_x^{(N)}$ , with the same labellings. Thus the construction that would classify $z$ as being in a certain $X$ -partition of unity, can be reinterpreted as providing a classification of $[a_z]_X$ , under the same labellings: the two classifications must coincide. If $y$ has a given $X$ -partition of unity, the $X$ -intrinsic part $[a_z]_X$ of $a_z$ must have the same one. Likewise for the intrinsic part $[b_z]_X$ of $b_z$ (obtained as output attribute on the target of the $X$ -measurement applied to $y$ ): the ' $X$ '-partition of unity of $[b_z]_X$ is numerically the same as the $X$ -partition of unity of $z$ .
6. Conditions for decision-supporting superinformation theories
For a given observable $X$ and a generalised mixture $z$ of attributes in $X$ , the labels $[f(z)x]{x \in X}$ of the $X$ -partition of unity defined in section 6 are not probabilities. Even though they are numbers between 0 and 1, and sum to unity, they need not satisfy other axioms of the probability calculus: for instance, in quantum interference experiments they do not obey the axiom of additivity of probabilities of mutually exclusive events (Deutsch, Ekert & Lupacchini, 2000). It is the decision-theory argument (section 7) that explains under what circumstances the numbers $[f(z)x]{x \in X}$ can be used to inform decisions in experiments on finitely many instances as if they were probabilities, without assuming them to be so. I shall now establish sufficient conditions on superinformation theories to support the decision-theory argument, thus characterising decision-supporting superinformation theories.
I shall introduce one of the conditions via the special case of quantum theory. Consider the $x$ - and $y$ -components of a qubit spin, $\hat{X}$ and $\hat{Y}$ . There exist eigenstates of $\hat{X}$ , i.e. $|x_1\rangle, |x_2\rangle$ , and of $\hat{Y}$ , i.e. $|y_{\pm}\rangle = \frac{1}{\sqrt{2}}[|x_1\rangle \pm |x_2\rangle]$ with the property that they are 'equally weighted', respectively, in the $x$ - and $y$ - basis – in other words, $|x_1\rangle, |x_2\rangle$ are invariant under the action of a unitary that swaps $|y_+\rangle$ with $|y_-\rangle$ ; and $|y_{\pm}\rangle$ are invariant under a unitary that swaps $|x_1\rangle$ with $|x_2\rangle$ . Moreover, there exist quantum states on the composite system of two qubits, which are likewise 'equally weighted' and have the special property that:$$\frac{1}{\sqrt{2}}[|x_1\rangle|x_1\rangle \pm |x_2\rangle|x_2\rangle] = \frac{1}{\sqrt{2}}[|y_+\rangle|y_+\rangle \pm |y_-\rangle|y_-\rangle] \quad (5)$$
I shall now require that the analogous property holds in superinformation theories. While in quantum theory it is straightforward to express this property via the powerful tools of linear superpositions, in constructor theory expressing the same conditions will require careful definition in terms of ‘generalised mixtures’.
The conditions for decision-supporting information theories is that there exist two complementary information observables $X$ and $Y$ such that:
T1) The theory admits $X$ -partitions of unity (on the information attributes of $\mathbf{S}$ that are generalised mixtures of attributes of $X$ ) and $X_a + X_b$ -partitions of unity (on the information attributes of the substrate $\mathbf{S}_a \oplus \mathbf{S}_b$ that are generalised mixtures of the attributes in the observable $X_a + X_b$ ).
T2) There exist observables $\tilde{X} \doteq {x_1, x_2} \subseteq X$ , $\tilde{Y} \doteq {y_+, y_-} \subseteq Y$ satisfying the following symmetry requirements:
R1. $x_1, x_2$ are generalised mixtures of $\tilde{Y}$ and ${y_1, y_2}$ are generalised mixtures of attributes of $\tilde{X}$ .
As a consequence, since ${\bar{u}{\tilde{x}}, \bar{u}{\tilde{x}}}$ is sharp in both $y_+$ and $y_-$ with value $\bar{u}{\tilde{x}}$ , it follows that $f(y+)x = 0 = f(y-)x$ for all $x$ other than $x_1$ and $x_2$ . Similarly, $f(x_1)y = 0 = f(x_2)y, \forall y \neq y+, y-$ . Note also that by definition of complementary observables, $x_i \cap y{\pm} = \emptyset$ – i.e., the attributes are non-trivial generalised mixtures.
Defining the computation $S_{a,b} \doteq {a \rightarrow b, b \rightarrow a}$ which swaps the attributes $a, b$ of $\mathbf{S}$ :
R2. $S_{x_1, x_2}(y_{\pm}) \subseteq y_{\pm}; S_{y_+, y_-}(x_i) \subseteq x_i, i = 1, 2$ .
In quantum theory, $y_{\pm}$ correspond to two distinguishable equally-weighted quantum superpositions or mixtures of the attributes in $X_y$ , such as $|y_{\pm}\rangle$ . Similarly for $|x_1\rangle, |x_2\rangle$ . The principle of consistency of measurement (section 3) implies that if an attribute $y$ has an $X$ -partition of unity with element $f(y)_x$ , for any permutation $\Pi$
Xet Storage Details
- Size:
- 73.4 kB
- Xet hash:
- ebc3387b59b0cadacacc3828d2e43e488ad20dcd308f837eacbe2434c08ab369
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.