On the Role of Coherence in Shor's Algorithm

Shor's factoring algorithm provides a super-polynomial speed-up over all known classical factoring algorithms. Here, we address the question of which quantum properties fuel this advantage. We investigate a sequential variant of Shor's algorithm with a fixed overall structure and identify the role of coherence for this algorithm quantitatively. We analyze this protocol in the framework of dynamical resource theories, which capture the resource character of operations that can create and detect coherence. This allows us to derive a lower and an upper bound on the success probability of the protocol, which depend on rigorously defined measures of coherence as a dynamical resource. We compare these bounds with the classical limit of the protocol and conclude that within the fixed structure that we consider, coherence is the quantum resource that determines its performance by bounding the success probability from below and above. Therefore, we shine new light on the fundamental role of coherence in quantum computation.


I. Introduction
Factoring large integers is considered to be a notoriously hard problem on a classical device.No classical factoring algorithm with polynomial run time is known, and the assumption that none exists lies at the heart of the widely used RSA cryptosystem [1].Therefore, Shor's discovery of a quantum algorithm capable of factoring in polynomial time [2] not only attracted interest in this algorithm itself but the field of quantum computation in general: It is an example of a quantum algorithm that provides a super-polynomial computational speed-up over its best known classical counterpart (see also Refs.[3][4][5][6]).Since quantum devices are governed by laws different to those of classical physics, in principle, it might not seem too surprising that they can outperform them in certain applications.But which properties of quantum mechanics not present in classical physics fuel the speed-up in Shor's algorithm?And can they be used to explain speed-ups for the solution of other problems too?It is known that the presence of an unbounded amount of multi-partite entanglement is necessary for exponential speed-ups in circuit-based pure state quantum computation because every protocol that does not exhibit this property can be simulated efficiently on a classical device [7].This result, therefore, describes a necessary condition for exponential speed-ups in arbitrary protocols but not a sufficient condition as the presence of unbounded entanglement does not gaurantee efficient quantum computation.This and the lack of a connection of entanglement and classical simulability in the case of mixed states might give a hint that deeper concepts underpin the computational speed-up.
Here, we go one step further and instead of asking whether a resource is necessary to obtain speed-ups or describing its evolution during a protocol [7][8][9][10][11], we explore the speed-up that it actually grants.To start this exploration, we retreat from the general computational setting and focus on a specific algorithm with a fixed overall structure, namely a variant of Shor's algorithm introduced by Parker and Plenio [12].The focus allows us to present lower and upper bounds on the performance of this algorithm that hold for mixed states too and are expressed in terms of coherence measures, which are derived in the framework of quantum resource theories [13].
Quantum resource theories, see for example Refs.[13][14][15][16][17][18][19][20][21][22][23][24][25], are mathematical tools developed to describe the resource character of quantum properties in a mathematically and operationally well defined manner.Their central idea is to impose additional restrictions on the laws of quantum mechanics, which single out certain properties as precious resources.The resource content of various physical objects such as states, operations, or measurements can then be quantified rigorously via resource measures that cannot increase under actions that are still permitted in the presence of the constraints.Furthermore, one can study which physical operations may still be implemented in the presence of these restrictions and at what cost they can be overcome when supplied with resourceful quantum objects.This allows for an investigation of which resource is responsible for what quantum advantage.Besides the insights that our results give on Shor's algorithm, they also show that (dynamical) resource theories are applicable to problems of practical relevance.Whilst this is often used as a motivation to study resource theories, quantitative relations between coherence and performance beyond variations of discrimination, exclusion, and detection games [26][27][28][29][30][31][32][33][34][35][36][37] are surprisingly rare [8,[38][39][40].
After a short introduction to the relevant aspects of resource theories and Shor's algorithm, we present our main results.First, we carefully motivate and describe what algorithm we are considering and how this allows us to investigate the role of coherence.This is crucial because we need to fix the over-all structure of our algorithm: The most general approach to an investigation of the speed-up quantum resources grant in factoring would be to compare an ideal quantum algorithm (given fixed resources) to an ideal classical algorithm.Since it is unknown what such algorithms are, this is however out of reach.Instead, we will focus on the quantum part in Shor's algorithm, namely the order-finding protocol, and fix its core, the modular exponentiation, whilst varying the remainder.This approach provides enough freedom while giving enough structure to observe interesting quantitative connections.We conclude with a discussion and outlook and refer to the SM for proofs and further details.

II. Quantum resource theories
In this section, we give a brief introduction to the resource theoretical notions that we will use in this work.We will restrict ourselves to finite-dimensional quantum systems, which we label by large Latin letters.Quantum states will be denoted by small Greek letters, quantum channels, i.e., linear and completely positive and trace-preserving (CPTP) maps that transform quantum states by large Greek letters.Additionally, if clear from the context, super-channels, i.e., linear maps that transform channels into channels will be labeled by large Latin letters as well.
Generally, resource theories emerge from restrictions that are frequently motivated experimentally.Here we focus on constraints concerning the ability to create and detect coherence, but the concepts can be analogously applied to other restrictions.We begin by fixing the incoherent basis {|i } i , i.e., the basis with respect to which we are going to describe coherence.Since we are considering circuit-based quantum computation, the computational basis in which we are encoding and extracting our classical information is the natural choice: If we never create coherence with respect to the computational basis, we are essentially reduced to the (classical) application of stochastic matrices onto probability vectors.
A quantum state σ is now considered incoherent and equivalent to a probability vector iff it is diagonal in the incoherent basis, i.e., iff ∆(σ) = σ, where denotes total dephasing in the incoherent (computational) basis {|i } i .We denote the set of incoherent states by I and call the maximal set of channels Φ that cannot create coherence, i.e., turn an incoherent state into a coherent one, the maximally incoherent channels and denote it by MIO [17,[41][42][43].This set constitutes of all channels Φ that satisfy Φ∆ = ∆Φ∆.To exploit coherence, we do not only need to create it, but also use it.Using coherence is only possible if we have access to measurements that can detect it in the sense that its presence makes a difference in measurement statistics [43][44][45].
As detailed in Ref. [43], it is possible to identify all instruments that cannot detect coherence with the detectionincoherent channels DI, i.e., all channels Φ satisfying ∆Φ = ∆Φ∆ [41,43,46,47].After we defined the set of channels that cannot create/detect coherence and are thus considered free for the respective task, we can use them to build dynamical resource theories [29,43,[48][49][50][51][52][53][54][55][56][57][58][59][60][61] introduced only relatively recently to quantify how well non-free channels can create/detect coherence.The missing pieces are the superchannels that map quantum channels to quantum channels.A super-channel S can be represented by two quantum channels Θ 1 , Θ 2 that are used as pre-and post-processing, i.e., ( This definition is natural in the context of circuit quantum computation, but can also be shown to be the most general one consistent with an operational interpretation [62].We now also divide the super-channels into free and non-free: A minimal requirement is that a free super-channel maps free channels to free channels, otherwise it would be possible to create non-free operations from free ones at no cost.Since both DI and MIO are closed under sequential and parallel concatenation, we take the circuit-based approach and define a super-channel as free iff it can be represented by a free preand postprocessing [63,64].The set of free super-channels in the resource theory concerning the creation/detection of coherence will be labeled by MIOS/DIS.This concept allows us to compare the value of channels: A channel Θ is at least as valuable as a channel Λ if there exists a free superchannel S such that S[Θ] = Λ.In general, it is difficult to decide whether such an S exists, which is why one considers (dynamical) resource measures.These are functionals M that map quantum operations to the non-negative numbers and satisfy (i) monotonicity: ) for all free superchannels S, i.e., they respect the preorder that the free superchannels impose on the channels and therefore their relative value, (ii) faithfulness: M (Θ) = 0 iff Θ is free, (iii) convexity.
We will now proceed to define the two measures that we will use to quantify the connection between coherence and the performance of Shor's algorithm.The robustness of coherence [26] is defined as From this, we can define a dynamical measure that describes how well a channel can create coherence (i.e., with respect to MIO) via which is a resource generation capacity [65][66][67][68][69][70].For the detection-incoherent setting, the NSID-measure [43] (nonstochastic in detection) is a dynamical measure that describes how well a channel can detect coherence (i.e., with respect to DI), and is given by Furthermore, we show in the SM that an intuitive candidate for a measure, namely D(Λ) = max ρ ∆Λ(1 − ∆)ρ 1 , fails to form a measure in the DI setting, as it violates monotonicity.
Figure 1a: The quantum subroutine of Shor's order-finding algorithm.
CLASSICAL CONTROL AND CLASSICAL POST-PROCESSING Figure 1b: A sequential variant of the order-finding algorithm, where the R ′ n denote phase gates that depend on the outcomes of the previous measurements and

B
. See the main text and the SM for further details.

III. Shor's algorithm
Let N denote an integer to factor and assume N to be large.The factorization problem can be reduced to the order-finding problem: given integers N and x where x < N and x coprime to N , the order r is defined as the smallest integer such that x r = 1 (mod N ) (see Ref. [2] and the SM for more information).Solving order-finding for a randomly chosen x with the above properties allows to solve factoring with high probability, and it is exactly what the quantum parts of the various versions of Shor's algorithm accomplish efficiently.
For the standard quantum order-finding protocol one uses two quantum systems A and B of dimension q and N respectively, where system A consists of L qubits with N 2 < q = 2 L < 2N 2 .Furthermore, one defines a unitary by U B |n = |xn (mod N ) that acts on system B and the modular exponentiation via Important from a resource theoretical perspective, U c is both in DI and in MIO, i.e., it can neither produce nor detect coherence and is thus considered free in both resource theories.As shown in Fig. 1a, an order-finding protocol works then as follows: Initialize system AB in the state |0 ⊗L A |1 B , first apply Hadamard gates to each qubit, apply U c , followed by an inverse Fourier transform F † on A and then a measurement in the computational basis.This allows us to estimate a randomly chosen eigenvalue of U B with sufficiently high probability from the measurement outcome via the continued fraction algorithm and thus deduce r.Since both the quantum part (and in particular modular exponentiation and the inverse Fourier transform) as well as the classical pre-and post-processing can be implemented efficiently, this allows to factor in polynomial runtime [2,12,71,72].A particular implementation of the Fourier transformStreltsov2017 given by sequentially applied controlled phase gates and Hadamard gates [73] allows to derive an equally efficient variant of the order-finding protocol that requires only a single control qubit which is being recycled [12], see Fig. 1b.

IV. Results
We now describe the setup to which our results apply, namely the order-finding protocol depicted in Fig. 1b, and connect its performance to coherence.The quantum advantage in this protocol is obviously not emerging from the classical control and post-processing, so we will keep this part fixed.Now looking at a single block, we remind that the controlled unitary U l = U 2 L−l B as well as the phase gate R ′ l (see the SM for details) can neither create nor detect coherence and are thus free in both resource theories.Therefore, we will keep them fixed as well and treat them as a black-box that we can probe.The remaining ingredients of each block become the main focus of study: If we would replace the initial state H |0 = |+ of the control qubit, which is a maximally coherent state [23], with an incoherent one, the block would be seriously flawed, in the sense that it does not encode information about r, since the black-box only affects the coherences of the control qubit (see the SM for more information).Incoherent and maximally coherent states are extreme cases, and to connect the performance of the algorithm quantitatively to coherence, we will investigate the impact on efficiency if we replace the initial control state with a partially coherent state.Since every quantum state can be identified with its replacements channel, we replace it with a fixed qubit channel Θ l that is used to create an initial (partially coherent) control qubit state from an incoherent state σ l .We further allow Θ l to be transformed by arbitrary super-channels S (l) 1 ∈ MIOS since this is free from a resource perspective and ensures that we use the resource at hand appropriately.In this spirit, S (l) 1 allows for a fair comparison of different resourceful operations.Note that another approach would be to optimize over different U l (see Refs. [37,39] for related approaches in different settings).Furthermore, after the application of U l , we must extract the desired information, which is encoded exclusively in the coherences of the control qubit, hence we must detect coherence exactly in the sense that it makes a difference in the measurement statistics.The application of a Hadamard gate, which maximizes the NSID-measure among all qubit channels, is thus an extremal case too [43].
In contrast, a channel that cannot detect coherence would not be able to recover any of the available information on the prime factors.The ability to detect coherence, therefore, plays a vital role after the application of U l , and to investigate its Ul Figure 2: A single modified block of the sequential order-finding algorithm with super-channels to make optimal use of the resources in the protocol.
precise contribution, we replace H with a fixed channel Λ l , that interpolates between the optimal H and a completely incoherent measurement.We now allow to apply an arbitrary super-channel S (l) 2 ∈ DIS that is unitality preserving (we comment on this requirement in the SM), for the same reasoning as for S (l) 1 .The resulting block is depicted in Fig. 2. To simplify our analysis for the main text, we further assume here that in each block, we use the same channel Θ/Λ for the creation/detection of coherence (see the SM for the more general version).For fixed Θ and Λ, we then define P succ (Θ, Λ) to be the probability (maximized over the 2 , and σ l ) that a single run of this order-finding protocol leads to the correct order and bound it by the following Theorem.
Theorem 1.The success probability of the order-finding protocol as described above with qubit operations Θ and unital Λ for creation and detection respectively is lower bounded by where ϕ(r) denotes Euler's totient function.
The presence of C (Θ) is intuitive as it quantifies the ability of Θ to create coherence in the control qubit [68][69][70], which is exactly what we use the channel Θ for.We note that for any qubit channel Θ, we have C (Θ) ≤ 1, with equality if and only if Θ can create a maximally coherent qubit state [70].Moreover, for qubit operations Λ, M⋄ (Λ) ≤ 1, and the bound is saturated for a Hadamard gate [43].The measures enter the bound on equal footing, which indicates that the ability to create and detect coherence are equally important, as one would intuitively expect.In case both Θ and Λ are Hadamard gates, we thus recover the bound presented in Refs.[2,12], which is used to prove the polynomial runtime of the algorithm.If the abilities to create and detect coherence decrease, this influences our bound exponentially in L. This suggests that the polynomial runtime of the fully coherent protocol becomes degraded exponentially in L by the lack of coherence and the ability to detect it.However, one needs to ask whether the performance actually decreases exponentially with less coherence, or if only our bound does so.To address this question, we continue to present a sufficiently general upper bound.
Theorem 2. The success probability of the order-finding protocol as described above with qubit operations Θ and unital Λ for creation and detection respectively is upper bounded by where ϕ(r) denotes Euler's totient function.
We notice that this bound depends on both the problem and the employed coherence.The bound becomes trivial if the first term exceeds unit probability, which is sensitively dependent on the ratio of 2 L and r.Nevertheless, this is a rather gentle restriction on our upper bound, which can be justified by comparing the bounds on the resourceful success probability with the classical limit of the algorithm.We define the classical limit as the corresponding protocol if we are only allowed to use operations that cannot detect or create coherence, i.e., if both Θ and Λ are free in their respective resource theories.In this case, we are in a classical regime and all states and operations can be reduced to probability vectors and stochastic matrices.The success probability is then determined by the uniform measurement statistic and the probability that the post-processing works, which results in as we show in the SM.If we compare the bounds on the classical limit of the success probability with the one in Thm. 2, we see that the same prefactor occurs.In this sense, the slightly limited upper bound in Thm. 2 can be regarded as an artifact of the problem-dependence.If the fixed protocol does not perform well in the classical limit (which is the case of interest), we conclude that coherence is the quantum resource that determines the success probability by bounding it from below and above.

V. Discussion and outlook
In our work, we have used resource theories to derive quantitative upper and lower bounds on the success probability of the quantum part of a sequential version of Shor's algorithm in terms of measures of (dynamical) coherence.Since the full algorithm repeats the quantum part until it succeeds, this also quantifies the total run time and speed-up in terms of the available resources.It is a novelty of our approach that we do not only observe how a resource evolves or depletes during an algorithm [8][9][10][11] but determine quantitatively the performance advantage that it grants.Moreover, our approach differs from Ref. [7], where a necessary condition for the presence of a resource (here entanglement) to admit a speed-up in pure state quantum computing was derived.The argumentation of Ref. [7] is based on the observation that a quantum protocol with limited multi-partite entanglement operating on pure states can be simulated efficiently on a classical device.As already pointed out in Ref. [7], this approach is incapable of establishing a (quantitative) sufficient condition for the contribution of entanglement as a resource as the presence of certain forms of large scale multi-partite entanglement can permit efficient classical simulation when employing a suitable mathematical data structure such as the stabilizer formalism [74].
In contrast, we derive bounds that hold even for mixed states and show quantitatively that coherence is necessary and sufficient to achieve an advantage over the classical limit of the investigated algorithm with a fixed overall structure.This, however, comes at the price that at present these quantitative connections are tied to a specific family of factoring algorithms.Furthermore, we remark that whilst the way we fixed the overall structure of the protocol and our choice of the free operations is natural, well-motivated, and models the operations that are available to a classical computer, other choices may be considered too.Indeed, introducing restrictions that model the capabilities of a classical computer more accurately is an open problem that would lead to different (and potentially more involved) resource theories.As an example, one can additionally restrict the ability to preserve coherence [56], or more generally states [75].It is an interesting open question whether other restrictions and the corresponding resources would lead to relations comparable to those we found; see for instance the discussion in the SM why we did not choose operations that can neither create nor detect coherence as free.
A closely related question is to what extent the overall structure of the protocol can be generalized whilst still obtaining meaningful bounds.As we discuss in the SM in more detail, our findings hold for the standard parallel version of Shor's algorithm as depicted in Fig. 1a too if the first register is in a product state and if the inverse Fourier transform is implemented in a way that leads to the sequential version.Indeed, generalizing the structure and choosing other free operations may reveal additional resources to underpin the efficiency of the quantum processor.One may for example argue that the implementation of the modular exponentiation, which is assumed to be free in our framework, does carry a cost.
Relaxing this assumption may establish entanglement as a resource that bounds the efficiency of the protocol.However, as incoherent operations such as the modular exponentiation can convert coherence to entanglement [76][77][78] it may also be possible to reduce the resource entanglement to coherence when it comes to computation.In summary, our results depend on the choice of free operations and overall structure and we do not claim that coherence is the quantum resource for factoring alone, but we show that it is a quantum resource that lower and upper bounds the performance.In fact, it might well be that other resources not captured in our framework contribute (in other factoring algorithms) too.Exploring this is an interesting starting point for future work.
Furthermore, using our technique to fix the structure of a protocol and to define a free limit, one can investigate the role of quantum resources in other quantum algorithms too.Since general statements about the role of quantum resources in computation are often out of reach, such an algorithm and implementation specific approach might lead to further insights into the value of quantum resources in computation, which might help to understand separation between classical and quantum computing.

On the Role of Coherence in Shor's Algorithm Supplemental Material
In this Supplemental Material, we give the proofs of the results presented in the main text and some further information.This includes additional dynamic resource measures and their properties.

A. On the appearing measures
In this section, we present properties of the resource measures employed in our analysis of the performance of Shor's algorithm.To begin with, we discuss the functional which is interesting from a resource theoretical perspective and we will see later that D(Λ) appears naturally when connecting the success probability of the investigated order-finding protocol with the ability to detect coherence.Moreover, it seems to be a natural candidate for a resource measure under detection incoherent operations.However, it is not monotonic under DIS as we show here.To relate the performance of Shor's algorithm with a rigorous dynamical measure in Thm.16 and Thm.18 we make use of the fact that the functional D shares sufficient similarities with the NSID measure In particular, that D provides an upper bound on M⋄ and that the two functionals coincide on qubit channels.
Proof.Let us begin by pointing out that convexity in the argument follows from the convexity of the trace norm itself.Notice that for any Λ ∈ DI, i.e., ∆Λ = ∆Λ∆, it follows that the functional vanishes.Furthermore, for any detecting channel Λ there exists some state ρ such that ∆Λ(1 − ∆)ρ = 0, which proves faithfulness since • 1 is a norm.The functional behaves monotonically under post-processing with free channels even with an identity channel attached in parallel.Using [43,Lem. 14] for the inequality, we see that which coincides with D(Λ C←A ) due to the convexity of the trace norm.The reverse inequality follows from restricting ρ AB to product states in the first line.
Additionally, the properties of the trace norm allow us to write To show that D is a resource measure, we would need in addition that D(ΛΦ) ≤ D(Λ).However, this condition is violated in general for Φ ∈ DI.To prove this, we first describe how we can evaluate D numerically.Proposition 4. Consider a quantum channel Θ C←B and let N = dim(C).Let further (s m,n ) m,n be the matrix of dimension 2 N × N that contains as rows all N -dimensional vectors s m whose entries are ±1.The numerical value of D(Θ C←B ) is then equivalent to the maximum of the solutions of the following 2 N semidefinite programs (each for a fixed m) Proof.We use that for real f n n where the vectors s m have been introduced in the statement of the Proposition.In addition This method of evaluating D is certainly not the most efficient.However, with the help of the following Proposition, it allows us to disprove monotonicity.Proposition 5. Let Θ C←B be a quantum channel and A a third and fixed quantum system.Denote by S the set of all diagonal matrices of dimension dim(C) with diagonal elements ±1, and by M A the matrix on system A with all entries equal to one.Furthermore, we define The solution of the maximization problem is then given by the maximum of the solutions of the following (finite number of) semidefinite programs to be evaluated for all fixed X ∈ X maximize: Proof.We write quantum states as ρ = i,j ρ ij |i j| and the action of quantum channels as Φ(|i j|) = k,l Φ ij kl |k l|.With this notation, we find Using [37,Lem. 12] we thus have where the set Y is defined as and therefore characterized by semidefinite constraints.Using the absolute value technique from Prop. 4, we can solve this optimization problem via a set of SDPs: With the definitions from the Proposition, we note that Due to this Proposition, for every fixed system A, we can evaluate DA (Θ C←B ) := max numerically by solving a collection of SDPs.We now move to the equivalence on qubit channels, which we will use to connect the performance of Shor's algorithm to the ability to detect coherence.(A16) Now we first consider the inner optimization problem, i.e., we fix Φ.The first sum always evaluates to a real number, and the phase φ only appears in the second sum.Let us assume that the first sum is positive.The optimum over φ is then obviously achieved for φ = −λ.If the first sum is negative, the optimum is φ = π − λ.In both cases, we have Since Λ∆ ∈ DI, we also have and find for qubit channels Λ.
Furthermore, this allows us to prove that D(Λ) = max ρ ∆Λ(1 − ∆)ρ 1 fails to form a measure as defined in the main text.
Proof.Let Θ B←B be defined via the two Kraus operators i.e., |B| = 2 (and it is straightforward to check that this defines indeed a CPTP map).With the help of Lem. 6, we find D(Θ) = 1.
Choosing |A| = 3, we can use Prop. 5 to evaluate DA (Θ B←B ) numerically.Moreover, it is possible to extract optimal Φ B←A and σ A from the solution of the semidefinite program.An optimal choice consists of which is a maximally coherent state and the quantum operation Φ C←A given by its Choi state |nn mm| =: where and for n = m.It is straightforward to check that J Φ is hermitian, has eigenvalues (0, 0, 0, 1, 1, 1), and i.e., Φ C←A is CPTP.Moreover, due to Φ ∈ DI.We are not going to prove optimality of Φ and σ, for example by deriving the dual program, but rather note that which finishes the proof by giving an explicit example.
Whilst this is not the purpose of this Letter, we note that it is straightforward to show that the family of functionals defines resource measures in the detection-incoherent setting in itself.We notice similarities to the measures in Ref. [37], but leave further investigations on these measures, for example on an operational interpretation, to future work.

B. Shor's factorization algorithm
In this section we review Shor's algorithm, beginning with the basic prerequisites in number theory, moving on to Shor's protocol and a sequential version introduced in Ref. [12].Additionally, the fine-tuned interplay of the quantum part and the classical post-processing via the continued fraction algorithm is discussed in detail, paving the way for a discussion of the protocols investigated in this work.Some notable examples of further reading on Shor's algorithm are the articles [2,12] and the textbook [72], on which the following brief review is based.

Reduction to order-finding
The first step in Shor's algorithm is the reduction of the integer factorization problem to the so-called order-finding problem [2].Let N denote the integer to be factorized, which consists of m distinct prime factors and can be represented in an n bit string.Furthermore, let x be an integer 1 ≤ x < N with x coprime to N , i.e., x and N share no common factor.The order-finding problem is then to find the smallest integer r such that x r = 1 mod N .This integer r is called the order of x modulo N .The reduction of factoring to order-finding results from the following two statements.We omit the proofs at this point, for further reading see for example Ref. [72].
Lemma 8. Given a composite (with more than one distinct prime factor), odd integer N and an integer solution a with 1 ≤ a < N to the equation a 2 = 1 mod N , that is non-trivial, i.e., a = ±1 mod N , then at least one of gcd(a ± 1, N ) is a non-trivial factor of N .Lemma 9.For a uniformly chosen x in the range 1 ≤ x < N and coprime to N , the probability that the order r of x modulo N is even and non-trivial is bounded by P (r even, and where m is the number of distinct prime factors of N . With this at hand, a factorization algorithm is given by the following procedure: In a first step, catch exceptions like N having two as a (multiple) prime factor and check if N is a composite integer, i.e., has more than one distinct prime factor.This can be done efficiently on a classical device, see Ref. [72].These two steps guarantee that the prerequisites of Lem. 8 and Lem. 9 are satisfied.In the next step, choose a random x, check if it is coprime to N , otherwise, repeat until it is.The bottleneck of the algorithm is the order-finding, but assuming we can solve this in polynomial time, determine the order r and subsequently check if it is even and non-trivial (which has sufficiently high probability due to Lem. 9).If so, compute a = x r/2 (note that x r/2 cannot be 1 mod N due to the definition of the order) and use Lem. 8 to find a factor of N , otherwise, repeat.The algorithm is run until all prime factors have been found.Since the greatest common divisor can be computed efficiently in polynomial time in the input length n (for example using Euclid's algorithm), having a polynomial time algorithm for order-finding results in a polynomial time algorithm for factorization.

Order-finding à la Shor
Shor's coup of an efficient order-finding protocol, depicted schematically in Fig. 4, is at the heart of the factorization algorithm.This standard protocol for order-finding utilizes two quantum systems A and B of dimension q and N respectively, where system A consists of L qubits such that N 2 < q = 2 L < 2N 2 , with N being the number to factor.Along with the classical postprocessing via the continued fraction algorithm, the quantum part of the protocol can be separated into three essential ingredients: preparation of an initial state, then the so-called modular exponentiation, and a measurement.The modular exponentiation is defined by the controlled-like unitary where U B |n B = |xn mod N B .Note that the modular exponentiation can be implemented in polynomial time [2,[83][84][85].The modular exponentiation encodes information about the order r into the state of system A, only requiring knowledge about x and the number N to be factored.The encoding of this information depends on the initial state of the auxiliary system B, and a convenient choice is the state |1 B .Let us emphasize that other incoherent states can be used as well.For instance in Ref. [12], it is shown that choosing a normalized maximally mixed initial state 1 B will increase the runtime of the algorithm at most polynomially.In fact, for factorization problems of the form N = pq, where p and q are primes, the increase is asymptotically negligible.After performing the modular exponentiation, the auxiliary system is discarded.For our purposes, the action of the modular exponentiation on system A will be fixed and labeled by E. This channel E admits the following simple structure.
Lemma 10.If system B is in the state |1 B , then the effect of the modular exponentiation on system A is given by where the R j/r denote rotations around multiples of the fraction of r, i.e., R j/r = n e 2πi j r n |n n|.
Proof.Notice that by definition of the order-finding problem x r = 1 mod N .It follows that U r B = 1 B , since ∀n we find Hence, orthonormal eigenstates |ψ j B of U B are simply given by with corresponding eigenvalues of e 2πi j r .This allows us to expand the auxiliary state as . With this at hand, it is straightforward to calculate Let us emphasize the resemblance of E to a symmetry operation that gives rise to the resource theory of asymmetry [86][87][88].In this particular case, the symmetry group elements are simple rotations, being uniformly weighted to define the symmetry operation E. This symmetry group gives rise to the resource theory of coherence as a special case [26,27,89].Any incoherent state is left invariant under the action of E, i.e., an incoherent state is symmetric with respect to the symmetry group, thereby naturally selecting a set of free states.On the contrary, any coherent state will encode information about r, thus being useful at least in principle for the task of order-finding.Analyzing the protocol in the framework of coherence theory is a natural consequence.Concretely, in this work the performance of the protocol will be quantitatively linked to the ability to create and detect coherence.
Furthermore, it has to be noted that not every single rotation E j encodes the order r the way we wish.In fact, any rotation E j where gcd(j, r) = 1 is equivalent to a rotation around an angle depending on a factor of r rather than r itself.Fortunately, this is sufficiently rare to still allow for an efficient post-processing strategy that estimates r from the measurement statistics efficiently.After the modular exponentiation, a measurement of system A in the Fourier basis produces an outcome k that is forwarded to the continued fraction algorithm (CFA), which will then compute a continued fraction decomposition of k/q.
The continued fraction algorithm computes the decomposition of a number x in the following iterative form: the sum of its closest integer part and the reciprocal of another number, which is then written as the sum of its closets integer part and another reciprocal, and so on, see for example Ref. [72].This decomposition is typically denoted as where the list is finite for rational x, i.e., x = [a 0 , a 1 , . . ., a n ], and infinite otherwise.The so-called convergents, or specifically the m-th convergent of x, is defined by [a 0 , a 1 , . . ., a m ].The post-processing of measurement results will be done by computing the convergents of k/q.Some measurement results give sufficiently good approximations to some j/r that allow recovering the latter fraction from k/q by using the CFA to compute the convergents, where one matches j/r.To clarify which measurement outcomes do so, we continue with the following result from number theory involved in the study of Diophantine Approximation, i.e., approximations of irrational numbers by rational ones.The following statement can be found in various textbooks on number theory, see for example Ref. [90].The first part is also treated in the textbook [72], and for completeness, we give a short proof of the statement based on Ref. [90].
Theorem 11.Let x be a positive number and p/q a positive rational number.If then p/q is a convergent of x.Conversely if p/q is a convergent of x, then Proof.Let pn qn denote the convergents to the continued fraction decomposition of x.The sequence (q n ) n is increasing [90], thus there exists some integer n such that q n ≤ q < q n+1 .Now assume that p q satisfies the inequality (B6) but is not a convergent to the continued fraction algorithm, i.e., p q = pn qn ∀n.The convergents pn qn are precisely the best approximations to x in the second sense, thus, |qx − p| < |q n x − p n | implies q > q n+1 [90].Therefore, if p q is not a convergent with q < q n+1 (if q = q n+1 there is nothing to show) we find |q n x − p n | ≤ |qx − p| < 1 2q , since p q satisfies Eq. (B6) by assumption.This yields and thus q n > q, which is a contradiction to q n ≤ q < q n+1 .Therefore, we find that q = q n and consequentially p = p n , which concludes the first part of the statement.
For the second part, we can make use of the so-called complete quotients a ′ i , see for example Ref. [90], defined as a ′ i = [a i , a i+1 , ...] which allows us to express arbitrary x as in terms of an arbitrary convergent pi qi .Then it follows where we used in the last line that a ′ i+1 q i + q i−1 ≥ q i+1 .Lastly, since q i+1 ≥ q i every convergent and thus also the particular convergent p/q satisfies the inequality x − p q ≤ 1 q 2 .Recall that in the case of a rational x, i.e., a simple finite continued fraction expansion x = [a 0 , a 1 , ..., a n ], we define the denominator of the n + 1 convergent simply as q n , such that the proof also holds for rational x.
This Theorem provides a sufficient and necessary condition on the absolute difference of the number x and a rational approximation p/q such that said approximation is a convergent of x in the continued fraction decomposition.Coming back to the question of which measurement outcomes are useful, we employ the following Corollary.
Corollary 12. Let k be an integer with 0 ≤ k < q where N 2 < q = 2 L < 2N 2 that satisfies the inequality for some coprime (j, r) with 0 < j < r and β = q−1 r 2 .Then the continued fraction expansion of k/q will yield j/r and thereby r, as a convergent.
Proof.According to the first part of Thm.11, any integer k that satisfies j r − k q < 1 2r 2 will yield j/r as a convergent.Obviously In particular, since β > 1 all integers k that obey the inequality j r − k q < 1 2q will yield j/r as a convergent.
This justifies the choice of the dimension of quantum system A with dim(A) = q at the beginning of the discussion.Let us emphasize that extending the margin of error like in Cor. 12 for a β > 1 has allowed to sharpen Shor's original bound (which basically utilizes a weaker bound with β = 1) on the coherent protocol, see for example Refs.[91,92].Looking at the following result, the reason why the post-processing via the CFA works well for a measurement result as in Cor. 12 can be better understood.
Lemma 13.Consider fixed integers N and q > N 2 .i) Assume you have a fixed integer 0 ≤ k < q.Then there exists at most one pair of integers (j, r) with 1 ≤ r < N , 0 ≤ j < r, and gcd(j, r) = 1 such that j r − k q < 1 2q .ii) Assume that you have a pair of integers (j, r) with 1 ≤ r < N and 0 ≤ j < r.Then there exists an integer 0 ≤ k < q such that j r − k q < 1 2q is satisfied.
Proof.We begin with i).Assume that there exist two distinct fractions j′ r ′ = j r that satisfy j r − k q < 1 2q and On the other hand since r, r ′ < N and there exists an integer i such that |j ′ r − jr ′ | = |i| ≥ 1.By contradiction the two fractions are identical.
For ii), we note that the distance between neighboring fractions k q is given by 1 q .Therefore, there always exists a k ′ such that j r − k ′ q ≤ 1 2q .However, equality can only hold if r is a power of 2, in which case there exists a k such that k q samples j r exactly.
Combining the results of Cor. 12 and Lem. 13 tells us, that given a single rotation E j as defined in Lem. 10 and with j coprime to r, there always exists a measurement outcome k that will yield r via the continued fraction algorithm.With this at hand, we define the following two sets for a fixed j coprime to r (B13) Additionally we define the sets K 1 , K 2 as K i = ∪ j K j i , where the union is formed over all j smaller than and coprime to r.The set K 1 contains all measurement outcomes that will yield the correct order r by putting the outcome in the continued fraction algorithm.The second set K 2 consists of all outcomes that obey the necessary condition to be a convergent of the CFA according to Thm. 11, i.e., it contains all outcomes that will yield the correct r via the CFA but potentially also outcomes that do not.Let us conclude this preliminary discussion by noting what happens for an unknown and randomly chosen j (or equivalently a uniformly weighted E j , as we got here) during the post-processing.Suppose for the sampled E j , j and r share a common factor.The post-processing will then maximally yield a factor of r and thus fail.This case is however rare: the probability that a randomly chosen j is coprime to r is given by ϕ(r)/r, where ϕ(r) denotes Euler's totient function.This ratio between Euler's totient function and its argument is bounded by ϕ(r) r > δ log log r > δ log log N , for some positive constant δ, according to a well-known result by Hardy [71,Theorem 328].In fact, the latter inequality is asymptotically tight for infinitely many values of r.

Sequential order-finding
Furthermore, we have to discuss a sequential version of Shor's original order-finding protocol that allows reducing the number of qubits drastically for large factorization problems.The protocol is based on a semi-classical implementation of the combination of an inverse quantum Fourier transform and a measurement in the computational basis following directly afterward (see Refs. [12,73]): Assume the inverse Fourier transform is implemented via its standard decomposition into Hadamard gates and controlled rotations as depicted in Fig. 3, see also Ref. [72].Fig. 4 therefore shows an implementation of Shor's algorithm in which the measurement outcome has to be reordered in reverse order.As explained in detail in Ref. [73], it is then possible to do the measurement on the first qubit directly after the first Hadamard gate belonging to the inverse Fourier transform was implemented (the second Hadamard gate in the figure) and use its outcome to classically control all the following rotations that depend on this qubit.A similar argument holds for the other qubits as well: after the respective Hadamard gate in their line, one can directly measure them and control all following rotations classically depending on the outcome.Since all the controlled rotations in one line lead to an effective rotation, in this way, one can replace them with a single effective classically controlled rotation R ′ l that depends on the previous measurement outcomes.This is shown in Fig. 5. Thereby, the gates and measurements on the individual qubits are performed sequentially, which allows to split Shor's protocol into blocks, see Fig. 6, that each utilize only a single control qubit on which the Hadamard gates and the classically controlled rotations R ′ l are performed.The single control qubit can be recycled after each block, such that the total amount of qubits required decreases to log N + 1. Due to this decomposition, Shor's original protocol and the sequential version lead to identical measurement statistics if the auxiliary systems are initialized in the same state.

C. Choosing the free operations
As discussed in the main text, we fix the overall protocol that we investigate and vary only parts of it.Here, we will explain our choices a bit more in detail.First, we assume that the post-processing of measurement results after a single round is achieved by the continued fraction algorithm.If this fails, then we restart the algorithm and perform the post-processing without accounting for the previous outcomes, thereby ignoring possible correlations between results of failed trials.In general, this is not the best possible post-processing strategy.An example of a more involved strategy can be found in Ref. [72].Nevertheless, for simplicity, we assume this fixed post-processing involves only the outcome of individual trials since we are not optimizing over post-processing strategies anyway.The ability to create, then utilize, and finally detect coherence is a key feature in the protocol.Imposing constraints on these abilities can be done naturally within the framework of dynamical resource theories of coherence.
CLASSICAL CONTROL AND CLASSICAL POST-PROCESSING Figure 6: Sequential order-finding protocol using the semi-classical version of the Fourier transform.The modular exponentiation factors into single qubit controlled-operations given by U l = U 2 L−l B and the classically controlled rotations R ′ l = n e −2πinφ ′ l |n n|, where the phases φ ′ l depend on the previous measurement outcomes k l via φ ′ l = l a=2 k l−a /2 a , see Refs.[12,73].
Notice that expressing the protocol in the form of Fig. 6 makes it clear that except for the Hadamard gates, the protocol utilizes only incoherent input states, channels U l and R ′ l that can neither detect nor create coherence, and measurements in the incoherent basis.Replacing Hadamard gates by quantum channels S (l) 1 [Θ l ] and S (l) 2 [Λ l ] respectively results in the protocol depicted in Fig. 7.If no particular block is considered, we omit the label l and refer to the channels for creation and detection simply as Θ and Λ.

A σ1 S
(1) CLASSICAL CONTROL AND CLASSICAL POST-PROCESSING Figure 7: Circuit representation of the order-finding protocol using only channels Θ l and Λ l to create and detect coherence.The outcomes of an (incoherent) projective measurement in the computational basis are forwarded to the classical control and post-processing unit, which re-initializes the single control qubit, classically controls the rotations R ′ l to implement the inverse Fourier transform, and lastly computes the continued fraction decomposition to yield an estimate of the order r.
Let us now explain why the symmetry of the fully coherent protocol that uses the same channel to create and detect coherence (i.e., the Hadamard gate) has to be broken in the more general case: The ability to create and detect coherence are two fundamentally different properties a quantum channel can possess, which in turn gives rise to two different resources that are generally not interconvertible (e.g., a channel Γ(σ) = ρ Tr(σ) can prepare coherence if ρ is chosen suitably, but not detect, whilst a destructive measurement in the Fourier basis can detect but not prepare coherence).The Hadamard gate can however create a maximally coherent state (by applying it to |0 ), but also maximizes the NSID-measure [43].Therefore, it plays a dual role, i.e., it both creates and detects coherence.
As mentioned in the main text, the choice of free channels follows naturally.If Θ is incapable of creating coherence, no information about the order can be encoded.If Λ cannot detect coherence, none of this information can influence the measurement statistics.Thus, the lack of either ingredient renders the protocol practically "useless" by reducing it to a random number generator independent of the order that it is supposed to estimate, and is moreover classically simulable.Therefore, the choices of free channels are maximally incoherent channels MIO [17] and detection-incoherent channels DI [43], also known as non-activating [41].Let us mention that this random number generator gives rise to different probability distributions depending on the structure of the free channels Θ free and Λ free .Details will follow in the next section.
It is tempting to choose the set of creation-detection incoherent channels CDI as the set of free channels, i.e., the channels that can neither create nor detect coherence, also known as dephasing-covariant channels [93][94][95], classical [46], or commuting [41].This would keep the symmetry of the protocol and seems to be an intuitive choice as it leads to a "fully classical" protocol.However, it does not lead to a consistent connection between operational advantages and deployed resources: Imagine we would use an channel Λ ∈ DI with Λ ∈ MIO for detection.Although not granting any operational advantage, this channel has to be considered resourceful.In contrast, our choice of different sets of free channels naturally leads to an operationally meaningful use of resources.
Furthermore, the channel Λ utilized in the detection scheme is assumed to be a unital map.This assumption is physically motivated: The measurement statics of the incoherent measurement are uniquely determined by the pre-measurement populations.To be maximally sensitive to information about r, we want that the deviation of the measurement statistics from a flat distribution purely depends on the coherences that Λ mapped to populations, and not on a reshuffling of populations that does not include information about r.Without knowing r, we can choose both free super-channels S 1 and S 2 such that this is the case iff Λ is unital.Since the state before U l is still independent of r, we can always choose S 1 such that its populations are equal to a maximally mixed state, without affecting the coherences (because the phases of the coherences are still independent of r and therefore known).After U l , the phase of the coherences depends however on r, and we can thus not alter the populations without varying the coherences (or knowing r).Thus, if Λ were not unital but could detect coherence, the following might happen: The measurement statistics depend stronger on the population reshuffling than on the detected coherences.In this case, we would perform worse than with a free channel that leads to equally distributed random numbers and therefore on average produces better guesses of r than random numbers that are weighted in a way that does not depend on r.To avoid this, we must choose Λ l to be unital and similarly choose the super-channels S (l) 2 to be unitality-preserving.

D. Success probability
The success probability of the order-finding protocol, consisting of the quantum part combined with the continued fraction algorithm, can now be expressed.To ease up the notation, we make use of the equivalence between Shor's original version and the sequential version.That way, there is no need to laboriously the back-action of the measurements in each block on the auxiliary system, which allows us to express the success probability compactly.Recall that the detection part, i.e., the standard implementation of the inverse Fourier transform (see Fig. 3), was altered only by replacing the Hadamard gates with channels S where E denotes the uniformly weighted rotations described in Lem.10.After completing all blocks, the measurement outcome k is forwarded to the CFA, which will return the order r with a probability of P (k → r | CFA).Therefore, the probability that the order-finding protocol in Fig. 7 succeeds, is given by P succ (S 2 are available at no cost, we choose them optimally (but without knowledge of r and in a way that is implementable efficiently), which ensures that the available resources are used adequately.The resulting success probability is then given by π 2 Proof.Since a l ≥ 0, the upper bound holds trivially.For the lower bound, notice that the term 0 ≤ cos π 2 l < 1 rapidly converges to one for increasing l.Thereby, it is reasonable that the deviation from the simple upper bound is small.First rewrite the product as Now remember that up to here, we assumed that we replaced E with E j .This is of course not possible since it would require knowledge of r.To get back to the original protocol, we note that applying E corresponds to applying E j with j ∈ 0, ..., r − 1 chosen uniformly at random.The number of such j with gcd(j, r) = 1 is given by ϕ(r), where ϕ(r) denotes Euler's totient function.The overall success probability is therefore bounded by δr log log N for some δ > 0, where δ ≈ e −γ with γ being the Euler-Mascheroni constant, see for instance Ref. [71,Theorem 328], which connects this bound to the original bound derived by Shor [2,12].For a perfectly coherent protocol, this bound would take the form P succ > 4 π 2 δ log log r , which equals the bound originally obtained by Shor [2].In the following works, see for example Refs.[91,92], it has been shown that for the fully coherent protocol, the factor 4 π 2 ≈ 0.4 can be pushed to about 0.9 (at least in an average case) by a more careful, yet tedious, analysis.The basic idea behind these proofs is to consider not only the set K 1 as useful outcomes but to stretch the definition of said set as it has been outlined in Cor.12. Since continuity in the dynamical measures C (Θ) and M⋄ (Λ) is to be expected, it would not be surprising if the bound in Eq. (E9) can be sharpened analogously.For now, we leave this to future work.

Classical limit
As already pointed out, the classical limit of the protocol uses only free channels Θ

1 B U 2 0 BFigure 4 :
Figure 4: Decomposition of Shor's algorithm, with the inverse Fourier transform decomposed into Hadamard gates and controlled rotations R l .A controlled rotation R l adds a phase of −2πi/2 l to |1 and leaves |0 unchanged.This leads to a total outcome k = L−1 i=0 2 i k i .

P 2 ϕ
succ (Θ l , Λ l ) (Θ l ) M⋄ (Λ l ) .(E24)Particularly, if the same channels are utilized in each block the bound simplifies to P succ (Θ, Λ) ≥ 4 π totient function grows almost linearly in its argument and is strictly bounded by ϕ(r) > δr log log r > free and corresponds to a random number generator.It returns a number in the range 0 ≤ k ≤ 2 L − 1 with a probability distribution of Standard decomposition of the inverse Fourier transform into Hadamard gates and controlled rotations R l .The controlled rotation R l adds a phase of −2πi/2 l to |1 and leaves |0 unchanged.An additional initial reordering of the qubits in reverse order is not shown.