# Computing Backends

To run a simulation, Perceval is integrating different computing backends implemented from state of the art algorithms. Each of these backends have different specificities and features that we describe in that section.

## Features

### Sampling or Weak Simulation

Sampling is the simulation task closest to the actual running of a physical circuit. Given known input states, the sampling will produce output states one at a time as it would be observed by ideal detectors. Sampling is considered as a weak simulation of a circuit, since it does not give explicit output distribution, nor the nature of the mixture states generated by the circuit but a mere observation of individual outputs.

Sampling has been studied in length (see , ), in the context of Boson Sampling, as a classical computing challenge pushing further the limit of the quantum supremacy.

### Strong Simulation

Strong(er) Simulation (see ) of a circuit should provide an access to complete distribution and nature of the output state. Compared to Sampling, circuit designer are interested in the actual probabilistic distribution of the outputs and their exact characteristic.

In particular, for a $$m$$ port circuit, one would like to know the exact expected probability of detecting photons on a given port, and not a mere estimation based on sampling observations. We also differentiate here the ability of getting the probability (or probability amplitude) of a single output, and the possibilities of getting probabilities of all the different outputs for a give input state.

Also, even for a simple circuit showing an equiprobable probability of detecting $$\ket{0,1}$$ and $$\ket{1,0}$$, we would want to know if the output is $$\frac{1}{\sqrt 2}(\ket{0,1}+\ket{1,0})$$ or $$\frac{1}{\sqrt 2}(\ket{0,1}-\ket{1,0})$$ which are very distinct states.

Finally, a fine-grained simulation would need not only to give output state probability but also probability amplitude. Indeed, probability amplitude is required for further evolution of the output states, but also analysis of polarization for circuit with polarization support, etc.

### Strongest Simulation

Beyond simulation of perfect circuit describes by unitary matrix, goal of Perceval is also to model non linear phenomenon like loss of photons, noise, time delays, and more. Ideal simulators should take these phenomenon into accounts.

## The Backends

Perceval has built-in 4 different backends with the support of optimized C-library documented here.

Perceval also integrates some connectors with 3rd-Party framework for compatibility purpose.

### Comparison Table

Features Name

CliffordClifford2017

SLOS

Naive

Stepper

Sampling Efficiency

$$\mathrm{O}(n2^n+poly(m,n))$$

$$\mathrm{O}(mC_n^{n+m-1})$$

N/A 1

N/A 1

Single output Efficiency

N/A

N/A

$$\mathrm{O}(n2^n)$$

$$\mathrm{o}(N_cC_n^{n+m-1})$$

Full Distribution Efficiency

N/A

$$\mathrm{O}(nC_n^{n+m-1})$$

$$\mathrm{O}(n2^nC_n^{n+m-1})$$

$$\mathrm{o}(N_cC_n^{n+m-1})$$

Probability Amplitude

No

Yes

Yes

Yes

Support Symbolic Computation

No

Yes

No

Yes

Support of Time-Circuit

No

No

No

Yes

Practical Limits

$$n\approx30$$

$$n,m<20$$

$$n\approx30$$

where:

• $$n$$ is the number of photons

• $$m$$ is the number of modes

• $$N_c$$ is the number of elementary circuits

### CliffordClifford2017

This backend is the implementation of the algorithm introduced in . The algorithm, applied to Boson Sampling, aims to produce provably correct random samples from a particular quantum mechanical distribution. Its time and space complexity are respectively $$\mathrm{n2^n+mn^2}$$ and $$\mathrm{m}$$ (in addition to matrix storing). The algorithm has been implemented in C++, and uses an adapted Glynn algorithm to efficiently compute $$n$$ simultaneous sub-permanents.

Recently, the same authors have proposed a faster algorithm in with an average time complexity of $$\mathrm{n\rho_\theta^n}$$ for a number of modes $$m=\theta n$$ which is linear in the number of photons $$n$$, where:

$\rho_\theta = \frac{(2\theta+1)^{2\theta+1}}{(4\theta)^{4\theta}(\theta+1)^{\theta+1}}$

For example, if we were to work with dual rail path encoding (ignoring for now the number of auxiliary modes required), we would typically work with $$\theta=2$$, and the average performance is then $$\mathrm{n(\frac{5^5}{8^23^3})^n} \approx \mathrm{n1.8^n}$$.

### SLOS

The Strong Linear Optical Simulation SLOS algorithm developed by a subset of the present authors is introduced in . It unfolds the full computation path in memory, leading to a remarkable time complexity of $$\mathrm{nC_n^{n+m-1}}$$ for computing the full distribution. The current implementation also allows restrictive sets of outputs, with average computing time in $$\mathrm{n\rho_\theta^n}$$ for single output computation. As discussed in , Boson Sampling with SLOS is possible with the time complexity of , though it has not yet been implemented in the current version of Perceval.

The tradeoff in this approach is a huge memory usage in $$\mathrm{nC^{n+m-1}_n}$$ that limits usage on personal computers to circuits with $$\approx 20$$ photons and to $$\approx 24$$ photons on super-computers.

### Naive

This backend implements direct permanent calculation and is therefore suited for single output probability computation with small memory cost. Both Ryser’s and Glynn’s algorithms have been implemented. Extra-care has been taken on the implementation of these algorithms, with usage of different optimisation techniques including native multithreading and SIMD vectorisation primitives. Benchmarking of these algorithms and comparison with the implementation present in the The Walrus library is provided in following figure: Comparison of the average time 2 to calculate a permanent of an $$n\times n$$ Haar random matrix. The processor is a 32 core, 3.1GHz Intel Haswell. For The Walrus, version 0.19 is used and installed from pypi. The Ryser implementation is run on 4 or 32 threads. The Glynn implementation is run on a single thread. What is interesting to note is that all implementations have convergence to the theoretical performance but the factor between optimised and less optimised implementation still makes a perceptible time difference for the end-user.

### Stepper

This backend takes a totally different approach. Without computing the circuit’s overall unitary matrix first, it applies the unitary matrix associated with the components in each layer of the circuit one-by-one, simulating the evolution of the statevector. The complexity of this backend is therefore proportional to the number of components. It has the nice features that:

• it supports non linear optical components like Time Delay;

• it is very flexible with simulating noise in the circuit, like photon loss;

• it enables simpler debugging of circuits by exposing intermediate states.

Footnotes

1(1,2)

Those backends technically support sampling, but to do so, they need to compute the full output distribution which is totally inefficient.

2

Following the methodology presented at https://the-walrus.readthedocs.io/en/latest/gallery/permanent_tutorial.html.