Differential equation resolution

Introduction

We present here a Perceval implementation of a Quantum Machine Learning algorithm for solving differential equations. Its aims is to approximate the solution to the differential equation considered in [1]:

\[\frac{d f}{d x}+\lambda f(x)(\kappa+\tan (\lambda x))=0\]

with boundary condition \(f(0)=f_{0}\). The analytical solution is \(f(x)=\exp (-\kappa \lambda x) \cos (\lambda x)+ f_0 - 1\).

QML Loss Function Definition

In order to use QML to solve this differential equation, we first need to derive from it a loss function whose minimum is associated to its analytical solution.

Let \(F\left[\left\{d^{m} f / d x^{m}\right\}_{m},f, x\right]=0\) be a general differential equation verified by \(f(x)\), where \(F[.]\) is an operator acting on \(f(x)\), its derivatives and \(x\). For the solving of a differential equation, the loss function described in [1] consists of two terms

\[\mathcal{L}_{\boldsymbol{\theta}}\left[\left\{d^{m} g / d x^{m}\right\}_{m},g, x\right]:=\mathcal{L}_{\boldsymbol{\theta}}^{(\mathrm{diff})}\left[\left\{d^{m} g / d x^{m}\right\}_{m},g, x\right]+\mathcal{L}_{\boldsymbol{\theta}}^{(\text {boundary})}[g, x].\]

The first term \(\mathcal{L}_{\boldsymbol{\theta}}^{(\mathrm{diff})}\) corresponds to the differential equation which has been discretised over a fixed regular grid of \(M\) points noted \(x_i\):

\[\mathcal{L}_{\boldsymbol{\theta}}^{(\mathrm{diff})}\left[\left\{d^{m} g / d x^{m}\right\}_{m},g, x\right]:=\frac{1}{M} \sum_{i=1}^{M} L\left(F\left[d_{x}^m g\left(x_{i}\right), g\left(x_{i}\right), x_{i}\right], 0\right),\]

where \(L(a,b) := (a - b)^2\) is the squared distance between two arguments. The second term \(\mathcal{L}_{\boldsymbol{\theta}}^{(\text {boundary })}\) is associated to the initial conditions of our desired solution. It is defined as:

\[\mathcal{L}_{\boldsymbol{\theta}}^{\text {(boundary) }}[g, x]:=\eta L\left(g(x_0), f_{0}\right),\]

where \(\eta\) is the weight granted to the boundary condition and \(f_{0}\) is given by \(f(x_0) = f_0\).

Given a function approximator \(f^{(n)}(x, \boldsymbol{\theta}, \boldsymbol{\lambda})\), the loss function above will be minimised using a classical algorithm, updating the parameters \(\boldsymbol{\theta}\) based on samples obtained using a quantum device.

Quantum circuit architecture

The feature map used is presented in [2,3,4]. The quantum circuit architecture from [4] is expressed as \(\mathcal{U}(x, \boldsymbol{\theta}):=\mathcal{W}^{(2)}\left(\boldsymbol{\theta}_{2}\right) \mathcal{S}(x) \mathcal{W}^{(1)}\left(\boldsymbol{\theta}_{1}\right).\) The phase-shift operator \(\mathcal{S}(x)\) incorporates the \(x\) dependency of the function we wish to approximate. It is sandwiched between two universal interferometers \(\mathcal{W}^{(1)}(\boldsymbol{\theta_1})\) and \(\mathcal{W}^{(2)}(\boldsymbol{\theta_2})\), where the beam-splitter parameters \(\boldsymbol{\theta_1}\) and \(\boldsymbol{\theta_2}\) of this mesh architecture are tunable to enable training of the circuit. The output measurement operator, noted \(\mathcal{M}(\boldsymbol{\lambda})\), is the projection on the Fock states obtained using photon-number resolving detectors, multiplied by some coefficients \(\boldsymbol{\lambda}\) which can also be tunable. Formally, we have:

\[\mathcal{M}(\boldsymbol{\lambda}) = \sum_{\mathbf{\left | n^{(f)}\right \rangle}}\lambda_{\mathbf{\left | n^{(f)}\right \rangle}}\mathbf{\left | n^{(f)}\right \rangle}\mathbf{\left \langle n^{(f)}\right |},\]

where the sum is taken over all \(\binom{n+m-1}{n}\) possible Fock states considering \(n\) photons in \(m\) modes. Let \(\mathbf{\left | n^{(i)}\right \rangle} = \left |n^{(i)}_1,n^{(i)}_2,\dots,n^{(i)}_m\right \rangle\) be the input state consisting of \(n\) photons where \(n^{(i)}_j\) is the number of photons in input mode \(j\). Given these elements, the circuit’s output \(f^{(n)}(x, \boldsymbol{\theta}, \boldsymbol{\lambda})\) is given by the following expectation value:

\[f^{(n)}(x, \boldsymbol{\theta}, \boldsymbol{\lambda})=\left\langle\mathbf{n}^{(i)}\left|\mathcal{U}^{\dagger}(x, \boldsymbol{\theta}) \mathcal{M}(\boldsymbol{\lambda}) \mathcal{U}(x, \boldsymbol{\theta})\right| \mathbf{n}^{(i)}\right\rangle.\]

This expression can be rewritten as the following Fourier series [4]

\[f^{(n)}(x, \boldsymbol{\theta}, \boldsymbol{\lambda})=\sum_{\omega \in \Omega_{n}} c_{\omega}(\boldsymbol{\theta}, \boldsymbol{\lambda}) e^{i \omega x},\]

where \(\Omega_n = \{-n, -n+1, \dots, n-1, n \}\) is the frequency spectrum one can reach with \(n\) incoming photons and \(\{c_\omega(\boldsymbol{\theta}, \boldsymbol{\lambda})\}\) are the Fourier coefficients. The \(\boldsymbol{\lambda}\) parameters are sampled randomly in the interval \([-a;a]\), with \(a\) a randomly chosen integer. \(f^{(n)}(x, \boldsymbol{\theta}, \boldsymbol{\lambda})\) will serve as a function approximator for this chosen differential equation. Differentiation in the loss function is discretised as \(\frac{df}{dx} \simeq \frac{f(x+\Delta x) - f(x-\Delta x)}{2\Delta x}\).

\(n, m,\) and \(\boldsymbol{\lambda}\) are variable parameters defined below. \(\Delta x\) is the mesh size.

Perceval Simulation

Initialisation

[18]:
import perceval as pcvl
import perceval.lib.phys as phys
import numpy as np
from math import comb
from scipy.optimize import minimize
import time
import matplotlib.pyplot as plt
import matplotlib as mpl
import tqdm as tqdm

We will run this notebook with 6 photons

[2]:
nphotons = 6

Differential equation parameters

We define here the value of the differential equation parameters and boundary condition \(\lambda, \kappa, f_0\).

[3]:
# Differential equation parameters
lambd = 8
kappa = 0.1

def F(u_prime, u, x):       # DE, works with numpy arrays
    return u_prime + lambd * u * (kappa + np.tan(lambd * x))
[4]:
# Boundary condition (f(x_0)=f_0)
x_0 = 0
f_0 = 1
[5]:
# Modeling parameters
n_grid = 50    # number of grid points of the discretized differential equation
range_min = 0  # minimum of the interval on which we wish to approximate our function
range_max = 1  # maximum of the interval on which we wish to approximate our function
X = np.linspace(range_min, range_max-range_min, n_grid)  # Optimisation grid
[6]:
# Differential equation's exact solution - for comparison
def u(x):
    return np.exp(- kappa*lambd*x)*np.cos(lambd*x)
[7]:
# Parameters of the quantum machine learning procedure
N = nphotons              # Number of photons
m = nphotons              # Number of modes
eta = 5                   # weight granted to the initial condition
a = 200                   # Approximate boundaries of the interval that the image of the trial function can cover
fock_dim = comb(N + m - 1, N)
# lambda coefficients for all the possible outputs
lambda_random = 2 * a * np.random.rand(fock_dim) - a
# dx serves for the numerical differentiation of f
dx = (range_max-range_min) / n_grid
[8]:
# Input state with N photons and m modes
input_state = pcvl.BasicState([1]*N+[0]*(m-N))
print(input_state)
|1,1,1,1,1,1>

Definition of the circuit

We will generate a Haar-random initial unitary using QR decomposition built in Perceval Matrix.random_unitary, the circuit is defined by the combination of 3 sub-circuits - the intermediate phase is a parameter.

[9]:
"Haar unitary parameters"
# number of parameters used for the two universal interferometers (2*m**2 per interferometer)
parameters = np.random.normal(size=4*m**2)

simulator_backend = pcvl.BackendFactory().get_backend("SLOS")

px = pcvl.P("px")
c = phys.Unitary(pcvl.Matrix.random_unitary(m, parameters[:2 * m ** 2]), name="W1")\
     // (0, phys.PS(px))\
     // phys.Unitary(pcvl.Matrix.random_unitary(m, parameters[2 * m ** 2:]), name="W2")
pcvl.pdisplay(c)
s1 = simulator_backend(pcvl.Matrix.random_unitary(m))
s1.compile(input_state)

0 1 2 3 4 5 W1 Φ=px W2 0 1 2 3 4 5
[9]:
True

Expectation value and loss function computation

The expectation value of the measurement operator \(\mathcal{M}(\boldsymbol{\lambda})\) is obtained directly from Fock state probabilities computed by Perceval. Given this expectation value, the code snippet below computes the loss function defined in the Introduction.

Note the use of the all_prob simulator method giving directly access to the probabilities of all possible output states. This calculation is optimized in SLOS backend.

[10]:
def computation(params):
    global current_loss
    global computation_count
    "compute the loss function of a given differential equation in order for it to be optimized"
    computation_count += 1
    f_theta_0 = 0  # boundary condition
    coefs = lambda_random  # coefficients of the M observable
    # initial condition with the two universal interferometers and the phase shift in the middle
    U_1 = pcvl.Matrix.random_unitary(m, params[:2 * m ** 2])
    U_2 = pcvl.Matrix.random_unitary(m, params[2 * m ** 2:])

    px = pcvl.P("x")
    c = phys.Unitary(U_2) // (0, phys.PS(px)) // phys.Unitary(U_1)

    px.set_value(x_0)
    U = c.compute_unitary(use_symbolic=False)
    s1.U = U
    f_theta_0 = np.sum(np.multiply(s1.all_prob(input_state), coefs))

    # boundary condition given a weight eta
    loss = eta * (f_theta_0 - f_0) ** 2 * len(X)

    # Y[0] is before the domain we are interested in (used for differentiation), x_0 is at Y[1]
    Y = np.zeros(n_grid + 2)

    # x_0 is at the beginning of the domain, already calculated
    Y[1] = f_theta_0

    px.set_value(range_min - dx)
    s1.U = c.compute_unitary(use_symbolic=False)
    Y[0] = np.sum(np.multiply(s1.all_prob(input_state), coefs))


    for i in range(1, n_grid):
        x = X[i]
        px.set_value(x)
        s1.U = c.compute_unitary(use_symbolic=False)
        Y[i + 1] = np.sum(np.multiply(s1.all_prob(input_state), coefs))

    px.set_value(range_max + dx)
    s1.U = c.compute_unitary(use_symbolic=False)
    Y[n_grid + 1] = np.sum(np.multiply(s1.all_prob(input_state), coefs))

    # Differentiation
    Y_prime = (Y[2:] - Y[:-2])/(2*dx)

    loss += np.sum((F(Y_prime, Y[1:-1], X))**2)

    current_loss = loss / len(X)
    return current_loss

Classical optimisation

Finally the code below performs the optimisation procedure using the loss function defined in the previous section. To this end, we use a Broyden–Fletcher–Goldfarb–Shanno (BFGS) optimiser [5] from the SciPy library.

[11]:
def callbackF(parameters):
    """callback function called by scipy.optimize.minimize allowing to monitor progress"""
    global current_loss
    global computation_count
    global loss_evolution
    global start_time
    now = time.time()
    pbar.set_description("M= %d Loss: %0.5f #computations: %d elapsed: %0.5f" %
                         (m, current_loss, computation_count, now-start_time))
    pbar.update(1)
    loss_evolution.append((current_loss, now-start_time))
    computation_count = 0
    start_time = now
[12]:
computation_count = 0
current_loss = 0
start_time = time.time()
loss_evolution = []

pbar = tqdm.tqdm()
res = minimize(computation, parameters, callback=callbackF, method='BFGS', options={'gtol': 1E-2})
M= 6 Loss: 0.01929 #computations: 435 elapsed: 3.72517: : 264it [07:16,  3.46s/it]

After the optimisation procedure has been completed, the optimal unitary parameters (in res.x) can be used to determine the quantum circuit beam-splitter and phase-shifter angles for an experimental realisation.

[13]:
print("Unitary parameters", res.x)
Unitary parameters [ 0.53891548  0.21569128 -0.22353646  0.28563813 -0.29147866  0.43921577
 -1.42398067  0.89122694  1.7983834   0.60008989 -1.65376412 -0.38094132
  0.29041105 -0.21317347  1.45050872  1.46926834 -1.25998162 -1.56496857
  0.86145319 -1.41503119  2.13940914 -1.87598386 -1.21801322  0.30972512
 -0.27351561 -0.4282766  -1.00645191  1.73981227  0.74053531 -0.42923668
  0.54527143 -1.01678203  0.71667792 -0.59972183 -2.90448644  1.50140223
  2.64934115 -0.05122437 -0.35208727 -2.24785128 -0.3136343   0.85498015
 -0.78692954  0.20216458 -0.05461981 -0.39944246  2.30232439 -1.66634158
 -0.76181137  0.45161744 -0.13022815  2.37213915  2.59868001 -0.60661216
  2.47405142  0.07558246 -0.22223953 -1.60395241 -1.62073662  2.78193488
  0.8807037   1.79411624 -0.28278379  1.40664947 -2.2676453   3.28390565
  1.70265253 -0.7578531  -0.86833146  0.49135553 -1.71873575  0.24549672
 -0.49010474 -0.22684666  0.53083833 -0.48257756  1.07001376  1.69976603
 -0.27101161  1.60044371 -0.97041346 -0.89852207 -1.22383548 -0.98609283
 -0.34705425 -1.24259372 -0.6082258  -0.60734625 -1.92371456  0.71529154
 -0.60812648  0.86638546 -0.13906038 -1.92310185 -1.19006334 -0.10496104
  0.75299518  0.44190065 -1.21985268 -2.00148534 -3.96393817  1.7585545
 -0.50994779  0.32836584  0.01673392  1.14178273  1.30651786  2.11460988
 -1.104944    1.39375128 -0.5343951  -1.67729652 -1.29758049 -0.77721965
  0.09478565 -0.35168055  0.35954347 -1.62900678 -1.74163867  0.10919559
  1.62430792 -1.31252367  1.37714924  0.28078232  0.01823145 -1.02650567
 -1.4851842  -0.33864834 -0.08074981 -1.08737797 -0.6167941   0.00810604
  0.34166759 -0.76807854  1.30148629  0.62279576 -0.55109847 -1.40018476
 -1.29953122 -0.24036318 -1.84388996 -1.76089538 -1.31287697 -1.20853267]

Plotting the approximation

We now plot the result of our optimisation in order to compare the QML algorithm’s output and the analytical solution.

[14]:
def plot_solution(m, N, X, optim_params, lambda_random):
    Y = []
    U_1 = pcvl.Matrix.random_unitary(m, optim_params[:2 * m ** 2])
    U_2 = pcvl.Matrix.random_unitary(m, optim_params[2 * m ** 2:])
    simulator_backend = pcvl.BackendFactory().get_backend("SLOS")
    px = pcvl.P("x")
    c = phys.Unitary(U_2) // (0, phys.PS(px)) // phys.Unitary(U_1)

    for x in X:
        px.set_value(x)
        U = c.compute_unitary(use_symbolic=False)
        s1.U = U
        f_theta = np.sum(np.multiply(s1.all_prob(input_state), lambda_random))
        Y.append(f_theta)
    exact = u(X)
    plt.plot(X, Y, label="Approximation with {} photons".format(N))
[19]:
X = np.linspace(range_min, range_max, 200)

# Change the plot size
default_figsize = mpl.rcParamsDefault['figure.figsize']
mpl.rcParams['figure.figsize'] = [2 * value for value in default_figsize]

plot_solution(m, N, X, res.x, lambda_random)

plt.plot(X, u(X), 'r', label='Analytical solution')
plt.legend()
plt.show()
../_images/notebooks_Differential_equation_solving_23_0.png
[20]:
plt.plot([v[0] for v in loss_evolution])
plt.yscale("log")
plt.xlabel("Number of epochs")
plt.ylabel("Loss function value")
[20]:
Text(0, 0.5, 'Loss function value')
../_images/notebooks_Differential_equation_solving_24_1.png

References

[1] : O. Kyriienko, A. E. Paine, and V. E. Elfving, “Solving nonlinear differential equations with differentiable quantum circuits”, Physical Review A 103, 052416 (2021). https://journals.aps.org/pra/abstract/10.1103/PhysRevA.103.052416

[2] A. Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and J. I. Latorre, “Data re-uploading for a universal quantum classifier”, Quantum 4, 226 (2020). https://quantum-journal.org/papers/q-2020-02-06-226/

[3] M. Schuld, R. Sweke, and J. J. Meyer, “Effect of data encoding on the expressive power of variational quantum-machine-learning models”, Physical Review A 103, 032430 (2021). https://journals.aps.org/pra/abstract/10.1103/PhysRevA.103.032430

[4] B. Y. Gan, D. Leykam, D. G. Angelakis, and D. G. Angelakis, “Fock State-enhanced Expressivity of Quantum Machine Learning Models”, in Conference on Lasers andElectro-Optics (2021), paper JW1A.73. Optica Publishing Group, (2021). https://opg.optica.org/abstract.cfm?uri=CLEO_AT-2021-JW1A.73.

[5] R. Fletcher, Practical methods of optimization. John Wiley & Sons. (2013) https://onlinelibrary.wiley.com/doi/book/10.1002/9781118723203