Question on async job interface

SwgAnno · October 18, 2024, 10:30am

Hi there,

I’m writing a script that allows me to segment a long experiment into smaller batches so I can run experiments without hoarding on the QPU.
In particular It’s mandatory for me to keep track of the calibration data of the QPU in each segmented job I launch.

This last requirement (which I assume is not that rare) interacts very wierdly with how async jobs are created and resumed and I wanted to share my thoughts on that.

Async jobs are meant to be resumed in a second time with the job ID, possibly by a different script
Remote async jobs require an active RemoteProcessor to be resumed
The RemoteProcessor that resumes the job could be connected to a different backend than the one with which the job was originally created. The identity between these two objects could be an important requirement in some application, are check of this kind performed anywhere?
The identity between the processor that creates the job and the one that resumes it could be technically guaranteed by noting down locally the backend used, but in my case this is not enough as the calibration data reported when I resume the job will likely differ from the one reported when the job was created (the latter are also more likely to be close in time to when the job is actually executed).
In case I decide to further note down locally the calibration data, this renders any kind of identity between the creator and resumer Processor useless: the resumer RemoteProcessor just becomes a mean to retrieve the raw counts data from the cloud, regardless of the backend.

Do you have any thoughts on this problem?
Is it really an inconsistency?
Am I handling things the right way?

Eric · October 22, 2024, 10:14am

Hello,

First of all, we’re currently working on a job management system that will allow you to easily group jobs, run them, get their data back easily. This feature will be released in Perceval 0.12 and will address most of your issues.

Answers to your points:

The RemoteProcessor that resumes the job could be connected to a different backend than the one with which the job was originally created. The identity between these two objects could be an important requirement in some application, are check of this kind performed anywhere?

Getting the data back from a job requires an authenticated access to the Cloud (we need to check if you have the rights to retrieve a job’s status or results before handing them to you). The easiest way of creating this authenticated access is through the RemoteProcessor class. As you said, you can instantiate it with a different platform name from the one you used to launch a given job (e.g. you create a remote processor targetting “qpu:altair” to run the job, then you create a remote processor targetting “sim:altair” to retrieve its results). Although this can be misleading, it’s also a quality of life feature.
There is no check between both platform names. However, when you execute a job on a given platform, you are assured it’ll have run on this platform and no other, even if you retrieve the results through the connexion to another platform.

The identity between the processor that creates the job and the one that resumes it could be technically guaranteed by noting down locally the backend used, but in my case this is not enough as the calibration data reported when I resume the job will likely differ from the one reported when the job was created (the latter are also more likely to be close in time to when the job is actually executed).

You’re right: you have to save the source characterisation data of the moment you run the job manually. There’s no better way in the current Perceval version. We’ll consider adding such a feature in the job management system we’re implementing.

In case I decide to further note down locally the calibration data, this renders any kind of identity between the creator and resumer Processor useless: the resumer RemoteProcessor just becomes a mean to retrieve the raw counts data from the cloud, regardless of the backend.

Yes, the platform you use when retrieving status/results is actually meaningless. The only important data here, are the Cloud URL (which defaults to Quandela Cloud) + your token to authenticate you.

Am I handling things the right way?

Given the tools you have, I guess so. We’re definitely aiming at making the Cloud users life better with the upcoming job management feature. Please, stay tuned and tell me would you have specific needs on that matter that you’d like to see in Perceval.