JobGroup

The JobGroup class is designed to help manage jobs client-side by storing them in named groups. Large experiments can be easily cut in chunks and even ran during multiple days, over multiple Python sessions, the job group will make sure all data can be retrieved from a single location.

Usage example

Here’s an example creating a job group with two jobs, running the same acquisition on a post-processed vs an heralded CNOT gate:

>>> import perceval as pcvl
>>> from perceval.algorithm import Sampler
>>>
>>> p_ralph = pcvl.RemoteProcessor("sim:altair")
>>> p_ralph.add(0, pcvl.catalog["postprocessed cnot"].build_processor())
>>> p_ralph.min_detected_photons_filter(2)
>>> p_ralph.with_input(pcvl.BasicState([0, 1, 0, 1]))
>>> sampler_ralph = Sampler(p_ralph, max_shots_per_call=1_000_000)
>>>
>>> p_knill = pcvl.RemoteProcessor("sim:altair")
>>> p_knill.add(0, pcvl.catalog["heralded cnot"].build_processor())
>>> p_knill.min_detected_photons_filter(4)
>>> p_knill.with_input(pcvl.BasicState([0, 1, 0, 1]))
>>> sampler_knill = Sampler(p_knill, max_shots_per_call=1_000_000)
>>>
>>> jg = pcvl.JobGroup("compare_knill_and_ralph_cnot")
>>> jg.add(sampler_ralph.sample_count, max_samples=10_000)
>>> jg.add(sampler_knill.sample_count, max_samples=10_000)

This first script only prepared the experiment, nothing was executed remotely. Before going on, it’s important for a user to know the details of their plan on the Cloud, for this will establish the number of job they can run concurrently.

The job group supports executing jobs sequentially or in parallel and includes the ability to rerun failed jobs, if needed.

The second script may be used exclusively to run jobs. It includes a built-in tqdm progress bar to provide real-time updates on job execution. To run jobs sequentially with a given delay:

>>> import perceval as pcvl
>>>
>>> jg = JobGroup("compare_knill_and_ralph_cnot")  # Loads prepared experiment data
>>> jg.run_sequential(0)  # Will send the 2nd job to the Cloud as soon as the first one is complete

Other methods - jg.run_parallel(), jg.rerun_failed_parallel(), and jg.rerun_failed_sequential(delay).

A third script can then prepared to analyze results:

>>> import perceval as pcvl
>>>
>>> jg = JobGroup("compare_knill_and_ralph_cnot")
>>> results = jg.get_results()
>>> ralph_res = results[0]
>>> knill_res = results[1]
>>> perf_ratio = (ralph_res['physical_perf'] * ralph_res['logical_perf']) / (knill_res['physical_perf'] * knill_res['logical_perf'])
>>> print(f"Ralph CNOT is {perf_ratio} times better than Knill CNOT, but needs a measurement to work")
Ralph CNOT is 490.01059 times better than Knill CNOT, but needs a measurement to work

Note

If the connection token you use in a JobGroup expires or gets revoked, said JobGroup will not be usable anymore. Stay tuned for further improvements on this feature, fixing this issue.

Class reference

class perceval.runtime.job_group.JobGroup(name)

JobGroup handles a collection of remote jobs. A job group is named and persistent (job metadata will be written on disk). Job results will never be stored and will be retrieved every time from the Cloud.

The JobGroup class can perform various tasks such as:
  • Saving information for a collection of jobs, whether they have been sent to the cloud or not.

  • Running jobs within the group either in parallel or sequentially.

  • Rerunning failed jobs within the group.

  • Retrieving all results at once.

Parameters:

name (str) – A name uniquely identifying the group (also, the filename used to save data on disk). If the same name is used more than once, jobs can be appended to the same group.

add(job_to_add, **kwargs)

Adds information of the new RemoteJob to an existing Group. Saves the data in a chronological order in the group (each entry is a dictionary of necessary information - status, id, body, metadata)

Parameters:

job_to_add (Job) – a remote job to add to the list of existing job group

count_never_sent_jobs()

Returns the number of all RemoteJobs in the group that were never sent to the cloud.

Return type:

int

property created_date: datetime

Date time of JobGroup creation

static delete_all_job_groups()

Delete all existing groups on disk

static delete_job_group(name)

Delete a single group by name

Parameters:

name (str) – name of the JobGroup to delete

static delete_job_groups_date(del_before_date)

Delete all saved groups created before a date.

Parameters:

del_before_date (datetime) – datetime of the oldest job group to keep. Anterior groups will be deleted.

get_results()

Retrieve results for all jobs in the group. It aggregates results by calling the get_results() method of each job object that have completed successfully.

Return type:

list[dict]

list_active_jobs()

Returns a list of all RemoteJobs in the group that are currently active on the cloud - those with a Running or Waiting status.

Return type:

list[RemoteJob]

static list_existing()

Returns a list of filenames of all JobGroups saved to disk

Return type:

list[str]

property list_remote_jobs: list[perceval.runtime.remote_job.RemoteJob]

Returns a chronologically ordered list of RemoteJobs in the group. Jobs never sent to the cloud will be represented by None.

list_successful_jobs()

Returns a list of all RemoteJobs in the group that have run successfully on the cloud.

Return type:

list[RemoteJob]

list_unfinished_jobs()

Returns a list of all RemoteJobs in the group that have run unsuccessfully on the cloud - errored or canceled

Return type:

list[RemoteJob]

property modified_date: datetime

Date time of the latest JobGroup change

property name: str

Name of the job group

progress()

Iterates over all jobs in the group to create a dictionary of the current status of jobs. Jobs in the group are categorized as follows (depending on their RunningStatus on the Cloud)

  • Finished
    • successful {‘SUCCESS’}

    • unsuccessful {‘CANCELED’, ‘ERROR’, ‘UNKNOWN’, ‘SUSPENDED’}

  • Unfinished
    • sent {‘WAITING’, ‘RUNNING’, ‘CANCEL_REQUESTED’}

    • not sent {None}

Return type:

dict

Returns:

dictionary of the current status of jobs

rerun_failed_parallel()

Restart all failed jobs in the group on the Cloud, running them in parallel.

If the user lacks authorization to send multiple jobs at once or exceeds the maximum allowed limit, an exception is raised, terminating the launch process. Any remaining jobs in the group will not be sent.

rerun_failed_sequential(delay)

Reruns Failed jobs in the group on the Cloud in a sequential manner with a user-specified delay between the completion of one job and the start of the next.

Parameters:

delay (int) – number of seconds to wait between re-launching jobs on cloud

run_parallel()

Launches all the unsent jobs in the group on Cloud, running them in parallel.

If the user lacks authorization to send multiple jobs to the cloud or exceeds the maximum allowed limit, an exception is raised, terminating the launch process. Any remaining jobs in the group will not be sent.

run_sequential(delay)

Launches the unsent jobs in the group on Cloud in a sequential manner with a user-specified delay between the completion of one job and the start of the next.

Parameters:

delay (int) – number of seconds to wait between launching jobs on cloud

track_progress()

Displays the status and progress of each job in the group using tqdm progress bars. Jobs are categorized into “Successful,” “Running/Active on Cloud,” and “Inactive/Unsuccessful.” The method iterates over the list of jobs, continuously refreshing their statuses and updating the progress bars to provide real-time feedback until no “Running/Waiting” jobs remain on the Cloud.