JobGroup
The JobGroup
class is designed to help manage jobs client-side by storing them in named groups.
Large experiments can be easily cut in chunks and even ran during multiple days, over multiple Python sessions, the job
group will make sure all data can be retrieved from a single location.
Usage example
Here’s an example creating a job group with two jobs, running the same acquisition on a post-processed vs an heralded CNOT gate:
>>> import perceval as pcvl
>>> from perceval.algorithm import Sampler
>>>
>>> p_ralph = pcvl.RemoteProcessor("sim:altair")
>>> p_ralph.add(0, pcvl.catalog["postprocessed cnot"].build_processor())
>>> p_ralph.min_detected_photons_filter(2)
>>> p_ralph.with_input(pcvl.BasicState([0, 1, 0, 1]))
>>> sampler_ralph = Sampler(p_ralph, max_shots_per_call=1_000_000)
>>>
>>> p_knill = pcvl.RemoteProcessor("sim:altair")
>>> p_knill.add(0, pcvl.catalog["heralded cnot"].build_processor())
>>> p_knill.min_detected_photons_filter(4)
>>> p_knill.with_input(pcvl.BasicState([0, 1, 0, 1]))
>>> sampler_knill = Sampler(p_knill, max_shots_per_call=1_000_000)
>>>
>>> jg = pcvl.JobGroup("compare_knill_and_ralph_cnot")
>>> jg.add(sampler_ralph.sample_count, max_samples=10_000)
>>> jg.add(sampler_knill.sample_count, max_samples=10_000)
This first script only prepared the experiment, nothing was executed remotely. Before going on, it’s important for a user to know the details of their plan on the Cloud, for this will establish the number of job they can run concurrently.
The job group supports executing jobs sequentially or in parallel and includes the ability to rerun failed jobs, if needed.
The second script may be used exclusively to run jobs. It includes a built-in tqdm progress bar to provide real-time updates on job execution. To run jobs sequentially with a given delay:
>>> import perceval as pcvl
>>>
>>> jg = JobGroup("compare_knill_and_ralph_cnot") # Loads prepared experiment data
>>> jg.run_sequential(0) # Will send the 2nd job to the Cloud as soon as the first one is complete
Other methods - jg.run_parallel()
, jg.rerun_failed_parallel()
, and jg.rerun_failed_sequential(delay)
.
A third script can then prepared to analyze results:
>>> import perceval as pcvl
>>>
>>> jg = JobGroup("compare_knill_and_ralph_cnot")
>>> results = jg.get_results()
>>> ralph_res = results[0]
>>> knill_res = results[1]
>>> perf_ratio = (ralph_res['physical_perf'] * ralph_res['logical_perf']) / (knill_res['physical_perf'] * knill_res['logical_perf'])
>>> print(f"Ralph CNOT is {perf_ratio} times better than Knill CNOT, but needs a measurement to work")
Ralph CNOT is 490.01059 times better than Knill CNOT, but needs a measurement to work
Note
If the connection token you use in a JobGroup
expires or gets revoked, said JobGroup
will not
be usable anymore. Stay tuned for further improvements on this feature, fixing this issue.
Class reference
- class perceval.runtime.job_group.JobGroup(name)
JobGroup handles a collection of remote jobs. A job group is named and persistent (job metadata will be written on disk). Job results will never be stored and will be retrieved every time from the Cloud.
- The JobGroup class can perform various tasks such as:
Saving information for a collection of jobs, whether they have been sent to the cloud or not.
Running jobs within the group either in parallel or sequentially.
Rerunning failed jobs within the group.
Retrieving all results at once.
- Parameters:
name (
str
) – A name uniquely identifying the group (also, the filename used to save data on disk). If the same name is used more than once, jobs can be appended to the same group.
- add(job_to_add, **kwargs)
Adds information of the new RemoteJob to an existing Group. Saves the data in a chronological order in the group (each entry is a dictionary of necessary information - status, id, body, metadata)
- Parameters:
job_to_add (
Job
) – a remote job to add to the list of existing job group
- count_never_sent_jobs()
Returns the number of all RemoteJobs in the group that were never sent to the cloud.
- Return type:
int
- property created_date: datetime
Date time of JobGroup creation
- static delete_all_job_groups()
Delete all existing groups on disk
- static delete_job_group(name)
Delete a single group by name
- Parameters:
name (
str
) – name of the JobGroup to delete
- static delete_job_groups_date(del_before_date)
Delete all saved groups created before a date.
- Parameters:
del_before_date (
datetime
) – datetime of the oldest job group to keep. Anterior groups will be deleted.
- get_results()
Retrieve results for all jobs in the group. It aggregates results by calling the get_results() method of each job object that have completed successfully.
- Return type:
list
[dict
]
- list_active_jobs()
Returns a list of all RemoteJobs in the group that are currently active on the cloud - those with a Running or Waiting status.
- Return type:
list
[RemoteJob
]
- static list_existing()
Returns a list of filenames of all JobGroups saved to disk
- Return type:
list
[str
]
- property list_remote_jobs: list[perceval.runtime.remote_job.RemoteJob]
Returns a chronologically ordered list of RemoteJobs in the group. Jobs never sent to the cloud will be represented by None.
- list_successful_jobs()
Returns a list of all RemoteJobs in the group that have run successfully on the cloud.
- Return type:
list
[RemoteJob
]
- list_unfinished_jobs()
Returns a list of all RemoteJobs in the group that have run unsuccessfully on the cloud - errored or canceled
- Return type:
list
[RemoteJob
]
- property modified_date: datetime
Date time of the latest JobGroup change
- property name: str
Name of the job group
- progress()
Iterates over all jobs in the group to create a dictionary of the current status of jobs. Jobs in the group are categorized as follows (depending on their RunningStatus on the Cloud)
- Finished
successful {‘SUCCESS’}
unsuccessful {‘CANCELED’, ‘ERROR’, ‘UNKNOWN’, ‘SUSPENDED’}
- Unfinished
sent {‘WAITING’, ‘RUNNING’, ‘CANCEL_REQUESTED’}
not sent {None}
- Return type:
dict
- Returns:
dictionary of the current status of jobs
- rerun_failed_parallel()
Restart all failed jobs in the group on the Cloud, running them in parallel.
If the user lacks authorization to send multiple jobs at once or exceeds the maximum allowed limit, an exception is raised, terminating the launch process. Any remaining jobs in the group will not be sent.
- rerun_failed_sequential(delay)
Reruns Failed jobs in the group on the Cloud in a sequential manner with a user-specified delay between the completion of one job and the start of the next.
- Parameters:
delay (
int
) – number of seconds to wait between re-launching jobs on cloud
- run_parallel()
Launches all the unsent jobs in the group on Cloud, running them in parallel.
If the user lacks authorization to send multiple jobs to the cloud or exceeds the maximum allowed limit, an exception is raised, terminating the launch process. Any remaining jobs in the group will not be sent.
- run_sequential(delay)
Launches the unsent jobs in the group on Cloud in a sequential manner with a user-specified delay between the completion of one job and the start of the next.
- Parameters:
delay (
int
) – number of seconds to wait between launching jobs on cloud
- track_progress()
Displays the status and progress of each job in the group using tqdm progress bars. Jobs are categorized into “Successful,” “Running/Active on Cloud,” and “Inactive/Unsuccessful.” The method iterates over the list of jobs, continuously refreshing their statuses and updating the progress bars to provide real-time feedback until no “Running/Waiting” jobs remain on the Cloud.