Issue with finding the dependencies among the sub-tasks of a primary calculation using pymatgen API. #3884
-
Dear pymatgen Development Team, I want to retrieve task dependencies and their corresponding directories using the pymatgen API. However, I am encountering an issue where the Here is a snippet of the code I am using: from mp_api.client import MPRester
material_id = "mp-1183063"
# Get all task IDs related to the given material ID
with MPRester() as mpr:
tasks = mpr.materials.search(material_ids=[material_id], fields=["task_ids"])
# Print all task IDs
print("All related task IDs:")
task_ids = tasks[0].task_ids
for task_id in task_ids:
print(task_id)
# Analyze the dependencies between different task IDs
print("\nDependencies between task IDs:")
path_to_mp_id = {}
for task_id in task_ids:
with MPRester() as mpr:
selected_task = mpr.materials.tasks.search(task_ids=[task_id], fields=["task_id", "calcs_reversed"])
if selected_task:
task = selected_task[0]
calcs_reversed = getattr(task, "calcs_reversed", [])
if calcs_reversed:
print(f"\nTask ID {task_id} reverse dependencies:")
for calc in calcs_reversed:
reverse_task_dir = calc["dir_name"] if "dir_name" in calc else "dir_name field not found"
print(f" - {reverse_task_dir}")
path_to_mp_id[reverse_task_dir] = None # Initialize dictionary
else:
print(f"\nTask ID {task_id} has no reverse dependencies.")
else:
print(f"\nNo data found for task_id: {task_id}")
# Get detailed information for all related tasks
all_task_details = []
with MPRester() as mpr:
for task_id in task_ids:
task_details = mpr.materials.tasks.search(task_ids=[task_id], fields=["*"])
all_task_details.extend(task_details)
# Print detailed information for all tasks
print("\nDetailed information for all tasks:")
for task_detail in all_task_details:
print(task_detail) # Directly print MPDataDoc object
# Find the mp-id corresponding to each path
print("\nFinding mp-id for each path:")
for path in path_to_mp_id.keys():
found = False
for task_detail in all_task_details:
if hasattr(task_detail, "dir_name") and task_detail.dir_name == path:
path_to_mp_id[path] = task_detail.task_id
found = True
break
if not found:
print(f"Path not found: {path}")
# Print the mapping from path to mp-id
print("\nMapping from path to mp-id:")
for path, mp_id in path_to_mp_id.items():
print(f"Path: {path} -> mp-id: {mp_id}") The output I receive indicates that the
Could you please help me understand why the Thank you for your time and assistance. Best regards, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
Dear @hongyi-zhao , when you print the for task_detail in all_task_details:
print(task_detail.dir_name) You will get Alternatively, you can also print all the entries: for task_detail in all_task_details:
for item in task_detail:
print(item) which will get you ('builder_meta', None)
('nsites', None)
('elements', None)
('nelements', None)
('composition', None)
('composition_reduced', None)
('formula_pretty', None)
('formula_anonymous', None)
('chemsys', None)
('volume', None)
('density', None)
('density_atomic', None)
('symmetry', None)
('tags', None)
('dir_name', None)
('state', None)
('calcs_reversed', None)
('structure', None)
('task_type', None)
('task_id', None)
('orig_inputs', None)
('input', None)
('output', None)
('included_objects', None)
('vasp_objects', None)
('entry', None)
('task_label', None)
('author', None)
('icsd_id', None)
('transformations', None)
('additional_json', None)
('custodian', None)
('analysis', None)
('last_updated', None)
('fields_not_requested', ['builder_meta', 'nsites', 'elements', 'nelements', 'composition', 'composition_reduced', 'formula_pretty', 'formula_anonymous', 'chemsys', 'volume', 'density', 'density_atomic', 'symmetry', 'tags', 'dir_name', 'state', 'calcs_reversed', 'structure', 'task_type', 'task_id', 'orig_inputs', 'input', 'output', 'included_objects', 'vasp_objects', 'entry', 'task_label', 'author', 'icsd_id', 'transformations', 'additional_json', 'custodian', 'analysis', 'last_updated']) Again, every entry is |
Beta Was this translation helpful? Give feedback.
hmm.... you basically produced the same result.
At this point, I'm not quite sure if you can extract the sequential order of the jobs using the MP API.
The entries of the MPIDs are according to "last_updated":
with the output: