-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify TENDL 2017 files for GROUPR processing and data extraction #68
Conversation
This looks like it needs a merge/rebase to resolve a conflict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another nice, well-contained addition.
I'm not sure that file-handling
needs to be it's own file, but that's fine. It's all pretty specific to TENDL files, so it might make sense to put those methods in that file.
isomer_id = '' | ||
|
||
upper_case_letters = [char for char in stem if char.isupper()] | ||
lower_case_letters = [char for char in stem if char.islower()] | ||
numbers = [str(char) for char in stem if char.isdigit()] | ||
|
||
if len(lower_case_letters) == 0: | ||
lower_case_letters = [''] | ||
elif len(lower_case_letters) > 1: | ||
isomer_id = lower_case_letters[-1] | ||
lower_case_letters = lower_case_letters[:-1] | ||
|
||
element = upper_case_letters[0] + lower_case_letters[0] | ||
A = ''.join(numbers) + isomer_id | ||
|
||
return element, A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there must be a simpler way to do this? The files are always named n-Z[z]AAA[m]
, right, where
- Z is an uppercase letter
- [z] is an optional lower case letter
- A is a digit
- [m] is an optional ower case letter 'm'
so why not something like:
Z_start = 2
Z_end = Z_start + 1
if !stem[Z_end].isdigit():
Z_end += 1
element = stem[Z_start:Z_end]
A_start = Z_end
A_end = A_start + 3
if !stem[-1].isdigit():
A_end += 1
A = stem[A_start:A_end]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regular expressions can make this even simpler in terms of number of lines, although maybe with a higher cognitive burden
directory = Path(directory) | ||
|
||
file_info = {} | ||
for file in directory.iterdir(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you consider using directory.glob("*.endf")
?
endf_path.rename(TAPE20) | ||
pendf_path.rename(TAPE21) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this result in gradually overwriting all the files in the directory? Don't we want to copy the files to these names for processing?
endf_path.rename(TAPE20) | ||
pendf_path.rename(TAPE21) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did our prior testing work without a PENDF file? Do we really need a pendf file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need an ENDF and a PENDF file to run GROUPR
gendf_data = tp.iterate_MTs(MTs, endftk_file_obj, mt_dict, pKZA) | ||
cumulative_data = concat([cumulative_data, gendf_data], | ||
ignore_index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing how this is used across multiple nuclides, I'm not convinced that a pandas dataframe offers much more than a list of dictionaries. I think gendf_data
could look like:
[ {'Parent KZA' : pkza, 'Daughger KZA' : dkza, etc.... } ,
{'Parent KZA' : pkza, 'Daughger KZA' : dkza, etc.... } , ... ]
and cumulative_data
could be a local variable that gets appended with the new data. It will require a change to iterate_MTs()
that I think will make that a little simpler too.
At the end, this list could be used to create a new dataframe and export to CVS:
pd.DataFrame(cumulative_data).to_csv(filename)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two last things...
for file in (p for p in dir.glob('*') if p.suffix in {'.tendl', '.endf'}): | ||
if file.is_file() and file.with_suffix('.pendf').is_file(): | ||
element, A = get_isotope(file.stem) | ||
file_info[f'{element}{A}'] = { | ||
'Element' : element, | ||
'Mass Number' : A, | ||
'File Paths' : (file, file.with_suffix('.pendf')) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that the glob could allow you to be more selective about which files you are even looking for:
for file in (p for p in dir.glob('*') if p.suffix in {'.tendl', '.endf'}): | |
if file.is_file() and file.with_suffix('.pendf').is_file(): | |
element, A = get_isotope(file.stem) | |
file_info[f'{element}{A}'] = { | |
'Element' : element, | |
'Mass Number' : A, | |
'File Paths' : (file, file.with_suffix('.pendf')) | |
} | |
for suffix in ['tendl', 'endf']: | |
for file in dir.glob(f'*.{suffix}'): | |
if file.with_suffix('.pendf').is_file(): | |
element, A = get_isotope(file.stem) | |
file_info[f'{element}{A}'] = { | |
'Element' : element, | |
'Mass Number' : A, | |
'File Paths' : (file, file.with_suffix('.pendf')) | |
} |
shutil.copy(endf_path, TAPE20) | ||
shutil.copy(pendf_path, TAPE21) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work from Pathlib (trying to reduce number of dependencies, even if they're built in; I also think it's a more modern solution???):
shutil.copy(endf_path, TAPE20) | |
shutil.copy(pendf_path, TAPE21) | |
Path(TAPE20).write_bytes(endf_path.read_bytes()) | |
Path(TAPE21).write_bytes(pendf_path.read_bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing indentiation
Closes #51 and #52.
Adds methods to search through a directory for pairs of ENDF and PENDF files corresponding to the same isotope to
tendl_processing.py
.Modifies the example case in
process_fendl3.2.py
to be generalized to work with all of files found from the search methods employed in the new script.