You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An application we made uses a yaml file called query-definition.yml to keep track of possible filter values for users to search by in the app. Normally it loads the query-definition.yml file and only edits filters["datasets"], and formatting is not usually a concern.
Starting point: How query-definition.yml looks:
enums:
# content for this key isn't relevant
tabs:
# content for this key isn't relevant
filter_list:
- ethnicity
- primary care department
- datasets
- ..
- An array of strings. # content is not relevant
filters:
ethnicity:
label: Ethnicity
enum:
inline:
- value: Hispanic or Latino
- value: I want a blank line after this.
primary-care-department:
label: Seen in Primary Care
enum:
inline:
- value: BROOMALL CARE NTWK
label: I want a blank line after this.
datasets:
# a filter that we normally can modify without issue because formatting isn't as important
I want there to be a blank line in between every 1st child of filters or at the end of every 1st child in filters. But currently, there is no blank line and that makes it hard for people to read the filters section.
Loading query-definition.yml
The query definition file is being loaded (in main())
Where query_definition_orig is the text contents of the query-definition.yml file (read from a github repo)
Modifying query-definition.yml
We have to modify this yaml programmatically via python by going through our elasticsearch database and making sure all the filters that appear there also appear in the query-definition.yml.
This means appending or removing elements from filters[each for each in filter_list]["enum"]["inline"]. I use a class called MissingValuesFinder that has all the methods, properties, and private variables to easily do that.
The code for doing this is shown here:
def process_changes_to_qdefn(self):
for qdefn_field in self.qdefn_es_results:
es_values = self.qdefn_es_results[qdefn_field] # list of strings found from elasticsearch
# that need to be added to query-definition.yml
qdefn_inline_list = copy_list(
self.qdefn_contents['filters'][qdefn_field]['enum']['inline']) # list of dictionaries read from query-definition.yml
qdefn_values = [ff['value'] for ff in qdefn_inline_list]
# add all the strings from es_values to qdefn_inline_list
for es_val in es_values:
if es_val not in qdefn_values:
if not self.field_has_labels(self.qdefn_contents['filters'][qdefn_field]['enum']['inline']):
new_entry = {'value': es_val}
qdefn_inline_list.append(new_entry)
else:
placeholder_label = "New value in database: populate or delete this placeholder label"
new_entry = {
'value': str(es_val).title(), 'label': placeholder_label,
'note': "Accept or edit the auto-generated label; then delete this note"}
qdefn_inline_list.append(new_entry)
self.qdefn_contents['filters'][qdefn_field]['enum']['inline'] = qdefn_inline_list
def copy_list(old_list):
# if we tried setting qdefn_inline_list directly,
# it would list all old values, a newline, and then the elasticsearch ones.
# this helper function is the only way we could prevent a newline from being sandwiched between
new_list = []
for each in old_list:
# can also use an ruamel.compat.commentedMap() as well as dict()
new_list.append(dict(each))
return new_list
Important private variables.
MissingValuesFinder.qdefn_contents is a dictionary identical to the qdefn_contents variable.
MissingValuesFinder.qdefn_es_results is a dictionary. Each key is a string (one of the filters from filter_list) and each value is a list of strings. Each string in the list is a string found from elasticsearch that has to get added to the query definition file for the filter it's mapped to (eg. in the [enums][inline] array)
Saving the new query-definition.yml file
After process_changes_to_qdefn() runs, main() uses this to save the file
from ruyaml import YAML
from io import StringIO
# dumpyaml.py, round trip dumper based on
# https://ruyaml.readthedocs.io/en/latest/example.html?highlight=inefficient%20%3D%20False#output-of-dump-as-a-string
class DumpYAML(YAML):
def dump(self, data, stream=None, **kw):
inefficient = False
if stream is None:
inefficient = True
stream = StringIO()
YAML.dump(self, data, stream, **kw)
if inefficient:
return stream.getvalue()
dumpyaml = DumpYAML()
# in main()
from dumpyaml import dumpyaml
yaml.indent(mapping=2, sequence=4, offset=2)
yaml.width = float("Infinity")
tree_map = ruyaml.compat.ordereddict()
tree_map['\n'] = ruyaml.scalarstring.preserve_literal
tree_map[':'] = ruyaml.scalarstring.SingleQuotedScalarString
ruyaml.scalarstring.walk_tree(
missing_values_finder.qdefn_contents, tree_map)
query_definition_reencoded = dumpyaml.dump(
missing_values_finder.qdefn_contents)
with open(file_to_write, "w") as f:
f.write(query_definition_reencoded)
f.close()
This is almost good, but there is no blank line between the keys within filter.
Goal: How query-definnition.yml should look
filters:
ethnicity:
label: Ethnicity
enum:
inline:
- value: Hispanic or Latino
- value: I want a blank line after this.
primary-care-department:
label: Seen in Primary Care
enum:
inline:
- value: BROOMALL CARE NTWK
label: Broomall Care Ntwk
- value: CAPE MAY CARE NTWK
label: I want a blank line after this.
An application we made uses a yaml file called query-definition.yml to keep track of possible filter values for users to search by in the app. Normally it loads the query-definition.yml file and only edits filters["datasets"], and formatting is not usually a concern.
Starting point: How query-definition.yml looks:
I want there to be a blank line in between every 1st child of
filters
or at the end of every 1st child in filters. But currently, there is no blank line and that makes it hard for people to read the filters section.Loading query-definition.yml
The query definition file is being loaded (in main())
Where
query_definition_orig
is the text contents of the query-definition.yml file (read from a github repo)Modifying query-definition.yml
We have to modify this yaml programmatically via python by going through our elasticsearch database and making sure all the filters that appear there also appear in the query-definition.yml.
This means appending or removing elements from filters[each for each in filter_list]["enum"]["inline"]. I use a class called MissingValuesFinder that has all the methods, properties, and private variables to easily do that.
The code for doing this is shown here:
Important private variables.
MissingValuesFinder.qdefn_contents
is a dictionary identical to theqdefn_contents
variable.MissingValuesFinder.qdefn_es_results
is a dictionary. Each key is a string (one of the filters fromfilter_list
) and each value is a list of strings. Each string in the list is a string found from elasticsearch that has to get added to the query definition file for the filter it's mapped to (eg. in the [enums][inline] array)Saving the new query-definition.yml file
After
process_changes_to_qdefn()
runs,main()
uses this to save the fileThis is almost good, but there is no blank line between the keys within
filter
.Goal: How query-definnition.yml should look
My issue is similar to
And if this is a use case for transform, I'm just not able to figure out how.
The text was updated successfully, but these errors were encountered: