Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding blank lines in between top level objects for a given key #93

Open
bbiney1 opened this issue Jul 7, 2022 · 0 comments
Open

Adding blank lines in between top level objects for a given key #93

bbiney1 opened this issue Jul 7, 2022 · 0 comments

Comments

@bbiney1
Copy link

bbiney1 commented Jul 7, 2022

An application we made uses a yaml file called query-definition.yml to keep track of possible filter values for users to search by in the app. Normally it loads the query-definition.yml file and only edits filters["datasets"], and formatting is not usually a concern.

Starting point: How query-definition.yml looks:

enums:
  # content for this key isn't relevant

tabs:
  # content for this key isn't relevant

filter_list:
- ethnicity
- primary care department
- datasets
- .. 
- An array of strings. # content is not relevant

filters:
  ethnicity:
    label: Ethnicity

    enum:
      inline:
      - value: Hispanic or Latino
      - value: I want a blank line after this.
  primary-care-department:
    label: Seen in Primary Care

    enum:
      inline:
      - value: BROOMALL CARE NTWK
        label: I want a blank line after this.
  datasets:
     # a filter that we normally can modify without issue because formatting isn't as important

I want there to be a blank line in between every 1st child of filters or at the end of every 1st child in filters. But currently, there is no blank line and that makes it hard for people to read the filters section.

Loading query-definition.yml

The query definition file is being loaded (in main())

yaml = ruyaml.YAML()
yaml.preserve_quotes = True
qdefn_contents = yaml.load(
        query_definition_orig)
MissingValuesFinder.qdefn_contents = qdefn_contents # for editing

Where query_definition_orig is the text contents of the query-definition.yml file (read from a github repo)

Modifying query-definition.yml

We have to modify this yaml programmatically via python by going through our elasticsearch database and making sure all the filters that appear there also appear in the query-definition.yml.
This means appending or removing elements from filters[each for each in filter_list]["enum"]["inline"]. I use a class called MissingValuesFinder that has all the methods, properties, and private variables to easily do that.

The code for doing this is shown here:

def process_changes_to_qdefn(self): 
for qdefn_field in self.qdefn_es_results:
            es_values = self.qdefn_es_results[qdefn_field] # list of strings found from elasticsearch 
            # that need to be added to query-definition.yml
            qdefn_inline_list = copy_list(
                self.qdefn_contents['filters'][qdefn_field]['enum']['inline']) # list of dictionaries read from query-definition.yml
            qdefn_values = [ff['value'] for ff in qdefn_inline_list]
            # add all the strings from es_values to qdefn_inline_list
            for es_val in es_values:
                if es_val not in qdefn_values:
                    if not self.field_has_labels(self.qdefn_contents['filters'][qdefn_field]['enum']['inline']):
                        new_entry = {'value': es_val}
                        qdefn_inline_list.append(new_entry)
                    else:
                        placeholder_label = "New value in database: populate or delete this placeholder label"
                        new_entry = {
                            'value': str(es_val).title(), 'label': placeholder_label, 
                            'note': "Accept or edit the auto-generated label; then delete this note"}
                        qdefn_inline_list.append(new_entry)
            self.qdefn_contents['filters'][qdefn_field]['enum']['inline'] = qdefn_inline_list

 def copy_list(old_list):
    # if we tried setting qdefn_inline_list directly, 
    # it would list all old values, a newline, and then the elasticsearch ones. 
    # this helper function is the only way we could prevent a newline from being sandwiched between
    new_list = []
    for each in old_list:
        # can also use an ruamel.compat.commentedMap() as well as dict()
        new_list.append(dict(each))
    return new_list

Important private variables.

  • MissingValuesFinder.qdefn_contents is a dictionary identical to the qdefn_contents variable.
  • MissingValuesFinder.qdefn_es_results is a dictionary. Each key is a string (one of the filters from filter_list) and each value is a list of strings. Each string in the list is a string found from elasticsearch that has to get added to the query definition file for the filter it's mapped to (eg. in the [enums][inline] array)

Saving the new query-definition.yml file

After process_changes_to_qdefn() runs, main() uses this to save the file


from ruyaml import YAML
from io import StringIO

# dumpyaml.py, round trip dumper based on
# https://ruyaml.readthedocs.io/en/latest/example.html?highlight=inefficient%20%3D%20False#output-of-dump-as-a-string
class DumpYAML(YAML):
    def dump(self, data, stream=None, **kw):
        inefficient = False
        if stream is None:
            inefficient = True
            stream = StringIO()
        YAML.dump(self, data, stream, **kw)
        if inefficient:
            return stream.getvalue()

dumpyaml = DumpYAML()

# in main()
    from dumpyaml import dumpyaml
    yaml.indent(mapping=2, sequence=4, offset=2)
    yaml.width = float("Infinity")
    tree_map = ruyaml.compat.ordereddict()
    tree_map['\n'] = ruyaml.scalarstring.preserve_literal
    tree_map[':'] = ruyaml.scalarstring.SingleQuotedScalarString
    ruyaml.scalarstring.walk_tree(
        missing_values_finder.qdefn_contents, tree_map)
    query_definition_reencoded = dumpyaml.dump(
        missing_values_finder.qdefn_contents)
 
    with open(file_to_write, "w") as f:
        f.write(query_definition_reencoded)
        f.close()

This is almost good, but there is no blank line between the keys within filter.

Goal: How query-definnition.yml should look

filters:
  ethnicity:
    label: Ethnicity

    enum:
      inline:
      - value: Hispanic or Latino
      - value: I want a blank line after this.
  
  primary-care-department:
    label: Seen in Primary Care

    enum:
      inline:
      - value: BROOMALL CARE NTWK
        label: Broomall Care Ntwk
      - value: CAPE MAY CARE NTWK
        label: I want a blank line after this.

My issue is similar to

And if this is a use case for transform, I'm just not able to figure out how.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant