Skip to content
This repository has been archived by the owner on Jul 25, 2024. It is now read-only.

Planned Upgrade: Optional flag to output SQL Compliant Column Names #29

Open
nickaustinlee opened this issue Feb 25, 2022 · 0 comments
Open

Comments

@nickaustinlee
Copy link
Contributor

Prior versions of the Labelbox Connector for Databricks tried to preserve column names to how they were expressed in the JSON output of Labelbox. For instance "Labeled Data" was expressed as a column named "Labeled Data".

Downstream workflows sometimes require accessing these columns in ways where a space in the name is impractical. Additionally, spaces need to be removed prior to saving the table as a Delta Lake table. Right now developers can run a simple column reformat to solve these issues.

To make it easier for developers downstream but avoid breaking existing code which may reference column names with spaces, we are exploring the addition of a flag "SQL_friendly_columns" which will output dataframes with the following characteristics:

  • All spaces will be replaced with underscores in column names
  • The dot format which we currently use to express nesting will be replaced with underscores.
  • All character cases will be preserved to match Labelbox JSON character case

Examples:

"Labeled Data" --> "Labeled_Data"
"Label.objects.title" --> "Label_objects_title"

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant