Skip to content

Commit

Permalink
Merge pull request #158 from openstates/fix-os-us-updates
Browse files Browse the repository at this point in the history
Clean up phone
  • Loading branch information
alexobaseki authored Dec 27, 2024
2 parents 2979278 + 9a67e1e commit 7504d87
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 1 deletion.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## 6.20.13 - Dec 27, 2024

* Sanitize phone number for US people scrape.

## 6.20.12 - Nov 22, 2024

* Use transformers to trim incoming strings at import that are too long for DB columns:
Expand Down
16 changes: 16 additions & 0 deletions openstates/cli/convert_us.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import re
import typing
import uuid
from collections import defaultdict
Expand Down Expand Up @@ -28,7 +29,22 @@ def make_org_id(id_: str) -> str:
return "ocd-organization/" + str(uuid.uuid5(US_UUID_NAMESPACE, id_))


def sanitize_phone(phone: str) -> str:
"""Remove trail text, toll-free phone number or N/A"""
if phone.lower() in ["n/a", "same as above"]:
return ""
# Some phone might appear like (123) 456 7890
pattern = r"\((\d{3})\)\s*(\d{3})-(\d{4})"
match = re.search(pattern, phone)
if match:
# Format the first matched number as XXX-XXX-XXXX
formatted_number = f"{match.group(1)}-{match.group(2)}-{match.group(3)}"
return formatted_number
return phone


def _fix_bad_dashes(phone: str) -> str:
phone = sanitize_phone(phone)
return phone.replace("–", "-")


Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "openstates"
version = "6.20.12"
version = "6.20.13"
description = "core infrastructure for the openstates project"
authors = ["James Turk <[email protected]>"]
license = "MIT"
Expand Down

0 comments on commit 7504d87

Please sign in to comment.