Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up phone #158

Merged
merged 3 commits into from
Dec 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## 6.20.13 - Dec 27, 2024

* Sanitize phone number for US people scrape.

## 6.20.12 - Nov 22, 2024

* Use transformers to trim incoming strings at import that are too long for DB columns:
Expand Down
16 changes: 16 additions & 0 deletions openstates/cli/convert_us.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import re
import typing
import uuid
from collections import defaultdict
Expand Down Expand Up @@ -28,7 +29,22 @@ def make_org_id(id_: str) -> str:
return "ocd-organization/" + str(uuid.uuid5(US_UUID_NAMESPACE, id_))


def sanitize_phone(phone: str) -> str:
"""Remove trail text, toll-free phone number or N/A"""
if phone.lower() in ["n/a", "same as above"]:
return ""
# Some phone might appear like (123) 456 7890
pattern = r"\((\d{3})\)\s*(\d{3})-(\d{4})"
match = re.search(pattern, phone)
if match:
# Format the first matched number as XXX-XXX-XXXX
formatted_number = f"{match.group(1)}-{match.group(2)}-{match.group(3)}"
return formatted_number
return phone


def _fix_bad_dashes(phone: str) -> str:
phone = sanitize_phone(phone)
return phone.replace("–", "-")


Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "openstates"
version = "6.20.12"
version = "6.20.13"
description = "core infrastructure for the openstates project"
authors = ["James Turk <[email protected]>"]
license = "MIT"
Expand Down
Loading