Skip to content

Commit

Permalink
fix(agents-api): Fix search stuff (#695)
Browse files Browse the repository at this point in the history
Signed-off-by: Diwank Singh Tomer <[email protected]>

<!-- ELLIPSIS_HIDDEN -->

----

> [!IMPORTANT]
> Add `clean` option to `extract_keywords` and filter empty queries in
`nlp.py`; update imports and defaults in `utils.py`.
> 
>   - **Behavior**:
> - Add `clean` parameter to `extract_keywords()` in `nlp.py` to
optionally strip non-alphanumeric characters.
> - Filter out empty queries in `paragraph_to_custom_queries()` in
`nlp.py`.
>   - **Imports**:
>     - Add `debug` to imports in `utils.py`.
>   - **Function Defaults**:
> - Change default `only_on_error` to `True` in `cozo_query()` in
`utils.py`.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=julep-ai%2Fjulep&utm_source=github&utm_medium=referral)<sup>
for ca38891. It will automatically
update as commits are pushed.</sup>

<!-- ELLIPSIS_HIDDEN -->

Signed-off-by: Diwank Singh Tomer <[email protected]>
  • Loading branch information
creatorrr authored Oct 18, 2024
1 parent e2f1b49 commit 8720d46
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 4 deletions.
7 changes: 6 additions & 1 deletion agents-api/agents_api/common/nlp.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,14 @@
nlp = spacy.load("en_core_web_sm")


def extract_keywords(text: str, top_n: int = 10) -> list[str]:
def extract_keywords(text: str, top_n: int = 10, clean: bool = True) -> list[str]:
"""
Extracts significant keywords and phrases from the text.
Args:
text (str): The input text to process.
top_n (int): Number of top keywords to extract based on frequency.
clean (bool): Strip non-alphanumeric characters from keywords.
Returns:
List[str]: A list of extracted keywords/phrases.
Expand Down Expand Up @@ -46,6 +47,9 @@ def extract_keywords(text: str, top_n: int = 10) -> list[str]:
# Get top_n keywords
keywords = [item for item, count in freq.most_common(top_n)]

if clean:
keywords = [re.sub(r"[^\w\s\-_]+", "", kw) for kw in keywords]

return keywords


Expand Down Expand Up @@ -212,5 +216,6 @@ def paragraph_to_custom_queries(paragraph: str) -> list[str]:
"""

queries = [text_to_custom_query(sentence.text) for sentence in nlp(paragraph).sents]
queries = [q for q in queries if q]

return queries
6 changes: 3 additions & 3 deletions agents-api/agents_api/models/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from pydantic import BaseModel

from ..common.utils.cozo import uuid_int_list_to_uuid4
from ..env import do_verify_developer, do_verify_developer_owns_resource
from ..env import debug, do_verify_developer, do_verify_developer_owns_resource

P = ParamSpec("P")
T = TypeVar("T")
Expand Down Expand Up @@ -185,8 +185,8 @@ def make_cozo_json_query(fields):

def cozo_query(
func: Callable[P, tuple[str | list[str | None], dict]] | None = None,
debug: bool | None = None,
only_on_error: bool = False,
debug: bool | None = debug,
only_on_error: bool = True,
):
def cozo_query_dec(func: Callable[P, tuple[str | list[Any], dict]]):
"""
Expand Down

0 comments on commit 8720d46

Please sign in to comment.