Skip to content

Commit

Permalink
docs: add langchain dappier retriever integration notebooks (langchai…
Browse files Browse the repository at this point in the history
…n-ai#28931)

Add a retriever to interact with Dappier APIs with an example notebook.

The retriever can be invoked with:

```python
from langchain_dappier import DappierRetriever

retriever = DappierRetriever(
    data_model_id="dm_01jagy9nqaeer9hxx8z1sk1jx6",
    k=5
)

retriever.invoke("latest tech news")
```

To retrieve 5 documents related to latest news in the tech sector. The
included notebook also includes deeper details about controlling filters
such as selecting a data model, number of documents to return, site
domain reference, minimum articles from the reference domain, and search
algorithm, as well as including the retriever in a chain.

The integration package can be found over here -
https://github.com/DappierAI/langchain-dappier
  • Loading branch information
amaan-ai20 authored Jan 3, 2025
1 parent 0185010 commit 8d7daa5
Show file tree
Hide file tree
Showing 4 changed files with 371 additions and 18 deletions.
48 changes: 48 additions & 0 deletions docs/docs/integrations/providers/dappier.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Dappier

[Dappier](https://dappier.com) connects any LLM or your Agentic AI to
real-time, rights-cleared, proprietary data from trusted sources,
making your AI an expert in anything. Our specialized models include
Real-Time Web Search, News, Sports, Financial Stock Market Data,
Crypto Data, and exclusive content from premium publishers. Explore a
wide range of data models in our marketplace at
[marketplace.dappier.com](https://marketplace.dappier.com).

[Dappier](https://dappier.com) delivers enriched, prompt-ready, and
contextually relevant data strings, optimized for seamless integration
with LangChain. Whether you're building conversational AI, recommendation
engines, or intelligent search, Dappier's LLM-agnostic RAG models ensure
your AI has access to verified, up-to-date data—without the complexity of
building and managing your own retrieval pipeline.

## Installation and Setup

Install ``langchain-dappier`` and set environment variable
``DAPPIER_API_KEY``.

```bash
pip install -U langchain-dappier
export DAPPIER_API_KEY="your-api-key"
```

We also need to set our Dappier API credentials, which can be generated at
the [Dappier site.](https://platform.dappier.com/profile/api-keys).

We can find the supported data models by heading over to the
[Dappier marketplace.](https://platform.dappier.com/marketplace)

## Chat models

See a [usage example](/docs/integrations/chat/dappier).

```python
from langchain_community.chat_models import ChatDappierAI
```

## Retriever

See a [usage example](/docs/integrations/retrievers/dappier).

```python
from langchain_dappier import DappierRetriever
```
18 changes: 0 additions & 18 deletions docs/docs/integrations/providers/dappierai.mdx

This file was deleted.

319 changes: 319 additions & 0 deletions docs/docs/integrations/retrievers/dappier.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,319 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e6e1e5d5",
"metadata": {
"id": "e6e1e5d5"
},
"source": [
"# Dappier\n",
"\n",
"[Dappier](https://dappier.com) connects any LLM or your Agentic AI to real-time, rights-cleared, proprietary data from trusted sources, making your AI an expert in anything. Our specialized models include Real-Time Web Search, News, Sports, Financial Stock Market Data, Crypto Data, and exclusive content from premium publishers. Explore a wide range of data models in our marketplace at [marketplace.dappier.com](https://marketplace.dappier.com).\n",
"\n",
"[Dappier](https://dappier.com) delivers enriched, prompt-ready, and contextually relevant data strings, optimized for seamless integration with LangChain. Whether you're building conversational AI, recommendation engines, or intelligent search, Dappier's LLM-agnostic RAG models ensure your AI has access to verified, up-to-date data—without the complexity of building and managing your own retrieval pipeline."
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {
"id": "e49f1e0d"
},
"source": [
"# DappierRetriever\n",
"\n",
"This will help you getting started with the Dappier [retriever](https://python.langchain.com/docs/concepts/retrievers/). For detailed documentation of all DappierRetriever features and configurations head to the [API reference](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html).\n",
"\n",
"### Integration details\n",
"\n",
"Bring-your-own data (i.e., index and search a custom corpus of documents):\n",
"\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[DappierRetriever](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html) | ❌ | ❌ | langchain-dappier |\n",
"\n",
"### Setup\n",
"\n",
"Install ``langchain-dappier`` and set environment variable ``DAPPIER_API_KEY``.\n",
"\n",
"```bash\n",
"pip install -U langchain-dappier\n",
"export DAPPIER_API_KEY=\"your-api-key\"\n",
"```\n",
"\n",
"We also need to set our Dappier API credentials, which can be generated at the [Dappier site.](https://platform.dappier.com/profile/api-keys).\n",
"\n",
"We can find the supported data models by heading over to the [Dappier marketplace.](https://platform.dappier.com/marketplace)"
]
},
{
"cell_type": "markdown",
"id": "72ee0c4b-9764-423a-9dbf-95129e185210",
"metadata": {
"id": "72ee0c4b-9764-423a-9dbf-95129e185210"
},
"source": [
"If you want to get automated tracing from individual queries, you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
"metadata": {
"id": "a15d341e-3e26-4ca3-830b-5aab30ed66de"
},
"outputs": [],
"source": [
"# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
"# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
]
},
{
"cell_type": "markdown",
"id": "0730d6a1-c893-4840-9817-5e5251676d5d",
"metadata": {
"id": "0730d6a1-c893-4840-9817-5e5251676d5d"
},
"source": [
"### Installation\n",
"\n",
"This retriever lives in the `langchain-dappier` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"outputId": "d1bb87dd-860d-4255-d5a8-b3f42da1d76e"
},
"outputs": [],
"source": [
"%pip install -qU langchain-dappier"
]
},
{
"cell_type": "markdown",
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
"metadata": {
"id": "a38cde65-254d-4219-a441-068766c0d4b5"
},
"source": [
"## Instantiation\n",
"\n",
"- data_model_id: str\n",
" Data model ID, starting with dm_.\n",
" You can find the available data model IDs at:\n",
" [Dappier marketplace.](https://platform.dappier.com/marketplace)\n",
"- k: int\n",
" Number of documents to return.\n",
"- ref: Optional[str]\n",
" Site domain where AI recommendations are displayed.\n",
"- num_articles_ref: int\n",
" Minimum number of articles from the ref domain specified.\n",
" The rest will come from other sites within the RAG model.\n",
"- search_algorithm: Literal[\n",
" \"most_recent\",\n",
" \"most_recent_semantic\",\n",
" \"semantic\",\n",
" \"trending\"\n",
"]\n",
" Search algorithm for retrieving articles.\n",
"- api_key: Optional[str]\n",
" The API key used to interact with the Dappier APIs."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "70cc8e65-2a02-408a-bbc6-8ef649057d82",
"metadata": {
"id": "70cc8e65-2a02-408a-bbc6-8ef649057d82"
},
"outputs": [],
"source": [
"from langchain_dappier import DappierRetriever\n",
"\n",
"retriever = DappierRetriever(data_model_id=\"dm_01jagy9nqaeer9hxx8z1sk1jx6\")"
]
},
{
"cell_type": "markdown",
"id": "5c5f2839-4020-424e-9fc9-07777eede442",
"metadata": {
"id": "5c5f2839-4020-424e-9fc9-07777eede442"
},
"source": [
"## Usage"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
"outputId": "f85bcf8e-4b51-4f82-8e48-582a9643fa0a"
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(metadata={'title': 'Man shot and killed on Wells Street near downtown Fort Wayne', 'author': 'Gregg Montgomery', 'source_url': 'https://www.wishtv.com/news/indiana-news/man-shot-dies-fort-wayne-december-25-2024/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/fort-wayne-police-department-vehicle-via-Flickr_.jpg?width=428&height=321', 'pubdata': 'Thu, 26 Dec 2024 01:00:33 +0000'}, page_content='A man was shot and killed on December 25, 2024, in Fort Wayne, Indiana, near West Fourth and Wells streets. Police arrived shortly after 6:30 p.m. following reports of gunfire and found the victim in the 1600 block of Wells Street, where he was pronounced dead. The area features a mix of businesses, including a daycare and restaurants.\\n\\nAs of the latest updates, police have not provided details on the safety of the area, potential suspects, or the motive for the shooting. Authorities are encouraging anyone with information to reach out to the Fort Wayne Police Department or Crime Stoppers.'),\n",
" Document(metadata={'title': 'House cat dies from bird flu in pet food, prompting recall', 'author': 'Associated Press', 'source_url': 'https://www.wishtv.com/news/business/house-cat-bird-flu-pet-food-recall/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/BACKGROUND-Northwest-Naturals-cat-food_.jpg?width=428&height=321', 'pubdata': 'Wed, 25 Dec 2024 23:12:41 +0000'}, page_content='An Oregon house cat has died after eating pet food contaminated with the H5N1 bird flu virus, prompting a nationwide recall of Northwest Naturals\\' 2-pound Feline Turkey Recipe raw frozen pet food. The Oregon Department of Agriculture confirmed that the strictly indoor cat contracted the virus solely from the food, which has \"best if used by\" dates of May 21, 2026, and June 23, 2026. \\n\\nThe affected product was distributed across several states, including Arizona, California, and Florida, as well as British Columbia, Canada. Consumers are urged to dispose of the recalled food and seek refunds. This incident raises concerns about the spread of bird flu and its potential impact on domestic animals, particularly as California has declared a state of emergency due to the outbreak affecting various bird species.'),\n",
" Document(metadata={'title': '20 big cats die from bird flu at Washington sanctuary', 'author': 'Nic F. Anderson, CNN', 'source_url': 'https://www.wishtv.com/news/national/bird-flu-outbreak-wild-felid-center-2024/', 'image_url': 'https://images.dappier.com/dm_01jagy9nqaeer9hxx8z1sk1jx6/BACKGROUND-Amur-Bengal-tiger-at-Wild-Felid-Advocacy-Center-of-Washington-FB-post_.jpg?width=428&height=321', 'pubdata': 'Wed, 25 Dec 2024 23:04:34 +0000'}, page_content='The Wild Felid Advocacy Center in Washington state has experienced a devastating bird flu outbreak, resulting in the deaths of 20 big cats, over half of its population. The first death was reported around Thanksgiving, affecting various species, including cougars and a tiger mix. The sanctuary is currently under quarantine, closed to the public, and working with animal health officials to disinfect enclosures and implement prevention strategies.\\n\\nAs the situation unfolds, the Washington Department of Fish and Wildlife has noted an increase in bird flu cases statewide, including infections in cougars. While human infections from bird flu through contact with mammals are rare, the CDC acknowledges the potential risk. The sanctuary hopes to reopen in the new year, focusing on the recovery of the remaining animals and taking measures to prevent further outbreaks, marking an unprecedented challenge in its 20-year history.')]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"latest tech news\"\n",
"\n",
"retriever.invoke(query)"
]
},
{
"cell_type": "markdown",
"id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e",
"metadata": {
"id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e"
},
"source": [
"## Use within a chain\n",
"\n",
"Like other retrievers, DappierRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n",
"\n",
"We will need a LLM or chat model:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "25b647a3-f8f2-4541-a289-7a241e43f9df",
"metadata": {
"id": "25b647a3-f8f2-4541-a289-7a241e43f9df"
},
"outputs": [],
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae",
"metadata": {
"id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae"
},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\n",
" \"\"\"Answer the question based only on the context provided.\n",
"\n",
"Context: {context}\n",
"\n",
"Question: {question}\"\"\"\n",
")\n",
"\n",
"\n",
"def format_docs(docs):\n",
" return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
"\n",
"\n",
"chain = (\n",
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 143
},
"id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8",
"outputId": "f23f18a9-d138-4684-cb5b-b92e0895b5f2"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"\"The key highlights and outcomes from the latest events covered in the article include:\\n\\n1. An Israeli airstrike in Gaza killed five journalists from Al-Quds Today Television, leading to condemnation from their outlet and raising concerns about violence against media professionals in the region.\\n2. The Committee to Protect Journalists reported that since October 7, 2023, at least 141 journalists have been killed in the region, marking the deadliest period for journalists since 1992, with the majority being Palestinians in Gaza.\\n3. A man was shot and killed in Fort Wayne, Indiana, with police not providing details on suspects, motive, or the safety of the area.\\n4. An Oregon house cat died after eating pet food contaminated with the H5N1 bird flu virus, leading to a nationwide recall of Northwest Naturals' Feline Turkey Recipe raw frozen pet food and raising concerns about the spread of bird flu among domestic animals.\""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(\n",
" \"What are the key highlights and outcomes from the latest events covered in the article?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
"metadata": {
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3"
},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all DappierRetriever features and configurations head to the [API reference](https://python.langchain.com/en/latest/retrievers/langchain_dappier.retrievers.Dappier.DappierRetriever.html)."
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
4 changes: 4 additions & 0 deletions libs/packages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -317,3 +317,7 @@ packages:
repo: kingtroga/langchain-falkordb
downloads: 610
downloads_updated_at: '2025-01-02T20:23:02.544257+00:00'
- name: langchain-dappier
path: .
repo: DappierAI/langchain-dappier
downloads: 0

0 comments on commit 8d7daa5

Please sign in to comment.