From ea17ebe18105bee2eaed42b54bbc3953e22b5b31 Mon Sep 17 00:00:00 2001
From: DARREN OBERST <darrenoberst@DARRENs-MacBook-Pro.local>
Date: Fri, 17 May 2024 11:05:13 -0400
Subject: [PATCH] updating docs

---
 docs/Gemfile.lock       |   2 +-
 docs/architecture.md    |   2 -
 docs/fast_start.md      | 142 +++++++++++++++++++++++++++++++++
 docs/index.md           | 172 +---------------------------------------
 docs/platforms.md       | 156 ++++++++++++++++++++++++++++++++++++
 docs/release_history.md | 147 ++++++++++++++++++++++++++++++++++
 docs/troubleshooting.md | 147 ++++++++++++++++++++++++++++++++++
 docs/use_cases.md       | 133 +++++++++++++++++++++++++++++++
 docs/videos.md          | 116 +++++++++++++++++++++++++++
 9 files changed, 843 insertions(+), 174 deletions(-)
 create mode 100644 docs/fast_start.md
 create mode 100644 docs/platforms.md
 create mode 100644 docs/release_history.md
 create mode 100644 docs/troubleshooting.md
 create mode 100644 docs/use_cases.md
 create mode 100644 docs/videos.md

diff --git a/docs/Gemfile.lock b/docs/Gemfile.lock
index e9c0e246..0b8cd92f 100644
--- a/docs/Gemfile.lock
+++ b/docs/Gemfile.lock
@@ -57,7 +57,7 @@ GEM
       jekyll-seo-tag (>= 2.0)
       rake (>= 12.3.1)
     kramdown (2.4.0)
-      rexml
+      rexml (>= 3.2.7)
     kramdown-parser-gfm (1.1.0)
       kramdown (~> 2.0)
     liquid (4.0.4)
diff --git a/docs/architecture.md b/docs/architecture.md
index ef57683c..bf093b4a 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -5,8 +5,6 @@ nav_order: 1
 description: overview of the major modules and classes of LLMWare  
 permalink: /architecture
 ---
-
-
 # LLMWare Architecture
 ===============
 
diff --git a/docs/fast_start.md b/docs/fast_start.md
new file mode 100644
index 00000000..9720f8e7
--- /dev/null
+++ b/docs/fast_start.md
@@ -0,0 +1,142 @@
+---
+layout: default
+title: Fast Start Series | llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /fast_start
+---
+Fast Start: Learning RAG with llmware through 6 examples 
+===============
+
+**Welcome to llmware!**    
+
+Fast Start is a structured series of 6 self-contained examples and accompanying videos that walk through the core foundational components of RAG with LLMWare.  
+Set up  
+
+`pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git
+`.   
+
+Platforms: 
+- Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred)  
+- RAM: 16 GB minimum  
+- Python 3.9, 3.10, 3.11, 3.12
+- Pull the latest version of llmware == 0.2.13 (as of mid-May 2024)  
+- Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example.    
+  
+There are 6 examples, designed to be used step-by-step, but each is self-contained,  
+so you can feel free to jump into any of the examples, in any order, that you prefer.  
+
+Each example has been designed to be "copy-paste" and RUN with lots of helpful comments and explanations embedded in the code samples.  
+
+Please check out our [Fast Start Youtube tutorials](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB) that walk through each example below.  
+
+Examples:
+
+**Section I - Learning the Main Components**
+1.  **Library** - parse, text chunk, and index to convert a "pile of files" into an AI-ready knowledge-base.  [Video](https://youtu.be/2xDefZ4oBOM?si=8vRCvqj0-HG3zc4c)  
+  
+2.  **Embeddings** - apply an embedding model to the Library, store vectors, and start enabling natural language queries.  [Video](https://youtu.be/xQEk6ohvfV0?si=B3X25ZsAZfW4AR_3)
+   
+3.  **Prompts** & **Model Catalog** - start running inferences and building prompts.  [Video](https://youtu.be/swiu4oBVfbA?si=0IVmLhiiYS3-pMIg)
+
+**Section II - Connecting Knowledge with Prompts - 3 scenarios**  
+
+4.  **RAG with Text Query** - start integrating documents into prompts.  [Video](https://youtu.be/6oALi67HP7U?si=pAbvio4ULXTIXKdL)
+  
+5.  **RAG with Semantic Query** - use natural language queries on documents and integrate with prompts.  [Video](https://youtu.be/XT4kIXA9H3Q?si=EBCAxVXBt5vgYY8s)
+    
+6.  **RAG with more complex retrieval** - start integrating more complex retrieval patterns.  [Video](https://youtu.be/G1Q6Ar8THbo?si=vIVAv35uXAcnaUJy)  
+   
+After completing these 6 examples, you should have a good foundation and set of recipes to start 
+exploring the other 100+ examples in the /examples folder, and build more sophisticated 
+LLM-based applications.
+
+**Models**  
+  - All of these examples are optimized for using local CPU-based models, primarily BLING and DRAGON.
+  - If you want to substitute for any other model in the catalog, it is generally as easy as 
+    switching the model_name.  If the model requires API keys, we show in the examples how to pass those keys as an
+    environment variable.
+
+**Collection Databases**  
+  - Our parsers are optimized to index text chunks directly into a persistent data store.   
+  - For Fast Start, we will use "sqlite" which is an embedded database, requiring no install
+  - For more scalable deployment, we would recommend either "mongo" or "postgres"
+  - Install instructions for "mongo" and "postgres" are provided in docker-compose files in the repository
+
+**Vector Databases**  
+   - For Fast Start, we will use "chromadb" in persistent 'file' mode, requiring no install.  
+   - Note: if you are using Python < 3.12, then please feel free to substitute for faiss (which was used in the videos).  
+   - Note: depending upon how and when you installed llmware, you may need to `pip install chromadb`.  
+   - For more scalable deployment, we would recommend installing one of 9 supported vector databases, 
+     including Milvus, PGVector (Postgres), Redis, Qdrant, Neo4j, Mongo-Atlas, Chroma, LanceDB, or Pinecone.  
+   - Install instructions provided in "examples/Embedding" for specific db, as well as docker-compose scripts.  
+
+**Local Private**
+    - All of the processing will take place locally on your laptop.
+
+*This is an ongoing initiative to provide easy-to-get-started tutorials - we welcome and encourage feedback, as well
+as contributions with examples and other tips for helping others on their LLM application journeys!*  
+
+**Let's get started!**
+
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---
diff --git a/docs/index.md b/docs/index.md
index a983753d..b1f8b7b6 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -28,7 +28,7 @@ Our specific focus is on making it easy to integrate open source small specializ
 1.  Install llmware - `pip3 install llmware`  
 
 
-2.  Make sure that you are running on a [supported platform](#platform-support).  
+2.  Make sure that you are running on a [supported platform](platforms.md/#platform-support).  
 
 
 3.  Learn by example:  
@@ -53,7 +53,6 @@ Our specific focus is on making it easy to integrate open source small specializ
 
 [Install llmware](#install-llmware){: .btn .btn-primary .fs-5 .mb-4 .mb-md-0 .mr-2 }  
 [Common Setup & Configuration Items](#platform-support){: .btn .fs-5 .mb-4 .mb-md-0 }  
-[Troubleshooting](#common-troubleshooting-issues){: .btn .fs-5 .mb-4 .mb-md-0 }  
 [Architecture](architecture.md/#llmware-architecture){: .btn .fs-5 .mb-4 .mb-md-0 }  
 [View llmware on GitHub](https://www.github.com/llmware-ai/llmware/tree/main){: .btn .fs-5 .mb-4 .mb-md-0 }  
 [Open an Issue on GitHub](https://www.github.com/llmware-ai/llmware/issues){: .btn .fs-5 .mb-4 .mb-md-0 }  
@@ -91,175 +90,6 @@ git clone git@github.com:llmware-ai/llmware.git
 
 - Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries.  If you prefer, you can keep only those libs required for your OS platform.  
 
-___  
-# Platform Support
-
-**Platform Supported**
-
-- **Python 3.9+**  (note that we just added support for 3.12 starting in llmware version 0.2.12)  
-
-
-- **System RAM**:  recommended 16 GB RAM minimum (to run most local models on CPU)  
-
-
-- **OS Supported**:  Mac OS M1/M2/M3, Windows, Linux Ubuntu 20/22.  We regularly build and test on Windows and Linux platforms with and without CUDA drivers.
-
-
-- **Deprecated OS**:  Linux Aarch64 (0.2.6) and Mac x86 (0.2.10) - most features of llmware should work on these platforms, but new features integrated since those versions will not be available.  If you have a particular need to work on one of these platforms, please raise an Issue, and we can work with you to try to find a solution.  
-
-
-- **Linux**:  we build to GLIBC 2.31+ - so Linux versions with older GLIBC drivers will generally not work (e.g., Ubuntu 18).  To check the GLIBC version, you can use the command `ldd --version`.  If it is 2.31 or any higher version, it should work.  
-
-___
-
-___
-**Database**  
-
-- LLMWare is an enterprise-grade data pipeline designed for persistent storage of key artifacts throughout the pipeline.  We provide several options to parse 'in-memory' and write to jsonl files, but most of the functionality of LLMWare assumes that a persistent scalable data store will be used.   
-
-
-- There are three different types of data storage used in LLMWare:
-
-    1.  **Text Collection database** - all of the LLMWare parsers, by default, parse and text chunk unstructured content (and associated metadata) into one of three databases used for text collections, organized in Libraries - **MongoDB**, **Postgres** and **SQLite**.  
-
-    2.  **Vector database** - for storing and retrieving semantic embedding vectors, LLMWare supports the following vector databases - Milvus, PG Vector / Postgres, Qdrant, ChromaDB, Redis, Neo4J, Lance DB, Mongo-Atlas, Pinecone and FAISS.  
-  
-    3.  **SQL Tables database** - for easily integrating table-based data into LLM workflows through the CustomTable class and for using in conjunction with a Text-2-SQL workflow - supported on Postgres and SQLite.  
-
-
-- **Fast Start** option:  you can start using SQLite locally without any separate installation by setting `LLMWareConfig.set_active_db("sqlite")` as shown in [configure_db_example](https://www.github.com/llmware-ai/llmware/blob/main/examples/Getting_Started/configure_db.py).  For vector embedding examples, you can use ChromaDB, LanceDB or FAISS - all of which provide no-install options - just start using.  
-
-
-- **Install DB dependencies**:  we provide a number of Docker-Compose scripts which can be used, or follow install instructions provided by the database - generally easiest to install locally with Docker.  
-
-
-**LLMWare File Storage**
-
-- llmware stores a variety of artifacts during its operation locally in the /llmware_data path, which can be found as follows:  
-
-```python
-from llmware.configs import LLMWareConfig
-llmware_fp = LLMWareConfig().get_llmware_path()
-print("llmware_data path: ", llmware_fp)
-```
-
-- to change the llmware path, we can change both the 'home' path, which is the main filepath, and the 'llmware_data' path name 
-as follows:  
-
-```python
-
-from llmware.configs import LLMWareConfig
-
-# changing the llmware home path - change home + llmware_path_name
-LLMWareConfig().set_home("/my/new/local/home/path")
-LLMWareConfig().set_llmware_path_name("llmware_data2")
-
-# check the new llmware home path
-llmware_fp = LLMWareConfig().get_llmware_path()
-print("updated llmware path: ", llmware_fp)
-
-
-```
-
-___
-
-___
-**Local Models**
-
-- LLMWare treats open source and locally deployed models as "first class citizens" with all classes, methods and examples designed to work first with smaller, specialized, locally-deployed models.  
-- By default, most models are pulled from public HuggingFace repositories, and cached locally.  LLMWare will store all models locally at the /llmware_data/model_repo path, with all assets found in a folder tree with the models name.  
-- If a Pytorch model is pulled from HuggingFace, then it will appear in the default HuggingFace /.cache path.   
-- To view the local model path:  
-
-```python
-from llmware.configs import LLMWareConfig
-
-model_fp = LLMWareConfig().get_model_repo_path()
-print("model repo path: ", model_fp)
-
-```
-___
-
-# Common Troubleshooting Issues
-___
-
-
-1. **Can not install the pip package**  
-
-    -- Check your Python version.   If using Python 3.9-3.11, then almost any version of llmware should work.  If using an older Python (before 3.9), then it is likely that dependencies will fail in the pip process.  If you are using Python 3.12, then you need to use llmware>=0.2.12.  
-    
-    -- Dependency constraint error.   If you receive a specific error around a dependency version constraint, then please raise an issue and include details about your OS, Python version, any unique elements in your virtual environment, and specific error.   
-
-
-2. **Parser module not found**
-
-    -- Check your OS and confirm that you are using a [supported platform](#platform-support).  
-    -- If you cloned the repository, please confirm that the /lib folder has been copied into your local path.  
-
-
-3.  **Pytorch Model not loading**
-
-   -- Confirm the obvious stuff - correct model name, model exists in Huggingface repository, connected to the Internet with open ports for HTTPS connection, etc.  
-
-   -- Check Pytorch version - update Pytorch to >2.0, which is required for many recent models released in the last 6 months, and in some cases, may require other dependencies not included in the llmware package.  
-
-
-4.  **GGUF Model not loading**
-
-   -- Confirm that you are using llmware>=0.2.11 for the latest GGUF support.  
-
-   -- Confirm that you are using a [supported platform](#platform-support).  We provide pre-built binaries for llama.cpp as a back-end GGUF engine on the following platforms:  
-        
-        - Mac M1/M2/M3 - OS version 14 - "with accelerate framework"
-        - Mac M1/M2/M3 - OS older versions - "without accelerate framework"  
-        - Windows - x86
-        - Windows with CUDA  
-        - Linux - x86  (Ubuntu 20+)
-        - Linux with CUDA  (Ubuntu 20+)  
-   
-If you are using a different OS platform, you have the option to "bring your own llama.cpp" lib as follows:  
-
-```python
-from llmware.gguf_configs import GGUFConfigs
-GGUFConfigs().set_config("custom_lib_path", "/path/to/your/libllama_binary")  
-```
-
-If you have any trouble, feel free to raise an Issue and we can provide you with instructions and/or help compiling llama.cpp for your platform.  
-        
-   -- Specific GGUF model - if you are successfully using other GGUF models, and only having problems with a specific model, then please raise an Issue, and share the specific model and architecture.  
-
-
-5.  **Example not working as expected** - please raise an issue, so we can evaluate and fix any bugs in the example code.  Also, pull requests are always especially welcomed with a fix or improvement in an example.  
-
-
-6.  **Model not leveraging CUDA available in environment.**  
-
-    -- **Check CUDA drivers installed correctly** - easy check of the NVIDIA CUDA drivers is to use `nvidia-smi` and `nvcc --version` from the command line.  Both commands should respond positively with details on the versions and implementations.  Any errors indicates that either the driver or CUDA toolkit are not installed or recognized.  It can be complicated at times to debug the environment, usually with some trial and error.   See extensive [Nvidia Developer documentation](https://docs.nvidia.com) for trouble-shooting steps, specific to your environment.  
-
-    -- **Check CUDA drivers are up to date** - we build to CUDA 12.1, which translates to a minimum of 525.60 on Linux, and 528.33 on Windows.  
-
-    -- **Pytorch model** - check that Pytorch is finding CUDA, e.g., `torch.cuda.is_available()` == True.   We have seen issues on Windows, in particular, to confirm that your Pytorch version has been compiled with CUDA drivers.  For Windows, in particular, we have found that you may need to compile a CUDA-specific version of Pytorch, using the following command:  
-    
-    ```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121```
-    
-    -- **GGUF model** - logs will be displayed on the screen confirming that CUDA is being used, or whether 'fall-back' to CPU drivers.  We run a custom CUDA install check, which you can run on your system with:  
-        ```gpu_status = ModelCatalog().gpu_available``` 
-        
-       If you are confirming CUDA present, but fall-back to CPU is being used, you can set the GGUFConfigs to force to CUDA:  
-        ```GGUFConfigs().set_config("force_gpu", True)```  
-      
-       If you are looking to use specific optimizations, you can bring your own llama.cpp lib as follows:
-        ```GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend")``` 
-
-    -- If you can not debug after these steps, then please raise an Issue.   We are happy to dig in and work with you to run FAST local inference.  
-
-
-7.  **Model result inconsistent**  
-
-    -- when loading the model, set `temperature=0.0` and `sample=False` -> this will give a deterministic output for better testing and debugging.  
-
-    -- usually the issue will be related to the retrieval step and formation of the Prompt, and as always, good pipelines and a little experimentation usually help !  
-
 
 # More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
 
diff --git a/docs/platforms.md b/docs/platforms.md
new file mode 100644
index 00000000..fd72f883
--- /dev/null
+++ b/docs/platforms.md
@@ -0,0 +1,156 @@
+---
+layout: default
+title: Platform Support| llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /platform_support
+---
+___  
+# Platform Support
+___
+
+**Platform Supported**
+
+- **Python 3.9+**  (note that we just added support for 3.12 starting in llmware version 0.2.12)  
+
+
+- **System RAM**:  recommended 16 GB RAM minimum (to run most local models on CPU)  
+
+
+- **OS Supported**:  Mac OS M1/M2/M3, Windows, Linux Ubuntu 20/22.  We regularly build and test on Windows and Linux platforms with and without CUDA drivers.
+
+
+- **Deprecated OS**:  Linux Aarch64 (0.2.6) and Mac x86 (0.2.10) - most features of llmware should work on these platforms, but new features integrated since those versions will not be available.  If you have a particular need to work on one of these platforms, please raise an Issue, and we can work with you to try to find a solution.  
+
+
+- **Linux**:  we build to GLIBC 2.31+ - so Linux versions with older GLIBC drivers will generally not work (e.g., Ubuntu 18).  To check the GLIBC version, you can use the command `ldd --version`.  If it is 2.31 or any higher version, it should work.  
+
+___
+
+___
+**Database**  
+
+- LLMWare is an enterprise-grade data pipeline designed for persistent storage of key artifacts throughout the pipeline.  We provide several options to parse 'in-memory' and write to jsonl files, but most of the functionality of LLMWare assumes that a persistent scalable data store will be used.   
+
+
+- There are three different types of data storage used in LLMWare:
+
+    1.  **Text Collection database** - all of the LLMWare parsers, by default, parse and text chunk unstructured content (and associated metadata) into one of three databases used for text collections, organized in Libraries - **MongoDB**, **Postgres** and **SQLite**.  
+
+    2.  **Vector database** - for storing and retrieving semantic embedding vectors, LLMWare supports the following vector databases - Milvus, PG Vector / Postgres, Qdrant, ChromaDB, Redis, Neo4J, Lance DB, Mongo-Atlas, Pinecone and FAISS.  
+  
+    3.  **SQL Tables database** - for easily integrating table-based data into LLM workflows through the CustomTable class and for using in conjunction with a Text-2-SQL workflow - supported on Postgres and SQLite.  
+
+
+- **Fast Start** option:  you can start using SQLite locally without any separate installation by setting `LLMWareConfig.set_active_db("sqlite")` as shown in [configure_db_example](https://www.github.com/llmware-ai/llmware/blob/main/examples/Getting_Started/configure_db.py).  For vector embedding examples, you can use ChromaDB, LanceDB or FAISS - all of which provide no-install options - just start using.  
+
+
+- **Install DB dependencies**:  we provide a number of Docker-Compose scripts which can be used, or follow install instructions provided by the database - generally easiest to install locally with Docker.  
+
+
+**LLMWare File Storage**
+
+- llmware stores a variety of artifacts during its operation locally in the /llmware_data path, which can be found as follows:  
+
+```python
+from llmware.configs import LLMWareConfig
+llmware_fp = LLMWareConfig().get_llmware_path()
+print("llmware_data path: ", llmware_fp)
+```
+
+- to change the llmware path, we can change both the 'home' path, which is the main filepath, and the 'llmware_data' path name 
+as follows:  
+
+```python
+
+from llmware.configs import LLMWareConfig
+
+# changing the llmware home path - change home + llmware_path_name
+LLMWareConfig().set_home("/my/new/local/home/path")
+LLMWareConfig().set_llmware_path_name("llmware_data2")
+
+# check the new llmware home path
+llmware_fp = LLMWareConfig().get_llmware_path()
+print("updated llmware path: ", llmware_fp)
+
+
+```
+
+___
+
+___
+**Local Models**
+
+- LLMWare treats open source and locally deployed models as "first class citizens" with all classes, methods and examples designed to work first with smaller, specialized, locally-deployed models.  
+- By default, most models are pulled from public HuggingFace repositories, and cached locally.  LLMWare will store all models locally at the /llmware_data/model_repo path, with all assets found in a folder tree with the models name.  
+- If a Pytorch model is pulled from HuggingFace, then it will appear in the default HuggingFace /.cache path.   
+- To view the local model path:  
+
+```python
+from llmware.configs import LLMWareConfig
+
+model_fp = LLMWareConfig().get_model_repo_path()
+print("model repo path: ", model_fp)
+
+```
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---
diff --git a/docs/release_history.md b/docs/release_history.md
new file mode 100644
index 00000000..8aa8cf6a
--- /dev/null
+++ b/docs/release_history.md
@@ -0,0 +1,147 @@
+---
+layout: default
+title: Release History | llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /release_history
+---
+
+Release History
+===============
+
+- For Specific Wheels:  [Wheel Archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives)  
+- For Features Details: [Main README-'Release notes and Change Log'](https://www.github.com/llmware-ai/llmware/tree/main/)  
+
+New wheels are built generally on PyPy on a weekly basis and updated on PyPy versioning.   The development repo is updated  
+and current at all times, but may have updates that are not yet in the PyPy wheel.  
+
+All wheels are built and tested on:  
+
+1.  Mac Metal  
+2.  Windows x86 (+ with CUDA)  
+3.  Linux x86 (+ with CUDA) - most testing on Ubuntu 22 and Ubuntu 20 - which are recommended.  
+4.  Mac x86 (see 0.2.11 note below)  
+5.  Linux aarch64* (see 0.2.7 note below)  
+
+**Release Notes**  
+
+--**0.2.14** released in the week of May 19, 2024 - continued clean up and updating of dependencies - changes in download from Huggingface Hub, associated with changes from huggingface_hub API changes - there are some 'future warnings' that are coming from within the HuggingFace code.  If any problems, please raise issue.  
+
+--**0.2.13** released in the week of May 12, 2024 - clean up of dependencies in both requirements.txt and Setup (PyPi) - install of vector db python sdk (e.g., pymilvus, chromadb, etc) is now required as a separate step outside of the pip3 install llmware - attempt to keep dependency matrix as simple as possible and avoid potential dependency conflicts on install, especially for packages which in turn have a large number of dependencies.  If you run into any issues with install dependencies, please raise an issue.   
+
+--**0.2.12** released in the week of May 5, 2024 - added Python 3.12 support, and deprecated the use of faiss for v3.12+.   We have changed the "Fast Start" no-install option to use chromadb or lancedb rather than faiss.   Refactoring of code especially with Datasets, Graph and Web Services as separate modules.  
+
+--**0.2.11** released in the week of April 29, 2024 - updated GGUF libs for Phi-3 and Llama-3 support, and added new prebuilt shared libraries to support WhisperCPP.  We are also deprecating support for Mac x86 going forward - will continue to support on most major components but not all new features going forward will be built specifically for Mac x86 (which Apple stopped shipping in 2022).  Our intent is to keep narrowing our testing matrix to provide better support on key platforms.  We have also added better safety checks for older versions of Mac OS running on M1/M2/M3 (no_acc option in GGUF and Whisper libs), as well as a custom check to find CUDA drivers on Windows (independent of Pytorch).  
+
+--**0.2.9** released in the week of April 15, 2024 - minor continued improvements to the parsers plus roll-out of new CustomTable class for rapidly integrating structured information into LLM-based workflows and data pipelines, including converting JSON/JSONL files and CSV files into structured DB tables.  
+  
+--**0.2.8** released in the week of April 8, 2024 - significant improvements to the Office parser with new libs on all platforms.   Conforming changes with the PDF parser in terms of exposing more options for text chunking strategies, encoding, and range of capture options (e.g., tables, images, header text, etc).  Linux aarch64 libs deprecated and kept at 0.2.6 - some new features will not be available on Linux aarch64 - we recommend using Ubuntu20+ on x86_64 (with and without CUDA).  
+
+--**0.2.7** released in the week of April 1, 2024 - significant improvements to the PDF parser with new libs on all platforms.   Important note that we are keeping linux aarch64 at 0.2.6 libs - and will be deprecating support going forward.  For Linux, we recommend Ubuntu20+ and x86_64 (with and without CUDA).  
+
+--**0.2.5** released in the week of March 12, 2024 - continued enhancements of the GGUF implementation, especially for CUDA support, and re-compiling of all binaries to support Ubuntu 20 and Ubuntu 22.  Ubuntu requirements are:  CUDA 12.1 (to use GPU), and GLIBC 2.31+.  
+
+--**GGUF on Windows CUDA**: useful notes and debugging tips -  
+
+    1.  Requirement:  Nvidia CUDA 12.1+  
+    
+        -- how to check:  `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured 
+        -- if you have older drivers (e.g., v11), then you will need to update them.  
+        
+    2.  Requirement:  CUDA-enabled Pytorch  (pre-0.2.11)  
+    
+        -- starting with 0.2.11, we have implemented a custom check to evaluate if CUDA is present, independent of Pytorch.  
+        -- for pre-0.2.11, we use Pytorch to check for CUDA drivers, e.g., `torch.cuda.is_available()` and `torch.version.cuda`  
+
+    3.  Installing a CUDA-enabled Pytorch - useful install script:  (not required post-0.2.11 for GGUF on Windows)  
+    
+        -- `pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`  
+
+    4.  Fall-back to CPU - if llmware can not load the CUDA-enabled drivers, it will automatically try to fall back to the CPU version of the drivers.  
+    
+        -- you can also adjust the GGUFConfigs().set_config - ("use_gpu", False) - and then it will automatically go to the CPU drivers.  
+
+    5.  Custom GGUF libraries - if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform.  Happy to help if you share the requirements.  
+
+        -- to "bring your own GGUF":  GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend" -> and llmware will try to load that library.  
+
+    6.  Issues?  - please raise an Issue on Github, or on Discord - and we can work with you to get you up and running!  
+    
+--**0.2.4** released in the week of February 26, 2024 - major upgrade of GGUF implementation to support more options, including CUDA support - which is the main source of growth in the size of the wheel package.   
+
+  -- Note: We will look at making some of the CUDA builds as 'optional' or 'bring your own' over time.    
+  -- Note: We will also start to 'prune' the list of wheels kept in the archive to keep the total repo size manageable for cloning.  
+
+--**0.2.2** introduced SLIM models and the new LLMfx class, and the capabilities for multi-model, multi-step Agent-based processes.  
+
+--**0.2.0** released in the week of January 22, 2024 - significant enhancements, including integration of Postgres and SQLite drivers into the c lib parsers.  
+
+--New examples involving Postgres or SQLite support (including 'Fast Start' examples) will require a fresh pip install of 0.2.0 or clone of the repo.  
+
+--If cloning the repo, please be especially careful to pick up the new updated /lib dependencies for your platform.  
+
+--New libs have new dependencies in Linux in particular - most extensive testing on Ubuntu 22. If any issues on a specific version of Linux, please raise a ticket.  
+
+
+
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---
diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
new file mode 100644
index 00000000..78715cf2
--- /dev/null
+++ b/docs/troubleshooting.md
@@ -0,0 +1,147 @@
+---
+layout: default
+title: Common Troubleshooting Tips | llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /troubleshooting
+---
+# Common Troubleshooting Issues
+___
+
+
+1. **Can not install the pip package**  
+
+    -- Check your Python version.   If using Python 3.9-3.11, then almost any version of llmware should work.  If using an older Python (before 3.9), then it is likely that dependencies will fail in the pip process.  If you are using Python 3.12, then you need to use llmware>=0.2.12.  
+    
+    -- Dependency constraint error.   If you receive a specific error around a dependency version constraint, then please raise an issue and include details about your OS, Python version, any unique elements in your virtual environment, and specific error.   
+
+
+2. **Parser module not found**
+
+    -- Check your OS and confirm that you are using a [supported platform](platforms.md/#platform-support).  
+    -- If you cloned the repository, please confirm that the /lib folder has been copied into your local path.  
+
+
+3.  **Pytorch Model not loading**
+
+   -- Confirm the obvious stuff - correct model name, model exists in Huggingface repository, connected to the Internet with open ports for HTTPS connection, etc.  
+
+   -- Check Pytorch version - update Pytorch to >2.0, which is required for many recent models released in the last 6 months, and in some cases, may require other dependencies not included in the llmware package.  
+        --note: we have seen some compatibility issues with Pytorch==2.3 on Wintel platforms - if you run into these issues, we recommend using a back-level Pytorch==2.1, which we have seen fixing the issue.  
+
+4.  **GGUF Model not loading**
+
+   -- Confirm that you are using llmware>=0.2.11 for the latest GGUF support.  
+
+   -- Confirm that you are using a [supported platform](platforms.md/#platform-support).  We provide pre-built binaries for llama.cpp as a back-end GGUF engine on the following platforms:  
+        
+        - Mac M1/M2/M3 - OS version 14 - "with accelerate framework"
+        - Mac M1/M2/M3 - OS older versions - "without accelerate framework"  
+        - Windows - x86
+        - Windows with CUDA  
+        - Linux - x86  (Ubuntu 20+)
+        - Linux with CUDA  (Ubuntu 20+)  
+   
+If you are using a different OS platform, you have the option to "bring your own llama.cpp" lib as follows:  
+
+```python
+from llmware.gguf_configs import GGUFConfigs
+GGUFConfigs().set_config("custom_lib_path", "/path/to/your/libllama_binary")  
+```
+
+If you have any trouble, feel free to raise an Issue and we can provide you with instructions and/or help compiling llama.cpp for your platform.  
+        
+   -- Specific GGUF model - if you are successfully using other GGUF models, and only having problems with a specific model, then please raise an Issue, and share the specific model and architecture.  
+
+
+5.  **Example not working as expected** - please raise an issue, so we can evaluate and fix any bugs in the example code.  Also, pull requests are always especially welcomed with a fix or improvement in an example.  
+
+
+6.  **Model not leveraging CUDA available in environment.**  
+
+    -- **Check CUDA drivers installed correctly** - easy check of the NVIDIA CUDA drivers is to use `nvidia-smi` and `nvcc --version` from the command line.  Both commands should respond positively with details on the versions and implementations.  Any errors indicates that either the driver or CUDA toolkit are not installed or recognized.  It can be complicated at times to debug the environment, usually with some trial and error.   See extensive [Nvidia Developer documentation](https://docs.nvidia.com) for trouble-shooting steps, specific to your environment.  
+
+    -- **Check CUDA drivers are up to date** - we build to CUDA 12.1, which translates to a minimum of 525.60 on Linux, and 528.33 on Windows.  
+
+    -- **Pytorch model** - check that Pytorch is finding CUDA, e.g., `torch.cuda.is_available()` == True.   We have seen issues on Windows, in particular, to confirm that your Pytorch version has been compiled with CUDA drivers.  For Windows, in particular, we have found that you may need to compile a CUDA-specific version of Pytorch, using the following command:  
+    
+    ```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121```
+    
+    -- **GGUF model** - logs will be displayed on the screen confirming that CUDA is being used, or whether 'fall-back' to CPU drivers.  We run a custom CUDA install check, which you can run on your system with:  
+        ```gpu_status = ModelCatalog().gpu_available``` 
+        
+       If you are confirming CUDA present, but fall-back to CPU is being used, you can set the GGUFConfigs to force to CUDA:  
+        ```GGUFConfigs().set_config("force_gpu", True)```  
+      
+       If you are looking to use specific optimizations, you can bring your own llama.cpp lib as follows:
+        ```GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend")``` 
+
+    -- If you can not debug after these steps, then please raise an Issue.   We are happy to dig in and work with you to run FAST local inference.  
+
+
+7.  **Model result inconsistent**  
+
+    -- when loading the model, set `temperature=0.0` and `sample=False` -> this will give a deterministic output for better testing and debugging.  
+
+    -- usually the issue will be related to the retrieval step and formation of the Prompt, and as always, good pipelines and a little experimentation usually help !  
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---
diff --git a/docs/use_cases.md b/docs/use_cases.md
new file mode 100644
index 00000000..cf6ec915
--- /dev/null
+++ b/docs/use_cases.md
@@ -0,0 +1,133 @@
+---
+layout: default
+title: Use Cases | llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /use_cases
+---
+🚀 Use Cases Examples  🚀  
+===============
+
+**End-to-End Scenarios**    
+
+We provide several 'end-to-end' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective.   While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases.  All of these examples use small, specialized models, running locally - 'Small, but Mighty' !  
+
+
+1.  [**Research Automation with Agents and Web Services**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/web_services_slim_fx.py)  
+
+    - Prepare a 30-key research analysis on a company  
+    - Extract key lookup and other information from an earnings press release  
+    - Automatically use the lookup data for real-time stock information from YFinance 
+    - Automatically use the lookup date for background company history information in Wikipedia  
+    - Run LLM prompts to ask key questions of the Wikipedia sources 
+    - Aggregate into a consolidated research analysis
+    - All with local open source models  
+
+
+2.  [**Invoice Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/invoice_processing.py)  
+
+    - Parse a batch of invoices (provided as sample files)  
+    - Extract key information from the invoices 
+    - Save the prompt state for follow-up review and analysis 
+
+
+3.  [**Analyzing and Extracting Voice Transcripts**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py)  
+
+    - Voice transcription of 50+ wav files of great speeches of the 20th century  
+    - Run text queries against the transcribed wav files 
+    - Execute LLM agent inferences to extract and identify key elements of interest 
+    - Prepare 'bibliography' with the key extracted points, including time-stamp 
+
+
+4.  [**MSA Processing**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/msa_processing.py)
+
+    - Identify the termination provisions in Master Service Agreements among a larger batch of contracts  
+    - Parse and query a large batch of contracts and identify the agreements with "Master Service Agreement" on the first page  
+    - Find the termination provisions in each MSA  
+    - Prompt LLM to read the termination provisions and answer a key question  
+    - Run a fact-check and source-check on the LLM response
+    - Save all of the responses in CSV and JSON for follow-up review.  
+
+
+5.  [**Querying a CSV**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/agent_with_custom_tables.py) 
+
+    - Start running natural language queries on CSVs with Postgres and slim-sql-tool.  
+    - Load a sample 'customer_table.csv' into Postgres
+    - Start running natural language queries that get converted into SQL and query the DB  
+    
+
+6.  [**Contract Analysis**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/contract_analysis_on_laptop_with_bling_models.py)  
+
+    - Extract key information from set of employment agreement  
+    - Use a simple retrieval strategy with keyword search to identify key provisions and topic areas  
+    - Prompt LLM to read the key provisions and answer questions based on those source materials  
+
+7.  [**Slicing and Dicing Office Docs**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/slicing_and_dicing_office_docs.py)  
+
+    - Shows a variety of advanced parsing techniques with Office document formats packaged in ZIP archives  
+    - Extracts tables and images, runs OCR against the embedded images, exports the whole library, and creates dataset  
+    
+    
+Check back often - we are updating these examples regularly - and many of these examples have companion videos as well.  
+
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---
diff --git a/docs/videos.md b/docs/videos.md
new file mode 100644
index 00000000..50f0cd3d
--- /dev/null
+++ b/docs/videos.md
@@ -0,0 +1,116 @@
+---
+layout: default
+title: Videos | llmware
+nav_order: 1
+description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows.
+permalink: /videos
+---
+llmware Youtube Video Channel
+===============
+
+**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples.  
+
+Check back often as this list is always growing ...  
+
+🎬 **Some of our most recent videos**  
+- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz)  
+- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2)  
+- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG)  
+- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx)  
+- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY)  
+
+🎬 **Using Agents, Function Calls and SLIM models**  
+- [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde)  
+- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe)  
+- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C)  
+- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY)  
+- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP)  
+- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s)   
+- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK)  
+- [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS)   
+- [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7)  
+- [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e)  
+- [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD)  
+
+🎬 **Using GGUF Models**  
+- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE)  
+- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0)  
+- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow)  
+- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E)  
+- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s)  
+
+🎬 **Core RAG Scenarios Running Locally**  
+- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU)     
+- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s)   
+- [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0)   
+- [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s)   
+- [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s)  
+- [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU)  
+- [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw)  
+
+
+🎬 **Parsing, Embedding, Data Pipelines and Extraction**  
+- [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s)  
+- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q)   
+- [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF)  
+
+
+# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git)
+
+
+# About the project
+
+`llmware` is &copy; 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home).
+
+## Contributing
+Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions).
+You can also write an email or start a discussion on our Discrod channel.
+Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md).
+
+## Code of conduct
+We welcome everyone into the ``llmware`` community.
+[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository.
+
+## ``llmware`` and [AI Bloks](https://www.aibloks.com/home)
+``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``.
+The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service.
+[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022.
+
+## License
+
+`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE).
+
+## Thank you to the contributors of ``llmware``!
+<ul class="list-style-none">
+{% for contributor in site.github.contributors %}
+  <li class="d-inline-block mr-1">
+     <a href="{{ contributor.html_url }}">
+        <img src="{{ contributor.avatar_url }}" width="32" height="32" alt="{{ contributor.login }}">
+    </a>
+  </li>
+{% endfor %}
+</ul>
+
+
+---
+<ul class="list-style-none">
+    <li class="d-inline-block mr-1">
+        <a href="https://discord.gg/MhZn5Nc39h"><span><i class="fa-brands fa-discord"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.youtube.com/@llmware"><span><i class="fa-brands fa-youtube"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://huggingface.co/llmware"><span><img src="assets/images/hf-logo.svg" alt="Hugging Face" class="hugging-face-logo"/></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.linkedin.com/company/aibloks/"><span><i class="fa-brands fa-linkedin"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://twitter.com/AiBloks"><span><i class="fa-brands fa-square-x-twitter"></i></span></a>
+    </li>
+    <li class="d-inline-block mr-1">
+        <a href="https://www.instagram.com/aibloks/"><span><i class="fa-brands fa-instagram"></i></span></a>
+    </li>
+</ul>
+---