diff --git a/docs/about-me/projects/48-rag-presentation.md b/docs/about-me/projects/48-rag-presentation.md index 87ab4040def..01ef2d07730 100644 --- a/docs/about-me/projects/48-rag-presentation.md +++ b/docs/about-me/projects/48-rag-presentation.md @@ -32,6 +32,15 @@ Share knowledge, expertise, and experiences using TensorFlow or any open-source 16. Final Reflections 17. Thanks +## Hackathon + +- [GitHub - google-gemini/cookbook: Examples and guides for using the Gemini API.](https://github.com/google-gemini/cookbook) +- Go to https://aistudio.google.com/app/ +- Get your own API Key +- Create API Key +- Create API Key in new project +- https://github.com/google-gemini/cookbook/blob/main/quickstarts/Prompting.ipynb + ## Others - rag architecture diff --git a/docs/about-me/projects/58-aws-gen-ai-hackathon.md b/docs/about-me/projects/58-aws-gen-ai-hackathon.md index 04f5e248aff..b5b8ecd7bfb 100644 --- a/docs/about-me/projects/58-aws-gen-ai-hackathon.md +++ b/docs/about-me/projects/58-aws-gen-ai-hackathon.md @@ -2,6 +2,8 @@ [Virtual Recruiter | GenAI - RAG - Google Slides](https://docs.google.com/presentation/d/1PL-uccbMAo21G0YkorF-RGwG5tqr-Sxm72lLsLggF_M/edit?usp=sharing) +[RAG Hackathon Questions](ai/llm/rag-hackathon-questions.md) + ### Links - [Visualizing Amazon SageMaker machine learning predictions with Amazon QuickSight AWS Machine Learning Blog](https://aws.amazon.com/blogs/machine-learning/making-machine-learning-predictions-in-amazon-quicksight-and-amazon-sagemaker/) diff --git a/docs/ai/libraries/mlops-model-deployment.md b/docs/ai/libraries/mlops-model-deployment.md index 813f2c735cc..4250ae13b3d 100755 --- a/docs/ai/libraries/mlops-model-deployment.md +++ b/docs/ai/libraries/mlops-model-deployment.md @@ -140,6 +140,7 @@ https://www.seldon.io ## Links +- [Ray | Faster Python](python/advanced/faster-python.md#ray) - [Home - MLOps Community](https://mlops.community/) - [GitHub - visenger/awesome-mlops: A curated list of references for MLOps](https://github.com/visenger/awesome-mlops) - [GitHub - kelvins/awesome-mlops: :sunglasses: A curated list of awesome MLOps tools](https://github.com/kelvins/awesome-mlops) diff --git a/docs/ai/libraries/tools.md b/docs/ai/libraries/tools.md index 8a878e40220..13807aa68b7 100755 --- a/docs/ai/libraries/tools.md +++ b/docs/ai/libraries/tools.md @@ -65,13 +65,11 @@ https://github.com/horovod/horovod The fastest way to build custom ML tools -https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace - -https://github.com/streamlit/streamlit - -https://www.freecodecamp.org/news/build-12-data-science-apps-with-python-and-streamlit - -[Generative AI and Streamlit: A perfect match](https://blog.streamlit.io/generative-ai-and-streamlit-a-perfect-match/) +- [Streamlit • A faster way to build and share data apps](https://streamlit.io/cloud) +- https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace +- https://github.com/streamlit/streamlit +- https://www.freecodecamp.org/news/build-12-data-science-apps-with-python-and-streamlit +- [Generative AI and Streamlit: A perfect match](https://blog.streamlit.io/generative-ai-and-streamlit-a-perfect-match/) ## Metaflow diff --git a/docs/ai/llm/libraries.md b/docs/ai/llm/libraries.md index 591e6fc296f..9780070f05e 100644 --- a/docs/ai/llm/libraries.md +++ b/docs/ai/llm/libraries.md @@ -202,6 +202,7 @@ - [**LPython**](https://github.com/lcompilers/lpython) - compiler that aggressively optimizes type-annotated Python code. It has several backends, including LLVM, C, C++, and WASM. LPython’s primary tenet is speed. [Launch blog post](https://lpython.org/blog/2023/07/lpython-novel-fast-retargetable-python-compiler/). - [**Petals**](https://github.com/bigscience-workshop/petals) - Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading. - [**TokenMonster**](https://github.com/alasdairforsythe/tokenmonster) - Determine the tokens that optimally represents a dataset at any specific vocabulary size +- [GitHub - microsoft/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.](https://github.com/microsoft/LLMLingua) ### Python Programming diff --git a/docs/ai/llm/llm-building.md b/docs/ai/llm/llm-building.md index dc5d38f3f3b..748b7c5acb1 100644 --- a/docs/ai/llm/llm-building.md +++ b/docs/ai/llm/llm-building.md @@ -26,22 +26,22 @@ [Let’s Architect! Discovering Generative AI on AWS | AWS Architecture Blog](https://aws.amazon.com/blogs/architecture/lets-architect-generative-ai/) -## Others +## Building - [GitHub - karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.](https://github.com/karpathy/nanoGPT) ![LLM Working](../../media/llm-working.jpg) -### How to train your ChatGPT +## How to train your ChatGPT -#### Stage 1: Pretraining +### Stage 1: Pretraining 1. Download ~10TB of text 2. Get a cluster of ~6,000 GPUs 3. Compress the text into a neural network, pay ~$2M, wait ~12 days 4. Obtain base model -#### Stage 2: Finetuning +### Stage 2: Finetuning 1. Write labeling instructions 2. Hire people (or use scale.ai!), collect 100K high quality ideal Q&A responses, and/or comparisons @@ -51,7 +51,7 @@ 6. Deploy 7. Monitor, collect misbehaviors, go to step 1 -### LLM Security +## LLM Security - Jailbreaking - Prompt injection diff --git a/docs/ai/llm/rag-hackathon-questions.md b/docs/ai/llm/rag-hackathon-questions.md new file mode 100644 index 00000000000..082ec53ff90 --- /dev/null +++ b/docs/ai/llm/rag-hackathon-questions.md @@ -0,0 +1,114 @@ +# RAG Hackathon Questions + +## Use case 1: Virtual recruiter + +The virtual recruiter, powered by generative Al, promises to revolutionize the recruitment process by automating tasks and personalizing interactions for both candidates and recruiters. Here's how it could work: + +### Key features + +**Automated Resume Matching and Ranking:** + +- **SemanticMatching:** Beyond word matching, the Al identifies semantic links between the candidate's profile and the job requirements, considering transferable skills and relevant experiences. + +**Personalized Interview Question Generation:** + +- **Adaptive Questionnaires:** Leveraging information from resumes and job descriptions, the Al generates tailored interview questions specific to each candidate. + +Overall, a virtual recruiter powered by generative Al can make the recruitment process more efficient, personalized, and equitable. By automating routine tasks, providing data-driven linsights, and tailoring interactions, it can benefit both candidates and recruiters, ultimately leading to better hiring outcomes. + +Data : Sample resumes + +## Use case: 2: Intelligent Insurance Claims Processing with Large Language Models + +Build an Al-powered insurance claims processing application that leverages large language models (LLMs) to automate key tasks, improve accuracy, and expedite claim resolution. The lapplication should: + +1. Intelligent Data Extraction + 1. Use LLMs to intelligently extract relevant information from claims forms (PDFs, images, text) + 2. Handle variations in form formats, languages, and terminology +2. Policy Verification + 1. Ingest and comprehend insurance policy documents + 2. Cross-reference extracted claim details against policy terms and coverage +3. Initial Claim Decision + 1. Leverage LLM's natural language understanding to analyze claim validity + 2. Generate initial claim decisions (approve, deny, request more info) with rationale + +Data : Historical claims data, insurance policies, and claims forms. + +## Use case: 3: Financial Analysis and Trading Chatbot + +To develop a chatbot that can ingest and analyze large amounts of financial data in multiple formats (e.g., PDF, documents, images, videos) using a multi-modal model, and provide detailed insights into |market analysis, trends, company performance, and investment decisions. + +1. Multimodal Data Ingestion + 1. Use state-of-the-art multimodal models to process text, images, PDFs, videos, and other data formats + 2. Create vector embeddings for efficient retrieval and analysis +2. Information Retrieval and Extraction + 1. " Leverage retrieval-augmented generation(RAG) models to scan through large volumes of data + 2. Extract relevant information, insights, and sentiment related to companies, industries, and markets +3. Natural Language Interface + 1. Provide a conversational interface for users to ask questions and receive detailed analysis + 2. Support queries on market trends, company performance, investment opportunities, and more + +Data : Financial data in multiple formats, including news articles, market reports, and social media data. + +## Use case: 4: Intelligent Product Image Generation and Manipulation + +Develop an Al-powered application that can understand product images, generate multiple relevant backgrounds, and manipulate the images based on customer prompts or requirements. The application should: + +1. Product Image Understanding + 1. Use computer vision and multimodal models to analyze and comprehend the product in a given image + 2. Identify the product category, features, and relevant contextual information +2. Background Generation/Replacement + 1. Leverage generative Al models to create multiple background variations for the product image + 2. Ensure the generated backgrounds are contextually relevant and visually appealing + 3. For product images with existing backgrounds, provide the ability to replace the background with a new, relevant one +3. Customer Prompt-based Image Manipulation + 1. Allow customers to provide natural language prompts or instructions for modifying the product image + 2. Use multimodal mode!s to understand the prompts and make the requested changes (e.g.,change product color, add accessories, adjust lighting, etc.) + +Data: Product images with and without backgrounds, and customer prompts for image updates. + +## Use case 5: Legal assistant application + +Scenario: Imagine a large corporation with mountains of legal documents, contracts, policies, and historical case information. Navigating this vast internal document dump can be time consuming and frustrating for legal professionals seeking answers to specific questions. + +**Enter the Legal Assistant application.** This Al-powered tool empowers legal staff by transforming the document pile into an easily searchable and insightful knowledge base. + +Key features: +- Data ingestion and processing +- Question and answering +- While giving answers Sensitive information tagging +- Contextualized results + +Overall, the Legal Assistant application empowers legal professionals by transforming internal documents into a valuable knowledge resource, streamlining workflows, and ensuring sensitive information remains protected. + +## Use case 6: GenAI personalized recommender + +Scenario: Sarah wants to buy a birthday gift for her husband, David, who loves sports. Instead of browsing through endless product categories, she uses a GenAl-powered product search feature. + +Here's how GenAI personalizes the search: + +- **Contextual understanding:** Based on the search intent, identify the right set of products +- **Personalised recommendations:** Based on the available product catalogue, user intent, make the list of recommendations +- Have interactive session to further refine the list of recommended products. + +## Use case: 7: Intelligent Review Summarization for E-commerce Products + +Develop an Al-powered application that can intelligently summarize online product reviews, |providing users with a concise and insightful overview. The application should: + +1. Review Data Ingestion + 1. Collect and ingest product reviews from various e-commerce platforms (Amazon, Walmart, Best Buy, etc.) + 2. Handle reviews in different languages and formats (text, images, videos) +2. Sentiment Analysis + 1. Use natural language processing (NLP) and multimodal models to analyze the sentiment of each review (positive, negative, neutral) + 2. ldentify and categorize reviews based on sentiment scores +3. Key Aspect Extraction + 1. Extract and categorize key aspects or topics discussed in the reviews (e.g., product quality, features, customer service, value for money) + 2. Identify the most frequently mentioned aspects and their associated sentiments + +Data: Sample online reviews like one below. + +Customers say - https://www.amazon.in/Duracell-Slimmest-Charging-Portable-Simultaneously/dp/B0BJV4L36G/ref=sr_1_1_sspa?crid=AGVPNZMYB98H&dib=eyJ2IjoiMSJ9.afXD21uOAWJpu5Vn7hGg9AMThE6sYsI8X_sV-ZHXm0g.GdugyvGsCPBTi9rx7eV4Q4NymOs27X9onUWQzLgf7yg&dib_tag=se&keywords=Duracell-Slimmest-Charging-Portable-Simultaneously&qid=1723463554&sprefix=duracell-slimmest-charging-portable-simultaneously%2Caps%2C206&sr=8-1-spons&sp_csd=d2lkZ2V0TmFtZT1zcF9hdGY&th=1 + +## Links + +- [AWS Gen AI Hackathon](about-me/projects/58-aws-gen-ai-hackathon.md) diff --git a/docs/ai/llm/readme.md b/docs/ai/llm/readme.md index 428dd706f58..d49690abf65 100644 --- a/docs/ai/llm/readme.md +++ b/docs/ai/llm/readme.md @@ -13,3 +13,4 @@ - [Libraries](ai/llm/libraries.md) - [Tools](ai/llm/tools.md) - [LLM FinTech Use Cases](ai/llm/fintech-use-cases.md) +- [RAG Hackathon Questions](ai/llm/rag-hackathon-questions.md) diff --git a/docs/computer-science/security/concepts.md b/docs/computer-science/security/concepts.md index 97c1a6ecb8f..d241338556e 100755 --- a/docs/computer-science/security/concepts.md +++ b/docs/computer-science/security/concepts.md @@ -60,6 +60,8 @@ OrBAC model allows the policy designer to define a security policy independently RBAC allows access based on the job title. RBAC largely eliminates discretion when providing access to objects. For example, a human resources specialist should not have permissions to create network accounts; this should be a role reserved for network administrators. +- RABAC: Role-Centric Attribute-Based Access Control + https://en.wikipedia.org/wiki/Role-based_access_control ### [Rule-Based Access Control](https://en.wikipedia.org/w/index.php?title=Rule-based_access_control&action=edit&redlink=1)(RAC) diff --git a/docs/management/marketing.md b/docs/management/marketing.md index 69f1b4bbf56..0f9bc9b939c 100755 --- a/docs/management/marketing.md +++ b/docs/management/marketing.md @@ -401,6 +401,8 @@ Interstitial ads are full-screen ads that cover the interface of their host app. https://developers.google.com/admob/android/interstitial +An **interstitial page** is ==a web page that appears before or after a desired content page, often for advertising or regulatory reasons==. Interstitial pages can be interactive pop-ups or full-page ads that float on a webpage or fill a mobile device's screen. They can appear when a user navigates to a page, unhides a tab or window, or clicks the browser's navigation bar. + [20 Years Of Marketing - 7 Most Important Lessons Learned](https://www.youtube.com/watch?v=VS4ECrG_0uM) 1. Start small, but look out for scale, okay? So here's what I mean by that. When we're thinking about scale, typically, I start off small. When I mean small, I'm talking not 5,000, not 10,000, I usually start off less than $1,000, even at our size, and I try to see what works. It doesn't mean I won't ramp up the next day to 10,000 or a 100,000, but I really try to start off small to try to figure out what works. Now, if I'm paying for services or hiring an agency, it's a little bit different because someone's creating a plan for me, and then executing on it. But if I'm doing it myself, I try to start off small because just because a channel or a tactic work for a competitor, it doesn't mean it works for me, so I try to start off small. diff --git a/docs/python/advanced/faster-python.md b/docs/python/advanced/faster-python.md index 9896bbd5c15..a735d8b58f4 100755 --- a/docs/python/advanced/faster-python.md +++ b/docs/python/advanced/faster-python.md @@ -38,3 +38,15 @@ https://towardsdatascience.com/memory-management-in-python-6bea0c8aecc9 https://strangemachines.io/articles/performant-python https://blog.esciencecenter.nl/parallel-programming-in-python-7fd62c90217d + +## Ray + +A fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library. + +- https://ray.io +- https://github.com/ray-project/ray +- https://towardsdatascience.com/10x-faster-parallel-python-without-python-multiprocessing-e5017c93cce1 +- https://towardsdatascience.com/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray-99f8d70369b8 +- [Ray Serve: Scalable and Programmable Serving — Ray 2.34.0](https://docs.ray.io/en/latest/serve/index.html) +- [Anyscale | Scalable Compute for AI and Python](https://www.anyscale.com/) +- [Scalable and Cost Efficient AI Workloads with AWS and Anyscale - YouTube](https://www.youtube.com/watch?v=pRiKZPk_-98) diff --git a/docs/python/documentation/17-concurrent-execution.md b/docs/python/documentation/17-concurrent-execution.md index 5c4b24ada99..7ac579a984f 100755 --- a/docs/python/documentation/17-concurrent-execution.md +++ b/docs/python/documentation/17-concurrent-execution.md @@ -107,18 +107,6 @@ https://www.machinelearningplus.com/python/parallel-processing-python https://blog.floydhub.com/multiprocessing-vs-threading-in-python-what-every-data-scientist-needs-to-know -## Ray - -A fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library. - -https://ray.io - -https://github.com/ray-project/ray - -https://towardsdatascience.com/10x-faster-parallel-python-without-python-multiprocessing-e5017c93cce1 - -https://towardsdatascience.com/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray-99f8d70369b8 - ## Subprocess #### Can be used to call other compiled programs @@ -141,3 +129,7 @@ https://zacs.site/blog/linear-python.html The [sched](https://docs.python.org/3/library/sched.html#module-sched) module defines a class which implements a general purpose event scheduler https://docs.python.org/3/library/sched.html + +## Links + +[Ray | Faster Python](python/advanced/faster-python.md#ray)