Skip to content

Commit

Permalink
updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
deepaksood619 committed Aug 12, 2024
1 parent 4d302e8 commit c431f71
Show file tree
Hide file tree
Showing 12 changed files with 158 additions and 24 deletions.
9 changes: 9 additions & 0 deletions docs/about-me/projects/48-rag-presentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@ Share knowledge, expertise, and experiences using TensorFlow or any open-source
16. Final Reflections
17. Thanks

## Hackathon

- [GitHub - google-gemini/cookbook: Examples and guides for using the Gemini API.](https://github.com/google-gemini/cookbook)
- Go to https://aistudio.google.com/app/
- Get your own API Key
- Create API Key
- Create API Key in new project
- https://github.com/google-gemini/cookbook/blob/main/quickstarts/Prompting.ipynb

## Others

- rag architecture
Expand Down
2 changes: 2 additions & 0 deletions docs/about-me/projects/58-aws-gen-ai-hackathon.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

[Virtual Recruiter | GenAI - RAG - Google Slides](https://docs.google.com/presentation/d/1PL-uccbMAo21G0YkorF-RGwG5tqr-Sxm72lLsLggF_M/edit?usp=sharing)

[RAG Hackathon Questions](ai/llm/rag-hackathon-questions.md)

### Links

- [Visualizing Amazon SageMaker machine learning predictions with Amazon QuickSight AWS Machine Learning Blog](https://aws.amazon.com/blogs/machine-learning/making-machine-learning-predictions-in-amazon-quicksight-and-amazon-sagemaker/)
Expand Down
1 change: 1 addition & 0 deletions docs/ai/libraries/mlops-model-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ https://www.seldon.io

## Links

- [Ray | Faster Python](python/advanced/faster-python.md#ray)
- [Home - MLOps Community](https://mlops.community/)
- [GitHub - visenger/awesome-mlops: A curated list of references for MLOps](https://github.com/visenger/awesome-mlops)
- [GitHub - kelvins/awesome-mlops: :sunglasses: A curated list of awesome MLOps tools](https://github.com/kelvins/awesome-mlops)
Expand Down
12 changes: 5 additions & 7 deletions docs/ai/libraries/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,11 @@ https://github.com/horovod/horovod

The fastest way to build custom ML tools

https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace

https://github.com/streamlit/streamlit

https://www.freecodecamp.org/news/build-12-data-science-apps-with-python-and-streamlit

[Generative AI and Streamlit: A perfect match](https://blog.streamlit.io/generative-ai-and-streamlit-a-perfect-match/)
- [Streamlit • A faster way to build and share data apps](https://streamlit.io/cloud)
- https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace
- https://github.com/streamlit/streamlit
- https://www.freecodecamp.org/news/build-12-data-science-apps-with-python-and-streamlit
- [Generative AI and Streamlit: A perfect match](https://blog.streamlit.io/generative-ai-and-streamlit-a-perfect-match/)

## Metaflow

Expand Down
1 change: 1 addition & 0 deletions docs/ai/llm/libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@
- [**LPython**](https://github.com/lcompilers/lpython) - compiler that aggressively optimizes type-annotated Python code. It has several backends, including LLVM, C, C++, and WASM. LPython’s primary tenet is speed. [Launch blog post](https://lpython.org/blog/2023/07/lpython-novel-fast-retargetable-python-compiler/).
- [**Petals**](https://github.com/bigscience-workshop/petals) - Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading.
- [**TokenMonster**](https://github.com/alasdairforsythe/tokenmonster) - Determine the tokens that optimally represents a dataset at any specific vocabulary size
- [GitHub - microsoft/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.](https://github.com/microsoft/LLMLingua)

### Python Programming

Expand Down
10 changes: 5 additions & 5 deletions docs/ai/llm/llm-building.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,22 +26,22 @@

[Let’s Architect! Discovering Generative AI on AWS | AWS Architecture Blog](https://aws.amazon.com/blogs/architecture/lets-architect-generative-ai/)

## Others
## Building

- [GitHub - karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.](https://github.com/karpathy/nanoGPT)

![LLM Working](../../media/llm-working.jpg)

### How to train your ChatGPT
## How to train your ChatGPT

#### Stage 1: Pretraining
### Stage 1: Pretraining

1. Download ~10TB of text
2. Get a cluster of ~6,000 GPUs
3. Compress the text into a neural network, pay ~$2M, wait ~12 days
4. Obtain base model

#### Stage 2: Finetuning
### Stage 2: Finetuning

1. Write labeling instructions
2. Hire people (or use scale.ai!), collect 100K high quality ideal Q&A responses, and/or comparisons
Expand All @@ -51,7 +51,7 @@
6. Deploy
7. Monitor, collect misbehaviors, go to step 1

### LLM Security
## LLM Security

- Jailbreaking
- Prompt injection
Expand Down
114 changes: 114 additions & 0 deletions docs/ai/llm/rag-hackathon-questions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# RAG Hackathon Questions

## Use case 1: Virtual recruiter

The virtual recruiter, powered by generative Al, promises to revolutionize the recruitment process by automating tasks and personalizing interactions for both candidates and recruiters. Here's how it could work:

### Key features

**Automated Resume Matching and Ranking:**

- **SemanticMatching:** Beyond word matching, the Al identifies semantic links between the candidate's profile and the job requirements, considering transferable skills and relevant experiences.

**Personalized Interview Question Generation:**

- **Adaptive Questionnaires:** Leveraging information from resumes and job descriptions, the Al generates tailored interview questions specific to each candidate.

Overall, a virtual recruiter powered by generative Al can make the recruitment process more efficient, personalized, and equitable. By automating routine tasks, providing data-driven linsights, and tailoring interactions, it can benefit both candidates and recruiters, ultimately leading to better hiring outcomes.

Data : Sample resumes

## Use case: 2: Intelligent Insurance Claims Processing with Large Language Models

Build an Al-powered insurance claims processing application that leverages large language models (LLMs) to automate key tasks, improve accuracy, and expedite claim resolution. The lapplication should:

1. Intelligent Data Extraction
1. Use LLMs to intelligently extract relevant information from claims forms (PDFs, images, text)
2. Handle variations in form formats, languages, and terminology
2. Policy Verification
1. Ingest and comprehend insurance policy documents
2. Cross-reference extracted claim details against policy terms and coverage
3. Initial Claim Decision
1. Leverage LLM's natural language understanding to analyze claim validity
2. Generate initial claim decisions (approve, deny, request more info) with rationale

Data : Historical claims data, insurance policies, and claims forms.

## Use case: 3: Financial Analysis and Trading Chatbot

To develop a chatbot that can ingest and analyze large amounts of financial data in multiple formats (e.g., PDF, documents, images, videos) using a multi-modal model, and provide detailed insights into |market analysis, trends, company performance, and investment decisions.

1. Multimodal Data Ingestion
1. Use state-of-the-art multimodal models to process text, images, PDFs, videos, and other data formats
2. Create vector embeddings for efficient retrieval and analysis
2. Information Retrieval and Extraction
1. " Leverage retrieval-augmented generation(RAG) models to scan through large volumes of data
2. Extract relevant information, insights, and sentiment related to companies, industries, and markets
3. Natural Language Interface
1. Provide a conversational interface for users to ask questions and receive detailed analysis
2. Support queries on market trends, company performance, investment opportunities, and more

Data : Financial data in multiple formats, including news articles, market reports, and social media data.

## Use case: 4: Intelligent Product Image Generation and Manipulation

Develop an Al-powered application that can understand product images, generate multiple relevant backgrounds, and manipulate the images based on customer prompts or requirements. The application should:

1. Product Image Understanding
1. Use computer vision and multimodal models to analyze and comprehend the product in a given image
2. Identify the product category, features, and relevant contextual information
2. Background Generation/Replacement
1. Leverage generative Al models to create multiple background variations for the product image
2. Ensure the generated backgrounds are contextually relevant and visually appealing
3. For product images with existing backgrounds, provide the ability to replace the background with a new, relevant one
3. Customer Prompt-based Image Manipulation
1. Allow customers to provide natural language prompts or instructions for modifying the product image
2. Use multimodal mode!s to understand the prompts and make the requested changes (e.g.,change product color, add accessories, adjust lighting, etc.)

Data: Product images with and without backgrounds, and customer prompts for image updates.

## Use case 5: Legal assistant application

Scenario: Imagine a large corporation with mountains of legal documents, contracts, policies, and historical case information. Navigating this vast internal document dump can be time consuming and frustrating for legal professionals seeking answers to specific questions.

**Enter the Legal Assistant application.** This Al-powered tool empowers legal staff by transforming the document pile into an easily searchable and insightful knowledge base.

Key features:
- Data ingestion and processing
- Question and answering
- While giving answers Sensitive information tagging
- Contextualized results

Overall, the Legal Assistant application empowers legal professionals by transforming internal documents into a valuable knowledge resource, streamlining workflows, and ensuring sensitive information remains protected.

## Use case 6: GenAI personalized recommender

Scenario: Sarah wants to buy a birthday gift for her husband, David, who loves sports. Instead of browsing through endless product categories, she uses a GenAl-powered product search feature.

Here's how GenAI personalizes the search:

- **Contextual understanding:** Based on the search intent, identify the right set of products
- **Personalised recommendations:** Based on the available product catalogue, user intent, make the list of recommendations
- Have interactive session to further refine the list of recommended products.

## Use case: 7: Intelligent Review Summarization for E-commerce Products

Develop an Al-powered application that can intelligently summarize online product reviews, |providing users with a concise and insightful overview. The application should:

1. Review Data Ingestion
1. Collect and ingest product reviews from various e-commerce platforms (Amazon, Walmart, Best Buy, etc.)
2. Handle reviews in different languages and formats (text, images, videos)
2. Sentiment Analysis
1. Use natural language processing (NLP) and multimodal models to analyze the sentiment of each review (positive, negative, neutral)
2. ldentify and categorize reviews based on sentiment scores
3. Key Aspect Extraction
1. Extract and categorize key aspects or topics discussed in the reviews (e.g., product quality, features, customer service, value for money)
2. Identify the most frequently mentioned aspects and their associated sentiments

Data: Sample online reviews like one below.

Customers say - https://www.amazon.in/Duracell-Slimmest-Charging-Portable-Simultaneously/dp/B0BJV4L36G/ref=sr_1_1_sspa?crid=AGVPNZMYB98H&dib=eyJ2IjoiMSJ9.afXD21uOAWJpu5Vn7hGg9AMThE6sYsI8X_sV-ZHXm0g.GdugyvGsCPBTi9rx7eV4Q4NymOs27X9onUWQzLgf7yg&dib_tag=se&keywords=Duracell-Slimmest-Charging-Portable-Simultaneously&qid=1723463554&sprefix=duracell-slimmest-charging-portable-simultaneously%2Caps%2C206&sr=8-1-spons&sp_csd=d2lkZ2V0TmFtZT1zcF9hdGY&th=1

## Links

- [AWS Gen AI Hackathon](about-me/projects/58-aws-gen-ai-hackathon.md)
1 change: 1 addition & 0 deletions docs/ai/llm/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
- [Libraries](ai/llm/libraries.md)
- [Tools](ai/llm/tools.md)
- [LLM FinTech Use Cases](ai/llm/fintech-use-cases.md)
- [RAG Hackathon Questions](ai/llm/rag-hackathon-questions.md)
2 changes: 2 additions & 0 deletions docs/computer-science/security/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ OrBAC model allows the policy designer to define a security policy independently

RBAC allows access based on the job title. RBAC largely eliminates discretion when providing access to objects. For example, a human resources specialist should not have permissions to create network accounts; this should be a role reserved for network administrators.

- RABAC: Role-Centric Attribute-Based Access Control

https://en.wikipedia.org/wiki/Role-based_access_control

### [Rule-Based Access Control](https://en.wikipedia.org/w/index.php?title=Rule-based_access_control&action=edit&redlink=1)(RAC)
Expand Down
2 changes: 2 additions & 0 deletions docs/management/marketing.md
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,8 @@ Interstitial ads are full-screen ads that cover the interface of their host app.

https://developers.google.com/admob/android/interstitial

An **interstitial page** is ==a web page that appears before or after a desired content page, often for advertising or regulatory reasons==. Interstitial pages can be interactive pop-ups or full-page ads that float on a webpage or fill a mobile device's screen. They can appear when a user navigates to a page, unhides a tab or window, or clicks the browser's navigation bar.

[20 Years Of Marketing - 7 Most Important Lessons Learned](https://www.youtube.com/watch?v=VS4ECrG_0uM)

1. Start small, but look out for scale, okay? So here's what I mean by that. When we're thinking about scale, typically, I start off small. When I mean small, I'm talking not 5,000, not 10,000, I usually start off less than $1,000, even at our size, and I try to see what works. It doesn't mean I won't ramp up the next day to 10,000 or a 100,000, but I really try to start off small to try to figure out what works. Now, if I'm paying for services or hiring an agency, it's a little bit different because someone's creating a plan for me, and then executing on it. But if I'm doing it myself, I try to start off small because just because a channel or a tactic work for a competitor, it doesn't mean it works for me, so I try to start off small.
Expand Down
12 changes: 12 additions & 0 deletions docs/python/advanced/faster-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,15 @@ https://towardsdatascience.com/memory-management-in-python-6bea0c8aecc9
https://strangemachines.io/articles/performant-python

https://blog.esciencecenter.nl/parallel-programming-in-python-7fd62c90217d

## Ray

A fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

- https://ray.io
- https://github.com/ray-project/ray
- https://towardsdatascience.com/10x-faster-parallel-python-without-python-multiprocessing-e5017c93cce1
- https://towardsdatascience.com/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray-99f8d70369b8
- [Ray Serve: Scalable and Programmable Serving — Ray 2.34.0](https://docs.ray.io/en/latest/serve/index.html)
- [Anyscale | Scalable Compute for AI and Python](https://www.anyscale.com/)
- [Scalable and Cost Efficient AI Workloads with AWS and Anyscale - YouTube](https://www.youtube.com/watch?v=pRiKZPk_-98)
16 changes: 4 additions & 12 deletions docs/python/documentation/17-concurrent-execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,18 +107,6 @@ https://www.machinelearningplus.com/python/parallel-processing-python

https://blog.floydhub.com/multiprocessing-vs-threading-in-python-what-every-data-scientist-needs-to-know

## Ray

A fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

https://ray.io

https://github.com/ray-project/ray

https://towardsdatascience.com/10x-faster-parallel-python-without-python-multiprocessing-e5017c93cce1

https://towardsdatascience.com/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray-99f8d70369b8

## Subprocess

#### Can be used to call other compiled programs
Expand All @@ -141,3 +129,7 @@ https://zacs.site/blog/linear-python.html
The [sched](https://docs.python.org/3/library/sched.html#module-sched) module defines a class which implements a general purpose event scheduler

https://docs.python.org/3/library/sched.html

## Links

[Ray | Faster Python](python/advanced/faster-python.md#ray)

0 comments on commit c431f71

Please sign in to comment.