proxy added and enhance gnewsdecoder functionality

SSujitX · Jan 18, 2025 · d7bc07d · d7bc07d
1 parent 836bac1
commit d7bc07d
Show file tree

Hide file tree

Showing 11 changed files with 383 additions and 95 deletions.
diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml
@@ -1,39 +1,41 @@
-# This workflow will upload a Python Package using Twine when a release is created
-# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
-
-# This workflow uses actions that are not certified by GitHub.
-# They are provided by a third-party and are governed by
-# separate terms of service, privacy policy, and support
-# documentation.
-
 name: Upload Python Package
 
 on:
+  push:
+    tags:
+      - "[0-9]+.[0-9]+.[0-9]+"
+      - "[0-9]+.[0-9]+.[0-9]+a[0-9]+"
+      - "[0-9]+.[0-9]+.[0-9]+b[0-9]+"
+      - "[0-9]+.[0-9]+.[0-9]+rc[0-9]+"
+    branches:
+      - main
   release:
     types: [published]
-
 permissions:
   contents: read
 
 jobs:
   deploy:
-
     runs-on: ubuntu-latest
 
     steps:
-    - uses: actions/checkout@v4
-    - name: Set up Python
-      uses: actions/setup-python@v3
-      with:
-        python-version: '3.x'
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        pip install build
-    - name: Build package
-      run: python -m build
-    - name: Publish package
-      uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
-      with:
-        user: __token__
-        password: ${{ secrets.PYPI_API_TOKEN }}
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v3
+        with:
+          python-version: "3.12"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install build
+
+      - name: Build package
+        run: python -m build
+
+      - name: Publish package
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          user: __token__
+          password: ${{ secrets.PYPI_API_TOKEN }}
diff --git a/.gitignore b/.gitignore
@@ -1,18 +1,16 @@
-# Ignore Python bytecode
 __pycache__/
 *.pyc
 *.pyo
 
-# Ignore macOS system files
 .DS_Store
 .venv
 .ignore
 .backup_readme.md
 .backup_readme2.md
 
-# Ignore build artifacts
 dist/
 build/
 
-# Ignore egg info
 *.egg-info/
+
+test.py
diff --git a/README.md b/README.md
@@ -1,32 +1,29 @@
 [![PyPI version](https://badge.fury.io/py/googlenewsdecoder.svg)](https://badge.fury.io/py/googlenewsdecoder)
-[![Python Versions](https://img.shields.io/badge/python-3.9-blue)](https://pypi.org/project/facebook-pages-scraper/)
+[![Python Versions](https://img.shields.io/badge/python-3.9%20|%203.10%20|%203.11%20|%203.12%20|%203.13-blue)](https://pypi.org/project/googlenewsdecoder/)
 [![Downloads](https://static.pepy.tech/badge/googlenewsdecoder)](https://pepy.tech/project/googlenewsdecoder)
-[![Downloads](https://static.pepy.tech/badge/googlenewsdecoder/month)](https://pepy.tech/project/googlenewsdecoder)
 [![Downloads](https://static.pepy.tech/badge/googlenewsdecoder/week)](https://pepy.tech/project/googlenewsdecoder)
 
 # Google News Decoder
 
 Google News Decoder is a Python package that can decode Google News links or Google News URLs to their original URLs. It is a simple tool that saves you time and effort. If you find it useful, please support the package by hitting the star on GitHub. Your support helps keep the project going!
 
-[Pypi Package](https://pypi.org/project/googlenewsdecoder/)
-
 ## Update
 
-- Version 0.1.6:
-
-  - Improved: Enhanced error handling with a fallback mechanism for decoding parameters.
-  - Refined: Optimized get_decoding_params to try decoding via https://news.google.com/articles first, falling back to https://news.google.com/rss/articles if needed
-  - Updated: Reduced occurrences of HTTP 429 (Too Many Requests).
-  - Removed: Logging functionality for a cleaner codebase.
-  - Fixed: Resolved time delay issue between requests.
+- **Version 0.1.7**:
+  - **New Feature**: Added **proxy support** to handle rate limiting and bypass restrictions.
+  - **Improved**: Enhanced error handling with a fallback mechanism for decoding parameters.
+  - **Refined**: Optimized `get_decoding_params` to try decoding via `https://news.google.com/articles` first, falling back to `https://news.google.com/rss/articles` if needed.
+  - **Updated**: Reduced occurrences of HTTP 429 (Too Many Requests).
+  - **Removed**: Logging functionality for a cleaner codebase.
+  - **Fixed**: Resolved time delay issue between requests.
 
 ## Demo
 
 ![Google News Decoder](https://github.com/user-attachments/assets/3a3c3279-1c54-4e19-96cb-6f22f889aa2a)
 
 ## Installation
 
-- You can install this package using pip:
+You can install this package using pip:
 
 ```sh
 pip install googlenewsdecoder
@@ -38,31 +35,70 @@ pip install googlenewsdecoder
 pip install googlenewsdecoder --upgrade
 ```
 
+## Supported Proxy Formats
+
+- **HTTP/HTTPS Proxy**:
+
+  - **With authentication**: `http://user:pass@host:port` or `https://user:pass@host:port`
+  - **Without authentication**: `http://host:port` or `https://host:port`
+
+- **SOCKS5 Proxy**:
+
+  - **With authentication**: `socks5://user:pass@host:port`
+  - **Without authentication**: `socks5://host:port`
+
+- **IP and Port Only**:
+  - **HTTP**: `http://127.0.0.1:8080`
+  - **SOCKS5**: `socks5://127.0.0.1:1080`
+
 ## Usage
 
 Here is an example of how to use this package with different decoders:
 
-### Using new_decoderv1
+### Using gnewsdecoder
 
 ```python
-from googlenewsdecoder import new_decoderv1
+from googlenewsdecoder import gnewsdecoder
 
 def main():
-
-    interval_time = 5 # default interval is 1 sec, if not specified
+    interval_time = 1  # interval is optional, default is None
 
     source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
 
     try:
-        decoded_url = new_decoderv1(source_url, interval=interval_time)
+        decoded_url = gnewsdecoder(source_url, interval=interval_time)
+
         if decoded_url.get("status"):
             print("Decoded URL:", decoded_url["decoded_url"])
         else:
             print("Error:", decoded_url["message"])
     except Exception as e:
         print(f"Error occurred: {e}")
 
-    # Output: decoded_urls - [{'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}]
+if __name__ == "__main__":
+    main()
+```
+
+### Using gnewsdecoder with proxy
+
+```python
+from googlenewsdecoder import gnewsdecoder
+
+def main():
+    interval_time = 1  # interval is optional, default is None
+    proxy = "http://user:pass@localhost:8080" # proxy is optional, default is None
+
+    source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
+
+    try:
+        decoded_url = gnewsdecoder(source_url, interval=interval_time, proxy=proxy)
+
+        if decoded_url.get("status"):
+            print("Decoded URL:", decoded_url["decoded_url"])
+        else:
+            print("Error:", decoded_url["message"])
+    except Exception as e:
+        print(f"Error occurred: {e}")
 
 if __name__ == "__main__":
     main()
@@ -71,54 +107,53 @@ if __name__ == "__main__":
 ### Using a for loop to decode multiple URLs
 
 ```python
-from googlenewsdecoder import new_decoderv1
+from googlenewsdecoder import gnewsdecoder
 
 def main():
+    interval_time = 1  # interval is optional, default is None
 
-    interval_time = 5 # default interval is None, if not specified
-
-    source_urls = ["https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen","https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"]
+    source_urls = [
+        "https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen",
+        "https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"
+    ]
 
     for url in source_urls:
         try:
-            decoded_url = new_decoderv1(url, interval=interval_time)
+            decoded_url = gnewsdecoder(url, interval=interval_time)
             if decoded_url.get("status"):
                 print("Decoded URL:", decoded_url["decoded_url"])
             else:
                 print("Error:", decoded_url["message"])
         except Exception as e:
             print(f"Error occurred: {e}")
 
-    # Output: decoded_url - {'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}
-
-
 if __name__ == "__main__":
     main()
 ```
 
-
-
-### Using a proxy to deal with rate limiting
+### Using a for loop to decode multiple URLs with Proxy
 
 ```python
-from googlenewsdecoder import new_decoderv1
+from googlenewsdecoder import gnewsdecoder
 
 def main():
+    interval_time = 1  # interval is optional, default is None
+    proxy = "http://user:pass@localhost:8080" # proxy is optional, default is None
 
-    interval_time = 5 # default interval is 1 sec, if not specified
-
-    source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
-
-    try:
-        decoded_url = new_decoderv1(source_url, proxy="http://user:pass@localhost:8080")
-        if decoded_url.get("status"):
-            print("Decoded URL:", decoded_url["decoded_url"])
-        else:
-            print("Error:", decoded_url["message"])
-    except Exception as e:
-        print(f"Error occurred: {e}")
+    source_urls = [
+        "https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen",
+        "https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"
+    ]
 
-    # Output: decoded_urls - [{'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}]
+    for url in source_urls:
+        try:
+            decoded_url = gnewsdecoder(url, interval=interval_time, proxy=proxy)
+            if decoded_url.get("status"):
+                print("Decoded URL:", decoded_url["decoded_url"])
+            else:
+                print("Error:", decoded_url["message"])
+        except Exception as e:
+            print(f"Error occurred: {e}")
 
 if __name__ == "__main__":
     main()

diff --git a/googlenewsdecoder/__init__.py b/googlenewsdecoder/__init__.py
@@ -1,5 +1,36 @@
-from .new_decoderv1 import decode_google_news_url as new_decoderv1
 from .decoderv1 import decode_google_news_url as decoderv1
 from .decoderv2 import decode_google_news_url as decoderv2
 from .decoderv3 import decode_google_news_url as decoderv3
 from .decoderv4 import decode_google_news_url as decoderv4
+from .new_decoderv1 import decode_google_news_url as new_decoderv1
+from .new_decoderv2 import GoogleDecoder
+from .__version__ import __version__
+
+
+def gnewsdecoder(source_url, interval=None, proxy=None):
+    """
+    Decodes a Google News article URL into its original source URL.
+    This is a convenience function that uses the GoogleDecoder class internally.
+
+    Parameters:
+        source_url (str): The Google News article URL.
+        interval (int, optional): Delay time in seconds before decoding to avoid rate limits.
+        proxy (str, optional): Proxy to be used for all requests.
+
+    Returns:
+        dict: A dictionary containing 'status' and 'decoded_url' if successful,
+              otherwise 'status' and 'message'.
+    """
+    decoder = GoogleDecoder(proxy=proxy)
+    return decoder.decode_google_news_url(source_url, interval=interval)
+
+
+__all__ = [
+    "decoderv1",
+    "decoderv2",
+    "decoderv3",
+    "decoderv4",
+    "new_decoderv1",
+    "GoogleDecoder",
+    "gnewsdecoder",
+]
diff --git a/googlenewsdecoder/__version__.py b/googlenewsdecoder/__version__.py
@@ -0,0 +1 @@
+__version__ = "0.1.7"