A self-hosted Cloudflare worker for SearXNG which allows you to run your own favicon grabber service.
- Install
- About
- Usage
- Methods Utilized:
- Step 1 Install Dependencies
- Step 2: Deploy Test Server
- Step 3: Customizing Worker
- Step 4: Publish Worker to Cloudflare
- Step 5: Adding Your Favicon Worker to SearXNG
- Cloudflare Loadbalancing
- Developer Notes
- Contributors ✨
To automatically deploy this Cloudflare worker with minimal setup, click the link below:
If you would like to manually set up the Cloudflare worker and install everything yourself, review the section below:
This repository contains the source code you will need to host your own Favicon grabber utilizing a Cloudflare worker (free).
Originally this project was developed around the use of the popular privacy search engine SearXNG, however, the worker can be used on its own, or can be integrated with any other application which makes use of a favicon grabber service simply by providing the absolute URL to where your worker is hosted.
When you deploy this worker to Cloudflare, you can enable the ability to either host the worker using your own domain name, or you can use a Cloudflare worker.dev
domain, which will make the worker available on the web via a browser.
This worker includes the following features:
- Favicon override using a Github repository
(self-hostable)
- Favicon override using locally provided image URL table
- Favicon override using locally provided SVG path
- Works with Google, Yandex, Duckduckgo, FaviconKit, Allesedv
- Site code scanning for favicon tags, both
link
andsvg
- CORS Security Headers
- Ability to set API rate limits
(disabled by default)
- Daily limits OR limit X per milliseconds
- Aggressive throttling mode
(disabled by default)
- Adds an incremental punishment onto the client's cooldown each time they attempt to grab a favicon when their original cooldown period has not yet expired.
- IP blacklisting / banning
- Supports sub-routes for users who want to add on
get
,post
routes - Supports Cloudflare worker logs
(beta)
The worker contains a variety of methods it uses for finding a favicon for a specified domain. If you would like to view the methods available in this worker, view the section below Methods Utilized.
No. This worker was made for SearXNG, however, the favicon worker can be used for any service that makes use of a favicon grabber.
The usage of this worker is rather simple. Deploy it by clicking the button above. Once the worker is configured, you will be able to access it within your web browser via the URL Cloudflare assigns you. This is usually searxico.YourCloudflareUsername.worker.dev
.
Once you access the domain name for your worker, you can start searching for favicons by providing a domain name. As an example, to find a favicon using the online demo worker, you should search using the url:
- https://searxico.aetherinox.workers.dev/reddit.com
- https://searxico.aetherinox.workers.dev/reddit.com/64
The icon image size on the end of the URL is optional. Review a list of available paremeters below:
Parameter | Description | Status |
---|---|---|
DOMAIN |
Website to grab favicon for Does not need http , https or www |
Required |
ICON_SIZE |
Size of the icon to return | Optional Default: 32 |
This worker contains a wide variety of methods that the worker tries to use in order to obtain a favicon from a website. These methods are listed below, and in the order of priority that they are ran in the worker:
Priority: 1
When you request a favicon using this self-hosted worker, it will first check to see if the specified domain has an icon hosted on our Searxico Favicon CDN Repository. This is a repository that you can host on your own Github account. If you decide to upload your own icon for Google, or Microsoft and place it within the repository, any time you request the Google or Microsoft favicon, it will first scan your own CDN repository and use that icon before it will go fetch the actual icon from their website.
This allows you to override any favicons for any websites.
If you want to see an example of how a Cloudflare hosted repository should be set up, see our Searxico CDN example repository:
Priority: 2
If an icon for a domain is not found within the Self-hosted CDN Repository listed above, it will then check the local worker index.js
for an override table:
const iconsOverrideIco = {
's/searxng.org': `https://raw.githubusercontent.com/searxng/searxng/master/searx/static/themes/simple/img/favicon.png`
};
The override table shown above is a table available within the Cloudflare worker index.js
source code which allows you to force a domain to use a specific favicon. To add a new domain to the list, maintain the format shown above. The entry name should be the first letter of the domain, followed by a forward slash /
and then the domain.extension. Then for the value, you will provide a direct URL to the favicon you wish for the domain to use.
Priority: 3
The next source that is prioritized when you search for a favicon is the localized override table with SVG paths. This is similar to the previous method above Localized Override Table (URLs), except this table uses SVG paths, and can be found inside your Cloudflare worker index.js
source code file.
const iconsOverrideSvg = {
's/searxng.org': `<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512" fill="#1F85DE" width="32px" height="32px"><path class="fa-primary" d=""></path><path class="fa-secondary" d="M0 256a160 160 0 1 1 320 0A160 160 0 1 1 0 256z"></path></svg>`
};
To add your own entry, the key must be the first letter of the website domain you are searching for, followed by a forward slash /
, and then the domain.extension for the domain. The value must be a full SVG path containing the icon you wish to use.
Priority: 4
If a favicon cannot be found using any of the methods listed above, the next step which has priority is for the favicon grabber to use an external API such as:
- Yandex
- Duckduckgo
- FaviconKit
- Allesedv
The service unavatar is also available, however, this API service seems to have a rate limit, so it is not enabled by default.
Priority: 5
The next step that the favicon grabber uses is a physical search of the domain you are requesting the favicon for. The Cloudflare worker will scan through the HTML code of the domain, and check for specific tags within the HTML code, including link[rel*="icon"]
, mask-icon
, etc. An example of HTML being searched for is shown below:
<link rel="shortcut icon" href="https://cdn.sstatic.net/Sites/stackoverflow/Img/favicon.ico?v=ec617d715196">
The worker will also search for any <svg>
icons that may appear in the HTML code to try and figure out if those icons are a logo for the website being searched.
Priority: 6
If all of the above attempts fail to retrieve a favicon for a website, the favicon worker will then return a default icon to display. The default SVG icon is defined within the worker index.js
as the following code:
const favicoDefaultSvg = `<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512" fill="#1F85DE" width="32px" height="32px">
<path d="M352 256c0 22.2-1.2 43.6-3.3 64H163.3c-2.2-20.4-3.3-41.8-3.3-64s1.2-43.6 3.3-64H348.7c2.2 20.4 3.3 41.8 3.3 64zm28.8-64H503.9c5.3 20.5 8.1 41.9 8.1 64s-2.8 43.5-8.1 64H380.8c2.1-20.6 3.2-42 3.2-64s-1.1-43.4-3.2-64zm112.6-32H376.7c-10-63.9-29.8-117.4-55.3-151.6c78.3 20.7 142 77.5 171.9 151.6zm-149.1 0H167.7c6.1-36.4 15.5-68.6 27-94.7c10.5-23.6 22.2-40.7 33.5-51.5C239.4 3.2 248.7 0 256 0s16.6 3.2 27.8 13.8c11.3 10.8 23 27.9 33.5 51.5c11.6 26 20.9 58.2 27 94.7zm-209 0H18.6C48.6 85.9 112.2 29.1 190.6 8.4C165.1 42.6 145.3 96.1 135.3 160zM8.1 192H131.2c-2.1 20.6-3.2 42-3.2 64s1.1 43.4 3.2 64H8.1C2.8 299.5 0 278.1 0 256s2.8-43.5 8.1-64zM194.7 446.6c-11.6-26-20.9-58.2-27-94.6H344.3c-6.1 36.4-15.5 68.6-27 94.6c-10.5 23.6-22.2 40.7-33.5 51.5C272.6 508.8 263.3 512 256 512s-16.6-3.2-27.8-13.8c-11.3-10.8-23-27.9-33.5-51.5zM135.3 352c10 63.9 29.8 117.4 55.3 151.6C112.2 482.9 48.6 426.1 18.6 352H135.3zm358.1 0c-30 74.1-93.6 130.9-171.9 151.6c25.5-34.2 45.2-87.7 55.3-151.6H493.4z"></path>
</svg>`;
It should be worth noting that a test was conducted with over 1,000 domains. Out of all of the domains we tried, the default icon was only ever shown twice. It is highly unlikely for this step to be utilized as there should always be a favicon found somewhere. But we can't say never.
You will need to register for a Cloudflare account if you have not already. First, we need to grab the files from this repo. Create a new project folder where everything will be stored.
git clone https://github.com/Aetherinox/searxico-worker.git ./searxico
You must have npm
installed. If you don't, you'll need to install it first. If you are on Windows, follow the Installation Guide here.
If you are on Linux, you can install with:
sudo apt install npm
Next, open your terminal / command prompt for Windows / Linux, change directories over to the folder where you downloaded Searxico and install the Node dependencies by running the commands:
cd searxico
npm install
Next, confirm that Wrangler is installed by running the command:
npx wrangler -v
You should receive:
⛅️ wrangler 3.80.0
-------------------
Next, you need to sign into Cloudflare using Wrangler so that the app knows where to upload your Favicon worker to:
npx wrangler login
Your operating system web browser should open. Sign into your Cloudflare, and a permission box should appear asking you to confirm that Wrangler should be able to access your Cloudflare account.
After you sign in and approve the permissions; you should see the following in your terminal:
$ npx wrangler login
Attempting to login via OAuth...
Opening a link in your default browser: https://dash.cloudflare.com/oauth2/auth?response_type=code&client_id=xxxxx
Successfully logged in.
▲ [WARNING] Processing wrangler.toml configuration:
To confirm it worked, type the command:
npx wrangler whoami
You should see:
⛅️ wrangler 3.80.0
-------------------
Getting User settings...
👋 You are logged in with an OAuth Token, associated with the email [email protected].
┌─────────────────────────────────┬──────────────────────────────────┐
│ Account Name │ Account ID │
├─────────────────────────────────┼──────────────────────────────────┤
│ [email protected]'s Account │ abcdefg123456789a1b2c3d4c5e6f7ab │
└─────────────────────────────────┴──────────────────────────────────┘
🔓 Token Permissions: If scopes are missing, you may need to logout and re-login.
Scope (Access)
You now have everything set up and can begin to either make edits to the source code within /src/index.js
, or you can move on to the next step of the guide which explains how to launch a dev server, or deploy the worker to Cloudflare.
Now that you finished the above section Install Dependencies, we can now launch a development server so that you can test the worker locally. Back in your terminal, run the command:
npx wrangler dev -e dev
You should see the following in terminal:
⛅️ wrangler 3.80.0
-------------------
Your worker has access to the following bindings:
- Unsafe:
- ratelimit: searxico
- Vars:
- THROTTLE_DELAY_MS: 0
- THROTTLE_AGGRESSIVE: false
- THROTTLE_AGGRESSIVE_PUNISH_MS: 5000
- THROTTLE_DAILY_ENABLED: false
- THROTTLE_DAILY_LIMIT: 2000
⎔ Starting local server...
[wrangler:inf] Ready on http://localhost:8787
╭──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [b] open a browser, [d] open devtools, [l] turn off local mode, [c] clear console, [x] to exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
As the instructions say, open your operating system web browser and navigate to the url:
http://localhost:8787
Note
Add the word /get
to the end of the URL above, as that is the end-point for the favicon grabber.
I am currently working on an additional setting which will allow you tp specify if you want the favicon grabber to reside in the base domain without a sub-route.
You should now see the favicon homepage:
Searxico Favicon Grabber 1.0.0
@usage ...... GET localhost:8787/domain.com
GET localhost:8787/domain.com/ICON_SIZE
@repo: ...... https://github.com/Aetherinox/searxico-worker
@cdn: ....... https://github.com/Aetherinox/searxico-cdn
@author: ... github.com/aetherinox
If you want to test out getting an icon, pick a domain and add it to the end of the URL:
http://localhost:8787/searxng.org
You should see the official SearXNG.org favicon, which confirms that this is working. If you wish to stop the development server, go back to your terminal and press X
. Your terminal should list all of the available options you can pick from:
╭──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [b] open a browser, [d] open devtools, [l] turn off local mode, [c] clear console, [x] to exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Now we can proceed onto the final part of this documentation which explains on how to publish your worker to Cloudflare Proceed to the section Publish Worker to Cloudflare.
This Cloudflare worker includes a few settings you can adjust. To edit these settings, open the source file /src/index.js
in an editor and read the sections below:
This worker includes the ability to host your favicon worker within a sub-route of your subdomain. You can find this setting within the top of the src/index.js
as the following settings:
let bSubRoute = false;
const subroute = 'get';
This setting is useful for users who want to expand on this worker and add multiple routes that can be queried such as GET
, POST
, and treat it more like an API.
If enabled, this means that you must search for favicons using the URL:
https://favicons.domain.com/get/yourdomain.com
^ Sub-route
If you set bSubRoute = false
, this means that you can search for favicons from domains without any type of additional route being specified. You'll notice in the example below, /get/
is not being added to the URL:
https://favicons.domain.com/yourdomain.com
The last part of this guide explains how to publish your worker to Cloudflare.
When you build a wrangler worker and deploy the container to Cloudflare, a file with the extension .js
will be created, and will display what folder wrangler was installed in. By default, this will show as
C:\Users\USERNAME\AppData\Roaming\npm\node_modules\wrangler
.
You can see this by going to Cloudflare, clicking Workers & Pages
, and clicking View Code
to the top right.
In order to hide your user path in the code, you must do one of the following:
- Change where NPM is installed for your user path to be removed.
- Deploy using
--minify
To change the installation path, execute:
npm config --global set cache "X:\NodeJS\cache"
npm config --global set prefix "X:\NodeJS\npm"
You may need to re-install wrangler after changing the paths:
npm uninstall wrangler --save-dev
npm install wrangler --save-dev
If you do not want to reinstall wrangler, you can also keep the user path from showing in your source code by deploying your project with --minify
wrangler deploy --minify
Go back to your Terminal, and execute the command:
npx wrangler deploy -e production
You will see a large amount of text in your terminal appear:
Total Upload: 65.15 KiB / gzip: 14.78 KiB
Your worker has access to the following bindings:
- Unsafe:
- ratelimit: searxico
- Vars:
- THROTTLE_DELAY_MS: 0
- THROTTLE_AGGRESSIVE: false
- THROTTLE_AGGRESSIVE_PUNISH_MS: 5000
- THROTTLE_DAILY_ENABLED: false
- THROTTLE_DAILY_LIMIT: 2000
Uploaded searxico (2.57 sec)
Deployed searxico triggers (0.31 sec)
https://searxico.aetherinox.workers.dev
Current Version ID: afe1c468-416e-1ff7-1ce6-42aa7490ef5c
Note
If you have multiple accounts attached to Cloudflare, you will be asked to pick which account you want to upload your worker to.
√ Select an account »
» 1. Brad
» 2. Domain.lan Organization
If you want to switch accounts, you must execute:
npx wrangler login
If you look at the second to last line, it will tell you what URL you can use to view the actual project online:
https://searxico.aetherinox.workers.dev
You can use that domain listed above for any service you wish to use your Favicon grabber for. Cloudflare also supports you adding your own custom domain name onto the worker so that you can access it using a url such as https://icons.mydomain.com
.
This concludes the basics of getting your worker up. There are a few things to remember.
For users who have a Free Cloudflare account, be aware that Cloudflare does place limits on how much traffic your worker can have. The limits are generous and if you are using this Cloudflare worker for your own personal site, you should not be surpassing them.
Feature | Limit |
---|---|
Request | 100,000 requests/day 1000 requests/min |
Memory | 128MB |
CPU Time | 10ms |
You can check your request limit by signing into Cloudflare, and on the left-side menu, clicking Worker & Pages -> Overview.
Select your worker from the Override page.
You should get a very detailed graph and hard numbers showing what your usage is for the day. You can also modify the search criteria to see how the usage has been for the month.
To use your new Favicon grabber service with SearXNG, we need to create a new file within SearXNG.
searxng/favicons.toml
You should create the file above in the same folder where your other SearXNG configs are, such as:
limiter.toml
settings.yml
uwsgi.ini
Open the new favicons.toml
file and add the following:
[favicons]
cfg_schema = 1 # config's schema version no.
[favicons.proxy.resolver_map]
"searxico" = "searx.plugins.searxico.searxico"
# "duckduckgo" = "searx.favicons.resolvers.duckduckgo"
# "searxico" = "searx.favicons.resolvers.searxico"
# "yandex" = "searx.favicons.resolvers.yandex"
If you want multiple favicon services enabled, uncomment the lines above by removing the #
for whatever services you want to enable.
You can also open your settings.yml
and set the default favicon service you want to use:
search:
# backend for the favicon near URL in search results.
# Available resolvers: "allesedv", "duckduckgo", "google", "yandex" - leave blank to turn it off by default.
favicon_resolver: "searxico"
Finally, we need to add the plugin file to /searxng/plugins/
, so create a new file called searxico.py
and add the following code to it:
"""Adds custom favicon grabber
@plugin : searxico
@url : https://github.com/Aetherinox/searxico-worker
@url-cdn : https://github.com/Aetherinox/searxico-cdn
"""
from __future__ import annotations
from typing import Callable
from searx import network
from searx.plugins import logger
from flask_babel import gettext
DEFAULT_RESOLVER_MAP: dict[str, Callable]
logger = logger.getChild('favicons.resolvers')
name = "Searxico"
description = gettext("Fetch favicons using Searxico favicon grabber")
default_on = True
plugin_id = 'searxico'
logger = logger.getChild(plugin_id)
def _req_args(**kwargs):
d = {"raise_for_httperror": False}
d.update(kwargs)
return d
def searxico(domain: str, timeout: int) -> tuple[None | bytes, None | str]:
"""Favicon Resolver from searxico"""
data, mime = (None, None)
url = f"https://searxico.aetherinox.workers.dev/{domain}/32"
logger.debug("fetch favicon from: %s", url)
response = network.get(url, **_req_args(timeout=timeout))
if response and response.status_code == 200 and len(response.content) > 70:
mime = response.headers['Content-Type']
data = response.content
return data, mime
In the code above, change the URL to your custom domain, or your Cloudflare worker:
url = f"https://searxico.aetherinox.workers.dev/{domain}/32"
You should now have all of the things required for your favicon service to work. Head over to your SearXNG website and click on Preferences. Under the General tab, find the setting Favicon Resolver
and change it to:
- Searxico
In a previous section, Publish Worker to Cloudflare, we discussed the fact that Cloudflare puts a limit on each account at 100,000 requests per day. Should there be a reason why you are hosting a public instance of SearXNG, you can also set up load-balancing and provisions off the workload between multiple Cloudflare accounts if you have a team of people working with you.
SearXNG gives you the ability to select more than one favicon resolver. This means that you can call a second Cloudflare account into service, and add both of these workers into your SearXNG settings. Then when a user performs a search within your search engine, the requests for favicons will be split between both workers instead of them all being sent to one.
Within your favicons.toml
file, you can list the different workers you have performing favicon queries:
[favicons]
cfg_schema = 1 # config's schema version no.
[favicons.proxy.resolver_map]
"Searxico Server 1" = "searx.plugins.searxico.searxico1"
"Searxico Server 2" = "searx.plugins.searxico.searxico2"
With these settings in place, the other step is to take the code provided in the section Adding Your Favicon Worker to SearXNG, and create two plugin files instead of one, ensuring each plugin is slightly modified with the updated name.
name = "Searxico 1"
plugin_id = 'searxico1'
logger = logger.getChild(plugin_id)
def _req_args(**kwargs):
d = {"raise_for_httperror": False}
d.update(kwargs)
return d
def searxico(domain: str, timeout: int) -> tuple[None | bytes, None | str]:
Then simply save the plugin file as /plugins/searxico1.py
.
These are notes you should keep in mind if you plan on modifying this favicon Cloudflare worker.
We recommend treating your wrangler.toml
file as the source of truth for your Worker configuration, and to avoid making changes to your Worker via the Cloudflare dashboard if you are using Wrangler.
If you need to make changes to your Worker from the Cloudflare dashboard, the dashboard will generate a TOML snippet for you to copy into your wrangler.toml
file, which will help ensure your wrangler.toml
file is always up to date.
If you change your environment variables in the Cloudflare dashboard, Wrangler will override them the next time you deploy. If you want to disable this behavior, add keep_vars = true
to your wrangler.toml
.
If you change your routes in the dashboard, Wrangler will override them in the next deploy with the routes you have set in your wrangler.toml
. To manage routes via the Cloudflare dashboard only, remove any route and routes keys from your wrangler.toml
file. Then add workers_dev = false
to your wrangler.toml
file. For more information, refer to Deprecations.
Wrangler will not delete your secrets (encrypted environment variables) unless you run wrangler secret delete <key>
.
Note
Experimental Config
Wrangler currently supports an --experimental-json-config
flag, which will read your configuration from a wrangler.json
file, rather than wrangler.toml
. The format of this file is exactly the same as the wrangler.toml
configuration file, except that the syntax is JSON
rather than TOML
.
This is experimental, and is not recommended for production use.
This section provides a reference for Wrangler commands.
npx wrangler <COMMAND> <SUBCOMMAND> [PARAMETERS] [OPTIONS]
Since Cloudflare recommends installing Wrangler locally in your project(rather than globally), the way to run Wrangler will depend on your specific setup and package manager.
After you have access to wrangler globally, you can switch over from using npx wrangler
to just wrangler
:
npx wrangler <COMMAND> <SUBCOMMAND> [PARAMETERS] [OPTIONS]
Full list of commands available at:
To update the version of Wrangler used in your project, run:
npm install wrangler@latest
Launches your local wrangler / cloudflare dev project in a test environment.
npx wrangler dev -e dev
Authorize Wrangler with your Cloudflare account using OAuth. Wrangler will attempt to automatically open your web browser to login with your Cloudflare account. If you prefer to use API tokens for authentication, such as in headless or continuous integration environments, refer to Running Wrangler in CI/CD.
If Wrangler fails to open a browser, you can copy and paste the URL generated by wrangler login
in your terminal into a browser and log in.
npx wrangler login [OPTIONS]
Lists all accounts associated with your Cloudflare account
npx wrangler whoami
Check where wrangler (and other global packages) are installed at:
npm list -g --depth=0
Deploy your Worker to Cloudflare.
npx wrangler deploy [<SCRIPT>] [OPTIONS]
npx wrangler deploy --minify -e production
Note
None of the options for this command are required. Also, many can be set in your wrangler.toml
file. Refer to the wrangler.toml
configuration documentation for more information.
The following command will build a dry-run compiled version of your index.js file which will be placed in the dist/
folder
npx wrangler deploy --dry-run --outdir dist -e production
Delete your Worker and all associated Cloudflare developer platform resources.
npx wrangler delete [<SCRIPT>] [OPTIONS]
We are always looking for contributors. If you feel that you can provide something useful to Gistr, then we'd love to review your suggestion. Before submitting your contribution, please review the following resources:
Want to help but can't write code?
- Review active questions by our community and answer the ones you know.
The following people have helped get this project going: