- use non-capturing groups for performance reasons
- avoid greedy matching: use lazy matching and avoid "any" matchers
- lookbehinds are only supported in the ECMA2018 standard, at the moment the browser support is very limited
- Optional port:
(?::\d{1,5})?
- Optional http scheme:
(?:http(?:s)?:\/\/)?
^(?:http(?:s)?:\/\/)?(?:localhost|127\.0\.0\.1|0\.0\.0\.0)(?::\d{1,5})?$
https://regex101.com/r/accq7h/4/tests
^(?:http(?:s)?:\/\/)?[a-zA-Z0-9-]{1,63}\.gitlab\.io$
^(?:http(?:s)?:\/\/)?[a-zA-Z0-9-]{1,63}\.github\.io$
^(?:http(?:s)?:\/\/)?[a-zA-Z0-9-]{1,63}\.herokuapp\.com$
There're two approaches to choose from when validating domains.
By-the-books FQDN matching (theoretical definition, rarely encountered in practice):
- max 253 character long (as per RFC-1035/3.1, RFC-2181/11)
- max 63 character long per label (as per RFC-1035/3.1, RFC-2181/11)
- any characters are allowed (as per RFC-2181/11)
- TLDs cannot be all-numeric (as per RFC-3696/2)
- FQDNs can be written in a complete form, which includes the root zone (the trailing dot)
Practical / conservative FQDN matching (practical definition, expected and supported in practice):
- by-the-books matching with the following exceptions/additions
- valid characters:
[a-zA-Z0-9.-]
- labels cannot start or end with hyphens (as per RFC-952 and RFC-1123/2.1)
- TLD min length is 2 character, max length is 24 character as per currently existing records
- don't match trailing dot
Regex for the practical use case:
^(?!.*?_.*?)(?!(?:[\w]+?\.)?\-[\w\.\-]*?)(?![\w]+?\-\.(?:[\w\.\-]+?))(?=[\w])(?=[\w\.\-]*?\.+[\w\.\-]*?)(?![\w\.\-]{254})(?!(?:\.?[\w\-\.]*?[\w\-]{64,}\.)+?)[\w\.\-]+?(?<![\w\-\.]*?\.[\d]+?)(?<=[\w\-]{2,})(?<![\w\-]{25})$
https://regex101.com/r/FLA9Bv/41 (Note: currently only works in Chrome because the regex uses lookbehinds which are only supported in ECMA2018)
See also: TLD limitations, domain limitations, list of TLDs, by-the-books regex and explanation
^v[0-9]+\.[0-9]+\.[0-9]+$
Extracts valid arguments from comma-separated argument list, supporting double-quoted arguments
(?<=")[^"]+?(?="(?:\s*?,|\s*?$))|(?<=(?:^|,)\s*?)(?:[^,"\s][^,"]*[^,"\s])|(?:[^,"\s])(?![^"]*?"(?:\s*?,|\s*?$))(?=\s*?(?:,|$))
https://regex101.com/r/UL8kyy/3/tests (Note: currently only works in Chrome because the regex uses lookbehinds which are only supported in ECMA2018)