Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Problems linking to generated targets #58

Open
pekkaklarck opened this issue Oct 19, 2024 · 6 comments
Open

bug: Problems linking to generated targets #58

pekkaklarck opened this issue Oct 19, 2024 · 6 comments
Assignees
Labels
feature New feature or request fund Issue priority can be boosted

Comments

@pekkaklarck
Copy link

pekkaklarck commented Oct 19, 2024

I might be submitting this to a wrong project, but because I don't know any better place, and this is at least related to autorefs, I decided to submit it here. Feel free to close this if the place is wrong. If you know a better place, please let me know.


Let's first assume I have the following content:

# Welcome

## Just testing

When I have autorefs installed, I can very conveniently link to the Welcome section using this syntax:

This is a link to the [welcome][] section.

This is very similar, and equally convenient, than the syntax I'm used to with reStructuredText and Sphinx:

This is a link to the `welcome`_ section.

The problem is that I'd typically like to refer to sections so that the name starts with a capital letter as below, but in this case auto-linking doesn't work:

This is a link to the [Welcome][] section.

There's also a problem with headings having spaces. For example, this doesn't work:

I'm [just testing][].

I know I can get both of the above to work by including the actual link target as below:

This is a link to the [Welcome][welcome] section. I'm [just testing][just-testing].

The problem with adding those link targets is that it requires writing redundant content, makes document source less readable and is inconsistent. Inconsistency also means you can easily forget to add the target when in some cases it is not needed. Notice that with reStructuredText and Sphinx these work out-of-the-box:

This is a link to the `Welcome`_ section. I'm `just testing`_.

My first problem that [Welcome][] isn't handled case-insensitively feels like a bug to me, because according to the original Markdown spec, reference targets are case-insensitive. There is, for example, an example like this:

I get 10 times more traffic from [Google][] than from
[Yahoo][] or [MSN][].

  [google]: http://google.com/        "Google"
  [yahoo]:  http://search.yahoo.com/  "Yahoo Search"
  [msn]:    http://search.msn.com/    "MSN Search"

The odd thing is that the above example is handled fine by MkDocs. It's strange that [Google][] works when the references target is [google]:, but [Welcome][] doesn't work when the reference target ought to be something similar.

My second problem that [just testing][] is not linked to the generated just-testing target is something where my wish is against the Markdown spec. This example from the spec wouldn't work if spaces would be automatically replaces with hyphens:

Visit [Daring Fireball][] for more information.

[Daring Fireball]: http://daringfireball.net/

Although I'm fighting against the spec, I feel that this syntax would be really convenient. Is it something autorefs could support? If there are backwards compatibility concerns, it didn't need to be enabled by default, but having an option to enable this behavior would be awesome. If this is something autorefs cannot do or it doesn't belong to its scope, do you have ideas how I could enable the behavior otherwise? I'm willing to implement a plugin if needed and would also be ready to provide a PR to autorefs if this feature could be added to it.

Boost priority

  • Boost priority in our backlog through Polar.sh. Higher pledge, higher priority.
  • Minimum pledge by user/organization is $5, minimum amount for boost is $30.
  • View all issues with pledges.
  • We receive the funds once the issue is completed and confirmed by you.
  • Features with the insiders label are released to sponsors first, and tied to a funding goal.
Fund with Polar
@pekkaklarck pekkaklarck added the unconfirmed This bug was not reproduced yet label Oct 19, 2024
@pawamoy
Copy link
Member

pawamoy commented Oct 19, 2024

Thank you for the bug report / feature request @pekkaklarck, I think it makes a lot of sense. We can definitely consider "slugifying" titles when no identifier is provided (and after having tried with the actual title as id, probably). At first glance, this shouldn't cause any issue if we do this by default, but I'll have a deeper look 🙂

@pawamoy pawamoy added bug Something isn't working feature New feature or request and removed unconfirmed This bug was not reproduced yet labels Oct 19, 2024
@pekkaklarck
Copy link
Author

pekkaklarck commented Oct 19, 2024

That would be awesome! And yes, should first check is there a match with the original title before sluggifying.

@pawamoy pawamoy added the fund Issue priority can be boosted label Oct 22, 2024
@pawamoy
Copy link
Member

pawamoy commented Nov 26, 2024

Looking into this. When referencing API docs, it's important to retain case-sensitivity, since object names themselves are most likely case-sensitive (that's the case for Python for example). So how to distinguish when we would like to try and slugify (in case of title not matching as is), and when we wouldn't want that? I think there's a simple answer: if the title is wrapped in backticks, maintain case-sensitivity (and more generally, don't try to slugify the title at all).

All the following examples assume there's no direct match with the title unchanged:

  • [Welcome][] -> slugify
  • [just testing][] -> slugify
  • [`Constant`][] -> do not slugify
  • [`whatever-the *characters*!`][] -> do not slugify
  • [Note about `Thing`][] -> slugify
  • [`a` -> `b`][] -> erm, more complex to get right, but slugify
  • [``valid`identifier``][] -> even more complex, but do not slugify

We basically want to match a single pair of equally long series of backticks, wrapping the whole string. Might be possible with a simple regex actually? Something like ^(`+)((?!\1).)*\1$.

With other features coming up, we could also decide to drop these implicit rules in favor of waiting for a proper, more elaborated syntax, that would allow more flexibility. Such a syntax could be based on attr_list, since it lets us add HTML classes to the generated (custom) autoref HTML elements, that autorefs could then use as configuration.

Just quick examples of what it could look like

  • [Welcome][]{.I} (like re.I for case-insentivity)
  • [just testing][]{.S} ("Slugify"?)
  • [`Constant`][]{.lang-rust} (only try to link to an object loaded through a rust handler)
  • [`whatever-the *characters*!`][]{.S}
  • [Note about `Thing`][]{.E} ("Exact", do not slugify)
  • [`a` -> `b`][]{.E}
  • etc.

Or we could first implement the implicit slugification, and allow overriding it with custom syntax (attr lists) later. Seems like the best way forward to me. I really don't want to force users to write tons of metadata in cross-references if we can get 95% of all uses right with implicit/default rules. The custom syntax must also not look too magic/obscure, and not be too verbose.

@pawamoy
Copy link
Member

pawamoy commented Nov 26, 2024

RE: RE (pun intended)

regex-are-cool

🎉

EDIT: ``he```llo`` doesn't match, but I don't think anyone would do such a thing (more backticks in the middle than at the extremities.

@pekkaklarck
Copy link
Author

Does autorefs know that link targets are available? If it does, then it could work like this:

  1. Check is there an exact match. If there is, use it.
  2. Go through all targets and compare the used text with them in normalized manner (e.g. lower case, all special characters removed, ...). If there's a single normalized match, use it. If there's none or multiple, emit a warning or error.

@pawamoy
Copy link
Member

pawamoy commented Nov 27, 2024

It knows. The issue with always falling back is that it's unwanted in some cases. Consider this:

class Bicycle:
    ...


bicycle = Bicycle()
"""This is my bicycle."""


def compare(b1: Bicycle, b2: Bicycle) -> bool:
    """Compare instances of [`Bicycle`][]."""
    ...

Now imagine that Bicycle is not included in the docs (because the dev adds each object manually and forgot about it, or because they auto-generate the docs and Bicycle doesn't have a docstring, which would exclude it). The Bicycle cross-ref in compare's docstring would therefore fail, and fallback to slugification, resulting in bicycle, but that's not the right object! Instead, we want to fail hard and emit a warning, without falling back. The wrapping backticks here indicate "this exact identifier and nothing else".

@pawamoy pawamoy removed the bug Something isn't working label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request fund Issue priority can be boosted
Projects
None yet
Development

No branches or pull requests

2 participants