Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fetching all files and/or with filename pattern matching to use as documents #37

Open
galligan opened this issue Apr 20, 2022 · 10 comments
Assignees
Labels
Feature New feature or request

Comments

@galligan
Copy link

galligan commented Apr 20, 2022

Problem

I have a desire to populate a Docusaurus instance with content from multiple distinct repos. The content in those “remote” (relative to the Docusaurus instance) may change, and therefore can’t be explicitly defined.

Proposed solution

It would be hugely beneficial if I could simply say “all” files within a repo are downloaded, or if I could use filename pattern matching to filter a subset of documents to download.

In practice, perhaps this means that within documents I could pass a non-area string such as:

  • all: this would fetch all files in a given directory
  • faq*: this would fetch all files that begin with faq

Other thoughts

What I’m not clear on is whether or not it would be useful to define specifics around subcategories. Perhaps this would mean that for a given directory, I could look for all files that begin with faq, traversing subdirectories, ignoring the files that don’t begin with faq, and then downloading the results, preserving the subdirectory structure along the way.


Thanks for your consideration!

@RDIL
Copy link
Member

RDIL commented Apr 20, 2022

Hey,

I totally understand your use-case, and it makes sense, but I don't really know how to implement this in a way that works nicely and easily.

The problem is that the plugin is currently designed to work on almost any platform or provider, meaning that GitHub is supported just as well as GitLab, and so on so forth.

Where are you fetching your content from? Based on that, you can likely use the documents parameter to pass an async function that fetches the names of all the things needing to be downloaded.

I can see why this would be helpful, I just don't want to implement something without making sure it works for most users, won't break, etc.

@galligan
Copy link
Author

Where are you fetching your content from?

So in this case, the content would be pulled from other Github repositories.

The reason for this is that we want to keep clean and distinct repositories that each serve a purpose, have their own contribution permissioning, and branch protection.

@RDIL
Copy link
Member

RDIL commented Apr 20, 2022

That makes sense. I think that for your use-case, you can just use Octokit with the GitHub trees API to make a list of all the files, then filter it down to the ones you want, and return it from the documents parameter as a function. It requires a bit of scripting but ultimately it will very likely work.

@RDIL RDIL added Feature New feature or request question Further information is requested labels Apr 21, 2022
@sw-tracker
Copy link

I would also love to have this feature. How can I fetch all markdown files? I dont want to have to define every markdown file to fetch every time someone adds a new markdown file in a remote repo.

@RDIL
Copy link
Member

RDIL commented May 18, 2022

I can see that this use case is one that multiple people want to see as official functionality, so I will add it to my to-do list.

@galligan
Copy link
Author

galligan commented May 18, 2022

Thanks for that @RDIL. More than happy to provide feedback as you get closer to putting this together.

In the meantime I've been working with @1amcode to build this support in using your plugin, so perhaps he might also have some feedback.

@1amcode
Copy link
Contributor

1amcode commented May 18, 2022

yeah @RDIL, I got the support for that ready. Will share the code soon, so that you can integrate back into your plugin – as you like.

@1amcode
Copy link
Contributor

1amcode commented May 19, 2022

@sw-tracker, @galligan I published the code under https://github.com/1amcode/docusaurus-lib-list-remote
@RDIL Take a look, and let me know if you have any idea on how to integrate it better with your plugin – whetever makes most sense to you. We can leave it as a standalone project, or integrate it closely into docusaurus-plugin-remote-content. Let me know.

@RDIL
Copy link
Member

RDIL commented May 29, 2022

@1amcode studying for final exams this week, so I can't write any code for a bit, but I think the best way to do it would to be integrating it as a GitHub "provider" in the plugin, probably with some shorthand config syntax to active it. This would allow for future providers for other platforms to be added, which I believe is important. Hopefully I can start working on this next week though.

@RDIL RDIL removed the question Further information is requested label May 29, 2022
@RDIL RDIL self-assigned this Jun 8, 2022
@rajatbarman
Copy link

rajatbarman commented Dec 5, 2023

This function can be used for this purpose, pass the author, repo, branch parameters.
This will work for docs structured like this (Example)

repo/
  docs/
    docs_folder_1
      /intro.md
      /guide.mdx
    docs_folder_2
      /more_docs.md
   

This function then returns a configuration array -

[
    [
        "docusaurus-plugin-remote-content",
        {
            "name": "docs_folder_1_docs",
            "sourceBaseUrl": "https://raw.githubusercontent.com/rajatbarman/tc-docs/main/docs/docs_folder_1",
            "outDir": "docs/docs_folder_1",
            "documents": [
                "intro.md",
                "guide.mdx",
            ]
        }
    ],
    [
        "docusaurus-plugin-remote-content",
        {
            "name": "docs_folder_2_docs",
            "sourceBaseUrl": "https://raw.githubusercontent.com/rajatbarman/tc-docs/main/docs/docs_folder_2",
            "outDir": "docs/docs_folder_2",
            "documents": [
                "more_docs.md"
            ]
        }
    ]
]

The function -

function fetchRemoteContentConfig(author = "rajatbarman", repo = "tc-docs", branch = "main") {
  return fetch(`https://api.github.com/repos/${author}/${repo}/git/trees/${branch}?recursive=1`, {
    headers: {
      Authorization: "your PAT token",
    },
  })
    .then((resp) => {
      return resp.json()
    })
    .then((resp) => {
      const config = []
      const docs = resp.tree.filter((doc) => {
        if (doc.path.startsWith("docs/")) {
          return true
        }
      })
      const folders = docs.filter((doc) => {
        return doc.type === "tree"
      })
      folders.forEach((folder) => {
        const folderName = folder.path.replace("docs/", "")
        config.push([
          "docusaurus-plugin-remote-content",
          {
            name: `${folderName}_docs`, // used by CLI, must be path safe
            sourceBaseUrl: `https://raw.githubusercontent.com/${author}/${repo}/${branch}/docs/${folderName}`, // the base url for the markdown (gets prepended to all of the documents when fetching)
            outDir: `docs/${folderName}`, // the base directory to output to.
            documents: docs
              .filter((doc) => {
                return (
                  doc.type === "blob" &&
                  doc.path.startsWith(`docs/${folderName}`) &&
                  (doc.path.endsWith(".md") || doc.path.endsWith(".mdx"))
                )
              })
              .map((doc) => {
                return encodeURIComponent(doc.path.replace(`docs/${folderName}/`, ""))
              }),
          },
        ])
      })
      return config
    })
    .catch((err) => {
      console.log(err)
    })
}

You will have to write your docusauras config as a async function, invoke this script first and use the config returned by it.
Something like this -

export default async function createConfigAsync() {
  const remoteContentConfig = await fetchRemoteContentConfig()
  return {
    title: 'Docusaurus',
    url: 'https://docusaurus.io',
    // your site config ...
    plugins: [...remoteContentConfig, otherPlugins]
  };
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature or request
Development

No branches or pull requests

5 participants