Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): add export type for options and adjust config param use #50

Open
wants to merge 3 commits into
base: feat/export-other-type
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .prettierrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"bracketSpacing": true,
"jsxSingleQuote": false,
"proseWrap": "never",
"semi": false,
"singleQuote": true,
"tabWidth": 2,
"trailingComma": "all",
"endOfLine": "lf"
}
7 changes: 0 additions & 7 deletions .prettierrc.js

This file was deleted.

65 changes: 38 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# yuque-dl

语雀知识库下载为本地markdown
语雀知识库下载为本地 markdown

![header](https://socialify.git.ci/gxr404/yuque-dl/image?description=1&descriptionEditable=%E8%AF%AD%E9%9B%80%E7%9F%A5%E8%AF%86%E5%BA%93%E4%B8%8B%E8%BD%BD&issues=1&logo=https%3A%2F%2Fraw.githubusercontent.com%2Fgxr404%2Fyuque-dl%2Fmain%2Fdocs%2Fassets%2Flogo.png&name=1&pattern=Circuit%20Board&pulls=1&stargazers=1&theme=Light)

Expand Down Expand Up @@ -31,13 +31,18 @@ $ yuque-dl --help
$ yuque-dl server --help

Options:
-d, --dist-dir <dir> 下载的目录 eg: -d download (default: download)
-i, --ignore-img 忽略图片不下载 (default: false)
-k, --key <key> 语雀的cookie key, 默认是 "_yuque_session", 在某些企业版本中 key 不一样
-t, --token <token> 语雀的cookie key 对应的值
--toc 是否输出文档toc目录 (default: false)
-h, --help Display this message
-v, --version Display version number
-d, --dist-dir <dir> 下载的目录 eg: -d download (default: download)
-i, --ignore-img 忽略图片不下载 (default: false)
-k, --key <key> 语雀的cookie key, 默认是 "_yuque_session", 在某些企业版本中 key 不一样
-t, --token <token> 语雀的cookie key 对应的值
--toc 是否输出文档toc目录 (default: false)
-h, --help Display this message
-v, --version Display version number
--docExportType <d> 指定文档导出类型,可选范围:md,lake,pdf,默认值是 md
--boardExportType <b> 指定画板导出类型,可选范围:lakeboard,jpg,png,默认值是 lakeboard
--sheetExportType <s> 指定sheet导出类型,可选范围:lakesheet,xlsx,md,默认值是 lakesheet
--tableExportType <t> 指定数据表导出类型,可选范围:laketable,xlsx,默认值是 laketable
--ctoken <c> 语雀的yuque_ctoken cookie值,用于导出文档内容等,如果到处类型选了语雀自由格式则必填,默认为空
```

### Start
Expand All @@ -55,43 +60,49 @@ yuque-dl "https://www.yuque.com/yuque/thyzgp"

### 私有知识库

通过别人私有知识库 分享的链接,需使用`-t`添加token才能下载
通过别人私有知识库 分享的链接,需使用`-t`添加 token 才能下载

```bash
yuque-dl "https://www.yuque.com/yuque/thyzgp" -t "abcd..."
```

[token的获取请看](./docs/GET_TOEKN.md)
[token 的获取请看](./docs/GET_TOEKN.md)

### 企业私有服务

企业服务有自己的域名(黄色语雀logo),非`yuque.com`结尾, 如`https://yuque.antfin.com/r/zone`
企业服务有自己的域名(黄色语雀 logo),非`yuque.com`结尾, 如`https://yuque.antfin.com/r/zone`

这种情况 token的key不唯一, 不一定是为`_yuque_session` 需用户使用 `-k` 指定 token的key,`-t` 指定 token的值
这种情况 token 的 key 不唯一, 不一定是为`_yuque_session` 需用户使用 `-k` 指定 token 的 key,`-t` 指定 token 的值

至于`key`具体是什么只能靠用户自己在 `浏览器Devtools-> Application -> Cookies` 里找了🤔
至于`key`具体是什么只能靠用户自己在 `浏览器Devtools-> Application -> Cookies` 里找了 🤔

### 公开密码访问的知识库

![public_pwd](https://github.com/gxr404/yuque-dl/assets/17134256/b546a9a3-68f0-4f76-b450-6b16f464db5d)

⚠️ 公开密码访问的知识库两种情况:

- 已经登录语雀,访问需要密码的知识库 输入密码后使用`_yuque_session`这个cookie
- 已经登录语雀,访问需要密码的知识库 输入密码后使用`_yuque_session`这个 cookie

```bash
yuque-dl "url" -t "_yuque_session的值"
```
```bash
yuque-dl "url" -t "_yuque_session的值"
```

- 未登录语雀,访问需要密码的知识库 输入密码后需要使用`verified_books`/`verified_docs`这个cookie
- 未登录语雀,访问需要密码的知识库 输入密码后需要使用`verified_books`/`verified_docs`这个 cookie

```bash
yuque-dl "url" -k "verified_books" -t "verified_books的值"
```
```bash
yuque-dl "url" -k "verified_books" -t "verified_books的值"
```

## 内置启动web服务可快速预览
### 需要导出 lake\* 格式文档

使用[`vitepress`](https://vitepress.dev/)快速启动一个web服务提供可预览下载的内容
```bash
yuque-dl "url" --ctoken "yuque_ctoken的值"
```

## 内置启动 web 服务可快速预览

使用[`vitepress`](https://vitepress.dev/)快速启动一个 web 服务提供可预览下载的内容

```bash
yuque-dl server ./download/知识库/
Expand All @@ -108,15 +119,15 @@ yuque-dl server ./download/知识库/
- [x] 支持图片下载本地
- [x] 支持下载分享私有的知识库
- [x] 支持转换表格类型的文档 (ps: 表格内插入图表暂不支持)
- [x] 添加toc目录功能
- [x] 添加 toc 目录功能
- [x] 添加测试
- [x] 添加附件下载
- [ ] 支持其他文档类型?🤔
- [ ] 直接打包成可执行文件 🤔

## 常见错误

1. 由于token可能含有 特殊字符导致参数识别错误
1. 由于 token 可能含有 特殊字符导致参数识别错误

```bash
yuque-dl "https://www.yuque.com/yuque/thyzgp" -t "-a123"
Expand All @@ -129,9 +140,9 @@ yuque-dl [ERROR]: Unknown option `-1`
yuque-dl "https://www.yuque.com/yuque/thyzgp" -t="-a123"
```

2. 附件下载失败,需设置登录token
2. 附件下载失败,需设置登录 token

附件文件下载需要用户登录token,即使是完全公开的知识库,下载附件也可能需要
附件文件下载需要用户登录 token,即使是完全公开的知识库,下载附件也可能需要

完全公开的知识库未登录的情况下查看附件:

Expand Down
93 changes: 64 additions & 29 deletions src/api.ts
Original file line number Diff line number Diff line change
@@ -1,110 +1,145 @@
import { env } from 'node:process'
import axios from 'axios'
import { randUserAgent } from './utils'
import { DEFAULT_COOKIE_KEY, DEFAULT_DOMAIN } from './constant'
import { DEFAULT_DOMAIN } from './constant'
import { getConfig } from './config'

import type {
ArticleResponse,
KnowledgeBase,
GetHeaderParams,
IReqHeader,
TGetKnowledgeBaseInfo,
TGetMdData
TGetMdData,
} from './types'
import type { AxiosRequestConfig } from 'axios'

function getHeaders(params: GetHeaderParams): IReqHeader {
const { key = DEFAULT_COOKIE_KEY, token } = params
function getHeaders(): IReqHeader {
const { key, token, url, ctoken } = getConfig()
const headers: IReqHeader = {
'user-agent': randUserAgent({
browser: 'chrome',
device: 'desktop'
})
device: 'desktop',
}),
referer: url,
}
if (ctoken) headers['x-csrf-token'] = ctoken
if (token) headers.cookie = `${key}=${token};`
return headers
}

export function genCommonOptions(params: GetHeaderParams): AxiosRequestConfig {
export function genCommonOptions(): AxiosRequestConfig {
const config: AxiosRequestConfig = {
headers: getHeaders(params),
headers: getHeaders(),
beforeRedirect: (options) => {
// 语雀免费非企业空间会重定向如: www.yuque.com -> gxr404.yuque.com
// 此时axios自动重定向并不会带上cookie
options.headers = {
...(options?.headers || {}),
...getHeaders(params)
...getHeaders(),
}
}
},
}
if (env.NODE_ENV === 'test') {
config.proxy = false
}
return config
}


/** 获取知识库数据信息 */
export const getKnowledgeBaseInfo: TGetKnowledgeBaseInfo = (url, headerParams) => {
export const getKnowledgeBaseInfo: TGetKnowledgeBaseInfo = (url) => {
const knowledgeBaseReg = /decodeURIComponent\("(.+)"\)\);/m
return axios.get<string>(url, genCommonOptions(headerParams))
.then(({data = '', status}) => {
return axios
.get<string>(url, genCommonOptions())
.then(({ data = '', status }) => {
if (status === 200) return data
return ''
})
.then(html => {
.then((html) => {
const data = knowledgeBaseReg.exec(html) ?? ''
if (!data[1]) return {}
const jsonData: KnowledgeBase.Response = JSON.parse(decodeURIComponent(data[1]))
const jsonData: KnowledgeBase.Response = JSON.parse(
decodeURIComponent(data[1]),
)
if (!jsonData.book) return {}
const info = {
bookId: jsonData.book.id,
bookSlug: jsonData.book.slug,
tocList: jsonData.book.toc || [],
bookName: jsonData.book.name || '',
bookDesc: jsonData.book.description || '',
host: jsonData.space?.host || DEFAULT_DOMAIN,
imageServiceDomains: jsonData.imageServiceDomains || []
host: jsonData.defaultSpaceHost || DEFAULT_DOMAIN,
imageServiceDomains: jsonData.imageServiceDomains || [],
}
return info
}).catch((e) => {
})
.catch((e) => {
// console.log(e.message)
const errMsg = e?.message ?? ''
if (!errMsg) throw new Error('unknown error')
const netErrInfoList = [
'getaddrinfo ENOTFOUND',
'read ECONNRESET',
'Client network socket disconnected before secure TLS connection was established'
'Client network socket disconnected before secure TLS connection was established',
]
const isNetError = netErrInfoList.some(netErrMsg => errMsg.startsWith(netErrMsg))
const isNetError = netErrInfoList.some((netErrMsg) =>
errMsg.startsWith(netErrMsg),
)
if (isNetError) {
throw new Error('请检查网络(是否正常联网/是否开启了代理软件)')
}
throw new Error(errMsg)
})
}


export const getDocsMdData: TGetMdData = (params, isMd = true) => {
const { articleUrl, bookId, token, key, host = DEFAULT_DOMAIN } = params
const { articleUrl, bookId } = params
const { host } = getConfig()
let apiUrl = `${host}/api/docs/${articleUrl}`
const queryParams: any = {
'book_id': String(bookId),
'merge_dynamic_data': String(false)
book_id: String(bookId),
merge_dynamic_data: String(false),
// plain=false
// linebreak=true
// anchor=true
}
if (isMd) queryParams.mode = 'markdown'
const query = new URLSearchParams(queryParams).toString()
apiUrl = `${apiUrl}?${query}`
return axios.get<ArticleResponse.RootObject>(apiUrl, genCommonOptions({token, key}))
.then(({data, status}) => {
return axios
.get<ArticleResponse.RootObject>(apiUrl, genCommonOptions())
.then(({ data, status }) => {
const res = {
apiUrl,
httpStatus: status,
response: data
response: data,
}
return res
})
}

export const getLakeFileExportUrl = ({
id,
type,
}: {
id: number
type: string
}) => {
const { host } = getConfig()
return axios
.post(
`${host}/api/docs/${id}/export`,
{
force: 0,
type,
},
{
...genCommonOptions(),
},
)
.then(({ data }) => {
return data.data
})
.catch((e) => {
throw new Error(`get file export url fail: ${e}`)
})
}
28 changes: 24 additions & 4 deletions src/cli.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

import { readFileSync } from 'node:fs'
import { cac } from 'cac'
import semver from 'semver'
Expand All @@ -7,6 +6,7 @@ import { main } from './index'
import { logger } from './utils'
import type { ICliOptions } from './types'
import { runServer } from './server'
import { setConfig } from './config'

const cli = cac('yuque-dl')

Expand All @@ -33,15 +33,35 @@ cli
default: 'download',
})
.option('-i, --ignore-img', '忽略图片不下载', {
default: false
default: false,
})
.option('-k, --key <key>', '语雀的cookie key, 默认是 "_yuque_session", 在某些企业版本中 key 不一样')
.option(
'-k, --key <key>',
'语雀的cookie key, 默认是 "_yuque_session", 在某些企业版本中 key 不一样',
)
.option('-t, --token <token>', '语雀的cookie key 对应的值')
.option('--toc', '是否输出文档toc目录', {
default: false
default: false,
})
.option('--docExportType', '输出 doc 文档类型, 默认是 md', {
default: 'md',
})
.option('--sheetExportType', '输出 sheet 文档类型, 默认是 lakesheet', {
default: 'lakesheet',
})
.option('--boardExportType', '输出 board 文档类型, 默认是 lakeboard', {
default: 'lakeboard',
})
.option('--tableExportType', '输出 table 文档类型, 默认是 laketable', {
default: 'laketable',
})
.option(
'--ctoken',
'语雀授权码,用于模拟用户进行导入导出操作,如果需要导出 lake* 类型文件,此参数必传',
)
.action(async (url: string, options: ICliOptions) => {
try {
setConfig({ ...options, url })
await main(url, options)
process.exit(0)
} catch (err) {
Expand Down
Loading