Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: getCanonicalPageId does not support non-latin page titles #422

Closed
marharyta opened this issue Jan 21, 2023 · 1 comment
Closed

Bug: getCanonicalPageId does not support non-latin page titles #422

marharyta opened this issue Jan 21, 2023 · 1 comment

Comments

@marharyta
Copy link

Description

I, unfortunately, failed to create a PR, the repo seems to require permission to push a new branch for me. But nevertheless, here is the problem description and my proposed solution:
Screenshot 2023-01-21 at 18 40 14

Problem: getCanonicalPageId does not support non-latin page titles

Issue:

I am using [Notion.so](http://Notion.so) to run [FinUA.org](http://FinUA.org) website and currently it isb deployed with [super.so](http://super.so). I have been using nextjs-notion-starter-kit project for it (thank you).

As deployed the project to Vercel, I realized that there were quite a few browser warnings about the page due to generated page URLs (they looked broken).
Screenshot 2023-01-21 at 18 41 25

the page behind it:
Screenshot 2023-01-21 at 18 41 37

moreover, this page also had the same URL generated /- despite being a separate page, and clicking on it would lead to the first page.

Screenshot 2023-01-21 at 18 42 13

I have investigated it, and it seems that the problem was in the module https://github.com/transitive-bullshit/nextjs-notion-starter-kit/blob/main/lib/get-canonical-page-id.ts

import { ExtendedRecordMap } from 'notion-types'
import {
  getCanonicalPageId as getCanonicalPageIdImpl,
  parsePageId
} from 'notion-utils'

import { inversePageUrlOverrides } from './config'

export function getCanonicalPageId(
  pageId: string,
  recordMap: ExtendedRecordMap,
  { uuid = true }: { uuid?: boolean } = {}
): string | null {
  const cleanPageId = parsePageId(pageId, { uuid: false })
  if (!cleanPageId) {
    return null
  }

  const override = inversePageUrlOverrides[cleanPageId]
  if (override) {
    return override
  } else {
		// PROBLEM: this line seemed to be the issue
    return getCanonicalPageIdImpl(pageId, recordMap, {
      uuid
    })
  }
}

I went to the module https://github.com/NotionX/react-notion-x/tree/master/packages/notion-utils

and copied https://github.com/NotionX/react-notion-x/blob/master/packages/notion-utils/src/get-canonical-page-id.ts module, the problem seemed to be getCanonicalPageId function, it only seemed to work for Latin symbols normalizeTitle(getBlockTitle(block, recordMap)):

I pulled the normalizeTitle function, and yes, it seems to be the case

function normalizeTitle(title) {
  return (title || '')
    .replace(/ /g, '-')
    .replace(/[^a-zA-Z0-9-\u4e00-\u9fa5]/g, '')
    .replace(/--/g, '-')
    .replace(/-$/, '')
    .replace(/^-/, '')
    .trim()
    .toLowerCase()
}

const eng = normalizeTitle('Naapurin Maalaiskana (NMK), in Lieto, in Turku area');
const ukr = normalizeTitle('Робота помічника з обслуговування контейнерів');
const ukr1 = normalizeTitle('Ищем литейщиков в Карккила, Финляндия, для обработки изделий в металлургической промышленности');
console.log('test', eng, ukr, ukr1)

// "test"
// "naapurin-maalaiskana-nmk-in-lieto-in-turku-area"
// ""
// "---"

Solution:

The one that worked for me was just replacing normalizeTitle(getBlockTitle(block, recordMap)) with slugify from the transliteration npm package.

Notion Test Page ID

701245d6db8c413689d180e87269ee56

@marharyta
Copy link
Author

Created a PR, #423 , closing the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant