You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Markdown Heading ID contains Unicode ( is inconsistent with Github.
For #### test(1) in gitea , the id is "user-content-test-1" and in github it is "user-content-test1"
The markdown below is available for jumping on github, but not for gitea.
changed the title [-]Markdown Heading ID contains Unicode “(” is inconsistent with Github[/-][+]anchors don't work when contains unicode `(`[/+]on Aug 8, 2023
changed the title [-]anchors don't work when contains unicode `(`[/-][+]anchors don't work when contains punctuation marks just like `(` or `(`[/+]on Aug 8, 2023
I've located the code for this problem, it has to do with the user-conent-* generation rules, but I'm not particularly sure how github handles this, can you give me some more examples to help me refine the code?
def cheanValue(anchor_name):
anchor_name = anchor_name.strip()
ret = []
for c in anchor_name:
if c.isalpha() or c.isdigit() or c == '_' or c == '-':
ret.append(c.lower())
if c == ' ':
ret.append('-')
return ''.join(ret)
def test():
cases = [
["", ""],
["test(0)", "test0"],
["test!1", "test1"],
["test:2", "test2"],
["test*3", "test3"],
["test!4", "test4"],
["test:5", "test5"],
["test*6", "test6"],
["test:6 a", "test6-a"],
["test:6 !b", "test6-b"],
["test:ad # df", "testad--df"],
["test:ad #23 df 2*/*", "testad-23-df-2"],
["test:ad 23 df 2*/*", "testad-23-df-2"],
["test:ad # 23 df 2*/*", "testad--23-df-2"],
["Anchors in Markdown", "anchors-in-markdown"],
["a_b_c", "a_b_c"],
["a-b-c", "a-b-c"],
["a-b-c----", "a-b-c----"],
["test:6a", "test6a"],
["test:a6", "testa6"],
["tes a a a a", "tes-a-a---a--a"],
[" tes a a a a ", "tes-a-a---a--a"]]
for parm,expect in cases:
if cheanValue(parm) != expect:
print("error: parm: %s, expect: %s, actual: %s" % (parm, expect, cheanValue(parm)))
test()
Can you help me write some test cases from github to verify that the logic of the cheanValue function is consistent with github? @lazyky@bioinformatist
def cheanValue(anchor_name):
anchor_name = anchor_name.strip()
ret = []
for c in anchor_name:
if c.isalpha() or c.isdigit() or c == '_' or c == '-':
ret.append(c.lower())
if c == ' ':
ret.append('-')
return ''.join(ret)
def test():
cases = [
["", ""],
["test(0)", "test0"],
["test!1", "test1"],
["test:2", "test2"],
["test*3", "test3"],
["test!4", "test4"],
["test:5", "test5"],
["test*6", "test6"],
["test:6 a", "test6-a"],
["test:6 !b", "test6-b"],
["test:ad # df", "testad--df"],
["test:ad #23 df 2*/*", "testad-23-df-2"],
["test:ad 23 df 2*/*", "testad-23-df-2"],
["test:ad # 23 df 2*/*", "testad--23-df-2"],
["Anchors in Markdown", "anchors-in-markdown"],
["a_b_c", "a_b_c"],
["a-b-c", "a-b-c"],
["a-b-c----", "a-b-c----"],
["test:6a", "test6a"],
["test:a6", "testa6"],
["tes a a a a", "tes-a-a---a--a"],
[" tes a a a a ", "tes-a-a---a--a"]]
for parm,expect in cases:
if cheanValue(parm) != expect:
print("error: parm: %s, expect: %s, actual: %s" % (parm, expect, cheanValue(parm)))
test()
Can you help me write some test cases from github to verify that the logic of the cheanValue function is consistent with github? @lazyky@bioinformatist
Activity
CaiCandong commentedon Aug 8, 2023
What's the impact of this problem?
bioinformatist commentedon Aug 8, 2023
@CaiCandong
Sometimes we need to use section titles like this:
However, the malfunction of the anchors pointing to them makes reading somewhat difficult.
[-]Markdown Heading ID contains Unicode “(” is inconsistent with Github[/-][+]anchors don't work when contains unicode `(`[/+]CaiCandong commentedon Aug 8, 2023
Thanks for the report, I understand the problem, besides
(
does this problem also occur when using(
directly?[-]anchors don't work when contains unicode `(`[/-][+]anchors don't work when contains punctuation marks just like `(` or `(`[/+]lazyky commentedon Aug 8, 2023
Yes. I also test
(
,!
,:
,*
,:
and!
. They are same as(
. @CaiCandonggitea
github
CaiCandong commentedon Aug 8, 2023
I've located the code for this problem, it has to do with the
user-conent-*
generation rules, but I'm not particularly sure how github handles this, can you give me some more examples to help me refine the code?lazyky commentedon Aug 8, 2023
github
There are the examples on github

CaiCandong commentedon Aug 8, 2023
Can you help me write some test cases from github to verify that the logic of the cheanValue function is consistent with github?
@lazyky @bioinformatist
wxiaoguang commentedon Aug 8, 2023
This one is also related: Different behaviors when generating Markdown links for headings containing punctuations and other symbols #19745
Quote the old comment from that issue:
I would say it's more like a
feature
but not abug
, because Markdown is not a strict system, and there seems no unique standard.There are various characters would be removed&replaced during URL generation. For example, the single quote
'
in your demo file, too.Since there is no standard, so there is no right or wrong, as long as it works.
Maybe the answer to the question could be: if there is a definition in CommonMark, then make upstream
goldmark
use CommonMark standard.CaiCandong commentedon Aug 8, 2023
I understand what you're saying, and it's not a bug. But do we need to adjust it so that github/vscode is consistent?
wxiaoguang commentedon Aug 8, 2023
Just to share the information from old issues. I am neutral for it.
bioinformatist commentedon Aug 8, 2023
Got that. Sure it is not a bug, but it seems that the logic of github is more straightforward and easier to use.
lazyky commentedon Aug 8, 2023
Yes. That's right, but
""
will not be renderedgithub
CaiCandong commentedon Aug 8, 2023
These test cases are the ones I got from github, of course they are correct. What I mean is can you help me to add some more test cases?
user-content-*
consistent with github #26388lazyky commentedon Aug 8, 2023
Ok. Below is the examples I tested on github
Make `user-content-* ` consistent with github (#26388)