Alphanumerical lists are sortable by alphabet and number, obviously, but if you have a list where each entry begins with a different punctuation mark (or any other kind of non-alphanumeric character), is there a similar standardised ordering method for them?

I imagine, for example, that a comma will come before whatever this is: ¦

I just tested an A-Z sort in Google Sheets where each cell was a different punctuation mark, and it seemed to rearrange what I’d entered into some sort of order, but is this order shared universally? Is there a global Unicode-compliant ordering method everyone uses?

Cheers!

  • fubo@lemmy.world
    link
    fedilink
    arrow-up
    11
    ·
    3 months ago

    If your input is limited to ASCII, sure.

    But ASCII is only a 7-bit standard, and only supports those characters needed by American English computer users in the 1960s. Lots of characters you might see in “plain text” are not part of ASCII; including all accented characters, all non-Latin alphabets, and many common symbols and punctuation marks including these: £€¢©™°

    (Yes, you could get accented characters in the pre-Unicode days using 8-bit “extended ASCII”, e.g. IBM/Windows code pages. However, those are not really ASCII and they will break if the text is interpreted as the wrong code page.)

    Unicode collation is the Right Thing today.