Unicode char comparison puzzle
|
05-30-2018, 12:50 PM
Post: #1
|
|||
|
|||
Unicode char comparison puzzle
I'd like to print out all the juicy non '⛾' unicode characters between #2600h and #26FFh as per http://www.unicode-symbol.com/block/Misc_Symbols.html
Code:
The output on the prime prints most characters as '⛾', so I tried to filter out those chars out with the conditional expression Code: IF CHAR(cp) ≠ "⛾" THEN When pasting the terminal output of the prime code above, from the prime emulator on my iPad to this thread, the output was surprising. More of the characters are rendered and there are many fewer '⛾' characters. ☀☁☂☃☄★☆☇☈☉☊☋☌☍☎☏☐☑☒☓☔☕☖☗☘☙☚☛☜☝☞☟☠☡☢☣☤☥☦☧☨☩☪☫☬☭☮☯☰☱☲☳☴☵☶☷☸☹☺☻☼☽☾☿♀♁♂♃♄♅♆♇♈♉♊♋♌♍♎♏♐♑♒♓♔♕♖♗♘♙♚♛♜♝♞♟♠♡♢♣♤♥♦♧♨♩♪♫♬♭♮♯♰♱♲♳♴♵♶♷♸♹♺♻♼♽♾♿⚀⚁⚂⚃⚄⚅⚆⚇⚈⚉⚊⚋⚌⚍⚎⚏⚐⚑⚒⚓⚔⚕⚖⚗⚘⚙⚚⚛⚜⚝⚞⚟⚠⚡⚢⚣⚤⚥⚦⚧⚨⚩⚪⚫⚬⚭⚮⚯⚰⚱⚲⚳⚴⚵⚶⚷⚸⚹⚺⚻⚼⚽⚾⚿⛀⛁⛂⛃⛄⛅⛆⛇⛈⛉⛊⛋⛌⛍⛎⛏⛐⛑⛒⛓⛔⛕⛖⛗⛘⛙⛚⛛⛜⛝⛞⛟⛠⛡⛢⛣⛤⛥⛦⛧⛨⛩⛪⛫⛬⛭⛮⛯⛰⛱⛲⛳⛴⛵⛶⛷⛸⛹⛺⛻⛼⛽⛾⛿ Thus it looks like the Prime cannot visually represent / print out all unicode characters, only a limited subset of them, Question 2: does anyone know the limits to the Prime's unicode support? |
|||
05-30-2018, 01:40 PM
Post: #2
|
|||
|
|||
RE: Unicode char comparison puzzle
(05-30-2018 12:50 PM)tcab Wrote: Thus it looks like the Prime cannot visually represent / print out all unicode characters, only a limited subset of them Devices don't decide which Unicode symbols they can print, that's what fonts are for. Different fonts have glyphs for different character sets. I don't know of any font that has "all" Unicode characters, some are more complete than others but they are all mere fractions depending on usage. Back to your problem: Code-wise, the Unicode characters don't change just because the font doesn't have a glyph to display it, therefore there's no way for you to filter the text, unless you take the font in which you want to display the text, make yourself a catalog of which characters have a glyph and which ones will display the default one, and use that table to do the filtering. Basically you need one table per font, or a program to "extract" that table from any font, and your filter code using those tables. |
|||
05-30-2018, 03:34 PM
Post: #3
|
|||
|
|||
RE: Unicode char comparison puzzle
The font used by the Prime is called Prime Sans, and it's a modified version of Google's free Droid Sans font. The ttf file is in the Fonts folder of the HP Prime Virtual Calculator installation, so you can look at that with your favorite font analysis tool to see what characters it supports. A tool I use shows 51280 different characters, but it doesn't seem to have any of the emoji characters.
|
|||
05-31-2018, 01:24 AM
Post: #4
|
|||
|
|||
RE: Unicode char comparison puzzle
Any chance the font set of the physical Prime can be altered, or the set updated via a loadable code page? I haven't seen anything in the documentation for that, so I presume it can't be done. Poking around in the bin file would probably come with the usual Void The Warranty admonition. I'd hate to do my own font creation and placement via graphics commands just to get a few unsupported characters!
~Mark Remember kids, "In a democracy, you get the government you deserve." |
|||
05-31-2018, 01:34 AM
Post: #5
|
|||
|
|||
RE: Unicode char comparison puzzle
Thanks for the responses.
So the problem is that I want to filter out all the unicode chars/codepoints that cannot be rendered. I just want to see all the "juicy" characters that the Prime is capable of rendering, and none of the noise viz. I dont want: I want: I stumbled upon this post about comparing the bitmaps of characters, and managed to write the Prime code to do it. My new character comparison function renders the character as a bitmap, which I then convert into a list of 1' s and 0's and can use this information to compare characters, as rendered. Code:
Running across a longer range of chars takes time, but gives: A couple of programming issues I ran into:
|
|||
05-31-2018, 04:22 AM
Post: #6
|
|||
|
|||
RE: Unicode char comparison puzzle | |||
05-31-2018, 04:29 AM
Post: #7
|
|||
|
|||
RE: Unicode char comparison puzzle
(05-31-2018 01:34 AM)tcab Wrote: A couple of programming issues I ran into: Bonjour Pour utiliser les variables graphiques G1...G9 vous devez les dimensionnées avant avec DIMGROB. Comme je ne vois pas cette instruction dans vôtre code : G1 était sans doute déja dimensionnée par un autre programme exécuté avant et pas G2. Pour comparer 2 listes il faut utiliser EQ(liste1,liste2). Si cela peut vous aider. Hello To use the graphical variables G1 ... G9 you have to dimension them before with DIMGROB. As I do not see this instruction in your code: G1 was probably already sized by another program executed before and not G2. To compare 2 lists, use EQ (list1, list2). If that can help you. Sorry for my english |
|||
05-31-2018, 01:23 PM
Post: #8
|
|||
|
|||
RE: Unicode char comparison puzzle
Thanks for the tips re comparing lists with EQ and DIMGROB usage. I've incorporated those changes, though then decided to render onto the main screen G0 to serve as a visual progress indicator, together with a percentage done.
Code:
More programming questions:
|
|||
06-01-2018, 04:30 AM
Post: #9
|
|||
|
|||
RE: Unicode char comparison puzzle
(05-31-2018 01:23 PM)tcab Wrote: [*]Is there a way to TEXTOUT an integer in hex? Bonjour Vous pouvez écrire var:=TEXTOUT_P(....... var contiendra la coordonée x du dernier pixel utiliser pour l'affichage du message ou de la valeur. Hello You can write var:=TEXTOUT_P (....... var will contain the x coordinate of the last pixel used for displaying the message or value. Sorry for my english |
|||
06-01-2018, 04:53 AM
Post: #10
|
|||
|
|||
RE: Unicode char comparison puzzle
Hello,
TEXTOUT_P(R->B(value), 0,0) will dispay hex (ih hex is the current mode). TEXTOUT_P returns the x position of the NEXT pixel where to write the next character... so, textout_p("part2", textout_p("part1", 0, 0), 0) will write part1part2 on the screen cyrille Although I work for the HP calculator group, the views and opinions I post here are my own. I do not speak for HP. |
|||
06-01-2018, 07:48 AM
Post: #11
|
|||
|
|||
RE: Unicode char comparison puzzle
Thanks guys - very helpful.
Am now working on an optimised algorithm that:
I measured the width of a character font size 2 using the return value of TEXTOUT_P as being 12 pixels. Is there any similar technique to determine a character's height? |
|||
06-01-2018, 07:56 AM
Post: #12
|
|||
|
|||
RE: Unicode char comparison puzzle
Not really. Remember that characters can be different widths/heights and even flow into neighboring characters in some cases. It is a very different thing then just plain "bitmap" fonts.
What is actually being returned as the "width" is the stroke advancement. TW Although I work for HP, the views and opinions I post here are my own. |
|||
06-01-2018, 08:03 AM
(This post was last modified: 06-01-2018 08:05 AM by Tim Wessman.)
Post: #13
|
|||
|
|||
RE: Unicode char comparison puzzle
Also, note that we use 16bit per character encoding and currently have no plans to support any characters higher then that. So you can stop at #FFFE...
I'm a bit curious what is wrong or missing from the existing character browser? What are you trying to accomplish that isn't done there? I'm the one primarily responsible for the unicode stuff in general, so I'm the one that may be able to provide more details in specific areas. TW Although I work for HP, the views and opinions I post here are my own. |
|||
06-01-2018, 08:29 AM
(This post was last modified: 06-01-2018 08:32 AM by Martin Hepperle.)
Post: #14
|
|||
|
|||
RE: Unicode char comparison puzzle
Some statistics:
PrimeSansMono.ttf ("Prime Sans Mono") has 613 glyphs PrimeSansBold.ttf ("Prime Sans Bold") has 606 glyphs PrimeSansFull.ttf ("Prime Sans") has 51279 glyphs Printing out a character table of the latter is left as an exercise to the reader ;-) And a small Java program (sorry no Prime here) to create a list of the code points: Code: static void checkChars () throws FontFormatException, IOException |
|||
06-01-2018, 10:35 AM
Post: #15
|
|||
|
|||
RE: Unicode char comparison puzzle
Ok here is the new faster version - to view different ranges of characters just edit the code.
Code:
The progress indicator now displays in hex, too. Quote:I'm a bit curious what is wrong or missing from the existing character browser? What are you trying to accomplish that isn't done there? The Prime character browser is great - nothing missing there. I just happened to be looping through some characters one day on the Prime and wondered how I could filter out dud characters, and, well, one thing led to another. :-) Just learning... about unicode, pixels, coordinate systems, grobs, hex format, comparing lists, TEXTOUT, character widths & optimisation. Love it - the Prime sure is a powerful calculator - a well thought out design appropriate to the times we live in (touchscreen, unicode, grobs, algebraic, structured programming, rechargeable, free connectivity kit/editor/emulator, iPhone versions etc.). Andy |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: