ToUpper() ?
03-09-2015, 01:02 PM
Post: #1
 Angus Member Posts: 212 Joined: Feb 2014
ToUpper() ?
Hello,

I was wondering if we already have a ToUpper() function in the prime. I didn't find it in the catalogue, but chances are high I overlooked it.

ToUpper("aaa") -> "AAA"
03-09-2015, 06:41 PM
Post: #2
 Thomas_Sch Senior Member Posts: 377 Joined: Dec 2013
RE: ToUpper() ?
I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".
03-09-2015, 07:50 PM (This post was last modified: 03-09-2015 07:51 PM by PANAMATIK.)
Post: #3
 PANAMATIK Senior Member Posts: 1,027 Joined: Oct 2014
RE: ToUpper() ?
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".

That's one small step for a man - one giant leap for mankind.
03-09-2015, 08:25 PM (This post was last modified: 03-09-2015 08:26 PM by Mark Hardman.)
Post: #4
 Mark Hardman Senior Member Posts: 525 Joined: Dec 2013
RE: ToUpper() ?
(03-09-2015 07:50 PM)PANAMATIK Wrote:
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".

Nor would it work with accented Latin characters and other non-ASCII characters (e.g. œ to Œ).

The case mapping and other special cases (e.g. for the German eszett [ß]) are fully defined and supported by the Unicode Standard under Case Mappings. It would certainly not be a trivial task to implement.

Ceci n'est pas une signature.
03-09-2015, 08:45 PM
Post: #5
 jebem Senior Member Posts: 1,343 Joined: Feb 2014
RE: ToUpper() ?

- Define an array variable like Table(i) with the required dimension, for instance, 256 entries (or any other value, depending on the character table size to use).
- Initialize each array position with the required Uppercase value.
- As input, use the lowercase character binary weight as an index to the Table(i) array list and get the corresponding uppercase value.

I didn't try to implement this kind of algorithm in the Prime myself, though.

Jose Mesquita

03-10-2015, 06:27 AM (This post was last modified: 03-10-2015 06:28 AM by Angus.)
Post: #6
 Angus Member Posts: 212 Joined: Feb 2014
RE: ToUpper() ?
I went with the following because mainly I am interessted in the alphanumeric keyboard. Initially I had problems while playing with integers and I wanted to have string constants converted to upper case. Poorly described problem. My fault.

Code:
 //convert 'a'..'z' to 'A'..'Z'. Leave others unchanged EXPORT TOUPPER(s) BEGIN   LOCAL off, ch;   IF TYPE(s)==2 THEN     FOR off FROM 1 TO SIZE(s) DO       ch:=s(off);       IF ch>=97 AND ch<=122 THEN          ch:=ch-32;       END;       s(off):=ch;     END;   END; END
03-10-2015, 06:28 AM
Post: #7
 cyrille de brébisson Senior Member Posts: 1,047 Joined: Dec 2013
RE: ToUpper() ?
Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille
03-10-2015, 07:51 AM
Post: #8
 Didier Lachieze Senior Member Posts: 1,495 Joined: Dec 2013
RE: ToUpper() ?
(03-09-2015 06:41 PM)Thomas_Sch Wrote:
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

(03-10-2015 06:27 AM)Angus Wrote:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged EXPORT TOUPPER(s) BEGIN   LOCAL off, ch;   IF TYPE(s)==2 THEN     FOR off FROM 1 TO SIZE(s) DO       ch:=s(off);       IF ch>=97 AND ch<=122 THEN          ch:=ch-32;       END;       s(off):=ch;     END;   END; END

Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged EXPORT TOUPPER(s) BEGIN   IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s); END;
03-10-2015, 08:10 AM (This post was last modified: 03-10-2015 08:25 AM by Angus.)
Post: #9
 Angus Member Posts: 212 Joined: Feb 2014
RE: ToUpper() ?
hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.
03-10-2015, 08:10 AM
Post: #10
 Thomas_Sch Senior Member Posts: 377 Joined: Dec 2013
RE: ToUpper() ?
(03-10-2015 07:51 AM)Didier Lachieze Wrote:  Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged EXPORT TOUPPER(s) BEGIN   IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s); END;
Great!!
many thanks,
i was thinking about a simple solution, but I trip over my feet ;-)
03-10-2015, 08:26 AM
Post: #11
 Angus Member Posts: 212 Joined: Feb 2014
RE: ToUpper() ?
yes, I forgot. THANK YOU! Great inspiration, indeed.
03-10-2015, 08:34 AM
Post: #12
 Didier Lachieze Senior Member Posts: 1,495 Joined: Dec 2013
RE: ToUpper() ?
(03-10-2015 08:10 AM)Angus Wrote:  hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.
I would say that it depends on what you do in the loop. Here, as it is a simple operation, the loop is faster than the combination of string to list (ASC), list processing (EXECON) and list to string (CHAR).
03-10-2015, 09:16 AM
Post: #13
 Angus Member Posts: 212 Joined: Feb 2014
RE: ToUpper() ?

I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.
03-10-2015, 12:12 PM (This post was last modified: 03-10-2015 12:17 PM by Didier Lachieze.)
Post: #14
 Didier Lachieze Senior Member Posts: 1,495 Joined: Dec 2013
RE: ToUpper() ?
(03-10-2015 09:16 AM)Angus Wrote:  Just a word about EXECON:
I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
The notation is quite flexible but you can choose to explicitly specify the list number and the relative element position in the list.

For example, with one list a single number after '&' specifies the relative position in the list:
EXECON("&1+&2",{1,3,5}) is the same as
EXECON("&11+&12",{1,3,5})

With two lists a single number after '&' specifies the list used (the relative position is 1 by default):
EXECON("&1+&2",{1,3,5},{2,4,6}) is the same as
EXECON("&11+&21",{1,3,5},{2,4,6})

(03-10-2015 09:16 AM)Angus Wrote:  Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.
Yes, this is clearly stated in the manual and the calculator help.

Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.

03-10-2015, 12:55 PM
Post: #15
 Tim Wessman Senior Member Posts: 2,280 Joined: Dec 2013
RE: ToUpper() ?
(03-10-2015 12:12 PM)Didier Lachieze Wrote:  Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.

Thanks. Looks like they missed that during the file conversion work last time.

TW

Although I work for HP, the views and opinions I post here are my own.
03-11-2015, 11:44 PM
Post: #16
 BruceH Senior Member Posts: 390 Joined: Dec 2013
RE: ToUpper() ?
Old-timer, assembler programmers would do something like this:
Code:
CHAR(BITOR(ASC("ABcde"),#20h)) -> "abcde" and CHAR(BITAND(ASC("ABcde"),BITNOT(#20h))) -> "ABCDE"
03-12-2015, 06:25 AM (This post was last modified: 03-12-2015 06:30 AM by bobkrohn.)
Post: #17
 bobkrohn Member Posts: 142 Joined: Dec 2014
RE: ToUpper() ?
Here's my versions of UCase and LCase.
Easy to understand.
Actually this concept can be adapted to many other uses.
Like a Filter.
The BITAND & BITOR versions above didn't work correctly.
This version works with spaces, numbers, etc.

Code:
 EXPORT UCase(MyText) BEGIN LOCAL i,test,temp,c; test:="abcdefghijklmnopqrstuvwxyz"; temp:=""; c:=""; FOR i FROM 1 TO DIM(MyText) DO   c := MID(MyText,i,1);   IF INSTRING(test,c) THEN     temp := temp + CHAR((ASC(c)-32));    ELSE     temp := temp + c;    END;   END; RETURN temp;  //----- END; EXPORT LCase(MyText) BEGIN LOCAL i,test,temp,c; test := "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; temp := ""; c := ""; FOR i FROM 1 TO DIM(MyText) DO   c := MID(MyText,i,1);   IF INSTRING(test,c) THEN     temp := temp + CHAR((ASC(c)+32));    ELSE     temp := temp + c;    END;   END; RETURN temp;  //----- END;
03-12-2015, 03:49 PM
Post: #18
 cyrille de brébisson Senior Member Posts: 1,047 Joined: Dec 2013
RE: ToUpper() ?
Hello,
I think that this should work...

for A:= 1 to size(string) do
if instring("abce....yz", string(A))
string(A):= string(A)+32;
end;
end;

Cyrille
03-12-2015, 04:19 PM (This post was last modified: 03-12-2015 04:58 PM by Claudio L..)
Post: #19
 Claudio L. Senior Member Posts: 1,840 Joined: Dec 2013
RE: ToUpper() ?
(03-10-2015 06:28 AM)cyrille de brébisson Wrote:  Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille

Here's my solution in C (somebody can translate to Prime? Han perhaps?). These tables were prepared by me based on the Unicode standard, with help from a public document about case folding. I did it a few years back, so they might have added more symbols/ranges afterwards.

Code:
 static const struct {    unsigned short start;    unsigned short end;    signed int diff; } folding_table16[] = { {0x0041,0x005A,32}, {0x00B5,0x00B5,775}, {0x00C0,0x00D6,32}, {0x00D8,0x00DE,32}, {0x0100,0x012E,1}, {0x0132,0x0136,1}, {0x0139,0x0147,1}, {0x014A,0x0176,1}, {0x0178,0x0178,-121}, {0x0179,0x017D,1}, {0x017F,0x017F,-268}, {0x0181,0x0181,210}, {0x0182,0x0184,1}, {0x0186,0x0186,206}, {0x0187,0x0187,1}, {0x0189,0x018A,205}, {0x018B,0x018B,1}, {0x018E,0x018E,79}, {0x018F,0x018F,202}, {0x0190,0x0190,203}, {0x0191,0x0191,1}, {0x0193,0x0193,205}, {0x0194,0x0194,207}, {0x0196,0x0196,211}, {0x0197,0x0197,209}, {0x0198,0x0198,1}, {0x019C,0x019C,211}, {0x019D,0x019D,213}, {0x019F,0x019F,214}, {0x01A0,0x01A4,1}, {0x01A6,0x01A6,218}, {0x01A7,0x01A7,1}, {0x01A9,0x01A9,218}, {0x01AC,0x01AC,1}, {0x01AE,0x01AE,218}, {0x01AF,0x01AF,1}, {0x01B1,0x01B2,217}, {0x01B3,0x01B5,1}, {0x01B7,0x01B7,219}, {0x01B8,0x01B8,1}, {0x01BC,0x01BC,1}, {0x01C4,0x01C4,2}, {0x01C5,0x01C5,1}, {0x01C7,0x01C7,2}, {0x01C8,0x01C8,1}, {0x01CA,0x01CA,2}, {0x01CB,0x01DB,1}, {0x01DE,0x01EE,1}, {0x01F1,0x01F1,2}, {0x01F2,0x01F4,1}, {0x01F6,0x01F6,-97}, {0x01F7,0x01F7,-56}, {0x01F8,0x021E,1}, {0x0220,0x0220,-130}, {0x0222,0x0232,1}, {0x023A,0x023A,10795}, {0x023B,0x023B,1}, {0x023D,0x023D,-163}, {0x023E,0x023E,10792}, {0x0241,0x0241,1}, {0x0243,0x0243,-195}, {0x0244,0x0244,69}, {0x0245,0x0245,71}, {0x0246,0x024E,1}, {0x0345,0x0345,116}, {0x0370,0x0372,1}, {0x0376,0x0376,1}, {0x0386,0x0386,38}, {0x0388,0x038A,37}, {0x038C,0x038C,64}, {0x038E,0x038F,63}, {0x0391,0x03A1,32}, {0x03A3,0x03AB,32}, {0x03C2,0x03C2,1}, {0x03CF,0x03CF,8}, {0x03D0,0x03D0,-30}, {0x03D1,0x03D1,-25}, {0x03D5,0x03D5,-15}, {0x03D6,0x03D6,-22}, {0x03D8,0x03EE,1}, {0x03F0,0x03F0,-54}, {0x03F1,0x03F1,-48}, {0x03F4,0x03F4,-60}, {0x03F5,0x03F5,-64}, {0x03F7,0x03F7,1}, {0x03F9,0x03F9,-7}, {0x03FA,0x03FA,1}, {0x03FD,0x03FF,-130}, {0x0400,0x040F,80}, {0x0410,0x042F,32}, {0x0460,0x0480,1}, {0x048A,0x04BE,1}, {0x04C0,0x04C0,15}, {0x04C1,0x04CD,1}, {0x04D0,0x0526,1}, {0x0531,0x0556,48}, {0x10A0,0x10C5,7264}, {0x10C7,0x10C7,7264}, {0x10CD,0x10CD,7264}, {0x1E00,0x1E94,1}, {0x1E9B,0x1E9B,-58}, {0x1E9E,0x1E9E,-7615}, {0x1EA0,0x1EFE,1}, {0x1F08,0x1F0F,-8}, {0x1F18,0x1F1D,-8}, {0x1F28,0x1F2F,-8}, {0x1F38,0x1F3F,-8}, {0x1F48,0x1F4D,-8}, {0x1F59,0x1F59,-8}, {0x1F5B,0x1F5B,-8}, {0x1F5D,0x1F5D,-8}, {0x1F5F,0x1F5F,-8}, {0x1F68,0x1F6F,-8}, {0x1F88,0x1F8F,-8}, {0x1F98,0x1F9F,-8}, {0x1FA8,0x1FAF,-8}, {0x1FB8,0x1FB9,-8}, {0x1FBA,0x1FBB,-74}, {0x1FBC,0x1FBC,-9}, {0x1FBE,0x1FBE,-7173}, {0x1FC8,0x1FCB,-86}, {0x1FCC,0x1FCC,-9}, {0x1FD8,0x1FD9,-8}, {0x1FDA,0x1FDB,-100}, {0x1FE8,0x1FE9,-8}, {0x1FEA,0x1FEB,-112}, {0x1FEC,0x1FEC,-7}, {0x1FF8,0x1FF9,-128}, {0x1FFA,0x1FFB,-126}, {0x1FFC,0x1FFC,-9}, {0x2126,0x2126,-7517}, {0x212A,0x212A,-8383}, {0x212B,0x212B,-8262}, {0x2132,0x2132,28}, {0x2160,0x216F,16}, {0x2183,0x2183,1}, {0x24B6,0x24CF,26}, {0x2C00,0x2C2E,48}, {0x2C60,0x2C60,1}, {0x2C62,0x2C62,-10743}, {0x2C63,0x2C63,-3814}, {0x2C64,0x2C64,-10727}, {0x2C67,0x2C6B,1}, {0x2C6D,0x2C6D,-10780}, {0x2C6E,0x2C6E,-10749}, {0x2C6F,0x2C6F,-10783}, {0x2C70,0x2C70,-10782}, {0x2C72,0x2C72,1}, {0x2C75,0x2C75,1}, {0x2C7E,0x2C7F,-10815}, {0x2C80,0x2CE2,1}, {0x2CEB,0x2CED,1}, {0x2CF2,0x2CF2,1}, {0xA640,0xA66C,1}, {0xA680,0xA696,1}, {0xA722,0xA72E,1}, {0xA732,0xA76E,1}, {0xA779,0xA77B,1}, {0xA77D,0xA77D,-35332}, {0xA77E,0xA786,1}, {0xA78B,0xA78B,1}, {0xA78D,0xA78D,-42280}, {0xA790,0xA792,1}, {0xA7A0,0xA7A8,1}, {0xA7AA,0xA7AA,-42308}, {0xFF21,0xFF3A,32}, {0,0,0} }; static const struct {    int start;    int end;    signed int diff; } folding_table32[] = { {0x10400,0x10427,40}, {0,0,0} };

Each table entry has the start and end codes of a range, and an offset value. If your character is within that range (both ends included), you need to add the offset to obtain the equivalent lowercase character (this table is for case-folding = tolower()).
To be used for toupper(), you should simply add the offset to the start and end values to get the lowercase range, check if your character is within, and subtract the offset instead of adding it.
EDIT: Forgot to mention, when the offset is 1, the uppercase and lowercase symbols are alternated. See the example code below.

There's actually 2 tables. One for 16-bit unicode values and the second one for 32-bit unicode characters (there's only one range defined in 32-bits that can be folded).

Code:
 uint32_t casefold(uint32_t character) { // THESE TABLES ARE FOR PROPER UNICODE CASE-INSENSITIVE COMPARISON // TABLES ADD ABOUT 1400 BYTES TO LIBRARY #include "folding_table.h"     int idx;     if(character<0x10000) {         for(idx=0;folding_table16[idx].start!=0;++idx)         {             if(character<folding_table16[idx].start) return character;             if(character<=folding_table16[idx].end) {                 if(folding_table16[idx].diff==1) {                     if( (character-folding_table16[idx].start)&1) return character;                 }                 return character+folding_table16[idx].diff;             }     }         return character;     }     for(idx=0;folding_table32[idx].start!=0;++idx)     {         if(character<folding_table32[idx].start) return character;         if(character<=folding_table32[idx].end) {             if(folding_table32[idx].diff==1) {                 if( (character-folding_table32[idx].start)&1) return character;             }             return character+folding_table32[idx].diff;         }     }     return character; }

I'll leave it for the Prime gurus to make a proper Unicode compliant ToUpper() and ToLower(). In C the tables are only 1400 bytes, not sure how much space you need on the Prime. It's not the fastest, but it's the most embeddable method I could find. Feel free to use it.

Claudio
03-12-2015, 04:44 PM (This post was last modified: 03-12-2015 05:06 PM by bobkrohn.)
Post: #20
 bobkrohn Member Posts: 142 Joined: Dec 2014
RE: ToUpper() ?
Cyrille, my Prime doesn't accept

string(A):= string(A)+32;

Did not know that SIZE could be used on a String. Says nothing in Help.
I have not experimented with the idea of using the "string(A)+n".
I have to play with it!
How many other Functions have unknown uses?
 « Next Oldest | Next Newest »

User(s) browsing this thread: 1 Guest(s)