HP Forums
ToUpper() ? - Printable Version

+- HP Forums (https://www.hpmuseum.org/forum)
+-- Forum: HP Calculators (and very old HP Computers) (/forum-3.html)
+--- Forum: HP Prime (/forum-5.html)
+--- Thread: ToUpper() ? (/thread-3320.html)

Pages: 1 2


ToUpper() ? - Angus - 03-09-2015 01:02 PM

Hello,

I was wondering if we already have a ToUpper() function in the prime. I didn't find it in the catalogue, but chances are high I overlooked it.

ToUpper("aaa") -> "AAA"


RE: ToUpper() ? - Thomas_Sch - 03-09-2015 06:41 PM

I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".


RE: ToUpper() ? - PANAMATIK - 03-09-2015 07:50 PM

(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".


RE: ToUpper() ? - Mark Hardman - 03-09-2015 08:25 PM

(03-09-2015 07:50 PM)PANAMATIK Wrote:  
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".

Nor would it work with accented Latin characters and other non-ASCII characters (e.g. œ to Œ).

The case mapping and other special cases (e.g. for the German eszett [ß]) are fully defined and supported by the Unicode Standard under Case Mappings. It would certainly not be a trivial task to implement.


RE: ToUpper() ? - jebem - 03-09-2015 08:45 PM

How about this:

- Define an array variable like Table(i) with the required dimension, for instance, 256 entries (or any other value, depending on the character table size to use).
- Initialize each array position with the required Uppercase value.
- As input, use the lowercase character binary weight as an index to the Table(i) array list and get the corresponding uppercase value.

I didn't try to implement this kind of algorithm in the Prime myself, though.


RE: ToUpper() ? - Angus - 03-10-2015 06:27 AM

I went with the following because mainly I am interessted in the alphanumeric keyboard. Initially I had problems while playing with integers and I wanted to have string constants converted to upper case. Poorly described problem. My fault.

Code:

//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  LOCAL off, ch;

  IF TYPE(s)==2 THEN
    FOR off FROM 1 TO SIZE(s) DO
      ch:=s(off);
      IF ch>=97 AND ch<=122 THEN
         ch:=ch-32;
      END;
      s(off):=ch;
    END;
  END;
END



RE: ToUpper() ? - cyrille de brébisson - 03-10-2015 06:28 AM

Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille


RE: ToUpper() ? - Didier Lachieze - 03-10-2015 07:51 AM

(03-09-2015 06:41 PM)Thomas_Sch Wrote:  
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

(03-10-2015 06:27 AM)Angus Wrote:  
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  LOCAL off, ch;

  IF TYPE(s)==2 THEN
    FOR off FROM 1 TO SIZE(s) DO
      ch:=s(off);
      IF ch>=97 AND ch<=122 THEN
         ch:=ch-32;
      END;
      s(off):=ch;
    END;
  END;
END

Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s);
END;



RE: ToUpper() ? - Angus - 03-10-2015 08:10 AM

hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.


RE: ToUpper() ? - Thomas_Sch - 03-10-2015 08:10 AM

(03-10-2015 07:51 AM)Didier Lachieze Wrote:  Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s);
END;
Great!!
many thanks,
i was thinking about a simple solution, but I trip over my feet ;-)
please, please write a little tutorial about that, specifically about EXECON examples!


RE: ToUpper() ? - Angus - 03-10-2015 08:26 AM

yes, I forgot. THANK YOU! Great inspiration, indeed.


RE: ToUpper() ? - Didier Lachieze - 03-10-2015 08:34 AM

(03-10-2015 08:10 AM)Angus Wrote:  hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.
I would say that it depends on what you do in the loop. Here, as it is a simple operation, the loop is faster than the combination of string to list (ASC), list processing (EXECON) and list to string (CHAR).


RE: ToUpper() ? - Angus - 03-10-2015 09:16 AM

Just a word about EXECON:

I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.


RE: ToUpper() ? - Didier Lachieze - 03-10-2015 12:12 PM

(03-10-2015 09:16 AM)Angus Wrote:  Just a word about EXECON:
I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
The notation is quite flexible but you can choose to explicitly specify the list number and the relative element position in the list.

For example, with one list a single number after '&' specifies the relative position in the list:
EXECON("&1+&2",{1,3,5}) is the same as
EXECON("&11+&12",{1,3,5})

With two lists a single number after '&' specifies the list used (the relative position is 1 by default):
EXECON("&1+&2",{1,3,5},{2,4,6}) is the same as
EXECON("&11+&21",{1,3,5},{2,4,6})

(03-10-2015 09:16 AM)Angus Wrote:  Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.
Yes, this is clearly stated in the manual and the calculator help.

Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.
[Image: a0np6m]

[Image: a0np68]


RE: ToUpper() ? - Tim Wessman - 03-10-2015 12:55 PM

(03-10-2015 12:12 PM)Didier Lachieze Wrote:  Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.

Thanks. Looks like they missed that during the file conversion work last time.


RE: ToUpper() ? - BruceH - 03-11-2015 11:44 PM

Old-timer, assembler programmers would do something like this:
Code:
CHAR(BITOR(ASC("ABcde"),#20h)) -> "abcde"

and

CHAR(BITAND(ASC("ABcde"),BITNOT(#20h))) -> "ABCDE"



RE: ToUpper() ? - bobkrohn - 03-12-2015 06:25 AM

Here's my versions of UCase and LCase.
Easy to understand.
Actually this concept can be adapted to many other uses.
Like a Filter.
The BITAND & BITOR versions above didn't work correctly.
This version works with spaces, numbers, etc.

Code:

EXPORT UCase(MyText)
BEGIN

LOCAL i,test,temp,c;

test:="abcdefghijklmnopqrstuvwxyz";
temp:="";
c:="";

FOR i FROM 1 TO DIM(MyText) DO
  c := MID(MyText,i,1);
  IF INSTRING(test,c) THEN
    temp := temp + CHAR((ASC(c)-32)); 
  ELSE
    temp := temp + c; 
  END;  
END;

RETURN temp; 
//-----
END;



EXPORT LCase(MyText)
BEGIN

LOCAL i,test,temp,c;

test := "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
temp := "";
c := "";

FOR i FROM 1 TO DIM(MyText) DO
  c := MID(MyText,i,1);
  IF INSTRING(test,c) THEN
    temp := temp + CHAR((ASC(c)+32)); 
  ELSE
    temp := temp + c; 
  END;  
END;

RETURN temp; 
//-----
END;



RE: ToUpper() ? - cyrille de brébisson - 03-12-2015 03:49 PM

Hello,
I think that this should work...

for A:= 1 to size(string) do
if instring("abce....yz", string(A))
string(A):= string(A)+32;
end;
end;

Cyrille


RE: ToUpper() ? - Claudio L. - 03-12-2015 04:19 PM

(03-10-2015 06:28 AM)cyrille de brébisson Wrote:  Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille

Here's my solution in C (somebody can translate to Prime? Han perhaps?). These tables were prepared by me based on the Unicode standard, with help from a public document about case folding. I did it a few years back, so they might have added more symbols/ranges afterwards.

Code:

static const struct {
   unsigned short start;
   unsigned short end;
   signed int diff;
} folding_table16[] = {
{0x0041,0x005A,32},
{0x00B5,0x00B5,775},
{0x00C0,0x00D6,32},
{0x00D8,0x00DE,32},
{0x0100,0x012E,1},
{0x0132,0x0136,1},
{0x0139,0x0147,1},
{0x014A,0x0176,1},
{0x0178,0x0178,-121},
{0x0179,0x017D,1},
{0x017F,0x017F,-268},
{0x0181,0x0181,210},
{0x0182,0x0184,1},
{0x0186,0x0186,206},
{0x0187,0x0187,1},
{0x0189,0x018A,205},
{0x018B,0x018B,1},
{0x018E,0x018E,79},
{0x018F,0x018F,202},
{0x0190,0x0190,203},
{0x0191,0x0191,1},
{0x0193,0x0193,205},
{0x0194,0x0194,207},
{0x0196,0x0196,211},
{0x0197,0x0197,209},
{0x0198,0x0198,1},
{0x019C,0x019C,211},
{0x019D,0x019D,213},
{0x019F,0x019F,214},
{0x01A0,0x01A4,1},
{0x01A6,0x01A6,218},
{0x01A7,0x01A7,1},
{0x01A9,0x01A9,218},
{0x01AC,0x01AC,1},
{0x01AE,0x01AE,218},
{0x01AF,0x01AF,1},
{0x01B1,0x01B2,217},
{0x01B3,0x01B5,1},
{0x01B7,0x01B7,219},
{0x01B8,0x01B8,1},
{0x01BC,0x01BC,1},
{0x01C4,0x01C4,2},
{0x01C5,0x01C5,1},
{0x01C7,0x01C7,2},
{0x01C8,0x01C8,1},
{0x01CA,0x01CA,2},
{0x01CB,0x01DB,1},
{0x01DE,0x01EE,1},
{0x01F1,0x01F1,2},
{0x01F2,0x01F4,1},
{0x01F6,0x01F6,-97},
{0x01F7,0x01F7,-56},
{0x01F8,0x021E,1},
{0x0220,0x0220,-130},
{0x0222,0x0232,1},
{0x023A,0x023A,10795},
{0x023B,0x023B,1},
{0x023D,0x023D,-163},
{0x023E,0x023E,10792},
{0x0241,0x0241,1},
{0x0243,0x0243,-195},
{0x0244,0x0244,69},
{0x0245,0x0245,71},
{0x0246,0x024E,1},
{0x0345,0x0345,116},
{0x0370,0x0372,1},
{0x0376,0x0376,1},
{0x0386,0x0386,38},
{0x0388,0x038A,37},
{0x038C,0x038C,64},
{0x038E,0x038F,63},
{0x0391,0x03A1,32},
{0x03A3,0x03AB,32},
{0x03C2,0x03C2,1},
{0x03CF,0x03CF,8},
{0x03D0,0x03D0,-30},
{0x03D1,0x03D1,-25},
{0x03D5,0x03D5,-15},
{0x03D6,0x03D6,-22},
{0x03D8,0x03EE,1},
{0x03F0,0x03F0,-54},
{0x03F1,0x03F1,-48},
{0x03F4,0x03F4,-60},
{0x03F5,0x03F5,-64},
{0x03F7,0x03F7,1},
{0x03F9,0x03F9,-7},
{0x03FA,0x03FA,1},
{0x03FD,0x03FF,-130},
{0x0400,0x040F,80},
{0x0410,0x042F,32},
{0x0460,0x0480,1},
{0x048A,0x04BE,1},
{0x04C0,0x04C0,15},
{0x04C1,0x04CD,1},
{0x04D0,0x0526,1},
{0x0531,0x0556,48},
{0x10A0,0x10C5,7264},
{0x10C7,0x10C7,7264},
{0x10CD,0x10CD,7264},
{0x1E00,0x1E94,1},
{0x1E9B,0x1E9B,-58},
{0x1E9E,0x1E9E,-7615},
{0x1EA0,0x1EFE,1},
{0x1F08,0x1F0F,-8},
{0x1F18,0x1F1D,-8},
{0x1F28,0x1F2F,-8},
{0x1F38,0x1F3F,-8},
{0x1F48,0x1F4D,-8},
{0x1F59,0x1F59,-8},
{0x1F5B,0x1F5B,-8},
{0x1F5D,0x1F5D,-8},
{0x1F5F,0x1F5F,-8},
{0x1F68,0x1F6F,-8},
{0x1F88,0x1F8F,-8},
{0x1F98,0x1F9F,-8},
{0x1FA8,0x1FAF,-8},
{0x1FB8,0x1FB9,-8},
{0x1FBA,0x1FBB,-74},
{0x1FBC,0x1FBC,-9},
{0x1FBE,0x1FBE,-7173},
{0x1FC8,0x1FCB,-86},
{0x1FCC,0x1FCC,-9},
{0x1FD8,0x1FD9,-8},
{0x1FDA,0x1FDB,-100},
{0x1FE8,0x1FE9,-8},
{0x1FEA,0x1FEB,-112},
{0x1FEC,0x1FEC,-7},
{0x1FF8,0x1FF9,-128},
{0x1FFA,0x1FFB,-126},
{0x1FFC,0x1FFC,-9},
{0x2126,0x2126,-7517},
{0x212A,0x212A,-8383},
{0x212B,0x212B,-8262},
{0x2132,0x2132,28},
{0x2160,0x216F,16},
{0x2183,0x2183,1},
{0x24B6,0x24CF,26},
{0x2C00,0x2C2E,48},
{0x2C60,0x2C60,1},
{0x2C62,0x2C62,-10743},
{0x2C63,0x2C63,-3814},
{0x2C64,0x2C64,-10727},
{0x2C67,0x2C6B,1},
{0x2C6D,0x2C6D,-10780},
{0x2C6E,0x2C6E,-10749},
{0x2C6F,0x2C6F,-10783},
{0x2C70,0x2C70,-10782},
{0x2C72,0x2C72,1},
{0x2C75,0x2C75,1},
{0x2C7E,0x2C7F,-10815},
{0x2C80,0x2CE2,1},
{0x2CEB,0x2CED,1},
{0x2CF2,0x2CF2,1},
{0xA640,0xA66C,1},
{0xA680,0xA696,1},
{0xA722,0xA72E,1},
{0xA732,0xA76E,1},
{0xA779,0xA77B,1},
{0xA77D,0xA77D,-35332},
{0xA77E,0xA786,1},
{0xA78B,0xA78B,1},
{0xA78D,0xA78D,-42280},
{0xA790,0xA792,1},
{0xA7A0,0xA7A8,1},
{0xA7AA,0xA7AA,-42308},
{0xFF21,0xFF3A,32},
{0,0,0}
};
static const struct {
   int start;
   int end;
   signed int diff;
} folding_table32[] = {
{0x10400,0x10427,40},
{0,0,0}
};


Each table entry has the start and end codes of a range, and an offset value. If your character is within that range (both ends included), you need to add the offset to obtain the equivalent lowercase character (this table is for case-folding = tolower()).
To be used for toupper(), you should simply add the offset to the start and end values to get the lowercase range, check if your character is within, and subtract the offset instead of adding it.
EDIT: Forgot to mention, when the offset is 1, the uppercase and lowercase symbols are alternated. See the example code below.

There's actually 2 tables. One for 16-bit unicode values and the second one for 32-bit unicode characters (there's only one range defined in 32-bits that can be folded).

Code:

uint32_t casefold(uint32_t character)
{
// THESE TABLES ARE FOR PROPER UNICODE CASE-INSENSITIVE COMPARISON
// TABLES ADD ABOUT 1400 BYTES TO LIBRARY
#include "folding_table.h"
    int idx;

    if(character<0x10000) {
        for(idx=0;folding_table16[idx].start!=0;++idx)
        {
            if(character<folding_table16[idx].start) return character;
            if(character<=folding_table16[idx].end) {
                if(folding_table16[idx].diff==1) {
                    if( (character-folding_table16[idx].start)&1) return character;
                }
                return character+folding_table16[idx].diff;
            }
    }
        return character;
    }
    for(idx=0;folding_table32[idx].start!=0;++idx)
    {
        if(character<folding_table32[idx].start) return character;
        if(character<=folding_table32[idx].end) {
            if(folding_table32[idx].diff==1) {
                if( (character-folding_table32[idx].start)&1) return character;
            }
            return character+folding_table32[idx].diff;
        }
    }

    return character;
}

I'll leave it for the Prime gurus to make a proper Unicode compliant ToUpper() and ToLower(). In C the tables are only 1400 bytes, not sure how much space you need on the Prime. It's not the fastest, but it's the most embeddable method I could find. Feel free to use it.

Claudio


RE: ToUpper() ? - bobkrohn - 03-12-2015 04:44 PM

Cyrille, my Prime doesn't accept

string(A):= string(A)+32;

Did not know that SIZE could be used on a String. Says nothing in Help.
I have not experimented with the idea of using the "string(A)+n".
I have to play with it!
How many other Functions have unknown uses?