newRPL: Adding string processing commands?
|
09-21-2016, 02:47 AM
Post: #6
|
|||
|
|||
RE: newRPL: Adding string processing commands?
Lots of ideas!...
(09-20-2016 04:04 PM)Didier Lachieze Wrote: The Prime commands ASC and CHAR are pretty handy: I like it, especially since strings in newRPL are UTF-8, these commands would convert Unicode codepoints into a string and vice versa, in other words encode/decode UTF-8. ASC is perhaps not the most appropriate name since it's not ASCII anymore. Alternative name suggestions are welcome ("UTF→" and "→UTF", or "STR2LST" and "LST2STR"). "abc" UTF→ -> { 97 98 99 } { 97 98 99 } →UTF -> "abc" What to do with composite Unicode characters? should they be split into various codepoints or perhaps a list of lists? (09-20-2016 05:08 PM)Vtile Wrote: Linecount - Counts lines of text object These are good too. In RPL slang it would be something like this (I'm renaming words into tokens to make it more generic, other name suggestions are welcome): "STR" NLINES -> N (count of lines in a text) "STR" N NTHLINE -> "LINE" (extract the nth line of text) "STR" N NTHLINEPOS -> POS (position of the nth line within STR) "STR" "SEP" NTOKENS -> N (count of tokens in "STR", separated by "SEP") "STR" "SEP" N NTHTOKEN -> "TOKEN" (extract the nth token in STR) "STR" "SEP" N NTHTOKENPOS -> POS (position of the nth token within the string) Notice how the lines version is the same as tokens, just using newlines as the separator. I think they may not need to be included, just the TOKEN versions. To trim a string: "STR" "WHITES" TRIM -> "TRIMMED" (removes any charaters present in "WHITES" from the end of "STR") "STR" "WHITES" RTRIM -> "TRIMMED" (same as TRIM, but removes at the beginning of the string) (09-20-2016 10:02 PM)David Hayden Wrote: Definitely SREV - reverse a string. I see some good ones here too: "STR" SREV -> "RTS" (reverse a string) "STR" RHEAD -> "R" (last character, the name RHEAD is for consistent naming with HEAD/TAIL) "STR" RTAIL -> "ST" (all but last character, reverse of TAIL) The find_first_not_of() are the same as NTHTOKENPOS above if you request the 1st token and put all your forbidden characters as white spaces. (09-21-2016 12:27 AM)DavidM Wrote:(09-20-2016 03:27 PM)Claudio L. Wrote: The latest HHC contest made me think that perhaps SUB and POS are insufficient to properly handle strings ... a reverse POS? OK, here we go: "STR" "SEARCH" RPOS -> pos (find the last occurrence of "SEARCH" within "STR", same as POS from the end) "STR" "SEARCH" N NPOS -> pos (first occurrence of "SEARCH", but start from position N) "STR" "SEARCH" N NRPOS -> pos (last occurrence of "SEARCH", but start from position N towards the first character) (09-21-2016 01:42 AM)compsystems Wrote: commands to send strings to a printer That's I/O, not really string manipulation. In the future there will be a command to send strings over the serial port, and if I ever write an IRda driver perhaps infrared too. Although to send things to a printer in the 21st century perhaps newRPL should be able to render text and graphics to a PDF file more than anything. Finally, I need to add a couple of commands that are a necessary evil due to multibyte characters: "STR" STRLEN -> N (get the length in Unicode characters, SIZE is in bytes) There's also perhaps the need to have "byte" versions of all the commands, to treat strings as a stream of bytes rather than a Unicode string (???) |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 2 Guest(s)