Any suggestions to remove accents from Unicode strings
|
05-17-2017, 08:02 AM
Post: #1
|
|||
|
|||
Any suggestions to remove accents from Unicode strings
I have a simple Prime program that can search for an element name in multiple languages - but on an English-configured keyboard (English being my native language) , entering the foreign text is impractical, requiring hunting through the character set for the right accent or knowing the Unicode number.
When searching for things on the internet, searching is much more tolerant of spelling errors. Specifically, Google translate will often successfully translate words typed in without using accents. Is there any practical way to mask out accents from text in a program to make typing in a search string easier? I cannot imagine any easy solution, comparable to masking out 1 bit in ASCII to search for text whilst ignoring whether it is upper or lower case. (The particular accented languages I try are French and Polish, but I am only fluent in English) Stephen Lewkowicz (G1CMZ) https://my.numworks.com/python/steveg1cmz |
|||
05-18-2017, 11:17 AM
Post: #2
|
|||
|
|||
RE: Any suggestions to remove accents from Unicode strings
(05-17-2017 08:02 AM)StephenG1CMZ Wrote: Google translate will often successfully translate words typed in without using accents. [OFF] Yeah, this is really a big problem with the translators (for example in Hungarian szárnyalás == flying freely like a bird and szarnyalás == licking s...t is totally different). And of course these softwares cannot do anything with agglutinative languages - like Hungarian. A Hungarian cannot use efficiently these translators and searching services. For example: box - doboz (Hungarian) in the box - dobozban on the box - dobozon to the box - dobozhoz to the top of the box - dobozra from the box - doboztól with the box - dobozzal etc... [/OFF] Csaba |
|||
« Next Oldest | Next Newest »
|
User(s) browsing this thread: 1 Guest(s)