Post Reply 
Sorting Strings
07-14-2017, 08:11 AM (This post was last modified: 07-14-2017 08:24 AM by Didier Lachieze.)
Post: #13
RE: Sorting Strings
Well my first version was more a proof of concept and was not optimized.

As you mention, calling Chunk() in the comparison function is suboptimal as each string will be split multiple times, each time the sort algorithm will compare it to another string. So we need to move the "chunkification" out of the sort function, to do it just once. This can be done by replacing each string by a list composed of the string followed by the chunks, and applying the sort function to this list of lists which requires just minimal changes to Comp(). Then extracting the first element of each sublist will return the original strings sorted.

Here is the second and faster version of Alphanum implementing this change :

Code:
// Alphanum: sort a list of strings_containing 
// a mix of letters and numbers. Given strings 
// of mixed characters and numbers, it sorts 
// the numbers in value order, while sorting 
// the non-numbers in ASCII order. 
// The end result is a natural sorting order.

Chunk();
Comp();

EXPORT Alphanum(List)
BEGIN
  LOCAL s:=SIZE(List),l;
  IF s==1  OR ΣLIST(MAKELIST(TYPE(List(I))≠2,I,1,s)) THEN 
    RETURN List; 
  ELSE
    l:=MAKELIST(CONCAT(List(I),Chunk(List(I))),I,1,s);
    l:=CAS("sort(l,(x,y)→Comp(x,y))");
    RETURN MAKELIST(l(I,1),I,1,s);
  END;
END;

// Compare two custom lists l1, l2
// each list is composed of a string followed by it's chunks 
// returns 0 if l1>l2
// returns 1 if l1≤l2

Comp(l1,l2)
BEGIN
  LOCAL s,j;
  s:=MIN(SIZE(l1),SIZE(l2));
  j:=2;
  WHILE IFTE(TYPE(l1(j))==TYPE(l2(j)),l1(j)==l2(j),0) AND j<s DO
    j:=j+1;
  END;
  IF TYPE(l1(j))==0 AND TYPE(l2(j))==2 THEN
    RETURN 1;
  END;
  IF TYPE(l1(j))==2 AND TYPE(l2(j))==0 THEN
    RETURN 0;
  END;
  IF l1(j)>l2(j) THEN
    RETURN 0;
  ELSE
    IF l1(j)==l2(j) AND j==s AND s<SIZE(l1) THEN
      RETURN 0;
    END;
    RETURN 1;
  END;
END;

// Breaks strings into chunks, where a chunk contains 
// either all alphabetic characters, or all numeric characters.

Chunk(s)
BEGIN
  LOCAL n:=DIM(s),j,t,m;
  t:=48≤s(1)≤57;
  IF n>1 THEN
   FOR j FROM 2 TO n DO
    IF (48≤s(j)≤57)≠t THEN 
      m:=IFTE(t,EXPR(LEFT(s,j-1)),LEFT(s,j-1));
      RETURN CONCAT(m,Chunk(RIGHT(s,n-j+1))); 
    END;
   END;
  END;
  RETURN IFTE(t,{EXPR(s)},{s});
END;

Another thing hitting the performance is that Comp() is written as a HOME program. So there is a penalty for the conversion from CAS to HOME to CAS environment every time sort() is calling Comp(). The solution is to rewrite Comp as a CAS function. But right now I have some troubles with the CAS "while" statement. I've converted all other instructions from HOME to CAS functions but "while" is not working, I'm getting an error message: "Error: Bad argument count"

Code:
// Compare two custom lists l1, l2
// each list is composed of a string followed by it's chunks 
// returns 0 if l1>l2
// returns 1 if l1≤l2
#cas
Comp(l1,l2):=
BEGIN
  local s,j;
  s:=min(size(l1),size(l2));
  j:=2;
  while l1(j)==l2(j) and j<s do
    j:=j+1;
  end;
  if type(l1(j))==DOM_INT and type(l2(j))==DOM_STRING then
    return 1;
  end;
  if type(l1(j))==DOM_STRING and type(l2(j))==DOM_INT then
    return 0;
  end;
  if l1(j)>l2(j) then
    return 0;
  else
    if l1(j)==l2(j) and j==s and s<size(l1) then
      return 0;
    end;
    return 1;
  end;
end;
#end
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
Sorting Strings - Jacob Wall - 07-10-2017, 12:18 AM
RE: Sorting Strings - webmasterpdx - 07-10-2017, 04:58 AM
RE: Sorting Strings - Joe Horn - 07-10-2017, 07:22 AM
RE: Sorting Strings - Jacob Wall - 07-10-2017, 05:02 PM
RE: Sorting Strings - Tyann - 07-11-2017, 05:03 AM
RE: Sorting Strings - Tyann - 07-11-2017, 06:06 PM
RE: Sorting Strings - Jacob Wall - 07-11-2017, 06:21 PM
RE: Sorting Strings - Didier Lachieze - 07-11-2017, 11:32 PM
RE: Sorting Strings - Gilles59 - 07-12-2017, 08:28 PM
RE: Sorting Strings - Han - 07-12-2017, 12:32 AM
RE: Sorting Strings - Jacob Wall - 07-12-2017, 05:10 AM
RE: Sorting Strings - Jacob Wall - 07-14-2017, 07:02 AM
RE: Sorting Strings - Didier Lachieze - 07-14-2017 08:11 AM
RE: Sorting Strings - Jacob Wall - 07-15-2017, 03:00 AM



User(s) browsing this thread: 11 Guest(s)