Is there a method to sort a list an array of words alphabetically (natural order sorting)
Show older comments
Hi, I was wondering if there is any method to sort the words alphabetically. I know how it can be done in Java, but I assusme Matlab should be having a quicker and compact method.
Cheers, Vish
Accepted Answer
More Answers (1)
Walter Roberson
on 6 Feb 2011
0 votes
There is no single "natural order" for sorting. The rules for sorting alphabetically depend upon language and region. Especially if the language has accents or other diacritical marks.
If you were to specify exactly which characters you wish to deal with and the exact sorting order, someone might be able to assist you.
There is no Mathworks provided function that uses the LOCALE information to determine how to sort, and there is no Mathworks provided "dictionary sort" routine even just for English.
6 Comments
Walter Roberson
on 6 Feb 2011
Moved: DGM
on 9 Mar 2023
sort() does not sort English words into dictionary order.
Although it is not used as much as it should be, English allows a symbol that resembles an umlaut to be placed on the first vowel of a vowel pair, which is an indication that the pair is not a diphthong but is instead a word formed by joining two syllables together with the need to pronounce the vowels separately. The more common rendering of this is to put a dash between the two parts, but apparently that is incorrect. The dictionary sorting is nominally as-if the symbol was not present, but more common practice in dictionaries is that if two words exist with otherwise the same spelling but one with the mark, then the one with the mark gets sorted after the one without. This is not the same as sorting as if the one with the mark had a vowel that sorted after the same vowel without a mark, as that would affect the sorting relative to other words that did not have the mark.
I am having trouble thinking of an example at the moment, so I will make one up. Suppose we were creating a word to mean something that was twice illusionary. The prefix for two is "bi" or "di", so we might be tempted to form biillusionary or diillusionary. We need, though, the two i's to be pronounced separately, so this should be bi"illusionary or di"illusionary where the " indicates a mark similar to an umlaut that is placed upon the prior vowel. sort() cannot handle this case.
Or more directly, sort() cannot handle the dictionary sorting of words that have a dash in them.
Bruno Luong
on 6 Feb 2011
Moved: DGM
on 9 Mar 2023
A *Dictionary Sort* has a second mathematical/algorithmic sense far less complicated than what Walter told about: It is defined the sortting with respect to the following comparison operator:
a >= b if and only if:
i := first index such that a(i) ~= b(i);
if i exists, a(i) > b(i)
otherwise length(a)>=length(b)
That allows to sort all set of numerical arrays, including strings, regardless it is english or something else.
MATLAB sort/sortrows commands as David indicated perform the dictionary sorting in this sense very-well.
Walter Roberson
on 6 Feb 2011
Edited: DGM
on 9 Mar 2023
Reference, please, Bruno.
"[...] if and only if the first ai which is different from bi comes before bi in the alphabet". The word "alphabet" there is a link to http://en.wikipedia.org/wiki/Alphabet which describes alphabets in general, not any particular encoding.
For example, http://en.wikipedia.org/wiki/Ll#Welsh
indicates that ll in Welsh is required to sort in dictionaries as a separate letter after l ("e.g. lwc sorts before llaw")
Walter Roberson
on 6 Feb 2011
Moved: DGM
on 9 Mar 2023
"Expert", definition: someone who tells you all the ways that something cannot be done.
Marie Trang
on 9 Mar 2023
Moved: DGM
on 9 Mar 2023
I know this is an older thread, but to help anyone looking at this, I used the sortrows() command for my 7x7 character array and it worked fine to organize them alphabetically in terms of the english alphabet. I found that sort() works to organize cell arrays of strings or characters alphabetically, so it's up to you whether to use sortrows() or sort() based on if you're working with a string/character array or a cell array of strings/characters. Hope this helps!
Categories
Find more on Shifting and Sorting Matrices in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!