2010-08-06

Sort with uniq comma separated items with POSIX utils.

Suppose you have string:
str="item4, item2, item3, item2, ..."
and want alphabetically sort items in shell script.

To sort with uniq you can use sort -u utils. So you must transform string into multiline, sort, then back to original format.

This man do string manipulation with complex sed command which I try avoid if possible, because it force read a lot of sed manual and debug code a long:

echo "item4, item2, item3, item2, ..." \ | sed "s| *, *|\n|g" \ | grep -v "^ *$" \ | sort --unique \ | sed -e ':x;$by;N;bx' -e ':y;s/\n/, /g'

My solution is to use awk. I think that it more verbose and really can be written correct from first attempt:

echo "item4, item2, item3, item2, ..." \ | tr -d " \n" \ | awk 'BEGIN{RS=",";ORS="\n"}{print $0}' \ | sort -u \ | awk '{ORS=", "}{print $1}'