I ran in to an issue the other day where I needed to perform some regular expression matching on a file and then edit the file's contents. Not a big problem really, I do it all the time, but this case was different as the file involved was encoded in UTF-16.
I see more and more Unicode files these days and at first I was worried by what I might have to do to manipulate this particular file. A little bit of research led me to something that made life super easy!
The UNIX command (available on Mac OS X and Debian boxes) '
iconv' does the work for me. 'iconv' can convert to any character set encoding available on your particular system. For my purposes, I just converted my UTF-16 file into a MacRoman file. Next, I manipulated to my heart's content, then I re-converted to UTF-16. It was easy as pie. I also used the command to convert a whole bunch of UTF-16 files to UTF-8, it was no problem. Here's the actual command in action:
$ iconv -f UTF-16 -t MacRoman my_file > my_new_file ¬