String operations of Swift seems to be easy to handle, but we should take care of them in development.
For example, this is a common code snippet which stays on the top spot, by searching ‘swift substring’ from www.google.co.jp.
1 2 3 4 5 |
|
But, it is not correct. NSString
’s implementation is based on UTF-16 and handing index for it is just confusing.
Take a look at the following test.
Because index in NSString
format is different from index in String
format. The text is counted as length 2 in NSString
, but only 1 in String
.
So using substring
in NSString
only get half of it, from its binary expression. Then certainly, it cannot be decoded correctly.
Swift Online Guide has a detail explanation for this problem.
We should use String.Index
to handle different byte length of each character, and also decomposed and precomposed characters.
Because String
in Swift is implemented as Unicode Scalar, or UTF-32, is always 4-byte long for each character and it default behaviour can handle composed characters.
1 2 3 4 5 6 7 8 9 |
|
For further reading about encoding, I recommend this page. http://www.objc.io/issues/9-strings/unicode/#utf-8