String operations of Swift seems to be easy to handle, but we should take care of them in development.
For example, this is a common code snippet which stays on the top spot, by searching ‘swift substring’ from www.google.co.jp.
1 2 3 4 5
But, it is not correct.
NSString’s implementation is based on UTF-16 and handing index for it is just confusing.
Take a look at the following test.
Because index in
NSString format is different from index in
String format. The text is counted as length 2 in
NSString, but only 1 in
NSString only get half of it, from its binary expression. Then certainly, it cannot be decoded correctly.
Swift Online Guide has a detail explanation for this problem.
We should use
String.Index to handle different byte length of each character, and also decomposed and precomposed characters.
String in Swift is implemented as Unicode Scalar, or UTF-32, is always 4-byte long for each character and it default behaviour can handle composed characters.
1 2 3 4 5 6 7 8 9
For further reading about encoding, I recommend this page. http://www.objc.io/issues/9-strings/unicode/#utf-8