Welcome to FutureAppLaboratory

v=(*^ワ^*)=v

Swift String Operations

| Comments

String operations of Swift seems to be easy to handle, but we should take care of them in development.

For example, this is a common code snippet which stays on the top spot, by searching ‘swift substring’ from www.google.co.jp.

1
2
3
4
5
extension String {
    public func substring(location:Int, length:Int) -> String! {
        return (self as NSString).substringWithRange(NSMakeRange(location, length))
    }
}

But, it is not correct. NSString’s implementation is based on UTF-16 and handing index for it is just confusing.

Take a look at the following test.

IMGAE_A

Because index in NSString format is different from index in String format. The text is counted as length 2 in NSString, but only 1 in String.

So using substring in NSString only get half of it, from its binary expression. Then certainly, it cannot be decoded correctly.

Swift Online Guide has a detail explanation for this problem.

We should use String.Index to handle different byte length of each character, and also decomposed and precomposed characters.

Because String in Swift is implemented as Unicode Scalar, or UTF-32, is always 4-byte long for each character and it default behaviour can handle composed characters.

1
2
3
4
5
6
7
8
9
public func substring2(location: Int, length:Int) -> String! {
    assert(location >= 0, "OMG")
    assert(location + length <= count(self), "OMG again")
    var startIndex = self.startIndex
    startIndex = advance(startIndex, location)
    var res = self.substringFromIndex(startIndex)
    var endIndex = advance(startIndex, length)
    return res.substringToIndex(endIndex)
}

IMAGE_B

For further reading about encoding, I recommend this page. http://www.objc.io/issues/9-strings/unicode/#utf-8

Comments