Optimizing CoreData full-text queries

Published on 20/07/2011

Given that the CoreData documentation describes a 10,000 object collection as a fairly small data set, I was surprised to find a simple NSFetchRequest being responsible for massive resource spikes in a recent Mac application I’ve been working on.

The NSFetchRequest in question uses an NSPredicate to fetch Files entities with a given path. As can be seen from the original query, I’m performing a case-insensitive comparison of each records path attribute with a receivedPath string.

NSPredicate *predicate = [NSPredicate predicateWithFormat:@"path ==[c] %@", receivedPath];
 
NSEntityDescription *entity = [NSEntityDescription entityForName:@"Files" inManagedObjectContext:self.moc];
NSFetchRequest *request = [[NSFetchRequest alloc] init];
[request setEntity:entity];

While I had indexed the path attribute and ensured that I was using an NSSQLiteStoreType backing store, I hadn’t considered exactly how CoreData would treat the case-insensitive modifier internally. By passing my debug build the -com.apple.CoreData.SQLDebug argument, I was able to ascertain that the comparison makes use of a custom CoreData SQLite function: NSCoreDataStringCompare.

Essentially this meant that SQLite was boiling the string comparison down to a fairly costly LIKE statement, meaning my index was being overlooked completely. To solve the issue I simply created a secondary attribute: lowercasePath, ensured it too was indexed, and used a direct string comparison, like so:

NSPredicate *predicate = [NSPredicate predicateWithFormat:@"path == %@", [receivedPath lowercaseString]];

The results were staggering. The function in question is no longer significant enough to even register during profiling runs.