Making a fetch request in validateForInsert is too expensive

Question

Making a fetch request in validateForInsert is too expensive

I recently refactored my underlying data model and I am using the multi-level managed object context model: http://www.cocoanetics.com/2012/07/multi-context-coredata/ .

I managed to isolate all my data parsing so that new managed objects are parsed and inserted inside the child MOC into the background thread, and these changes are eventually saved in the parent / main MOC, then ultimately written to the store's resident coordinator through its parent / copyright MOC.

This improved my UI sensitivity a bit noticeably as there was previously a large batch write on the parent / main MOC and blocking the UI thread.

I want to further improve the insertion and validation of objects. Every time the application is opened, and at a somewhat regular interval, there is a profile request, during which tens or hundreds of objects are sent with new values. I decided to just create NSManagedObjects

for all these objects, insert them into the child MOC and enable validation to eliminate duplicates.

My question is, is it executed NSFetchRequest

on every call validateForInsert

: for is NSManagedObject

expensive. I have seen several responses to the StackOverflow, which seem to use this template, for example: qaru.site/questions/149287 / ... . I want to do this instead of validating before the entity is created because if two threads create the same object at the same time, both will be created and the validation should happen during insert / merge on the parent thread.

So, is this method expensive? Is this common practice? Also, is there a difference in using validateForInsert

and validate

?

-(BOOL)validateUniqueField:(id *)ioValue error:(NSError * __autoreleasing *)outError{

    // The property being validated must not already exist

    NSFetchRequest *fetchRequest = [NSFetchRequest fetchRequestWithEntityName:NSStringFromClass([self class])];
    fetchRequest.predicate = [NSPredicate predicateWithFormat:@"uniqueField == %@", *ioValue];

    int count = [self.managedObjectContext countForFetchRequest:fetchRequest error:nil];
    if (count > 0) {
        if (outError != NULL) {
            NSString *errorString = NSLocalizedString(
                                                      @"Object must have unique value for property",
                                                      @"validation: nonunique property");
            NSDictionary *userInfoDict = @{ NSLocalizedDescriptionKey : errorString };
            *outError = [[NSError alloc] initWithDomain:nil
                                                   code:0
                                               userInfo:userInfoDict];
        }
        return NO;
    }
    return YES;
}

My use case for multiple threads potentially creating the same object would be, for example, if I asynchronously query all users within a scope, two of those scopes overlap and give me the same user object at about the same time and each thread trying to create the same user in its own context. findOrCreate

won't be able to confirm that the object was already created in a different thread / context. I am currently handling this by doing validation validateForInsert

.

+2

ios core-data

mitrenegade Sep 20 '14 at 2:23

source to share

3 answers

quellish · Answer 1 · 2014-09-20T23:09:43+0000

Getting inside a validation method?

Your question is smart as it hides several questions!

So, is this method expensive?

This is potentially very expensive since you get a sample for at least every object you check as part of the save (validation is called automatically at save time).

Is this common practice?

I really hope not! I've only seen this once and it didn't work out well (keep reading).

Also, is there a difference in using validateForInsert and validation?

I'm not sure what you mean here. A managed object has the following test methods: validateForInsert

, validateForUpdate

, validateForDelete

. They each follow their own rules and also call validateValue:forKey:error:

for individual properties, which in turn will call any implementations of the template validate<Key>:error:

. validateForInsert

for example, it will execute any insertion validation rules defined in the managed object model before calling other validation methods (for example, marking a modeled property that is not optional in the model editor is an insertion validation rule). Although validation is called automatically when the context is saved, you can call it at any time. This can be useful if you want to show user errors that need to be fixed in order to save, etc.

So read on for the solution to the problem you are trying to solve.

About extracting inside a validator ...

It is unwise to refer to an object graph inside a validator method. When you do a selection, you change the object graph in that context - objects are accessed, crashes occur, etc. Validation is performed automatically when you save and change the graph of the object in memory at this point - even if you are not changing property values directly - can have some dramatic and difficult to predict side effects. It wouldn't be fun, fun.

Correct solution for uniqueness: Find or Create

You seem to be trying to convince the managed objects are unique. Core Data does not have a built-in mechanism for this, but there is a recommended pattern for implementation: "find-or-create". This is done when objects are accessed, not when they are checked or saved.

Determine what makes this object unique. It can be a single property value (in your case, it is the only property), or a combination of several (for example, firstName and lastName together is what makes the "face" unique). Based on this criterion of uniqueness, you are requesting a context to map to an existing object. If any matches are found, return them, otherwise create an object with these values.

Here's an example based on the code in your question. This will use the "uniqueField" value as the criteria for uniqueness, obviously if you have multiple properties that together make your entity unique it gets a little more complicated.

Example:

// I am using NSValue here, as your example doesn't indicate a type.
+ (void) findOrCreateWithUniqueValue:(NSValue *)value inManagedObjectContext:(NSManagedObjectContext *)managedObjectContext completion:(void (^)(NSArray *results, NSError *error))completion {

    [managedObjectContext performBlock:^{
        NSError             *error      = nil;
        NSEntityDescription *entity     = [NSEntityDescription entityForName:NSStringFromClass(self) inManagedObjectContext:managedObjectContext];
        NSFetchRequest *fetchRequest    = [[NSFetchRequest alloc] init];
        fetchRequest.entity = entity;
        fetchRequest.predicate = [NSPredicate predicateWithFormat:@"uniqueField == %@", value];

        NSArray *results = [managedObjectContext executeFetchRequest:fetchRequest error:&error];
        if ([results count] == 0){
            // No matches found, create a new object
            NSManagedObject *object = [NSEntityDescription insertNewObjectForEntityForName:[entity name] inManagedObjectContext:managedObjectContext];
            object.uniqueField = value;
            results = [NSArray arrayWithObject:object];
        }

        completion(results, error);
    }];

}

This will become your main method for getting objects. In the scenario you describe in your question, you periodically get data from some source that should be applied to managed objects. Using the above method, the process would look something like this.

[MyEntityClass findOrCreateWithUniqueValue:value completion:^(NSArray *results, NSError *error){
    if ([results count] > 0){
        for (NSManagedObject *object in results){
            // Set your new values.
            object.someValue = newValue;
        }
    } else {
        // No results, check the error and handle here!
    }
}];

What can be done efficiently, efficiently and with appropriate data integrity. You can exploit batch error in your fetch implementation and so on if you are willing to take memory. After doing the above for all your incoming data, the context can be saved and the objects and their values are efficiently loaded into the parent store.

This is the preferred way to implement uniqueness using Core Data. This is mentioned very briefly and indirectly in the Master Data Programming Guide .

To expand on this ... It's not unusual to do bulk search or create. In your script, you get a list of updates to be applied to your managed objects, creating new objects if they don't exist. Obviously, the example find-or-create method above can do this, but you can also do it much more efficiently.

Master data is referred to as "batch error". Instead of accessing each individual object individually, if you know you are going to use multiple objects, you can combine them all at once. This means fewer drive trips and better performance.

A search or array creation method can take advantage of this. Keep in mind that since all of these objects will now have their "fired" errors, this will use more memory - but no more than if you called the above one-time detection or creation on each of them.

Instead of repeating the entire previous method, I will rephrase:

 // 'values' is a collection of your unique identifiers.
+ (void) findOrCreateWithUniqueValues:(id <NSFastEnumeration>)values inManagedObjectContext:(NSManagedObjectContext *)managedObjectContext completion:(void (^)(NSArray *results, NSError *error))completion {
    ...
    // Effective use of IN will ensure a batch fault
    fetchRequest.predicate = [NSPredicate predicateWithFormat:@"SELF.uniqueField IN %@", values];
    // returnsObjectsAsFaults works inconsistently across versions.
    fetchRequest.returnsObjectsAsFaults = NO;
    ...
    NSArray *results = [managedObjectContext executeFetchRequest:fetchRequest error:&error];
    // uniqueField values we initially wanted
    NSSet   *wanted = [NSSet setWithArray:values];
    // uniqueField values we got from the fetch
    NSMutableSet    *got    = [NSMutableSet setWithArray:[results valueForKeyPath:@"uniqueField"]];
    // uniqueField values we will need to create, the different between want and got
    NSMutableSet    *need   = nil;

    if ([got count]> 0){
        need = [NSMutableSet setWithSet:wanted];
        [need minusSet:got];
    }

    NSMutableSet *resultSet = [NSMutableSet setWithArray:fetchedResults];
    // At this point, walk the values in need, insert new objects and set uniqueField values, add to resultSet
    ...
    // And then pass [resultSet allObjects] to the completion block.

}

Making good use of batch failure can be a huge boost for any application that deals with many objects at the same time. As always, a profile with tools. Unfortunately, the tear behavior varied significantly between different Core Data releases. In older versions, additional sampling using managed object identifiers was even more useful. Your mileage may vary.

adonoho · Answer 2 · 2014-09-20T22:20:37+0000

Individual calls to the database are expensive compared to a single call that is compared to a set of identifiers. When you are comparing a single value, you can compare against a group of values using an operator in

in a set or array. Hence, after knocking down the lot, extract the ids using maybe -valueForKey:

and rewrite the above to accept an array of values.

malhal · Answer 3 · 2016-07-24T19:46:16+0000

I think it is fine to sample within methods validation

, for example. validateForInsert

... This is actually the only way to get the error back into context if the fetch fails. Just make sure you pass the error param

to your fetch and return false

if the fetch produced a result nil

.

Making a fetch request in validateForInsert is too expensive

Getting inside a validation method?

So, is this method expensive?

Is this common practice?

Also, is there a difference in using validateForInsert and validation?

About extracting inside a validator ...

Correct solution for uniqueness: Find or Create

More articles: