NSRegularExpression Crash

I am using NSRegularExpression to select image urls from HTML. However, when I try to use it, I get the following error:

* Application terminated due to uncaught 'NSInvalidArgumentException', reason: '* - [NSRegularExpression enumerateMatchesInString: options: range: usingBlock:]: nil argument'

I have looked at other Stackoverflow answers such as but this question uses NSMatchingOption and I do not and the answer does not provide any information on what is wrong with my situation.

Here is the code that crashes:

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?" options:NSRegularExpressionCaseInsensitive error:nil];
NSString *source = [NSString stringWithContentsOfURL:[NSURL URLWithString:object[@"link"]] encoding:NSUTF8StringEncoding error:nil];

NSArray *imageResults = [regex matchesInString:source options:0 range:NSMakeRange(0, source.length)];
NSURL *link = [imageResults.firstObject URL];
UIImage *img = [UIImage imageWithData:[NSData dataWithContentsOfURL:link]];
if (img)
{
    [self.images setObject:img forKey:object[@"link"]];
    dispatch_async(dispatch_get_main_queue(), ^{
        cell.imageView.image = img;
        [cell layoutSubviews];
    });
}

      

The error itself occurs on the line where the imageResults

.

Does anyone know what is wrong with this code?

+3


source to share


2 answers


This answer is specific to the question and url provided in the comment to the previous answer. It assumes there are multiple image URLs and all of them are needed.
Note 1: html encoding NSISOLatin1StringEncoding

.
Note 2: RegExp was a change in the processing order of "src =".

NSString *urString = @"http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNHVXXEN0DG2pblU2_FBFfeS3klRVw&clid=c3a7d30bb8a4878e06b80cf16b898331&cid=52778623354837&ei=ugAvVMDVIsLGwAHPtYG4CA&url=http://espn.go.com/new-york/nba/story/_/id/11634537/cleveland-cavaliers-open-regularly-resting-lebron-james-season";
NSURL *url = [NSURL URLWithString:urString];
NSError *error;

NSString *source = [NSString stringWithContentsOfURL:url encoding:NSISOLatin1StringEncoding error:&error];
if (source.length){
    NSString *regExp = @"<img.*?\\s+src=[\"']([^\"']+)";
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regExp options:NSRegularExpressionCaseInsensitive error:&error];
    NSRange marchRange = NSMakeRange(0, source.length);

    [regex enumerateMatchesInString:source
                            options:0
                              range:marchRange
                         usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags NSRegularExpressionCaseInsensitive, BOOL *stop) {
         NSRange imgRange = [result rangeAtIndex:1];
         NSLog(@"imgRange: %@, '%@'", NSStringFromRange(imgRange), [source substringWithRange:imgRange]);
    }];

}
else {
    Display(@"error: %@", error);
}

      



Output:

imgRange: {18793, 68}, 'http://a.espncdn.com/espncitysites/newyork/prod/assets/sub_ny_r3.png'
imgRange: {18793, 68}, 'http://a.espncdn.com/espncitysites/newyork/prod/assets/sub_ny_r3.png'
imgRange: {19784, 172}, 'http://ad.Doubleclick.net/ad/espn.local.newyork.com/nba;pgtyp=story;sp=nba;tm=cle;pl=1015;pl=1966 ; pl = 2028618; pl = 215; pl = 2419; objid = 11634537; col = mcmenamin_dave; sz = 150x45,1x1; '
imgRange: {22162, 182}, 'http://ad.Doubleclick.net/ad/espn.local.newyork.com/nba;pgtyp=story;sp=nba;tm=cle;pl=1015;pl=1966 ; pl = 2028618; pl = 215; pl = 2419; objid = 11634537; col = mcmenamin_dave; sz = 1280x946,200x800,1x1; '
imgRange: {23470, 186}, 'http://ad.Doubleclick.net/ad/espn.local.newyork.com/nba;pgtyp=story;sp=nba;tm=cle;pl=1015;pl=1966 ; pl = 2028618; pl = 215; pl = 2419; objid = 11634537; col = mcmenamin_dave; sz = 728x90,970x66,924x50,1x1; '
imgRange: {29706, 36}, 'http://a.espncdn.com/icons/in_15.png'
imgRange: {30352, 103}, 'http://a.espncdn.com/media/motion/2014/1003/dm_141003_nba_schwartz_bron/dm_141003_nba_schwartz_bron.jpg'
imgRange: {31339, 37}, 'http://a.espncdn.com/icons/video2.png'
imgRange: {34098, 65}, 'http://a.espncdn.com/photo/2014/1001/nba_a_lebron01jr_300x300.jpg'
imgRange: {35987, 55}, 'http://a.espncdn.com/i/columnists/windhorst_brian_m.jpg'
imgRange: {38249, 79}, 'http://a.espncdn.com/combiner/i?img=/photo/2014/0926/nba_a_james_mb_203x114.jpg'
imgRange: {41787, 36}, 'http://a.espncdn.com/icons/in_15.png'
imgRange: {42698, 87}, 'http://a.espncdn.com/combiner/i?img=%2fi%2fcolumnists%2fmcmenamin_dave_35.jpg&w=35&h=48'
imgRange: {48148, 68}, 'http://a.espncdn.com/photo/2014/1002/nba_garnett_wiggins_203x114.jpg'
imgRange: {48834, 33}, 'http://a.espncdn.com/icons/in.gif'
imgRange: {50157, 181}, 'http://ad.Doubleclick.net/ad/espn.local.newyork.com/nba;pgtyp=story;sp=nba;tm=cle;pl=1015;pl=1966 ; pl = 2028618; pl = 215; pl = 2419; objid = 11634537; col = mcmenamin_dave; sz = 300x600,300x250,1x1; '
imgRange: {51105, 45}, '/photo/2014/1003/mlb_g_martinez_b1_110x62.jpg'
imgRange: {51801, 41}, '/photo/2014/1003/ny_u_geno2_js_110x62.jpg'
imgRange: {52491, 43}, '/photo/2014/1002/nhl_g_fleury_b3_110x62.jpg'
imgRange: {53201, 44}, '/photo/2014/1002/ny_g_betances_js_110x62.jpg'
imgRange: {53902, 42}, '/photo/2014/1003/ny_g_murphy_js_110x62.jpg'
imgRange: {54986, 66}, 'http://a.espncdn.com/i/Integrators/shop.lebron.welcome.300x100.jpg'
0


source


There is a problem: it matchesInString:source

returns an array NSTextCheckingResults

.

For example, you need to add error checking:

NSString *regExp = @"<img\\s+src=[\"']([^\"']+)";
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regExp options:NSRegularExpressionCaseInsensitive error:&error];

NSString *source = @"leading<img src=\"news.google.com/news/…\" alt=\"Smiley face\">more";

NSArray *matchResults = [regex matchesInString:source options:0 range:NSMakeRange(0, source.length)];
NSTextCheckingResult *result0 = matchResults[0];
NSRange imgRange = [result0 rangeAtIndex:1];
NSLog(@"imgRange: %@, '%@'", NSStringFromRange(imgRange), [source substringWithRange:imgRange]);

      



Output:

imgRange: {17, 22}, 'news.google.com/news / ...'

See ICU User Guide Regular Expressions

+1


source







All Articles