正则表达式拆分NSString – iOS
在iOS中使用正则表达式拆分字符串
我已经使用循环解决了这个问题,但是想要一个更清晰的答案,我希望一个reg exe大师可以帮助我。
我的原始字符串可能如下所示
NSString *originalString = @"343 a mr smith needs this work"; NSString *originalStringVerTwo = @"345a mr jones needs this work as well"; NSString *originalStringVerThree = @"345 Mrs Someone";
我需要分成3个单独的新字符串:
- 带或带有尾随“a”或“b”的数字,删除if之间的空格
- 人名,也许是资本与否,即史密斯先生或琼斯夫人等
- 在此之后,零个或多个单词将出现在最终字符串中
例如
- 123a先生,这里有些话
- 124 b mrs jones n / p
- 654 Foo先生
- 123 Jones n / p
- 345 n / p
应该导致以下结果
第1行
NSString *one = 123a NSString *two = mr who NSString *three = here are some words
第2行
NSString *one = 124b // i want the white space removed between number and digit NSString *two = mrs jones NSString *three = n/p
第3行
NSString *one = 654 NSString *two = Mr Foo NSString *three = @""
第4行
NSString *one = 123 NSString *two = Jones NSString *three = n/p
第5行
NSString *one = 345 NSString *two = n/p NSString *three = @""
常数将是
- 带或不带“a”“b”的3位数字(123,123a,123b)
- 一个人的名字,有或没有称呼(琼斯先生,琼斯先生)
- 人名可能是未知的 – 因此“n / p”的确切文本
- 名称后面是一个n长度的字符串,以\ n结尾(这是一组单词\ n)。
将123a的空白区域移除到123a是理想的,但不是主要要求
这是一个应该有效的正则表达式:
^ //start of line ( //first capture group [\d]+ //one or more digits ) //end of first capture group (?: //start of optional non-capturing group \s? //optional whitespace ( //second capture group [ab] //character class - a or b ) //end of second capture group )? //end of optional non-capturing group \s //whitespace ( //third capture group (?: //non-capturing group Mr|Mrs|Mister //title alternation ) \s //whitespace [\w/]+ //1 or more word characters or "/" | //alternation [\w/]+ //1 or more word characters or "/" ) //end of third capture group (?: //start of optional non-capturing group \s //whitespace ( //fourth capture group .* //0 or more of any character ) //end of fourth capture group )? //end of optional non-capturing group $ //end of line
构建你的正则表达式。 我们必须逃避转义以将它们保留在NSString中:
NSString* regexString = @"^([\\d]+(?:\\s?[ab])?)\\s((?:Mr|Ms|Mrs|Mister)\\s[\\w/]+|[\\w/]+)(?:\\s(.*))?$"; NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:regexString options:NSRegularExpressionCaseInsensitive error:nil];
制作测试数组:
NSArray* testArray = @[ @"123a mr who here are some words" ,@"124 b mrs jones n/p" ,@"654 Mr Foo" ,@"123 Jones n/p" ,@"345 n/p" ,@"345" ,@"nothing here" ];
处理测试数组:
for (NSString* string in testArray) { NSLog(@" "); NSLog(@"input: '%@'",string); NSRange range = NSMakeRange(0,string.length); if ([regex numberOfMatchesInString:string options:0 range:range] == 1) { NSString* body = [regex stringByReplacingMatchesInString:string options:0 range:range withTemplate:@"$1\n$2\n$3"]; NSArray* result = [body componentsSeparatedByString:@"\n"]; NSString* one = result[0]; NSString* two = result[1]; NSString* three = result[2]; NSLog(@"one: '%@'",one); NSLog(@"two: '%@'",two); NSLog(@"three: '%@'",three); } else { NSLog(@"no match"); } }
输出:
input: '123a mr who here are some words' one: '123a' two: 'mr who' three: 'here are some words' input: '124 b mrs jones n/p' one: '124b' two: 'mrs jones' three: 'n/p' input: '654 Mr Foo' one: '654' two: 'Mr Foo' three: '' input: '123 Jones n/p' one: '123' two: 'Jones' three: 'n/p' input: '345 n/p' one: '345' two: 'n/p' three: '' input: '345' no match input: 'nothing here' no match