java - regular expression to detect a tag -
i trying detect paragraphs in file:
an xml file
to used code :
pattern p = pattern.compile("<paragraph>\\s*?(.*?)\\s*?(.*?)\\s*?(.*?)</paragraph>"); matcher m = p.matcher(ne); int occur = 1; while(m.find()) { system.out.print("word = " + ne.substring(m.start(), m.end())+"\n"); } }
the problem detects first paragraph. please?
dreamer, said... "simple java project":
//import java.util.regex.matcher; //import java.util.regex.pattern; stringbuilder text = new stringbuilder(); text.append("<html><something>"); text.append("<paragraph><sentence>text 1 qwe</sentence></paragraph>"); text.append("<paragraph><sentence>text 2 qwe</sentence></paragraph>"); text.append("<zzz>this text wont go</zzz>"); text.append("<paragraph><sentence>text 3 qwe</sentence></paragraph>"); text.append("</something></html"); system.out.println(text.tostring()); pattern p = pattern.compile("<paragraph>(.*?)</paragraph>"); matcher m = p.matcher(text.tostring()); while (m.find()) { system.out.print("word = " + m.group() + "\n"); }
output:
<html><something><paragraph><sentence>text 1 qwe</sentence></paragraph> <paragraph><sentence>text 2 qwe</sentence></paragraph><zzz>this text wont go</zzz><paragraph><sentence>text 3 qwe</sentence></paragraph></something> </html> word = <paragraph><sentence>text 1 qwe</sentence></paragraph> word = <paragraph><sentence>text 2 qwe</sentence></paragraph> word = <paragraph><sentence>text 3 qwe</sentence></paragraph>
Comments
Post a Comment