node.js - how to extract html content using xpath using nodejs module -


i looking html content extractor using xpath, have seen various nodejs module

jsdom, htmlparser2, xpath, cheerio

i found cheerio better getting data using class, id, tags etc not able data specifying xpath , , using xpath nodejs module able data using xpath smaller html, longer html gives different type of error

entity not found:  @#[line:120,col:9], unclosed xml attribute @#[line:1,col:877]

note: have no permission change html in way

e.g. if html

<html> <body>  <div>      <ul id="fruits">         <li class="apple">apple</li>         <li class="orange">orange</li>         <li class="pear">pear</li>     </ul>  </div>  </body>   </html> 

if using , giving xpath //*[@id="fruits"]/li[2] find element using xpath nodejs module, not getting error , got result orange using xpath nodejs module, if using html of page http://www.infotaxi.org/india_taxi/ahmedabad_taxi.htm

(which quite longer), , accessing part of text using xpath

//*[@id="navlistmeniu"]/li[3]/a/b,  

i getting error

entity not found:  @#[line:120,col:9]

using cheerio able extract data using class, id, tags etc. , not xpath

please help????


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -