javascript - Can not download html with phantomjs -
i have 3 different files in project , layout is
- phantomjs
- -->phantomjs.js
- -->phantomjs.exe
- index.php
index.php:
$phantom_script = dirname(__file__). '\phantomjs\phantomjs.js'; $response = exec ('\phantomjs\phantomjs.exe' . $phantom_script); echo $response;
phantomjs\phantomjs.js
var webpage = require('webpage'); var page = webpage.create(); page.open('http://www.google.com', function(status) { console.log(page.content); phantom.exit(); });
your usage oh phantomjs correct according documentation. http://phantomjs.org/api/webpage/property/content.html
php exec method returns last line only. maybe line white space. http://php.net/manual/fr/function.exec.php
you shall have seond parameter &$output, sent reference. array containing entire output.
a problem may face later, content need evaluated before try read s dom document content. using example innerhtml of html tag, ie: $('html').html();
if page not have jquery, may include it, see example, https://github.com/ariya/phantomjs/blob/master/examples/phantomwebintro.js
note google may actively desire not let users scrap , save search results. not sure that.
Comments
Post a Comment