So what I'm doing is, with a super simple PHP proxy that only uses file_get_contents
I fetch an HTML and convert it to htmlentities in UTF-8 format. After that with jQuery which is doing the AJAX call I want to get the whole HTML which includes the tags, <html><head><body>code..</body></head></html>
into an iframe so then I can traverse it with jQuery in search of inputs. Is there a way to do this? If it can be done some other way that is welcomed too, I'm just doing iframe because I thought that it was the best option. Since it's a complete HTML doc with doctype and everything I think I can't just append it to a div and then traverse it. My jQuery code is as follows:
$(document).ready(function(){
var globalCount = 0;
function countInputs(data, url){
var unparsedHTML = data.html; // get data from json object which is in htmlentities
var iframeCreate = $('<iframe id="iframe"></iframe>');
var iframe = $('#iframe');
if(iframe.length){
iframe.remove(); // if iframe exists remove it to clean it
iframeCreate.insertAfter($('#result')); //create iframe
}else{
iframeCreate.insertAfter($('#result')); //create iframe
}
iframe.html(unparsedHTML).text(); // insert html in iframe using html(text).text() to decode htmlentities as seen in some stackoverflow examples
var inputs = iframe.contents().find('input'); //find inputs on iframe
var count = inputs.length;
var output = '';
globalCount = globalCount + count;
output = "Count for url: " + url + " is: " + count + " , the global count is: " + globalCount;
console.log(output);
$('#result').append(output);
}
/*SNIP ----- SNIP */
function getPage(urls){
console.log("getPage");
for(i = 0; i < urls.length; i++){
var u = urls[i];
console.log("new request: " + urls[i]);
var xhr = $.ajax(
{
url: "xDomain.php",
type: "GET",
dataType: "json",
data: {
"url":u
}
})
xhr.done(function(data){
console.log("Done, starting next function");
countInputs(data, u)
});
xhr.fail(function (jqXHR, textStatus, errorThrown) {
if (typeof console == 'object' && typeof console.log == 'function') {
console.log(jqXHR);
console.log(textStatus);
console.log(errorThrown);
}
});
}
}
/*SNIP------------SNIP*/
});
The problem is that nothing is ever loaded into the iframe, no error is thrown and the request is successful bringing the HTML in the response. For example if I give the URL http://google.com to the script it should give a count of N inputs back. Since if you go to Google and type in the URL javascript: alert(document.getElementsByTagName('input').length)
will alert N number of inputs since there's N inputs. I hope that with the example provided everything is clearer.