|
Custom Search
|
Greetings:
I've been trying to implement CrawlTrack on a site and I noticed that on any page that invokes CrawlTrack using the following 2 lines:
$crawltsite=1;
require_once("C:/xampp/htdocs/site1/crawltrack/crawltrack.php");
The page doesn't load, and the browser throbber is spinning until it times out. In tracing through the code, I've found that the system hangs in file "C:\xampp\htdocs\crawltrack\include\searchenginesposition.php" at line 1767 which reads:
$crawltxml1 = file_get_contents($crawltquery1);
On the line before, it sets the variable $crawltquery1 to a Yahoo Web Search Service (couldn't put URL due to anti-spamming limitations here).
It seems to be going out to the web to pull something from Yahoo that it cannot access. I am running behind a proxy server, and I have set the $proxyhost and $proxyport variables on lines 379 & 380 and 1272 & 1273, respectively, and still no love.
My questions are these:
- Why is CrawlTrack going out to Yahoo? There are other lines of code that also have it access exalead.com (line 1865) and del.icio.us (line 2037). What is the purpose of this? Is this required for CrawlTrack functionality?
- Is there some other place where I need to designate proxy information in addition to what I already have set, listed above?
- Is there any other configuration that is required that I'm missing?
- Is there an error log somewhere that might give some hint why CrawlTrack is failing?
Any information appreciated.
Last edited by 137th Gebirg (11-08-2009 19:42:39)
Offline
Anyone?
Offline
Hi,
Sorry for the delay, (2 messages + 1 Email in 24 hrs it's great to see how impatient you are to use CrawlTrack
).
To answer to your question, CrawlTrack is sending query to Yahoo, MSN, Exalead and Del.icio.us once a day in order to collect the informations to complete the indexation page.
With a fresh install, the first visit on the site is giving the top for the first query (to Yahoo to get the number of backlinks of the first site), the second visit will give the top for the second query (to Yahoo to get the number of pages indexed if there is only one site in Crawltrack or otherwise to get the number of backlink of the second site). And so on until Crawltrack has collected all the information needed (six queries per site). After that Crawltrack will stop these query waiting for the following day.
You can try with an empty searchenginesposition.php file to see if these external queries are the only reason of your problems.
Jean-Denis
Offline
AH! Thank you! That helped it work behind the proxy server. I will see what happens when I re-add the reference to searchenginesposition.php when a proxy server is not present and see what happens. That should hopefully take care of it. Thanks again for your help! I definitely wanted to get this system working.
Offline
Edgepark Medical |