Universal Information Crawler

What can UICrawler do for you

Universal information crawler is a fast precise and reliable Internet crawler. Uicrawler is a program/automated script which browses the World Wide Web in a methodical, automated manner and creates the index of documents that it accesses.

Download the code HERE.

Welcome to Universal Information Crawler

Download the code HERE.

So what does it Do?

uicrawler Downloads first website. It goes through html, finds an link tag, and retrieves outside link. When it finds "< a href="http://uicrawler.sourceforge.net">website < /a >" it copies the link and adds it to a list of pages it plans to crawl. It crawls it and analyzes it again. Each website is saved into "web" folder, so there is no need to re-download pages if job was stopped.

Participation

To start using UICrawler go to our UICrawler WIKI Page.
To see an overview of architecture see the uicrawler Resource Guide.

Our software is 100% GPL;
If yours is 100% GPL compliant, then you have no obligation to pay us for the licenses and you can include UICrawler into your project. You may modify the code anyway you see it fit. It is a great opportunity for the open source community and those of you who are developing open source software.

Benefits

Why use it?

Get local copy of crawled pages.
Get link analysis on crawled pages using Page Rank.
Resume download only on pages not crawled

Requirements

The Universal Information Crawler is about functional, practical crawling and not the latest bleeding-edge tricks doable only on 2% of the bleeding edge hardware.

Linux
MAC OSX

Please let us know if our program works on other platforms not mentioned here.

View the Resources page or UICrawler Wiki Page for more information. Full system certified compliance is still sometimes a pipe dream, and we do not expect you to come up with crawler-perfect downloading across every information data available. But we do test as many as we can. Share the information with us on our mailing list UICrawler Mailing List.

Bandwidth graciously donated by Sourceforge.net