April 18, 2008 in Software (E)
[prMac.com] New York - With Blue Crab you can download all the content including HTML, PDF, graphics, video, file archives, etc., or use selective filtering to restrict downloads to specific kinds of files. For example, you can choose to save only the JPEG images Blue Crab finds, or just the PDF's.
Starting with a single URL Blue Crab begins traversing the site by following the links on the textual pages it finds, HTML, CSS, etc. Blue Crab does not stray off the domain of the starting URL; in other words it won't download the whole web! Moreover, you can restrict Blue Crab to a subset of the given website by specifying strings that must match parts of the URL. There is also a convenient "stay in folder" option in every grabber window which restricts the crawl to URLs whose path must begin with the same path of the starting URL.
Blue Crab has a special feature called the "Media Grabber" which you can use to easily download just the graphics, movies or PDFs on a website. When finding images, you can view a mini slide show as they are downloaded. You also have the option of "flattening" the download directory, i.e. putting all the downloaded images into one folder, or preserving the folder structure on the server (just as when downloading a complete website for offline viewing.)
Blue Crab sports two types of "grabbers" for crawling a website. The first kind, called "classic" since it was the initial grabber type in Blue Crab, uses the original methods provided by the Mac OS for downloading URLs. The classic grabber shows very detailed progress statistics, and in particular the server's HTTP response headers in log format (which provides information like server type.)
The alternative "quick" grabber is the newer type, and forms the basis of the "download one page" and "media" grabbers. The term "quick" comes from the fact this grabber uses newer Mac OS technology and doesn't provide as much progress statistics as the classic version. In benchmark testing it can be up to 20% faster. In addition, for certain websites, one type of grabber may yield better results than another.
As an alternative to crawling a website sometimes you may just want to download a collection of independent URLs. You might be surfing the web and as you go along find relevent PDFs or images for, say, a research project. Using the batch downloader simply drag the URLs into the batch download window to collect them. When you are done just click the "Start" button to download them all at once.
Finally, Blue Crab is gentle on the server. It processes only one URL at a time and may be configured to grab resources at preset time intervals. Considering the stress such programs can put on websites by downloading all of their content it is only fair that they do so in small increments over a longer periods of time. (If you've ever looked at your web logs you'll see that robots like Google are highly intermittent.)
A small tour is available that quickly introduces you to the many features of Blue Crab. If you would like to take the tour visit the website.
* Powerful URL extraction algorithms yield very complete downloads compared to other products of this type.
* You can download websites either for offline browsing, via re-linking of URLs, or to create virtual copies of them on your hard drive, i.e. backup a website as-is.
* Supports both HTTP, HTTPS protocols as well as server realms. Detailed Grabber window shows server HTTP header response fields.
* Optionally display web pages as they are crawled using either your own web browser or Blue Crabs own web window.
* You can restrict a crawl to just the initial domain, or allow the program to "stray" into offsite URLs.
* Blue Crab can remap "dynamic" URLs which contain path and search arguments so that resulting disk files are navigable offline.
* Image URLs into JPEG files (or any other graphic file type supported by Quicktime) to create complete pictures of web pages not possible simply by taking a single screenshot yourself.
* Blue Crab provides its URL imaging ability as a service menu item available from other applications.
* URLs can be imaged into files, or onto the clipboard.
* Built-in directory search displays found files as a hierarchical list, with double-clickable entries.
* A "batch window" enables you to download your own list of files. Simply drag and drop URLs into the batch window, or import them from an HTML file , or text file.
* Grabbing in "images-only" mode just downloads image files, optionally "flattening" the results (i.e. putting all images into one folder.)
* Reusable configuration settings control what is grabbed, saved or crawled:
* Filter by size, date, file type or even content
* Filter by filename, extension or path
* Bookmarks window for storing frequently crawled websites.
* Supports form submissions to enable you to begin a crawl from a login page.
* Supports cookies for more accurate downloads.
* Generate site maps that consist of a hierarchical display of URLs. Such maps can be optionally filtered by extension.
* Find email addresses.
* Customizable user-agent (spoofing) enables you to download websites which are platform specific.
* Optionally receive email notifications of Blue Crab's progress during a lengthy crawl.
* Dock badging provides visual feedback or progress when the program is in the background.
Limit Point Software has been avidly developing software for the Mac community since 1997. In order to improve the usefulness, simplicity and dependability of our products user feedback has always been highly welcome and encouraged. Our products cover a diverse range of applications. The internet applications include bulk emailing, HTML form processing, web crawling and document indexing and searching. The "Utilities" suite is a large collection of small programs for combining movies, processing images in batch, file property editing, downloading and converting YouTube files, and much, much more.