Description
PyGalleryCrawler project is a Web crawler for online image galleries.
tar -xzf pygallerycrawler.tar.gz
cd pygallerycrawler
psyco @ http://psyco.sourceforge.net
- performance
Python Imaging Library aka PIL @ http://www.pythonware.com/products/pil/
- thumbnails generation
- size verification
feedparser @ http://feedparser.org
- feed parser
chmod a+x pygallerycrawler.py
./pygallerycrawler.py the_url_you_want_crawl
If you make change in config.py, your changes will be overwrite at the next update. You can create a personal configuration and use it with the --config (or -c) switch.
cp config.py ~/pgc_config.py
vi ~/pgc_config.py
./pygallerycrawler.py -c ~/pgc_config.py the_url_you_want_crawl
Limitations:
· No check if some pictures are the same after download. Some gallery have a presentation link wich is one of the pictures. So the images will be double.
Requirements:
· Python
What's New in This Release:
· A check for the image size of both pictures and thumbnails was added.
· Regexp support was improved.
· An internal algorithm was cleaned.
· A simple feed which can be tried if there is direct link to the gallery was added.
User Reviews for PyGalleryCrawler FOR LINUX 1
-
PyGalleryCrawler for Linux is a powerful tool with impressive thumbnail generation and size verification capabilities. A must-have for efficient web image scraping.