Parsing lazygirls.info
Posted: Sat Jul 28, 2007 10:37 pm
Here's how the site works:
For any given person's set of images, on the top level, there are multiple pages, each having 6 thumbnails (filename contains ".thumb." in it).
When you click on the thumbnail, you reach a smaller-sized image (filename contains ".sized." in it).
On the page with the smaller-sized image, you click a link labeled "View Full Size", which takes you to a page that contains the full-sized image.
The problem:
I was able to get this to work only once: PL would go through every page, go through every thumbnail, and download each full-sized image...
...but when I tried it after that with a different person, PL would stop parsing the page before even reaching the thumbnails. PL goes through every page of the person's set, but doesn't go through the thumbnails anymore. I tried going back to the original person's images I had downloaded, but that didn't work for me anymore. I also tried removing my ".sized." and ".thumb." constraints, but that doesn't work anymore either.
I'm pretty sure my Option set is right; I'm wondering if there's some sort of cache that PL is keeping that isn't been renewed each time I start the project, or if anyone had any other approaches in creating the Option set.
For any given person's set of images, on the top level, there are multiple pages, each having 6 thumbnails (filename contains ".thumb." in it).
When you click on the thumbnail, you reach a smaller-sized image (filename contains ".sized." in it).
On the page with the smaller-sized image, you click a link labeled "View Full Size", which takes you to a page that contains the full-sized image.
The problem:
I was able to get this to work only once: PL would go through every page, go through every thumbnail, and download each full-sized image...
...but when I tried it after that with a different person, PL would stop parsing the page before even reaching the thumbnails. PL goes through every page of the person's set, but doesn't go through the thumbnails anymore. I tried going back to the original person's images I had downloaded, but that didn't work for me anymore. I also tried removing my ".sized." and ".thumb." constraints, but that doesn't work anymore either.
I'm pretty sure my Option set is right; I'm wondering if there's some sort of cache that PL is keeping that isn't been renewed each time I start the project, or if anyone had any other approaches in creating the Option set.