Can Picaloader Handle this

Discuss how to download pictures from websites using PicaLoader.
(NO adult contents and advertisement please.)
Post Reply
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Can Picaloader Handle this

Post by merc_slk »

Hello,

I want to download the xlg pictures from this site. It uses javascript but with a lot variables, it's hard to figure how to parse the right URL

I know the pictures are stored in www.nancymeyer.com/lingerie/assets/product_images/xlg/ but there's no pattern to follow.

Let's take a random page : http://www.nancymeyer.com/lingerie/sear ... %20CHARMEL

Now we open take the picture named "Lise Charmel: Casting Lady Plunge Bra" http://www.nancymeyer.com/lingerie/prod ... t=2&s_id=0

Another pages opens with a button "click for alternate views"


A popup opens with the xlg picture within and some links to other xlg pictures. I want to download these pictures but with my reference like "Lise charmel-Casting Lady Plunge Bra-xx" xx-number of picture.

I think it's a hard one to crack but maybe with LUA it could be possible but my knowledge is not enough to make a script.

Any help would be welcome.

Merc
User avatar
KoalaBear
Posts: 325
Joined: Wed Sep 24, 2003 5:27 pm

Post by KoalaBear »

for PicaLoader 1.61 or later:

Start URL:http://www.nancymeyer.com/lingerie/search2.asp?s_id=0&search_freetext=LISE%20CHARMEL
Page URL Include Filters:CHARMEL;product\.asp\?pf_id=\w+&attr_value2=\w+$
Page URL Exclude Filters:Answer=
Picture URL Include Filters:-xlg\.jpg$
HTML Parser Script:

local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
if (s1 and e1 and s2 and e2) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
Attachments
nancy.plt
select main menu->task->import tasks to import.
(9.93 KiB) Downloaded 28 times
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Super

Post by merc_slk »

Since the introduction of the Lua script language and don't think nothing is impossible. Great tool and works great. It's up to me to get into Lua scripting. The best buy since long time.
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Changed script

Post by merc_slk »

Hello,

I tried to change the script like this :


local s1,e1=string.find(HTML.content,'pf_id=%u*');
id=string.sub(HTML.Url,s1+6,e1);
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
attrs=string.sub(HTML.Content,s2+17,e2-1);
local S3,e3=string.find(HTML.Content,'product_name=%b""');
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
if (s1 and e1 and s2 and e2 and s3 and e3) then
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);


But it doesn't work.

Now I don't know if it's possible to assign more variables to HTML.Content or is it a one time pass browsing the content.

I've noticed in the que that "NOTE" gets a value. Where does it come from and can it be changed or manipulated.

Can I assign new names to the downloaded pictures with values I grabbed from the page content? Like "productname" in the example above.

Can I make new subdirectories with values I grabbed from the page content? A part of "productname" in the example above.



With Regards,

Jef
User avatar
KoalaBear
Posts: 325
Joined: Wed Sep 24, 2003 5:27 pm

Post by KoalaBear »

when you press start button, PicaLoader's working like this:
1.push Start URL into downloading queue.
2.get a URL from downloading queue, download the URL's content.
3.analyst the content, if it is a picture, saving to disk. if it is a HTML page, call HTML Parser script with it's URL and content.
4.if downloading queue is not empty, goto step 2.

by default, PicaLoader using the title of HTML that link from as "NOTE" of a picture, you can change it by third parameter when you call AddLink function.

and, Lua script is case sensitive!

so your script should be like this:

local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
local s3,e3=string.find(HTML.Content,'product_name=%b""');
if (s1 and e1 and s2 and e2 and s3 and e3) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);


If you want customize the local filename of picture, please check main menu->Project->Properties->Picture Filename->Customize Filename

P.S upgrade to PicaLoader 1.62 for long "Note" field.
roopam005
Posts: 36
Joined: Sun Oct 17, 2004 12:16 pm

Post by roopam005 »

why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????

and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
roopam
roopam005
Posts: 36
Joined: Sun Oct 17, 2004 12:16 pm

Post by roopam005 »

why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????

and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
roopam
Post Reply