Can Picaloader Handle this

merc_slk · Post by **merc_slk** » Sat Feb 16, 2008 11:26 am

Hello,

I want to download the xlg pictures from this site. It uses javascript but with a lot variables, it's hard to figure how to parse the right URL

I know the pictures are stored in www.nancymeyer.com/lingerie/assets/product_images/xlg/ but there's no pattern to follow.

Let's take a random page : http://www.nancymeyer.com/lingerie/sear ... %20CHARMEL

Now we open take the picture named "Lise Charmel: Casting Lady Plunge Bra" http://www.nancymeyer.com/lingerie/prod ... t=2&s_id=0

Another pages opens with a button "click for alternate views"

A popup opens with the xlg picture within and some links to other xlg pictures. I want to download these pictures but with my reference like "Lise charmel-Casting Lady Plunge Bra-xx" xx-number of picture.

I think it's a hard one to crack but maybe with LUA it could be possible but my knowledge is not enough to make a script.

Any help would be welcome.

Merc

KoalaBear · Post by **KoalaBear** » Sun Feb 17, 2008 5:38 am

for PicaLoader 1.61 or later:

Start URL:http://www.nancymeyer.com/lingerie/search2.asp?s_id=0&search_freetext=LISE%20CHARMEL
Page URL Include Filters:CHARMEL;product\.asp\?pf_id=\w+&attr_value2=\w+$
Page URL Exclude Filters:Answer=
Picture URL Include Filters:-xlg\.jpg$
HTML Parser Script:

local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
if (s1 and e1 and s2 and e2) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);

merc_slk · Post by **merc_slk** » Sun Feb 17, 2008 3:42 pm

Since the introduction of the Lua script language and don't think nothing is impossible. Great tool and works great. It's up to me to get into Lua scripting. The best buy since long time.

merc_slk · Post by **merc_slk** » Thu Feb 21, 2008 8:25 am

Hello,

I tried to change the script like this :

local s1,e1=string.find(HTML.content,'pf_id=%u*');
id=string.sub(HTML.Url,s1+6,e1);
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
attrs=string.sub(HTML.Content,s2+17,e2-1);
local S3,e3=string.find(HTML.Content,'product_name=%b""');
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
if (s1 and e1 and s2 and e2 and s3 and e3) then
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);

But it doesn't work.

Now I don't know if it's possible to assign more variables to HTML.Content or is it a one time pass browsing the content.

I've noticed in the que that "NOTE" gets a value. Where does it come from and can it be changed or manipulated.

Can I assign new names to the downloaded pictures with values I grabbed from the page content? Like "productname" in the example above.

Can I make new subdirectories with values I grabbed from the page content? A part of "productname" in the example above.

With Regards,

Jef

KoalaBear · Post by **KoalaBear** » Fri Feb 22, 2008 2:36 am

when you press start button, PicaLoader's working like this:
1.push Start URL into downloading queue.
2.get a URL from downloading queue, download the URL's content.
3.analyst the content, if it is a picture, saving to disk. if it is a HTML page, call HTML Parser script with it's URL and content.
4.if downloading queue is not empty, goto step 2.

by default, PicaLoader using the title of HTML that link from as "NOTE" of a picture, you can change it by third parameter when you call AddLink function.

and, Lua script is case sensitive!

so your script should be like this:

local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
local s3,e3=string.find(HTML.Content,'product_name=%b""');
if (s1 and e1 and s2 and e2 and s3 and e3) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);

If you want customize the local filename of picture, please check main menu->Project->Properties->Picture Filename->Customize Filename

P.S upgrade to PicaLoader 1.62 for long "Note" field.

roopam005 · Post by **roopam005** » Sat Mar 22, 2008 5:56 am

why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????

and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????

roopam005 · Post by **roopam005** » Sat Mar 22, 2008 5:57 am

why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????

and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????

Batch Download Pictures From Website

Can Picaloader Handle this

Can Picaloader Handle this

Super

Changed script