Hello,
I want to download the xlg pictures from this site. It uses javascript but with a lot variables, it's hard to figure how to parse the right URL
I know the pictures are stored in www.nancymeyer.com/lingerie/assets/product_images/xlg/ but there's no pattern to follow.
Let's take a random page : http://www.nancymeyer.com/lingerie/sear ... %20CHARMEL
Now we open take the picture named "Lise Charmel: Casting Lady Plunge Bra" http://www.nancymeyer.com/lingerie/prod ... t=2&s_id=0
Another pages opens with a button "click for alternate views"
A popup opens with the xlg picture within and some links to other xlg pictures. I want to download these pictures but with my reference like "Lise charmel-Casting Lady Plunge Bra-xx" xx-number of picture.
I think it's a hard one to crack but maybe with LUA it could be possible but my knowledge is not enough to make a script.
Any help would be welcome.
Merc
Can Picaloader Handle this
for PicaLoader 1.61 or later:
Start URL:http://www.nancymeyer.com/lingerie/search2.asp?s_id=0&search_freetext=LISE%20CHARMEL
Page URL Include Filters:CHARMEL;product\.asp\?pf_id=\w+&attr_value2=\w+$
Page URL Exclude Filters:Answer=
Picture URL Include Filters:-xlg\.jpg$
HTML Parser Script:
local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
if (s1 and e1 and s2 and e2) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
Start URL:http://www.nancymeyer.com/lingerie/search2.asp?s_id=0&search_freetext=LISE%20CHARMEL
Page URL Include Filters:CHARMEL;product\.asp\?pf_id=\w+&attr_value2=\w+$
Page URL Exclude Filters:Answer=
Picture URL Include Filters:-xlg\.jpg$
HTML Parser Script:
local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
if (s1 and e1 and s2 and e2) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
- Attachments
-
- nancy.plt
- select main menu->task->import tasks to import.
- (9.93 KiB) Downloaded 29 times
Changed script
Hello,
I tried to change the script like this :
local s1,e1=string.find(HTML.content,'pf_id=%u*');
id=string.sub(HTML.Url,s1+6,e1);
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
attrs=string.sub(HTML.Content,s2+17,e2-1);
local S3,e3=string.find(HTML.Content,'product_name=%b""');
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
if (s1 and e1 and s2 and e2 and s3 and e3) then
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
But it doesn't work.
Now I don't know if it's possible to assign more variables to HTML.Content or is it a one time pass browsing the content.
I've noticed in the que that "NOTE" gets a value. Where does it come from and can it be changed or manipulated.
Can I assign new names to the downloaded pictures with values I grabbed from the page content? Like "productname" in the example above.
Can I make new subdirectories with values I grabbed from the page content? A part of "productname" in the example above.
With Regards,
Jef
I tried to change the script like this :
local s1,e1=string.find(HTML.content,'pf_id=%u*');
id=string.sub(HTML.Url,s1+6,e1);
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
attrs=string.sub(HTML.Content,s2+17,e2-1);
local S3,e3=string.find(HTML.Content,'product_name=%b""');
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
if (s1 and e1 and s2 and e2 and s3 and e3) then
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
But it doesn't work.
Now I don't know if it's possible to assign more variables to HTML.Content or is it a one time pass browsing the content.
I've noticed in the que that "NOTE" gets a value. Where does it come from and can it be changed or manipulated.
Can I assign new names to the downloaded pictures with values I grabbed from the page content? Like "productname" in the example above.
Can I make new subdirectories with values I grabbed from the page content? A part of "productname" in the example above.
With Regards,
Jef
when you press start button, PicaLoader's working like this:
1.push Start URL into downloading queue.
2.get a URL from downloading queue, download the URL's content.
3.analyst the content, if it is a picture, saving to disk. if it is a HTML page, call HTML Parser script with it's URL and content.
4.if downloading queue is not empty, goto step 2.
by default, PicaLoader using the title of HTML that link from as "NOTE" of a picture, you can change it by third parameter when you call AddLink function.
and, Lua script is case sensitive!
so your script should be like this:
local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
local s3,e3=string.find(HTML.Content,'product_name=%b""');
if (s1 and e1 and s2 and e2 and s3 and e3) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
If you want customize the local filename of picture, please check main menu->Project->Properties->Picture Filename->Customize Filename
P.S upgrade to PicaLoader 1.62 for long "Note" field.
1.push Start URL into downloading queue.
2.get a URL from downloading queue, download the URL's content.
3.analyst the content, if it is a picture, saving to disk. if it is a HTML page, call HTML Parser script with it's URL and content.
4.if downloading queue is not empty, goto step 2.
by default, PicaLoader using the title of HTML that link from as "NOTE" of a picture, you can change it by third parameter when you call AddLink function.
and, Lua script is case sensitive!
so your script should be like this:
local s1,e1=string.find(HTML.Url,'pf_id=%u*');
local s2,e2=string.find(HTML.Content,'info_attributes=%b""');
local s3,e3=string.find(HTML.Content,'product_name=%b""');
if (s1 and e1 and s2 and e2 and s3 and e3) then
id=string.sub(HTML.Url,s1+6,e1);
attrs=string.sub(HTML.Content,s2+17,e2-1);
prdctnm=string.sub(HTML.Content,s3+14,e3-1);
productname=string.gsub(prdctnm,":","-");
for attr in string.gmatch(attrs,'[%l-]*') do
if string.len(attr) > 1 then
local url='http://www.nancymeyer.com/lingerie/assets/product_images/xlg/';
url=url .. id .. '-' .. attr .. '-xlg.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
If you want customize the local filename of picture, please check main menu->Project->Properties->Picture Filename->Customize Filename
P.S upgrade to PicaLoader 1.62 for long "Note" field.
why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????
and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
is that anything wrong in settings or anything missing in parser script????
and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
roopam
why after downloading some photos, it downloads only and only webpages....for hours....
is that anything wrong in settings or anything missing in parser script????
and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
is that anything wrong in settings or anything missing in parser script????
and .what exect string should be used in customised file name.. for assigning new names to pictures with values grabbed from the page content ????
roopam