Tried my own script but no luck

Discuss how to download pictures from websites using PicaLoader.
(NO adult contents and advertisement please.)
Post Reply
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Tried my own script but no luck

Post by merc_slk »

Hello,

I tried this script on the website but I think I missed something, can't figure it out.

if string.find(HTML.Url,'IDType=prd',1,true) then
local s1,e1=string.find(HTML.Url,'SetIDWebObject\?ID=%w*');
if (s1 and e1) then
product=string.sub(HTML.Url,s1+19,e1);
local url='http://www.glamonweb.com/GlamOnWeb/images/catalog/';
url=url .. 'prd' .. product .. '_F.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
local url='http://www.glamonweb.com/GlamOnWeb/images/catalog/';
url=url .. 'prd' .. product .. '_R.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);


It seems the parser doesn't enter into my script. BTW I checked get JPG but PL gets also the GIF, don't know why?
Attachments
laperla.JPG
laperla.JPG (85.74 KiB) Viewed 924 times
User avatar
KoalaBear
Posts: 325
Joined: Wed Sep 24, 2003 5:27 pm

Post by KoalaBear »

1.this site requist you to choose your location at first, we can use PicaLoader's Manual Login feature at here: set http://www.glamonweb.com/GlamOnWeb/index.jsp as manual login URL, when start downloading, click your location (e.g United Kingdom) at manual login dialog, then click OK button at the top left conner.
2.check the start URL with IE, we found it links to a collection page first that URL look like xxxxxxxIDType=ser, and then, from this collection page, links to the product page that URL look like xxxxxxIDType=prd, so we set Page URL Include Filters:IDType=prd$;IDType=ser$
3.the big pics's URL look like this: xxxxxxxx_P.jpg, so we set Picture URL Include Filters:_P\.jpg$
4.this site using Adobe Flash to show big pics, so we have to using PicaLoader's HTML Parser Script to add pics's URL, there are a few error at your script, it should be like this:

if string.find(HTML.Url,'IDType=prd',1,true) then
local s1,e1=string.find(HTML.Url,'SetIDWebObject%?ID=%d+');
if (s1 and e1) then
product=string.sub(HTML.Url,s1+18,e1);
productname=HTML.Title;
s1,e1=string.find(HTML.Content,'<title%b><');
if (s1 and e1) then
productname=string.sub(HTML.Content,s1+7,e1-1);
end;
local url='http://www.glamonweb.com/GlamOnWeb/images/catalog/';
url=url .. 'prd' .. product .. 'f_P.jpg';
AddLink(url,HTML.Url,productname,'',HTML.TaskID,HTML.Level);
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);
Attachments
glamon.plt
(9.93 KiB) Downloaded 48 times
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Changed script, Only analyzes first page but does nothing

Post by merc_slk »

I rewrote the script and some changes in the filter expressions.

The picture to be downloaded can be found in HTML.Content

value="/GlamOnWeb/images/catalog/prd8131f.jpg"

and than some annex are added like _P,_R,_F,_PR or nothing

So I changed the script like this :


if string.find(HTML.Url,'IDType=prd',1,true) then
local s1,e1=string.find(HTML.Content,'value="/GlamOnWeb/images/catalog/prd+\%w+');
if (s1 and e1) then
product=string.sub(HTML.Content,s1+34,e1);
local views=",_P,_R,_F,_PR";
for view in string.gmatch(views,[%u_]*) do
local url='http://www.glamonweb.com/GlamOnWeb/images/catalog/';
url=url .. product .. view .. '.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Lev


But after opening and analyzing first page nothing happens. The Href's with "prd$" or "ser$" aren't even downloaded, so it will not get into the lua script. This problem I've very often and than suddenly without changing something it runs marvelous???
P.S. In which cases do I select "support J/VB/JAVASCRIPT"
Attachments
laperla1.JPG
laperla1.JPG (83.45 KiB) Viewed 888 times
User avatar
KoalaBear
Posts: 325
Joined: Wed Sep 24, 2003 5:27 pm

Re: Changed script, Only analyzes first page but does nothin

Post by KoalaBear »

merc_slk wrote:I rewrote the script and some changes in the filter expressions.

The picture to be downloaded can be found in HTML.Content

value="/GlamOnWeb/images/catalog/prd8131f.jpg"

and than some annex are added like _P,_R,_F,_PR or nothing

So I changed the script like this :

if string.find(HTML.Url,'IDType=prd',1,true) then
local s1,e1=string.find(HTML.Content,'value="/GlamOnWeb/images/catalog/prd+\%w+');
should be:
local s1,e1=string.find(HTML.Content,'value="/GlamOnWeb/images/catalog/prd%w+');
(prd+\%w+ will match prd+\32344f, '\' is special character for C/C++, not for lua)

if (s1 and e1) then
product=string.sub(HTML.Content,s1+34,e1);
should be:
product=string.sub(HTML.Content,s1+33,e1);

local views=",_P,_R,_F,_PR";
for view in string.gmatch(views,[%u_]*) do
should be:
for view in string.gmatch(views,'[%u_]+') do
(* means match 0 or more times.)

local url='http://www.glamonweb.com/GlamOnWeb/images/catalog/';
url=url .. product .. view .. '.jpg';
AddLink(url,HTML.Url,HTML.Title,'',HTML.TaskID,HTML.Level);
end;
end;
end;
DefaultParser(HTML.Content,HTML.Url,HTML.Title,HTML.TaskID,HTML.Level);

But after opening and analyzing first page nothing happens. The Href's with "prd$" or "ser$" aren't even downloaded, so it will not get into the lua script. This problem I've very often and than suddenly without changing something it runs marvelous???
please visit the start URL with IE, you will find that this site will skip to it's index page and let you select your language.
you can try to let IE show the correct content of start URL before run PicaLoader, and keep IE open, PicaLoader will auto using IE's cookie.

P.S. In which cases do I select "support J/VB/JAVASCRIPT"
If the web site using JavaScript links that PicaLoader's default HTML parser can deal with and these links is useful for us, please select this.
for this site, it's not necessary.
and your Picture URL Include Filters:
prd+\w+\.jpg$
should be:
prd[\w_]+\.jpg$
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Still won't run

Post by merc_slk »

Hi KoalaBear,

Fisrt of all thanks for the explination and the help, also with the other scripts of other sites. I get now a better idea how PL is working.

I changed the scipt like you advised and kept IE open before opening and running the project in PL. The only thing what happend was "loading the URL, waiting ...., no analysing and after a pause the error 1 web page loaded no pictures. I cleared the script, the "ser" and "prd" URL's are downloaded. I put the script back in, same story nothing happens.

Don't know what is happening. The only thing I noticed is that the line
local s1,e1=string.find(HTML.Content,'value="/GlamOnWeb/images/catalog/prd%w+');
is split in 3 lines in my HTMP Parser Script List box.

Is the script loaded and analized before the task is ran?

Many thanks.
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Who has tried

Post by merc_slk »

Hello,

Has anybody tried this modified script?? I cann't get it running. Only when I clear the Lua script box the included URL's are downloaded. So far it works. Script loaded, nothing is downloaded.

Any answer or solution is welcome.

Merc
User avatar
KoalaBear
Posts: 325
Joined: Wed Sep 24, 2003 5:27 pm

Post by KoalaBear »

my settings
Attachments
result
result
glamon2.jpg (105.22 KiB) Viewed 811 times
settings
settings
glamon2s.jpg (84.65 KiB) Viewed 810 times
glamon2.plt
task export file
(9.93 KiB) Downloaded 39 times
merc_slk
Posts: 17
Joined: Sat Jan 26, 2008 4:34 pm

Let's summarize.

Post by merc_slk »

The first script did it's work but did not download all the pictures.
It loaded the "ser" and "prd" url from "marchi.jsp"and some pictures.

Second we changed the script and nothing else happened, only if I cleared the script, the "ser" and "prd" url were downloaded again, but not the required pictures.

Third The URL to start the download from is changed and the modified script works.

So I don't understand what caused the problem. One script works together with the url marchi.jsp but not the modified one??? Now the modified works with the modified start URL!! A litle flaw in the program? It sure would be nice If I saw what the parser was doing or creating a log registering what the parser is doing.So if there's a problem like here it would it make more understandable.

But anyway thanks again for the help KoalaBear. Thanks a lot.

Jef
Post Reply