Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Website data extract

Hello,

I am trying to compile a local copy of the PDGA player statistics.  This is available on their website and I can ingest this into Qlikview but only 1 page at a time and there are 1500 current pages and will be more in the future.  Is there a way to better have this run in the script so that it will pull all 1500 pages and not just the first one?  Here is the website I am trying to pull the data from - PDGA Player Statistics | Professional Disc Golf Association

LOAD Name,

     [PDGA #],

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats?order=Rating&sort=Desc]

(html, codepage is 1252, embedded labels, table is @1);

1 Solution

Accepted Solutions
robert_mika
Master III
Master III

Create a variable that will change a part of the link related to a page 

...http://www.pdga.com/players/stats?Year=2016&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All...3&order=Rating&sort=Desc

and the loop thru all the pages

Loops in the Script

not an easy task but possible...

View solution in original post

4 Replies
robert_mika
Master III
Master III

Create a variable that will change a part of the link related to a page 

...http://www.pdga.com/players/stats?Year=2016&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All...3&order=Rating&sort=Desc

and the loop thru all the pages

Loops in the Script

not an easy task but possible...

el_aprendiz111
Specialist
Specialist

Hi,

web.png

effinty2112
Master
Master

Hi Jason,

          Try:

for i = 0 to 10

if i = 0 then

Let vWebPage = 'http://www.pdga.com/players/stats?Year=All&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All&...

ELSE

Let vWebPage = 'http://www.pdga.com/players/stats?Year=All&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All&...' & $(i);

End if;

Data:

LOAD Name,

     [PDGA #],

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[$(vWebPage)]

(html, codepage is 1252, embedded labels, table is @1);

Next i;

This loads the first eleven pages of stats. Since there are a total of 3918 pages (i goes from 0 to 3917) I would do this in stages for a few hundred and store to qvd until you have them all.

Cheers

Andrew

Not applicable
Author

I was able to do it like this.

SET ThousandSep=',';

SET DecimalSep='.';

SET MoneyThousandSep=',';

SET MoneyDecimalSep='.';

SET MoneyFormat='$#,##0.00;($#,##0.00)';

SET TimeFormat='h:mm:ss TT';

SET DateFormat='M/D/YYYY';

SET TimestampFormat='M/D/YYYY h:mm:ss[.fff] TT';

SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';

SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';

SET a=1;

[Player Stats]:

LOAD Name,

     [PDGA #] as PDGANum,

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats]

(html, codepage is 1252, embedded labels, table is @1);

For a=1 to 1499

Concatenate ([Player Stats])

[Player Stats]:

LOAD Name,

     [PDGA #] as PDGANum,

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats?Year=2016&Class=All&Gender=All&Bracket=All&Country=All&StateProv=A...)]

(html, codepage is 1252, embedded labels, table is @1);

Next

Store [Player Stats] into AllPlayerDataPull.qvd;