Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to parse a subtag in a xml using Qlikview?

I have a xml under which there is another tag. For example

<p>The details of this book can be obtained from

<url href="https://xyz.org"> Founders Association </url>. The book has captured the <url href="https://xyz1.org"> XYZ Publisher</url> attention of the audience </p>.

I tried to load the <p> tag using

Load p

from abc.xml

The result it shows is as follows:

The details of this book can be obtained from The book has captured the attention of the audience

But my desired result should be:

The details of this book can be obtained from Founders Association.The book has captured the XYZ Publisher attention of the audience.

Can someone help me out to achieve this?

14 Replies
adamdavi3s
Master
Master

Can you share the xml?

The structure will matter in trying to figure this out I think

Not applicable
Author

Hi Adam,

It is not possible to share the xml but the structure I have given you is the similar to that of the original xml. I am not bothered about the data in other tags.My only concern is to fetch all the text inside the <p> tag. The only difference is that some of the <p> tag doesn't contain an <url> tag whereas in some cases it contains one and in some two <url> tags. I cannot fetch the text in between the <url> tag. The data fetched through my process is as shown above.

Regards,

Arghya Ray

jonathandienst
Partner - Champion III
Partner - Champion III

You haven't really given us very much to work on. A few more examples, as well as information on how the XML is delivered (are they fragments from a database, from a text file, from a web page); and your script that is not working. Without these we are only able to guess at a solution (and Sherlock Holmes has not logged in recently).

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
adamdavi3s
Master
Master

gah I missed a trick there with my username

adamdavi3s
Master
Master

Hi Arghya,

Hmm I wonder if it might be easier to parse it as text you see, but without the xml its really hard to know if that will work.

Can you just provide the XML with two or three lines (plus any header etc)  with the data made suitable for sharing?

adamdavi3s
Master
Master

text parsing is dirty but it gets what you want

Capture.PNG

load:

LOAD @1

FROM

(txt, codepage is 1252, no labels, delimiter is '\t', msq);

test:

load

replace(replace(replace(replace(replace(replace(replace(@1,TextBetween(@1,'<url','>'),''),'<url>',''),textbetween(replace(replace(@1,TextBetween(@1,'<url','>'),''),'<url>',''),'<url','>'),''),'<url>',''),'<p>',''),'</p>',''),'</url>','') as cleaned1

resident load;

Not applicable
Author

Hi Adam,

Please find the format of the xml below:

<item>

<section>

<title>Acknowledgements</title>

<p>There was a time<url href="xyz.org">xyz.org</url> Founders Association <url href="abc.org">abc.org</url>during ages</p>

</section>

<section>

<title>Acknowledgements1</title>

<p>Sa Re Ga<url href="pqr.org">pqr.org</url>Pa Dha<url href="pqr1.org">pqr1.org</url>Ni Sa</p>

</section>

<section>

<title>Acknowledgements2</title>

<p>The glory<url href="badumtiss.org">badumtiss.org</url>leads to<url href="badumtiss1.org">badumtiss1.org</url>success for all</p>

</section>

<section>

<title>Acknowledgements3</title>

<p>Chase your dreams<url href="tukur.org">tukur.org</url>And one day it will lead you to victory</p>

</section>

<section>

<title>Acknowledgements4</title>

<p>The path to enlightment is to know your inner self</p>

</section>

</item>

Thanks and regards,

Arghya

adamdavi3s
Master
Master

This one seems to work straight up I am afraid, Qlik just pulls it straight in using the XML import for <p>

Assuming this is what you want of course

Capture.PNG

Not applicable
Author

Hi Adam,

The data set you have provided is not correct.

My desired result should be:

Chase your dreams tukur.org And one day it will lead you to victory

Sa Re Ga bla.bla Pa Dha bla1.org Ni Sa

The glory badumtiss.org leads to badumtiss1.org success for all

The path to enlightment is to know your inner self

There was a time bla.bla  Founders Association bla1.org during ages