Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Extracting data from XML

Options
  • 01-12-2009 2:46am
    #1
    Closed Accounts Posts: 496 ✭✭


    <rss version="2.0">
    <channel>
    <title>Latest P</title>
    <link>/P</link>
    <description>A brief overview of the weather for the week ahead.</description>
    <language>en-us</language>
    <item>
    <title>Tue 1/12 : 2 Stars</title>
    <link>/P</link>
    <description>
    Rating :: 2 Stars. SSSS :: 11 ft @ 7 secs. Wind :: 26mph SSE </description>
    </item>

    Above is a sample copy of the xml, from which I wish to extract and display items based on the bolded text ( i.e. those which have a rating of 3 or higher), I'm finding some problems getting into this node. Is this cause by many of them with the same name, such as description, title etc. Or am I making an error based on my node numbers?

    could anyone give me an example of how to create the correct selection and parse the results to display


Comments

  • Registered Users Posts: 9,579 ✭✭✭Webmonkey


    First of all It would be useful to know what language you are using :)

    Also, can you show us the code you have so far?


  • Registered Users Posts: 18,272 ✭✭✭✭Atomic Pineapple


    Dom or Sax parser?


  • Registered Users Posts: 2,234 ✭✭✭techguy


    I've been looking to do a similar thing but just haven't gotten around to it.

    Hope the following helps.

    From:http://www.velocityreviews.com/forums/t296139-read-xml-from-string-instead-of-file-c.html
    using System;
    using System.IO;
    using System.Xml;

    public class XmlFromString
    {
    public static void Main(string[] args)
    {
    string xml = "<?xml version='1.0'?><person firstname='john'
    lastname='smith' />";
    XmlDocument doc = new XmlDocument();
    doc.InnerXml = xml;
    XmlElement root = doc.DocumentElement;
    Console.WriteLine(" The firstname : {0} lastname: {1}",
    root.GetAttribute("firstname"), root.GetAttribute("lastname"));
    }
    }


  • Moderators, Technology & Internet Moderators Posts: 1,335 Mod ✭✭✭✭croo


    if you are using java I believe xpath is a pretty common approach.
    http://www.javabeat.net/tips/182-how-to-query-xml-using-xpath.html


  • Closed Accounts Posts: 496 ✭✭j0e


    Hey guides thanks for the response, I am using visual basic on the mobile 5 pocket pc sdk

    trying to make a simple app to display some rss feeds based on the rating, but finding it hard to cut single up the rating.


  • Advertisement
  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    That's .Net right? Here's a potential solution in C# using LINQ to XML. Dunno if that will work on a mobile and I've been working on my masters all day so the head is a little fried but it works for me (it's asp.net and can be tided up a lot :o, I'll leave that to you):
    protected void Button1_Click(object sender, EventArgs e)
        {
            XDocument feedXML = XDocument.Load(Server.MapPath("Feed.xml"));
    
            var items = from item in feedXML.Descendants("item")
                        select new
                        {
                            Description = item.Element("description").Value
                        };
            foreach(var item in items)
            {
                this.Label1.Text = item.Description;
            }
        }
    

    Here's the XML for feed.xml, it will look familiar to you:
    <?xml version="1.0"?>
    <rss version="2.0">
      <channel>
        <title>Latest P</title>
        <link>/P</link>
        <description>A brief overview of the weather for the week ahead.</description>
        <language>en-us</language>
        <item>
          <title>Tue 1/12 : 2 Stars</title>
          <link>/P</link>
          <description>
            Rating :: 2 Stars. SSSS :: 11 ft @ 7 secs. Wind :: 26mph SSE
          </description>
        </item>
      </channel>
    </rss>
    
    
    

    You'll also have to work out how to get Rating out from the returned string, maybe use a offset or something.

    Have a look at Scott Gu's blog on this too.


  • Closed Accounts Posts: 496 ✭✭j0e


    update I've got it displaying, no parsing as yet

    but how do I spilt the string so the description is displayed as

    atm i'm using a stretched out textbox, this is similar to my first request, I want to search through a string an reterive a char or edit the string according to the char, in this case take a newline everytime It encounters a full stop.

    Rating :: 2 Stars.
    SSSS :: 11 ft @ 7 secs.
    Wind :: 26mph SSE


  • Registered Users Posts: 9,579 ✭✭✭Webmonkey


    Do a split on the string using the comma as a delimiter.


  • Registered Users Posts: 515 ✭✭✭NeverSayDie


    As suggested by Webmonkey, a Split() might suffice, otherwise, you can use something called "regular expressions" (that's a general tool across lots of platforms, not just used in VB.NET) to parse up the string you extracted from the node.

    You'll find lots of info on regexes around the web, here for example;
    http://www.4guysfromrolla.com/articles/022603-1.aspx


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    Yeah I'd go with split() too.


  • Advertisement
  • Closed Accounts Posts: 496 ✭✭j0e


    nice one lads, thanks for all the input. I'ill work on the implementation tonight, working on a paper at the minute :|. Once up and running ill post up the code for future referrence.


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    Could you update the thread to [Solved]? You'll need to edit your first post and use the dropdown beside the Title field.

    Thanks.


  • Closed Accounts Posts: 496 ✭✭j0e


    yeah no worries, was trying the split() with a full stop to cut them, works fine untill on of the numbers goes to a decimal so, trying to find a way around that.

    Would you like me to alter the original post to contain the solution also?


  • Registered Users Posts: 515 ✭✭✭NeverSayDie


    j0e wrote: »
    yeah no worries, was trying the split() with a full stop to cut them, works fine untill on of the numbers goes to a decimal so, trying to find a way around that.

    That's a bit trickier. Assuming it's specific things like the rating you're after, you should be able to use regexes to match particular combinations of terms in the string - eg, extract the number that appears after a "Rating :: " sequence and before a period following one or more letters, that kind of thing.

    And on a semi-related note, be careful when you're parsing over-verbose XML;
    http://thedailywtf.com/Articles/Special-Delivery.aspx
    :)


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    Nah its fine, just update the status to solved. Although you could add the solution in a new post so in future people can find it.


  • Closed Accounts Posts: 496 ✭✭j0e


    i was doing a bit of research on the split, do got no real answer
    there is a dirty hack, if I could do split() "s."

    but I'm reading up on regular expressions to see if thats any help


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    Maybe the regex route is the way to go. You could always split ". " which is a period followed by a space, that would remove decimal points in numbers.


Advertisement