Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

PHP rss feed parser help

Options
  • 28-01-2005 12:59pm
    #1
    Registered Users Posts: 1,747 ✭✭✭


    I am looking for a bit of help to better understand a script so i can easily adapt it now and in the future.

    I am using this PHP script in a page to parse an rss feed:
    <?php
    /*
    Class RSSParser: 2 October 2002
    Original Author: Duncan Gough
    Single Feed Hack: By Gregg 't3h GeeK' van der Sluys, http://geek.scorpiorising.ca 12-13-03. 
    */
    
    class RSSParser	{
    
        var $title			= "";
        var $link 			= "";
        var $description 	= "";
        var $inside_item 	= false;
    
    
        var $all_rss_urls = array(
    							"http://www.spoiltchild.com/new/rss.xml"	=> "Spoiltchild.com",
    				);
    
    	function startElement( $parser, $name, $attrs='' ){
    		global $current_tag;
    
    		$current_tag = $name;
    
    		if( $current_tag == "ITEM" )
    			$this->inside_item = true;
    
    	} // endfunc startElement
    
    	function endElement( $parser, $tagName, $attrs='' ){
    		global $current_tag;
    
        	if ( $tagName == "ITEM" ) {
    
    			printf( "\t<div class='newsitem'><h3><a href='%s' target='_blank'> %s</a></h3>\n", trim( $this->link ), htmlspecialchars( trim( $this->title ) ) );
        		printf( "\t<p>%s </p></div>\n", htmlspecialchars( trim( $this->description ) ) );
    
        		$this->title = "";
        		$this->description = "";
        		$this->link = "";
        		$this->inside_item = false;
    
        	}
    
    	} // endfunc endElement
    
    	function characterData( $parser, $data ){
    		global $current_tag;
    
    		if( $this->inside_item ){
    			switch($current_tag){
    
    				case "TITLE":
    					$this->title .= $data;
    					break;
    				case "DESCRIPTION":
    					$this->description .= $data;
    					break;
    				case "LINK":
    					$this->link .= $data;
    					break;
    
    				default:
    					break;
    
    			} // endswitch
    
    		} // end if
    
    	} // endfunc characterData
    
    	function parse_results( $xml_parser, $rss_parser, $file )	{
    
    		xml_set_object( $xml_parser, &$rss_parser );
    		xml_set_element_handler( $xml_parser, "startElement", "endElement" );
    		xml_set_character_data_handler( $xml_parser, "characterData" );
    
    		$fp = fopen("$file","r") or die( "Error reading XML file, $file" );
    
    		while ($data = fread($fp, 4096))	{
    
    			// parse the data
    			xml_parse( $xml_parser, $data, feof($fp) ) or die( sprintf( "XML error: %s at line %d", xml_error_string( xml_get_error_code($xml_parser) ), xml_get_current_line_number( $xml_parser ) ) );
    
    		} // endwhile
    
    		fclose($fp);
    
    		xml_parser_free( $xml_parser );
    
    	} // endfunc parse_results
    
    	function show_title( $rss_url ){
    					?>
    						<small></small>
    					<?
    	} // endfunc show_title
    
    } // endclass RSSParser
    
    global $rss_url;
    
    // Set a default feed
    if( $rss_url == "" )
    	$rss_url = "http://www.spoiltchild.com/new/rss.xml";
    
    $xml_parser = xml_parser_create();
    $rss_parser = new RSSParser();
    
    $rss_parser->show_title( $rss_url );
    $rss_parser->parse_results( $xml_parser, &$rss_parser, $rss_url );
    
    ?>
    

    you can see this script in action here: http://www.spoiltchild.com/new/news_test.php

    And the source xml feed here: http://www.spoiltchild.com/new/rss.xml

    My question is how to i pick out the individual elements and display them, such as the publish date and image and how can i take the URL of each news piece and add it to a 'More' link at the end of the description?

    At the moment it seems to only pull and print each node (%s) in order. How do i add to what it pulls from the xml and change the order?

    thanks


Comments

  • Closed Accounts Posts: 4,655 ✭✭✭Ph3n0m


    If you look carefully you can see the pattern


    printf( "\t<div class='newsitem'><h3><a href='%s' target='_blank'> %s</a></h3>\n", trim( $this->link ), htmlspecialchars( trim( $this->title ) ) );

    the first %s refers to trime($this->link)
    the second %s refers to htmlspecialchars (trim($this->table))

    and in this line

    printf( "\t<p>%s </p></div>\n", htmlspecialchars( trim( $this->description ) ) );

    the %s refers to htmlspecialchars( trim( $this->description ) )


    Therefore using that logic you can define whatever you want as %s by changing the subsequent reference using the <tag> in the xml

    for example I the headline to link to the story with the date in brackets beside that and underneath I want a shortened description


    printf( "\t<div class='newsitem'><h3><a href='%s' target='_blank'> %s</a> (%s)</h3>\n", trim( $this->link ), htmlspecialchars(trim($this->title)), trim( $this->date ) );


    and now for the description
    printf( "\t<br>%s </p></div>\n", htmlspecialchars( trim( $this->description ) ) );

    or something similiar to that


    AND dont forget to declare any extra vars you want at the start of the script

    i.e.

    var $date = "";

    and

    $this->date = "";


  • Registered Users Posts: 1,747 ✭✭✭Figment


    Cool, thanks. I got it now.

    next thing...
    the date is coming in as "Tue, 21 Dec 2004 07:02:00 EDT"
    Is there any way to drop the last few characters do i end up with "Tue, 21 Dec 2004" ?

    Thanks for your help :)


  • Registered Users Posts: 1,169 ✭✭✭dangerman


    just use substring

    http://ie.php.net/manual/en/function.substr.php



    so for Tue, 21 Dec 2004 i think it would be

    $date = substr($date, 0, 16);


  • Registered Users Posts: 1,747 ✭✭✭Figment


    Sorry, my knowledge of php is below basic. Where would i place
    $pubDate = substr($pubDate, 0, 16); in that code?

    A few attempts gave me errors.


  • Closed Accounts Posts: 4,655 ✭✭✭Ph3n0m


    if you are using the following


    printf( "\t<div class='newsitem'><h3><a href='%s' target='_blank'> %s</a> (%s)</h3>\n", trim( $this->link ), htmlspecialchars(trim($this->title)), trim( $this->date ) );


    the just use this version of it (i think this will work)


    printf( "\t<div class='newsitem'><h3><a href='%s' target='_blank'> %s</a> (%s)</h3>\n", trim( $this->link ), htmlspecialchars(trim($this->title)), substr(trim( $this->date ), 0, 16) );


  • Advertisement
  • Registered Users Posts: 1,747 ✭✭✭Figment


    That did the job brilliant. Thank you.


  • Banned (with Prison Access) Posts: 16,659 ✭✭✭✭dahamsta


    Unless you're doing it to educate yourself, you'd be a lot better off using Magpie. It's been around for an age so it's as mature as you get, and it'll parse just about anything.

    adam


  • Registered Users Posts: 1,747 ✭✭✭Figment


    I looked at Magpie first but it was overkill for what i needed.
    I also needed a solution with less of an install footprint (none) so the script i used above was ideal.

    Thanks


Advertisement