Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

perl regex - double search/replace

Options
  • 13-07-2012 2:30pm
    #1
    Registered Users Posts: 6,501 ✭✭✭


    I am modifying a custom CMS for a client.
    The very old editor that it has (htmlarea v3 I think) corrupts 'iframe' references so I have changed the angle brackets to square brackets for storing in the database, converting them back with a regex when the html pages are being created.

    The editor also converts double quotes to the html entity (" ).

    My code is:
    $page_content_temp =~ s/\[iframe (.+?)\](.*)\[\/iframe\]/<iframe $1>$2<\/iframe>/sg;
    $page_content_temp =~ s/&quot;/"/sg; # Return &quot; to actual double quotes.
    

    The second line is the problem. I would like it to only change '"' that is within the 'iframe' code.
    Should I write another few lines to capture the first line's $1 and $2 into new variables, then process them before reassembling?


Comments

  • Registered Users Posts: 6,501 ✭✭✭daymobrew


    I was thinking of something like:
    # Extract the attributes of the iframe tag, where the &#39;&#38;quot;&#39; bits will be
    ($iframe_attrs) = ($data =~ /\[iframe (.+?)\].*\[\/iframe\]/;);
    $iframe_attrs =~ s/&#38;quot;/&#34;/sg;  # Change &#38;quot; to double quotes.
    
    # Search/replace on the full date, but use the modified $iframe_attrs.
    $data =~ s/\[iframe (.+?)\](.*)\[\/iframe\]/&#60;iframe ${iframe_attrs}&#62;$2&#60;\/iframe&#62;/g;
    
    The bug with this code is that there could be multiple iframe elements in $data.


  • Subscribers Posts: 4,075 ✭✭✭IRLConor


    From the "crimes against readability" department:
    $page_content_temp =~ s {\[iframe(.*?)\](.*?)\[/iframe\]}
                            {
                                my ($a, $c) = ($1, $2);
                                $c =~ s/&quot;/"/g;
                                "<iframe$a>$c</iframe>"
                            }sexg;
    

    :)


Advertisement