Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Grep type question

Options
  • 09-01-2010 4:45pm
    #1
    Posts: 5,589 ✭✭✭


    I want to be able to select and copy multiple lines of code that lie within a unique CSS tag. So, if the source page looks like this
    Blah
    Blah
    <div id="unique">
    Line 1
    Line 2
    Line 3
    </div>
    blah
    blah
    

    Grep seems to fail me here as I can't select more then one line of code and I want to select lines 1-3. I'm guessing this is simple enough but can't find much on the internet. Anyone have any ideas? Thanks!


Comments

  • Posts: 0 [Deleted User]


    jQuery would be perfect for this - it has selectors to target the relevant div and it's then easy to pull out the content. You'll find all you need on the jQuery site - a line of code will do it.


  • Posts: 5,589 ✭✭✭ [Deleted User]


    Cheers - I'd rather not use javascript, I was hoping to get this done as a shell script.


  • Moderators, Education Moderators, Home & Garden Moderators Posts: 8,171 Mod ✭✭✭✭Jonathan




  • Registered Users Posts: 1,109 ✭✭✭Skrynesaver


    Something like the following should do it, WARNING untested
    perl -e 'while (<>){$found=1 if (/div id="unique"/);$found = 0 if (($found==1) && (/<\/div/));print if $found;}'  $FILENAME
    


  • Registered Users Posts: 6,509 ✭✭✭daymobrew


    Or look at the 3 dot perl range operator. Again, this is untested.
    perl -e 'while (<>){print if /div id="unique"/ ... /<\/div/;}'  $FILENAME
    
    There is a 2 dot version too:
    perl -e 'while (<>){print if /div id="unique"/ .. /<\/div/;}'  $FILENAME
    
    IIRC one will print the div lines, the other won't.

    Edit: From a quick experiment this morning, both code snippets print the div lines. I might be doing something wrong.
    <div id="unique">
    Line 1
    Line 2
    Line 3
    </div>
    


  • Advertisement
  • Registered Users Posts: 6,509 ✭✭✭daymobrew


    I haven't been able to get my code snippet working.
    The range operator docs say that the operator returns values that could be useful but I couldn't figure out how to access this returned value.
    The value returned is either the empty string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string "E0" appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1.
    This is getting closer:
    perl -e 'while (<>){print if ((/div id="unique"/ ... /<\/div/) > 1)}' $FILENAME
    
    This returns:
    Line 1
    Line 2
    Line 3
    </div>
    


  • Registered Users Posts: 6,509 ✭✭✭daymobrew


    I think I finally got it (I just couldn't let this one go):
    ~> perl -e 'while (<>){ $range = (/div id="unique"/ ... /<\/div/); print if ($range > 1 && $range !~ /E0/)}' $FILENAME
    
    It stores the result of the ranger operator and then checks it. The ranger operator returns '5E0' for the last matching line (one with '</div>').


Advertisement