Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Perl: Email and Url parsing

Options
  • 12-06-2003 12:41pm
    #1
    Registered Users Posts: 383 ✭✭


    Any one got some short code for parsing URL's and email address's from a body of text. Actual code appose to available modules.

    In Perl of course:)


Comments

  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    Originally posted by cherrio
    email address's from a body of text.
    I smell a would-be Spammer here...


  • Registered Users Posts: 383 ✭✭cherrio


    ??????????

    What is that suppose to mean???

    I am talking about how do I highlight URLs and email address's from a user submitted body of text.

    Exactly what vBulletin does,
    http://www.somesite.com
    vBulletin makes it clickable automatically. I want to do this but in Perl.

    How on earth did you come up with spam??


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    Originally posted by cherrio
    What is that suppose to mean???

    I am talking about how do I highlight URLs and email address's from a user submitted body of text.
    You never said what you needed it for and parsing email addresses from the body of an email or HTML page is how Spam harvesting spiders do their work.

    I'd imagine using a regular expression would be the way to go. Here's a quick one to extract URL's, that took me all of 30 seconds to find with Google.


  • Registered Users Posts: 383 ✭✭cherrio


    I think you jumped to a very large and inaccurate conclusion after reading my post. You as a mod you should certainly have a presumption of innocence, not guilt. And be certain before you start name calling.


    Any one else any ideas? I can do it in PHP, but want to do it in Perl. http://aspn.activestate.com/ASPN/Cookbook/Rx/Recipe/59821 is not what Im looking for, Im trying to extract the url from a textarea, not from html source code.


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    Originally posted by cherrio
    I think you jumped to a very large and inaccurate conclusion after reading my post. You as a mod you should certainly have a presumption of innocence, not guilt. And be certain before you start name calling.
    Awe diddums.

    On this forum I'm not a mod, just another Joe Smoe just like you.

    That we should all just have a presumption of innocence is your opinion, and one that I would not share. I've been about long enough to realize that, on the Internet, if someone looks as if they're up to no good, they probably are.


  • Advertisement
  • Registered Users Posts: 383 ✭✭cherrio


    Any other basic human rights that you don't believe in? Maybe free speech?

    To get back on topic, does any body have any suggestions?


  • Registered Users Posts: 491 ✭✭Silent Bob


    Originally posted by cherrio
    Any other basic human rights that you don't believe in? Maybe free speech?

    This is a private bulletin board. Free speech doesn't exist here.


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    Originally posted by cherrio
    Any other basic human rights that you don't believe in? Maybe free speech?
    setmypeoplefree.jpg


  • Registered Users Posts: 68,317 ✭✭✭✭seamus


    I don't care what you want to use it for.

    If you know Perl, then it should be pretty much identical to how you do it in PHP.

    Just think (in regular expresson terms) about how you would identify a URL or email address;

    URL: Begins with 'http://' or 'www.' and ends at the first whitespace character you find - this is how VB auto-parses urls.

    Email addresses are the subject of much academic discussion.

    Needless to say, that basic requirement is;

    First character is a letter. There is then any number of alphanumeric characters. Then an @ sign. Then any number of apphanumeric characters, then a dot (.) followed by the two or 3 letter TLD.

    :)


  • Registered Users Posts: 1,722 ✭✭✭Thorbar


    Originally posted by cherrio
    I think you jumped to a very large and inaccurate conclusion after reading my post.

    What The Corinthian would do such a thing? I'd never believe that.


  • Advertisement
  • Closed Accounts Posts: 9,314 ✭✭✭Talliesin


    Originally posted by cherrio
    Any other basic human rights that you don't believe in? Maybe free speech?

    I reall hate people turning any minor confontation on a list or board into a dramatic battle about fundamental human rights. I hate it on so many levels for so many reasons.

    I probably shouldn't ban you just for doing something that I personally hate, but don't tempt me.


  • Registered Users Posts: 383 ✭✭cherrio


    Originally posted by Talliesin
    I reall hate people turning any minor confontation on a list or board into a dramatic battle about fundamental human rights. I hate it on so many levels for so many reasons.

    Thats good for you. But I was being scarcastic.

    Now unless people have something to post about my original question, please dont post at all.


  • Closed Accounts Posts: 304 ✭✭Zaltais


    A quick 30 second search on CPAN finds me:

    URI::Find and Email::Find

    Is this 'on topic' enough for you?????

    Or should I delete this post?


  • Registered Users Posts: 383 ✭✭cherrio


    ya Zaltais, I found that as well. But I want to use actual code not loadable external modules. Any other suggestions?


  • Closed Accounts Posts: 304 ✭✭Zaltais


    Why not use external modules? AFAIK these modules follow the RFC's for both URL's and email addresses very closely and are likely to find a much larger range of possibility you are likely to write in the space of an afternoon......

    But, whatever gets your goat I suppose....

    You need to do some research on regex's (Regular Expressions).

    Read the documentation that comes with your distro.
    From a shell or command prompt.

    #perldoc perlretut
    #perldoc perlre

    Or the online versions

    perlretut

    perlre

    If you want more of a walk though read Learning Perl, and / or Programming Perl

    If you want example code on how to do what you're looking for that you can pilfer, spend an hour on Google


Advertisement