Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Regular expression help

Options
  • 15-09-2005 2:45pm
    #1
    Closed Accounts Posts: 1,525 ✭✭✭


    I'm getting a bit confused by the whole regular expression thing.
    I'm having issues regarding partial matching.
    I already spent 2 hours yesterday figuring out that the html space   is different to a normal space :mad:

    What I want to do is to have say the name John Murphy and have that match the text J Murphy. I could match by just the surname but that would be too inexact.

    Is there a way to do this all in one expression. I know that you can use \b to determine word boundaries but am a bit stumped regarding individually checking each word. Will I have to have an individual check for each word (e.g. check J matches and then Murphy matches) Any help would be appreciated!


Comments

  • Registered Users Posts: 21,264 ✭✭✭✭Hobbes


    Have you tried the manual online? incidently not everyone can read the black text (I have a black background).


  • Registered Users Posts: 1,275 ✭✭✭bpmurray


    Just a couple of points:   is actually "Non-breaking space", so if you have "John Murphy" at the end of the line, it can break between "John" and "Murphy"; however "John Murphy" does not break between the words - its a *non-breaking* space. For the purpose of text analysis, though, it's treated as a space. Have a look at the Unicode character properties data available from www.unicode.org.

    As far as reg expressions is concerned, if you want "John Murphy" and "J murphy" and "Johnney Murphey" all to match an expression, you have to look to see what you're really trying to match. In these cases it's:
    "J"(optional "ohn" (optional "ney"))<whitespace>"Murph"(optional "e")"y"

    Depending on which regexp you're using it might be:
    ^J(ohn(ney)?)? *Murph(e)?y$

    Clearer? Or did I confuse the issue more?


  • Closed Accounts Posts: 779 ✭✭✭homeOwner


    vorbis wrote:
    What I want to do is to have say the name John Murphy and have that match the text J Murphy. I could match by just the surname but that would be too inexact.

    not sure if i am understanding what you are matching but try

    J[A-Z][a-z]*.Murphy

    which is basically 'J' followed by zero or more occurrance of any letter followed by anything (ie a space) followed by the word Murphy. This will also match Jane Murphy though. Play with the above in TextPad, it has a reg exp matcher to find what you are looking for.

    put ^ at the start and $ at the end for unix.


  • Closed Accounts Posts: 1,525 ✭✭✭vorbis


    bpmurray, your pattern is kinda close to what I'm trying to do.
    I'd like to match as many characters as possible.
    e.g. if a word has three letters then it should match the first three letters
    so for John Murphy
    Jo Murphy would match, Joh Murphy would match but Joe Murphy wouldn't
    Is this possible with regular expressions?


Advertisement