Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Searching an unsorted text file to count number of occurrences of a word.

Options
  • 28-04-2008 7:29pm
    #1
    Closed Accounts Posts: 3


    Hi,

    This is part of an assignment that I have at the moment.. Basically I have to design a program that will take in a word and return the number of times it occurs in a given unsorted text file.

    Will this data need to be sorted? Or can I just go ahead and design a loop that will scan the whole document.. I will be trying the latter first because I only have to go through the document once as opposed to sorting it then searching the document?

    Am I going on the right track here??

    Thanks..


Comments

  • Moderators, Society & Culture Moderators Posts: 9,689 Mod ✭✭✭✭stevenmu


    Like you say, you already have to iterate through the array at least once to sort it, so why not just count then. You only gain anything by sorting if you need to search the same array for multiple words (and then there's better ways of doing that, google hashtables if you're interested)


  • Closed Accounts Posts: 1,444 ✭✭✭Cantab.


    cat filename.txt | grep word | wc -l

    should get you started


  • Registered Users Posts: 2,931 ✭✭✭Ginger


    In .NET use Regex and get the match collection and the count from that.


  • Closed Accounts Posts: 4,368 ✭✭✭thelordofcheese


    imboard2 wrote: »
    Hi,

    This is part of an assignment that I have at the moment.. Basically I have to design a program that will take in a word and return the number of times it occurs in a given unsorted text file.

    Will this data need to be sorted? Or can I just go ahead and design a loop that will scan the whole document.. I will be trying the latter first because I only have to go through the document once as opposed to sorting it then searching the document?

    Am I going on the right track here??

    Thanks..

    Unless your assignment calls for the data to be sorted then don't bother.
    A simple loop like you've suggested should suffice.

    Though, if it's only part of the whole assignment you'd probably be best off looking at whats needed in the rest of the before making any decisions, if you need to sort the file or search for multiple terms or something else later on, and wouldn't it be nice not to have to rewrite a large chunk of it?


Advertisement