Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Search Engine & Indexer for File servers?

Options
  • 21-07-2006 10:44am
    #1
    Registered Users Posts: 68,317 ✭✭✭✭


    OK, this could really belong in any of a number of forums, but I figure that here is as good as any.

    Our users are, let's say, non-technical. We've built a 1.5TB file server for the office, and don't really want to just go ahead and give each department some space and let them work away. It'll get filled up with crap, and people will still be unable to find their files.

    So we're going to use a defined file structure and organise the entire company's files into a single tree. But that's superfluous. I need an indexer which will crawl this tree and extract such information as filename, file path, filetype, date created, date modified, etc etc. If this came with a web frontend for searching the index, all well and good. If not, I can code one myself.

    Any ideas? :)


Comments

  • Registered Users Posts: 1,452 ✭✭✭tomED


    Not entirely sure what you are trying to do, but thought I'd suggest this anyway.

    Google has an enterprise edition of their desktop application. I imagine it would possibly be a something to try out.

    http://desktop.google.com/en/GB/enterprise/

    If I'm way off - sorry! :)

    Tom


  • Registered Users Posts: 68,317 ✭✭✭✭seamus


    No, that's the kind of idea, although something as vague as Google may not be perfect. I think the enterprise edition is still Google Desktop, it's just manageable centrally, so it wouldn't work for my purposes.

    It wouldn't be a major issue for myself, except that I wouldn't be overly sure how to code a good indexer. :)


  • Registered Users Posts: 1,452 ✭✭✭tomED


    seamus wrote:
    No, that's the kind of idea, although something as vague as Google may not be perfect. I think the enterprise edition is still Google Desktop, it's just manageable centrally, so it wouldn't work for my purposes.

    It wouldn't be a major issue for myself, except that I wouldn't be overly sure how to code a good indexer. :)

    If it's on windows, you could use the built in windows indexing service and then write some asp code for the search engine. I worked on a project many many moons ago that did something like this for a large intranet. We ran into a great deal of problems mind, but in the end it worked pretty well.

    If i'm causing more headaches, than helping, just tell me :)


  • Registered Users Posts: 7,412 ✭✭✭jmcc


    seamus wrote:
    So we're going to use a defined file structure and organise the entire company's files into a single tree. But that's superfluous. I need an indexer which will crawl this tree and extract such information as filename, file path, filetype, date created, date modified, etc etc. If this came with a web frontend for searching the index, all well and good. If not, I can code one myself.

    Any ideas? :)
    It sounds more like a requirement for a script that traverses directories, stores the results in a flatfile or db and has a search form attached. In Linux, the slocate/updatedb does this automatically. I suppose it would be possible to do something similar with Windows scripting.

    Regards...jmcc


  • Registered Users Posts: 68,317 ✭✭✭✭seamus


    Cheers guys, it sounds like it's something that's not specifically catered for. I'm not crazy about windows indexing.

    I think we may better off deciding on the filing structure before figuring out how to search it, or otherwise using a custom Document Management system to do it. The problem is that they'd like to use custom attributes for each document (such as Department, Supplier/Client Name, etc etc), to further aid in searching. If it was purely a DMS, then this wouldn't be a big deal.

    But free easy-to-use DMS's don't seem to be that easy to come by. Even Owl is super popular, but a standard user would generate a huge amount of queries from it.

    Thanks anyway guys.


  • Advertisement
  • Registered Users Posts: 2,031 ✭✭✭colm_c


    For the Google end of things, they removed the indexing of file-shares because documents were being indexed that shouldn't be, and it was eating up people's local Hard-drives.

    If you're interested in an Enterprise Google solution, you may want to look into a Google Search Appliance. I've implemented a few, and the results are quite impressive.

    The whole appliance is a 2u YELLOW server which sits on you lan, configurable via a web interface.

    It'll pretty much crawl anything you can throw at it - fileshares, websites, databases even.

    You can throw back pretty much any information found from the file to the frontend, last modified, path, user etc.

    It looks to me that this is going to be more of an intranet? rather than a fileshare, if the structure in the fileshare is logical and each department know what files they're working on then it should be pretty logical to find file.

    Maybe consider having an intranet of some sort on the file-server - even a wiki would be good and has search and version control for content to boot.


Advertisement