Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Designing Large Content Websites

Options
  • 24-07-2002 5:24pm
    #1
    Moderators, Science, Health & Environment Moderators Posts: 8,962 Mod ✭✭✭✭


    Howdy guys. Just wondering what the conventional wisdom on this matter is. I've been toying with different ways of handling large amounts of documents for a website.

    1. Database Indexing and Content using an index table and a content table with a record for each paragraph or something like that. Each paragraph of a document having optional title, image etc.
    2. As Above but one blob field of content for the whole document - problems with formatting arise and I would have rather not have the html in the blob.
    3. All documents in xml files.
    4. Index table on database and documents in xml files.

    I think it's a choice of 1 or 4 but not having a huge amount of experience on the matter I'd love to hear alternatives or endorsements etc. Option one appears to me to be the best option - data all in one repository, can be retrieved in any format using code and is very searchable using say the indexing powers of sql server. Option 4 would be good too but I wonder as to how the speed of transforming xml documents and more importantly searching them compares to say using the database.
    Comments?


Comments

  • Registered Users Posts: 500 ✭✭✭Nuphor


    Well, probably the easiest (and most configurable) way of making a site containing large amounts of content is to use a content manager such as PostNuke

    All manner of highly configurable plugins and mods are available, and the system uses a mySQL backend which makes it wicked fast.

    Installation is a sinch too.

    Hope this helps...


  • Registered Users Posts: 7,412 ✭✭✭jmcc


    Originally posted by musician

    1. Database Indexing and Content using an index table and a content table with a record for each paragraph or something like that. Each paragraph of a document having optional title, image etc.
    2. As Above but one blob field of content for the whole document - problems with formatting arise and I would have rather not have the html in the blob.
    3. All documents in xml files.
    4. Index table on database and documents in xml files.

    If the data does not change on a continual (daily/weekly) basis then a static webpages solution with a database backend may be the best bet as you would only regenerate the complete monthly or so. New stuff could be added with only the indices and the relevant document being updated.

    Generating the pages dyamically is processor intensive especially when it is is a big database. The performance of the site will also degenerate when you have a lot of users.

    I don't know the number of documents that you are talking about, so the solution here is fairly nebulous. However from experience (WhoisIreland.com has about 60,000 webpages on public access and about another 70,000 on private access and the db backend and static html is used to publish the site. Two separate search engines then indexe all these pages (pub/priv) and makes them searchable.)

    Regards...jmcc


  • Moderators, Science, Health & Environment Moderators Posts: 8,962 Mod ✭✭✭✭mewso


    Thanks for the replies guys but I just realised that I will probably have to go with the database only solution. The reason being that allowed authors will be using our intranet to do this and they won't necessarily have direct access to the web site which is blocked by a firewall. They will however have access to the SQL Server that the website uses as the intranet uses the same. So this kind of forces my decision.


Advertisement