Another database design question

Kai · 07-07-2005 02:40PM #1

Howdy

Ive been given the task to build a request
website including the database backend. Im having some problems
deciding on the best design for the DB. As im quite new to this i'd
like to learn the best / most efficient way to handle each of the
requests.

The basic brief is for a login based site that allows users to input
requests using forms. There are about 20 - 30 different types of
request each with different options or inputs ( dropdowns,
freetext...). These request are obviously stored in a database to be
processed by an Administrator later on.

Now this is probably a very common type of site, but im having
problems figuring out how to store each of the requests as each are
different in some way. Should i have 1 large table with lots of
columns which can catch all the different types of submitted forms ?
Or should i break each report down and store different parts of it in
different tables and then somehow join them all back up when the
Administrator comes to Process them. How exactly would i do that ?

Anyone have any ideas ? As i said im new to this but i probably could
get it to work using a massive table but it doesnt feel like the best
way to do it, especially when you consider that new request types will
have to be added in the future. Id prefer to learn about the Correct
way to build this type of DB as ill more than likely have to build
something similar in future.

Thanks

Tom Dunne · 07-07-2005 03:49PM

You would probably have a table for users, request types and the actual requests. They can be linked together with primary key/foreign key relationships - e.g. Request number

Kai · 07-07-2005 04:12PM

tom dunne wrote:

You would probably have a table for users, request types and the actual requests. They can be linked together with primary key/foreign key relationships - e.g. Request number

Yea i thought of that Tom, but the problem is that there are 20 different types of requests, so each one has slightly different storage needs. for example 1 might have 3 dropdowns and 3 free text boxes where another will have only 1 dropdown and 1 freetext. There are several different input types such as multiple selected items in a listbox.

How would i store different types in 1 table. The only way i can see that working is to use string columns and then store the inputs as comma delimited values or something and then have the site Parse this back into a form that makes sense but this still feels like a messy way to do things.

BizzyC · 07-07-2005 04:22PM

Maybe if you gave more details about what kind of request you mean.
If you're new to DB design, I assume you haven't looked into using stored procedures and functions yet.

bonkey · 07-07-2005 04:22PM

How would i store different types in 1 table

With different columns, each allowing NULL values. If you can come up with a table-structure capable of holding input from *any* of your 20 "screen types", then all you do is not populate (i.e. set to NULL) the non-required fields. Ideally store an additional value which tells you which "screen type" a record belongs to, et voila.

Of course, if your 20 screen types are different enough that this becomes too unwieldy, then you'll know you shouldn't have been trying to put the data all together

TBH, it sounds like you should read up on relational design. Its not rocket-science to get the basics, and unless you've a really good reason to its not a design that you should readily move away from.

Tom Dunne · 07-07-2005 04:32PM

I would agree with Bonkey - start off with 1 table, with several fields that can be null and take it from there. It's a trade-off between manageability vs. complexity. The design of any database/application is always iterative - the first couple of attempts are never going to be the most efficient.

Also, give us some concrete examples.

Zaph0d · 07-07-2005 05:22PM

I've done this on several occasions. It's very common.
You see this data model needed to store survey results, exams, service requests and many other situations. This type of data has the characteristic that its schema is unpredictable.

There are three main approaches. You have to decide which is suitable for you.

You say that there are 20-30 service request types each with varying fields.

Option One : One Big Table
Bonkey's suggestion is to use a single table with every possible attribute for all service requests. Then NULLs are used for N/A values. This is simple and good for querying. Fast and easy.

It has the following drawbacks:
You need to redesign the table every time a new service request is added.
Your table may be huge with loads of NULL attributes in every row. SQL Server can store 1024 columns per row and each row must be less than 8K.
What happens if an attribute is required to hold multiple data values? Now you have to stoop to something filthy and perverse like this:

How would i store different types in 1 table. The only way i can see that working is to use string columns and then store the inputs as comma delimited values or something and then have the site Parse this back into a form that makes sense but this still feels like a messy way to do things.

Storing multiple values in a single attribute breaks first normal form. It means you can't index the individual values or use the values in queries without dipping into substrings.

Option Two: Field Name-Value pairs
You start by making tables that describe the service request. You need a header and a detail table. The header table contains one row for each type of service request your system can handle. It stores fixed info about the service request such as the name and the date created. The detail table contains a row for each attribute of the service request type. Each row has a foreign key pointing at the header table and also stores info like the attribute name, the datatype, the max length etc etc. So you have two tables that store metadata describing the types of request your system can accept.

Now you need header and detail data tables to store the individual requests. Each row in these tables is an instantiation of the type data in the metadata tables. They have foreign keys pointing back at their metaparents.
The structure of the detail data table includes: ID (primary key), ServiceDetailTypeID (FK), value (varchar(255))

As you need to store every record in the same kind of row, you choose, a big varchar as the data type and then do type conversions whenever you need the real data.

So you have a more flexible approach here but you also have type conversion crappo and you have your old multiple values problem again (you can get around this with another layer of table (a detail detail table).

Also your queries will be wierd because all the data is essentially being stored in one field in the database.

Option Three: Flexible Schema with supertypes & subtypes.
In this method you use the same metadata tables as above and the same header data table. However, you generate a new table for each service request type that stores all the fields needed to fit a request into one row. When you need multiple answers in a field, you need to generate another table that stores these multiple answers.

This design is flexible, allowing new service types to be added. It produces a proper schema that a DBA or developer can understand. It is efficient with space (who cares?)

However, it requires that whenever a new service type is added that a table is dynamically created and added to the schema. This is unusual and would frighten many people. Schema changes are normally carried out when a DB is offline. You might end up with hundreds of tables.

I hope this is some use to you. If I am not clear on anything, just ask.

is_that_so · 07-07-2005 05:27PM

Would do normalisation on all of this first and separate entities into appropriate subsets. Much easier to work with lots of small subsets than a large set.

I agree with Zaph0d on this and would suggest that option 3 would also allow for easier long term maintenance as well as being the most flexible.

ressem · 07-07-2005 11:42PM

I'm not a DB expert by any means, but aren't MS advertising the XML blob type and xQuery heavily in Sql server 2005? And Oracle implementing similar.

Might just be for clueless newbies like myself who never covered Codd in school but the intention is that it's suitable for document-centric projects like this appears to be.

Given that the database proposed
seems to be oriented towards presenting info to a small number of consumers,
and the number of fields in a ticket tracking system, required by automated processes for searching/ordering/indexing might be low, say; ticketID, status, dateopen, dateclosed, current owner, opened by, last event (your supertype data).

might not an XML file stored in a TEXT or Blob field be appropriate for the majority of subtype data?
using XSLTs to present this file appropriately for each specified viewer
be a good-enough solution?
and XSDs to describe the schemas.

Admittedly we're waiting on a standard query language for the XML field type.

Having said that, any references (on or offline) on implementing option3 correctly would be appreciated.

Zaph0d · 08-07-2005 10:51AM

Well I'm not an XML expert and there is a tendency for nerds to advocate the technology they understand and denigrate alternative methods often as a cover for their own ignorance.

That said, I don't recommend XML for data storage ever.

XML has been touted over the last few years as the new technology to solve all your problems. Every new programming tool now seems to offer 'XML support' in the same way that washing powders went 'biological'. Naturally programmers with one eye on their future careers, tend to reach for XML as a solution to whatever they are doing.

XML is a data interchange format like CSV or fixed width columns. Now, you wouldn't design a database with the data stored in CSV text files, would you? If you keep your data in text files you have a problem if someone moves or deletes one of these files, so you need to manage the file system somehow. Your data is not indexed. If one of your files is altered by someone using notepad, then what record do you have of these changes? Bye bye transactions concurrency security logging performance...

But what about the benefits of XML? The human readability, the schema flexibility, the self describing integration of data and metadata, the multiplatform whatchamacallit. In fact, this is all bollocks. Fixed width text files are human readable. And they come with optional integrated metadata (column names in the first line). And they have the flexility to have any number of columns. And they are acceptable to many system architectures.

XML does allow you to show hierarchical tree structures in a single document. It's very verbose but this feature makes it an improvement on other text formats like CSV. On the other hand, parsing an XML file is harder than parsing CSV or fixed width.

ooh ooh what about you store the xml documents in the database, like in huge text fields? Then you could use special tools to extract the data values from those huge text clumps. And other special tools to put the data in. And others to update and delete it. What about we shove food up our asses and crap out of our mouths?

XQuery? I've got an ODBC driver for text files. It didn't make me give up storing data in a database.

XML is a buzzword. A text interchange format. No more and no less.

Maybe I am missing the point and XML text files are a really smart place to keep your data.

ressem · 08-07-2005 01:19PM

Ouch burn,

and pretty accurate.

Wasn't being sarcastic or trying to denigrate Option 3, obviously I'm in no position to.
Like OP Would really like info on where to find best practice guidelines, as I do before working with anything with which I'm unfamiliar.

So I don't attempt to create a DB in future implementing dynamic nested tables as a binary tree structure, as found on the 'google is my friend' channel, for my next hobby project.

(And I know the other opinion, that I don't know enough and I should demand that employer fires me and hires someone with an advanced DB qualification to do the telephony project, please leave aside for the moment)

For example O'Reilly lists 9 books related to DB design. The 'for mortals' one doesn't cover it.
Looking at "Designing Effective Database Systems" which talks about dimensionless databases, dimension tables,star schemas, snowflaking none of which I'd heard of before this thread made me aware of the gap in my knowledge.

Are these the terms I should be researching before such a setup is demanded of me? Or wrong route altogether?

(digging hole deeper)
Disagree that XML is harder to parse than CVS. Agree human readability not the issue.

Special Tools? Does it count when they're integrated into the DB engine like
SQL>SELECT XDBUriTYPE(any_path).getXML()
2 FROM resource_view
3 WHERE under_path(res,'/CD')=1
4 AND extractValue(res,'/Resource/ContentType')='text/xml'
(oracle 9i DR2 feature, borrowed from oracle jan/feb mag 2003 of resident OCP)

Oracle Jul/aug 2002 p106.
suggests that storing your data as native XML can be justified if you are publishing the data in XML most of the time, with rare relational access.

We use voiceXML, and webui for user interfaces, so perhaps we're the special case.

(Isn't XML old enough though to rule me out as a bandwagon jumper? Not as if I was yelling oodbms and hibernate will solve all + laundry)

Zaph0d · 08-07-2005 02:12PM

ressem wrote:

Wasn't being sarcastic or trying to denigrate Option 3, obviously I'm in no position to.

I wasn't suggesting that the three options I listed are the best - they're just the ones I'm familiar with. They are all slightly unsatisfying so if anyone knows another way I'd like to hear it.

For example O'Reilly lists 9 books related to DB design. The 'for mortals' one doesn't cover it.

I haven't read this but I'm told it's very basic.

Looking at "Designing Effective Database Systems" which talks about dimensionless databases, dimension tables,star schemas, snowflaking none of which I'd heard of before this thread made me aware of the gap in my knowledge.

these are terms from data warehouse design (OLAP - advanced denormalised reporting systems) and don't apply to standard OLTP databases.

If you want to learn about databases, first learn about ER modelling. There are about 5 steps to this process and any twit can do it. You don't need a book just look it up on the web. This teaches you how to get from an english description of the system you are modelling to a logical design and from there to a physical design (tables and attributes etc). Then you need to learn SQL which is a continuation of the abstract alegbra you did in school: Venn diagrams and the like. The basics are easy and later you have to learn some tricks for more complex queries. Joe Celko has written various popular books on SQL techniques.

Most databases you need to design have been designed by someone else before. So it's worthwhile finding out these design patterns rather than badly reinventing the wheel on every project. There aren't a lot of books on db design patterns. The most common are the data model reource book by Silverston. This book comes in two parts: one with data models common to all industries eg customers invoices etc. The other describes data models specific to a number of industries eg insurance policies, e-commerce shopping carts, bank accounts etc. The models are all logical rather than physical and you have to pay a huge amount of $ if you want his physical designs.

Advanced SQL also follows patterns. eg code to find rows that are in one table but not in another. or code to remove duplicates from a table. Once you twig these everything becomes easier. I don't know a book that tells you this stuff. Maybe celko?

Then you need to know how SQL maps to OO classes and UML diagrams. SQL for Smarties by Celko (again) covers this.

Also you should find out about temporal databases and how time effects data models. All real world databases are temporal so this is very important.

(oracle 9i DR2 feature, borrowed from oracle jan/feb mag 2003 of resident OCP)

just because Oracle offers you a feature doesn't mean that it's healthy or good for you. What would you say to a whore who offered a no condom service?

Zaph0d · 11-07-2005 01:11AM

ressem wrote:

Oracle Jul/aug 2002 p106.
suggests that storing your data as native XML can be justified if you are publishing the data in XML most of the time, with rare relational access.

We use voiceXML, and webui for user interfaces, so perhaps we're the special case.

voiceXML and HTML are fairly similar. One is a markup language for text interaction with a client and the other for voice interaction. Would you store HTML pages in a database? I've never seen this done.

Databases are used to store data. You store it in the most generalised way without reference to the presentation of the data. Then you can attach a reporting server or a web server and generate web pages based on querying this data. Or equally you could attach a vxml server and reuse the same data to present information over the phone. Once the data is updated on the database, all downstream systems receive the updates.

Another database design question

Comments