distributed database architecture for a large database

nahdoic · 26-06-2003 6:11pm #1

I've got an interesting problem for you all. I'd love to hear what you think.

I've simplified the database design of my web application to its root problem, so it'll be very easy to see my difficulty, and if you're feeling generous possibly offer a solution.

I have two tables.

member table (circa 1,000,000 rows - 100 meg - easily fit in a single table)
username varchar(30) pk
password varchar(30)
settings varchar(30)

member_log table (circa 3,000,000,000 rows - a few 100 gigs - a few 100 very simple updates and selects a second, with some quite long selects every minute or so - when the update is being done it needs to select the settings for the user from the member table before it does the update to the member_log table)

logid medint pk
fk_username varchar(30) fk
description varchar(200)

Now ehm … what's the most efficient way of doing this?

What I would like to do is:

Is have a copy of the member table on every server, then break up the member_log based on the username, and spread it across multiple servers.

database server a

full member table
1/4 member_log table

database server b

full member table
1/4 member_log table

database server c

full member table
1/4 member_log table

database server d

full member table
1/4 member_log table

In the future, if the servers start to slow down then I'll just add

database server e

full member table
member_log table

Well that's what I'd like to do, but I don't know how to do this.

My main problem is keeping the full member table in sync.

I can't use replication, because I only want to keep one table in sync, not the whole database.

So i don't know what to do. How do I do this, and do this efficently?

bonkey · 27-06-2003 8:51am

Originally posted by nahdoic
I can't use replication, because I only want to keep one table in sync, not the whole database.

Depends on your DB system. I know that MSSQL can synchronise individual tables with no problems.

(An alternate solution would be to update all databases from the client whenever this data is changed, running some form of distributed transaction, rather than relying on DB-to-DB synching. Not very pretty though.)

Your partitioning idea raises questions...are you going to cluster the data back into a single system in some way, or will you simply be able to figure out which server will carry the data you're looking for in advance so you can route the query appropriately from the controlling software (presumably the web server)?

Do you have any existing hardware, or is this planned on being a "from-scratch" design, and if so, are there technology/budgetary considerations.

jc

nahdoic · 27-06-2003 11:05am

hi,

Thx for replying.

Depends on your DB system. I know that MSSQL can synchronise individual tables with no problems.

Yeah, actually even MySQL can do this, but it was hidden away.

(An alternate solution would be to update all databases from the client whenever this data is changed, running some form of distributed transaction, rather than relying on DB-to-DB synching. Not very pretty though.)

That would work alright, but ughhhh, as you say not pretty

Your partitioning idea raises questions...are you going to cluster the data back into a single system in some way, or will you simply be able to figure out which server will carry the data you're looking for in advance so you can route the query appropriately from the controlling software (presumably the web server)?

Yeah. Interesting question. And it actually got me thinking, I don't actually need the full member table on each server.

I'll just break my member table up into two tables.

member_main table
username varchar(30) pk
password varchar(30)
log_server varchar(30)*

* log_server will know which server to look at when doing reports, so i can route the query appropriately

member_settings table
username varchar(30) pk
settings varchar(30)

member_log table
logid medint pk
fk_username varchar(30) fk
description varchar(200)

then I could distribute it across like

database server a

full member_main table

database server b

1/4 member_settings table
1/4 member_log table

database server c

1/4 member_settings table
1/4 member_log table

database server d

1/4 member_settings table
1/4 member_log table

database server e

1/4 member_settings table
1/4 member_log table

Yes, that would work quite nicely, and be uber efficient, and be very manageable and scaleable.

Do you have any existing hardware, or is this planned on being a "from-scratch" design, and if so, are there technology/budgetary considerations.

I do have existing hardware, and there is a current system running on a single database. But I'm doing a "from-scratch" design now, to facilitate the future demand. It's growing at a fantastically consistent pace, and when I introduce my new services I'm going to need multiple servers to handle all the data.

The margins are tight, it's a very competitive market so it has to be the most efficient and cost effective way I can make it. Leaving out its size, it's a very simple database, with very simple updates and selects. No need for stored procedures, sub-selects, transactions. Speed and efficiency are the two most crucial factors, so I'm probably leaning towards MySQL. It's currently running on MySQL and I'm extremely impressed with how it's performing.

Any more thoughts and ideas please send them on, this is very much the 'design' stage.

bonkey · 27-06-2003 11:32am

The one thing Id be really concerned about is the performance of MySQL on a database of that size. I'll admit to not being totally au fait with it, but I would always have been of the impression that it was not really designed to scale up to multi-GB tables (which seems to be what youre looking at here).

Also, I would be inclined to not use a varchar(30) field as the primary key. I would be more inclined to make it a Unique field (but not a PK) on your "small" table, and have a numeric PK for the linking. I could easily be wrong, but I seem to recall that most DB systems perform numeric comparisons more efficiently than string-comparisons, which may be of benefit in your joins (and queries in general).

jc

flav0rflav · 27-06-2003 12:01pm

You've already made the design decision to divide it up? Why?

Because of the size of it? Or because you think it won't be fast enough?

Have you done any performance measurements? I'd expect modern DBs to be independent of table size for simple operations that you describe. No?

nahdoic · 27-06-2003 12:07pm

The one thing Id be really concerned about is the performance of MySQL on a database of that size. I'll admit to not being totally au fait with it, but I would always have been of the impression that it was not really designed to scale up to multi-GB tables (which seems to be what youre looking at here).

You're right. It's generally not. It can handle gigabyte tables grand, the problem comes down to the underlying OS, on unix this generally means it is limited to 2GB or 4GB depending.

But again that's fine, I already knew about this, and I'll just be breaking up the member_log table on each server into member_log_1, member_log_2 ... member_log_10, so I'll be keeping each table at around a gigabyte.

It is a pretty interesting situation, whereby the database can be partitioned like this.

Also, I would be inclined to not use a varchar(30) field as the primary key. I would be more inclined to make it a Unique field (but not a PK) on your "small" table, and have a numeric PK for the linking. I could easily be wrong, but I seem to recall that most DB systems perform numeric comparisons more efficiently than string-comparisons, which may be of benefit in your joins (and queries in general).

You'd be right again. This is the only thing which I have to keep though due to the old system. But when I keep a full index cache of the primary keys, I couldn't notice any performance difference between using an int and a varchar.

nahdoic · 27-06-2003 12:21pm

because you think it won't be fast enough?

Yep. that's the numero uno.

Have you done any performance measurements?

No, I haven't done any performance measurements with different operating systems and different DBMS. That's no easy undertaking. I could be very silly, but I'm looking at my current system, I'm looking at how incredibly fast and efficiently it's handling the current database size. I've slammed it with Apache Bench and it can handle 60,000,000 updates and 20,000,000 selects a day before it starts to slow down.

If i can recreate that on multiple servers, then my cost margins are going to be very low and I could make a real success of this.

I'd expect modern DBs to be independent of table size for simple operations that you describe. No?

With a major performance hit, yes. And it would certainly be easier to maintain that way as well.

But it is a balancing act between maintenance/cost/efficiency

ButcherOfNog · 01-07-2003 11:07am

Originally posted by nahdoic
member table (circa 1,000,000 rows - 100 meg - easily fit in a single table)
username varchar(30) pk
password varchar(30)
settings varchar(30)

member_log table (circa 3,000,000,000 rows - a few 100 gigs - a few 100 very simple updates and selects a second, with some quite long selects every minute or so - when the update is being done it needs to select the settings for the user from the member table before it does the update to the member_log table)

A DB Cluster is what you want but prolly can't afford

Until you do some tests you'll not know if the overhead of spreading out the tables and trying to keep the members table in sync will give you any sort of benifit, it may even leave you worse off.

If you go for your solution, then you'd need a table that would cross reference members with servers i'd presume. Reading what yer at tho, the members table doesn't and shouldn't really cause any sort of problem, as its big, but not overly so. I can't see any reason to have it sitting on a number of different servers. The activity on this table is probably pretty simple and quick.

The member_log table tho is the problem, massive table with some long slow queries on it. Do you need every single record in this file for those queries? Could some data not be archived out periodically? I'd guess that hitting this table and the slow queries on it effect overall performance, so do farm it out on a different server from the rest of the DB. Other than that, try it and test it, if its no good, try and test summit else.

nahdoic · 06-07-2003 4:22pm

A DB Cluster is what you want but prolly can't afford

I’ll have one database cluster to go please.

Until you do some tests you'll not know if the overhead of spreading out the tables and trying to keep the members table in sync will give you any sort of benifit, it may even leave you worse off.

Well I won’t be keeping the members table in sync anymore (courteous of your next suggestion). Spreading out the main_log table is the only option, and has to be done – somehow – whether an advanced DBMS handles it or I develop my own code to manage it.

If you go for your solution, then you'd need a table that would cross reference members with servers i'd presume. Reading what yer at tho, the members table doesn't and shouldn't really cause any sort of problem, as its big, but not overly so. I can't see any reason to have it sitting on a number of different servers. The activity on this table is probably pretty simple and quick.

Yep. You’re dead right, I posted the same problem on another couple of forums and the MySQL mailing list, and that’s one thing that I’ve decided. The members table is definitely just going to kept on one server.

The member_log table tho is the problem, massive table with some long slow queries on it. Do you need every single record in this file for those queries? Could some data not be archived out periodically? I'd guess that hitting this table and the slow queries on it effect overall performance, so do farm it out on a different server from the rest of the DB. Other than that, try it and test it, if its no good, try and test summit else.

Good point. But the way the log is kept, each member basically has a certain size that they can use up, be it 100 rows or a million rows. They’ll basically be paying for the space in it. Once you get your 101st hit, and you only have space for a 100 hits it’ll write over your 1st one. This is the way the system is now and it works very well.

I’ve pretty much decided how I’m going to handle it. It’s going to use PHP and MySQL.

I’m developing this single Database class in PHP that will handle all of this work transparently for me, so I really don’t need to worry about it. And the code will be quite portable and maintainable.

This class will handle two concurrent connections, one to the member table and one to the appropriate log table. All I need to supply to the Database class will be the “username” of the user, and the “project” currently being looked at.

So when I call my functions like

createNewProject($projectName, $projectType, $projectSetings)

the database class will handle all the complexity for me. It will select to see what type of user that person is, what sort of project they are making. It will then create the new project on the appropriate log server, and create the appropriate entry on the member server.

Then for running the reports I’ll have equally simple methods in the database class like

returnSummary($projectName, $startDate, $endDate)

This will keep all my SQL commands in one file in one class, and it means, that in the future, should I be in the enviable position of affording a db clusters the code and system should be very easy to port across.

The only difficulty would be reconstructing the tables, but I can’t see that being a problem – the SQL for it would be very easy – you’re just talking about a few unions. The only problem would be the length of time it would take to reconstruct all that into a new DBMS but that’s a long way away.

I’m actually very excited about this project now, it’s going to be a lot of fun, I’ve got the bare bones of it together. You should all the plans for it I’ve done. It’s a fool 80 page foolscap will more LDMs, Flow diagrams and GUI layouts than you could shake a very big stick at.

It will take a few months to finish fully, but it will be well worth it. It’ll also mean that I’ll be making my living from a pure .com if it pans out. And if the whole thing goes belly up, it’ll have been great experience. Either way I should be able to submit it as my 4th year project.

distributed database architecture for a large database

Comments