N-Tier Distributed

stevenmu · 08-07-2004 02:49PM #1

Hi All,

I'm working for a Progress 4GL house but we're switching to .Net for our next product. I'm pretty good with OO and I think I've got my head wrapped around N-Tier ok but I'm not too sure of the best method/structure to use for my data layer. It's basically a sales/crm system (on VB.Net & SQL Server) with reps going around taking orders and making calls on their laptops. I'm basically going to have them connecting to a main database back in the HQ via GPRS/VPN. The problem is that there will be times when they're out of coverage. To deal with this obviously I'll have a local db (probably MSDE) but I'm not sure how to design my data layer to deal with this. Should I use the HQ db wherever possible, only using local when needed and synch whenever to connect to HQ ? Should they use local always and have a background service synchronize when it can ? Should it use both simultaneusly where possible, checking the HQ for updates every now and then ? and so on....
I'm not going to ask anyone to design this for me (but if you want to feel free

), I'm just hoping somebody might know of some articles, tutorials, books or maybe a more specialised forum that would deal the various theories and options behind this kind of thing, I've tried the usual supsects (msdn,gotdotnet and loads of others) but can't find anything of much use so any and all tips/pointers/petty bickering etc are welcome.

Thanks,
Steve.

ressem · 08-07-2004 06:24PM

The summary of the course
http://www.microsoft.com/traincert/syllabi/2390APrelim.asp#materials
describes the topics at which you should be looking, assuming that you can't do the course itself.

If you choose ado.net...
Have you read up on datasets yet?

You interact with the dataset at all times, and write logic as to when to read/save or connect to the remote data source.

http://www.bridgeport.edu/sed/projects/cs597/Fall_2001/dkiran/chapter3.html
"
...
The DataSet is persisted in memory, and the data therein can be manipulated and updated independent of the database. When appropriate, the DataSet can then act as a template for updating the central database.
"

Or for code and solutions for basic problems like autoincrement in disconnected solutions
http://www.awprofessional.com/articles/article.asp?p=169485&seqNum=3

---
And the SQL Server dependent option merge replication.
http://builder.com.com/5100-6387-1058880.html

google for: net disconnected database
and there's a fair few decent links

stevenmu · 09-07-2004 12:30PM

Thanks, some great starting points there, should keep me busy for a while.
I should have really thought of using 'Disconnected' as a keyword.

I've been playing with ADO.Net and datasets for a few weeks now and they definetely seem very usefull. I'd prefer not to have to manage dumping their contents to a file if the laptops being shut down though, even though it seems simple enough using XML (and I still don't trust hibernate fully).

bonkey · 09-07-2004 12:47PM

Originally posted by stevenmu
I'm basically going to have them connecting to a main database back in the HQ via GPRS/VPN. The problem is that there will be times when they're out of coverage. To deal with this obviously I'll have a local db (probably MSDE) but I'm not sure how to design my data layer to deal with this.

Classic problem, and as such there's the inevitable number of approaches which can be used. Trying to decide between them is never easy, because a lot will depend on factors which are probably themselves a bit iffy.

If you check the Microsoft Practices and Patterns stuff online, they should have an "application block" for this type of stuff, which will make use of locally-cached data (at a guess through writing recordsets to disk as XML) and so on. This can completely remove the need for a local DB, as your data is simply cached locally in files, and outstanding requests/changes to be sent to the main server are held in some form of queue (possibly MSMQ, possibly a "hand-rolled" one).

I'd also suggest having a read-up on "Smart Clients", because this is a central issue in Smart Client design, and you'll find some good stuff there.

Should I use the HQ db wherever possible,

I wouldn't be inclined to. If you implement a record-set-caching solution (as very briefly outlined above), then sure...you technically do use the HQ db whenever possible, but only to refresh the cache as needed and to process your queue....you dongenerally 't want to implement a second set of code to handle "direct" connections.

If you do go with a local DB (may be an requirement rather than an option depending on size of data, need for proper RDB functionality for efficient searches over large data, or whatever), then I would generally try to design it so that I'd always connect to the local DB with the app, and handle replication seperately.

only using local when needed and synch whenever to connect to HQ ? Should they use local always and have a background service synchronize when it can ?

Well, if you use a local DB, then I'd agree that what you should probably be doing is running MSDE 2000 on the client, with SQL 2000 on the server. Then you can set up SQL Server's own replication to use merge-replication (by definition 2-way). You can then look at the best options for scheduling this - it can be on demand, scheduled, or whatever. The SQL Server 2000 books online explain Merge replication quite well, including whats available to handle conflict-resolution issues (i.e. when two seperate changes are made to the same data, how do we merge them back).

Semi-connected applications like this are becoming a central issue in development, so you should find no shortage of info on it. I know it was a pretty central theme in this year's TechEd (in the form mostly of it being a central challenge to the design of so-called Smart Clients).

Cheers,

jc

stevenmu · 09-07-2004 01:22PM

I think we are going to need a local DB, the reps all like to be able run reports showing sales figures, clients histories and that kind of thing. I'm sure it would work with XML files but I'm not so sure how performance would be over a few years worth of data (I'd imagine it's not very good), I think this is one of those times where it's better to stick to the tried and trusted way of doing things.

I had been leaning towards local db with a "hand-rolled" message queue as that's what our current version uses but that's really a left over from the days where the only way to connect back was to find a phone line to plug in to. I didn't want to trust SQL's automatic replication but I was just looking though merge-replication when you posted and with the custom conflict resolvers it looks pretty good (even the defaults are probably enough for most situations. As for timing the synchronisation I think on application open and close would be enough, currently all the message queues get transmitted up once a night and it's the next night before changes come down to the laptops, so that would be a huge step up. At the rate most of our customers would use the system there would only be at most 100 to 200 records coming down on startup and then only whatever changes were made on close so performance should be pretty good.

I'll have a look into smart clients when I get a chance, I have to admit I've heard the term but don't know a thing about them.

Thanks,
Steve

ressem · 09-07-2004 11:46PM

gack... choking on buzzword

please don't have potential customer looking for auto updating applications Monday morning.

About saving as XML, don't see why you would other than debug/backup.
If you really want to be nostalgic, can save updates as zipped signed XML for pigeon post :rolleyes:

Local MSDE database is too cheap to ignore assuming you've got license to embed it.

Just to add more options:
You saw in the last link the remote data access stuff?
1 hour howto: if the merge replication looks over the top.
http://www.devbuzz.com/content/eyecandy_01_pg2.asp

My 2 cent: give 'em a synchronise button, last sync completed and progress bar that they can hit for debugging/firewall & proxy checks /to convince them that the app is alive.

Connect almost silently when app is started/ stopped and connection is available. Fail politely.

Usual UI stuff.

stevenmu · 13-07-2004 03:09PM

Originally posted by ressem
gack... choking on buzzword

Yeah, I've been reading way too many M$ sites lately

please don't have potential customer looking for auto updating applications Monday morning.

There's been too many Mondays where they'd ring up with some catasrophe or other and it would turn out they hadn't bothered to update over the weekend

My 2 cent: give 'em a synchronise button, last sync completed and progress bar that they can hit for debugging/firewall & proxy checks /to convince them that the app is alive.

Connect almost silently when app is started/ stopped and connection is available. Fail politely.

I'll probably give them a button so that they feel good about themselves but try and do it whenever they start/stop just to be sure to be sure.

RicardoSmith · 13-07-2004 03:31PM

I've had to use XML as local datastore in the past and its not a replacement for using a local database. performance will suffer with XML, it will contain duplicate data and once you see your storing duplicate data you know its not the right approach. I don't see what "wins" XML gives you in this instance.

Be sure to put an audit trail on the client at least for replication. Because you are sure to have some data issues at least initially. Resolving them with out an audit trail on the client /server at least for replication will be a lot harder. These days it good practise to do it anyway.

Interesting little project.

zt · 14-07-2004 07:11PM

Are you retaining any element of the progress system? The progress database performs very well in the real world and with the data connection options you could easily use this as the master database. It also allows you to go UNIX or other platforms if you need to scale.

I would also suggest that the key to replication is to ensure that only one 'real' copy of the master data exists. This means if sales people download data, they download a read-only copy. If the sales person is creates data these are new records.

Two-way updates in a replication scheme is horrible and very complicated. An example would be a customer record that could be updated by multiple users. This allows for replication conflicts.

We designed a similar system recently and used the idea of fragments. These fragments were chunks of data. Readonly copies of certain fragments were available at the central system. We used XML and Webservices to provide these fragments.

Remote sites generated XML fragments that were also sent back to the central service.

The term fragment reflected that the message was a piece of the overall data. The master database was considered the 'reliable' version of the data.

You should research store and forward, XML messaging architectures. These are the common approaches to replication based systems.

Somebody else suggested using a database at the mobile level. If you plan to store any volume of data at the mobile level or plan to allow queries this is an absolute must.

Originally posted by stevenmu
Hi All,

I'm working for a Progress 4GL house but we're switching to .Net for our next product. I'm pretty good with OO and I think I've got my head wrapped around N-Tier ok but I'm not too sure of the best method/structure to use for my data layer. It's basically a sales/crm system (on VB.Net & SQL Server) with reps going around taking orders and making calls on their laptops. I'm basically going to have them connecting to a main database back in the HQ via GPRS/VPN. The problem is that there will be times when they're out of coverage. To deal with this obviously I'll have a local db (probably MSDE) but I'm not sure how to design my data layer to deal with this. Should I use the HQ db wherever possible, only using local when needed and synch whenever to connect to HQ ? Should they use local always and have a background service synchronize when it can ? Should it use both simultaneusly where possible, checking the HQ for updates every now and then ? and so on....
I'm not going to ask anyone to design this for me (but if you want to feel free ), I'm just hoping somebody might know of some articles, tutorials, books or maybe a more specialised forum that would deal the various theories and options behind this kind of thing, I've tried the usual supsects (msdn,gotdotnet and loads of others) but can't find anything of much use so any and all tips/pointers/petty bickering etc are welcome.

Thanks,
Steve.

ressem · 14-07-2004 08:23PM

Given that the salespeople want to run report generation tools using disconnected machines, then local database is a must.

(Alternative of embedding Crystal Dev within your app is ugly IMO.)

Have used Dataset generated fragments between remote databases in my stuff in the past (had to avoid MS SQL Server ) without suprising issues.

The SQL merge agent might make a mistake? :eek:
As opposed to our own server based fragment updater.

Replication conflicts are always going to be a pain in any system like this regardless of how it's done.

But given that we're able to use SQL server on both ends of the connection, it's worth looking at the benefits it'd bring if it works, possibly to reduce traffic and allow one to concentrate on getting the merge rules correct, instead of inserting a bottleneck programmed by someone still a bit light on experience with the platform.

Anyone here used merge-replication in anger?

zt · 14-07-2004 11:25PM

Originally posted by ressem

Replication conflicts are always going to be a pain in any system like this regardless of how it's done.

It is generally possible with good design to avoid replication conflicts.

When designing this type of architecture, I would divide data into central shared and remote private. Central shared data should be read-only on remote nodes. Remote nodes should only create new data records.

I am not familiar with the MS replication mechanism although I have worked with similar database tools. These tools are specifically intended for cloning parts or all of a database. I would generally suggest that these are good when you need to distribute an entire database but fail when the master and slave applications require subsets of data.

An example might clarify the above. A sales person will generally only need their own customers, they don't require the full customer list. Often database replication tools do not provide for replicating a subset of data.

A further issue is the protocol used by the replication tool. Is it a proprietary protocol? Does it require specific server configuration? Does it run securely over the Internet and does it work with common firewalls?

My own choice for interconnecting the clients and servers would be some type of XML over HTTP. This allows changes of the client and server in the future. It could allow third-party integration without a fuss.

stevenmu · 15-07-2004 10:27AM

It seems pretty clear that having local databases is the way to go. I was pretty sure of this but it's always nice to have it confirmed by others. Seeing as MSDE is free it seems like a pretty good way to go, on the client side anyway. Progress databases do perform quite well and they're very stable but keeping that on the server side would mean an ODBC connection somewhere along the line which can cut performance a bit. Progress licensing can also be a bit restrictive, and can cost a lot more than SQL server licenses, especially because we want to keep the option open of having a web interface aswell (allowing our clients customers the bypass the reps for certain products and place orders directly). Progress reps tend to demand CPU licensing for this which is pretty expensive. I think I'll stick with SQL server for now and when I get that going fine I might try a seperate data layer for My-SQL or something similar to give a *nix server option, and save a bundle on licensing to boot. The main problem with the replication, as zt pointed out, will be if more than one rep changes something like a customer record. Unfortunately we want everybodys database to have all of the information, team leaders like being able to look at everyone elses work, and it makes the laptops interchangeable with a simple change of the rep code in the configuration. From what I've seen merge-replication can work on a column level so if one rep changes a customers phone number and another rep changes his email address, both changes stick. I think changes are date stamped so if both reps change a phone number the later one sticks. This seems pretty acceptable to me, the only improvement I can think of would be to give preference to team leaders, this may be possible with custom conflict resolvers (unless that's just a buzz phrase that doesn't mean anything), but if not it's no big deal. Once we get a good sized database set up and some data entry programs going we'll test it out and see how it works.

zt, sql server has different types of replication, some are just the standard clone ones like in other systems but merge replication seems to be designed with exactly this type of situation in mind. I think it basically works by maintaining an internal audit table and tracking when it was last synchronised. When it connects it send up/takes down the audit records from since it last synched. I'd imagine that if it works as advertised it will be more efficient than custom written stuff, with "if it works" being the biggest part of that.

posted by RicardoSmith
Interesting little project.

It sure is, it's a lot better than just mainting the old version.

zt · 15-07-2004 01:01PM

The Progress ODBC drivers were never great. Don't know if this has improved recently.

They do seem to be losing existing customers because of their pricing. This is a great shame because resellers were always the most important channel for Progress.

Have fun.

N-Tier Distributed

Comments