How to avoid manually writing/managing SQL

magooly · 26-02-2014 10:05PM #1

Hi guys - My team and I are rapidly developing an Webapp backed by an Oracle DB. We use maven's plugin flyway to manage our db creation and population from INSERT SQL scripts. Typically we add 3-4 tables per sprint and / or modify the existing tables structure.

We model the schema in an external tool that generates the schema including the constraints and run this in first followed by the SQL INSERTs to ensure the integrity of all the data.

We spend too much time managing the changes to the SQL to cover the new tables - by this I mean adding the extra column data to the existing SQL INSERT statements not to mention the manual creation of the new SQL INSERT data particularly when they reference a foreign key.

Surely there is another way, maybe maintaining raw data in Excel and passing this through a parser to the DB. Has anyone any ideas?

10 tables so far and up to 1000 SQL statements, DB is not live so we tear it down on every build.

Thanks

Edit: The inserted data is static reference data the platform depends on to function - menus etc. The architecture is Tomcat, JSF, Spring, JPA(Hibernate), Oracle

fergalr · 26-02-2014 10:52PM

There's probably a really obvious answer here, but is there a reason you don't use an ORM?

I mean, ORMs have their problems definitely, but you mention you are 'rapidly developing a webapp' which is sort of the sweet spot for the tech - and if you are starting to get bogged down in having to manage SQL to deal with minor updates, well, have you considered going down the ORM route?

magooly · 26-02-2014 11:20PM

fergalr wrote: »

There's probably a really obvious answer here, but is there a reason you don't use an ORM?

I mean, ORMs have their problems definitely, but you mention you are 'rapidly developing a webapp' which is sort of the sweet spot for the tech - and if you are starting to get bogged down in having to manage SQL to deal with minor updates, well, have you considered going down the ORM route?

I should carify this question. We use JPA (hibernate as the implementation) within the web app itself but my question is related to the import of static reference data required for the platform to function. This data drives menus, table data and business logic. Thanks Fergair

colm_c · 26-02-2014 11:40PM

Not familiar with hibernate, but a lot of the ORM's have seed mechanism where you can write the data you want to have in your DB in code or at least in configuration files, then it's a lot easier the manage and version control.

magooly · 27-02-2014 12:10AM

colm_c wrote: »

Not familiar with hibernate, but a lot of the ORM's have seed mechanism where you can write the data you want to have in your DB in code or at least in configuration files, then it's a lot easier the manage and version control.

Thanks Colm this is the track Im looking to go down - seeding is believing. This is plenty to go on cheers.

ChRoMe · 27-02-2014 01:00AM

magooly wrote: »

Thanks Colm this is the track Im looking to go down - seeding is believing. This is plenty to go on cheers.

Appfuse has an exact sample of what you want using the dbunit maven plugin

TrueDub · 27-02-2014 12:41PM

+1 for dbunit

moycullen14 · 27-02-2014 05:27PM

magooly wrote: »

Hi guys - My team and I are rapidly developing an Webapp backed by an Oracle DB. We use maven's plugin flyway to manage our db creation and population from INSERT SQL scripts. Typically we add 3-4 tables per sprint and / or modify the existing tables structure.

We model the schema in an external tool that generates the schema including the constraints and run this in first followed by the SQL INSERTs to ensure the integrity of all the data.

We spend too much time managing the changes to the SQL to cover the new tables - by this I mean adding the extra column data to the existing SQL INSERT statements not to mention the manual creation of the new SQL INSERT data particularly when they reference a foreign key.

Surely there is another way, maybe maintaining raw data in Excel and passing this through a parser to the DB. Has anyone any ideas?

10 tables so far and up to 1000 SQL statements, DB is not live so we tear it down on every build.

Thanks

Edit: The inserted data is static reference data the platform depends on to function - menus etc. The architecture is Tomcat, JSF, Spring, JPA(Hibernate), Oracle

For development, gibemn that you're using hibernate, using the

hbm2ddl.auto=update

will work fine.

This takes the existing schema and modifies it to match the current object model. Note that it is non-destructive (it doesn't delete anything) so, after a while, you end up with a schema that is well out of date, but for development it's fine.

In a production system, things get a lot trickier and you will either have to:

make the changes to an existing schema
or create a new schema (
```
hbm2ddl.auto=create
```
and copy the data from the old schema to the new one.

You might want to look at something like liquibase. Schema versioning is a whole pile of fun.

DublinWriter · 03-03-2014 09:38PM

magooly wrote: »

10 tables so far and up to 1000 SQL statements, DB is not live so we tear it down on every build.

Sorry, I'm old-school.

This sounds like complete lunacy to me. Surely there has to be some kind of robust meta-data layer that you can implement?

Colonel Panic · 03-03-2014 10:37PM

With Entity Framework, I'll sometimes take the code first approach and let it generate the database and then provide code to initialise it from files but only for smallish projects.

Jim2007 · 05-03-2014 11:21PM

magooly wrote: »

Surely there is another way, maybe maintaining raw data in Excel and passing this through a parser to the DB. Has anyone any ideas?

Yes XL is fine for this. Just keep the data in columns and write a VBA function to pick up the values and output the SQL inserts in the last column, something like:

=CreateInsert(A1, B1, C1.....)

Then just copy and paste the values in this column into your SQL editor and execute them against the BD.

fergalr · 05-03-2014 11:28PM

I don't really understand this thread; if the data is changing so much, maybe a read/write format that is more flexible than SQL is the answer?

It seems strange to be talking about various complex schemes to generate SQL based on parsing an Excel sheet, etc.

That said, maybe there's good reasons you need to do this, which are obvious to the other posters?

cython · 06-03-2014 09:38AM

One suggestion that I found online would be if you have an Oracle client on the machine running your builds, then could you use SQL*Loader via the maven-exec plugin to pull in data from CSV files as you suggest?

neil_hosey · 06-03-2014 11:55AM

Colonel Panic wrote: »

With Entity Framework, I'll sometimes take the code first approach and let it generate the database and then provide code to initialise it from files but only for smallish projects.

Entity framwork code first is very good, and is improving with each release.

The inclusion of auto-migrations is bringing it closer to NHibernate on the migrations side of things, and it is definately ahead on the fluent API side of things.

Jim2007 · 07-03-2014 01:32PM

fergalr wrote: »

I don't really understand this thread; if the data is changing so much, maybe a read/write format that is more flexible than SQL is the answer?

It seems strange to be talking about various complex schemes to generate SQL based on parsing an Excel sheet, etc.

That said, maybe there's good reasons you need to do this, which are obvious to the other posters?

Actually using XL to manage seed data and so on as I described is one of the most useful, almost any business user is capable of managing lists and a VBA function can be used to generate the correct SQL for what ever target database you wish to use, there is no requirement that the target DBMS supports a particular import format or anything like that.

My workflow when I get an updated file from the business, is to simply add the VBA function, create the SQL script and added to the build in Team City.

GreeBo · 07-03-2014 02:28PM

Still a manual workflow though...I agree with fergair, I must be missing something as what you are trying doing seems very complicated...

Jim2007 · 07-03-2014 07:49PM

GreeBo wrote: »

Still a manual workflow though...I agree with fergair, I must be missing something as what you are trying doing seems very complicated...

Let's say you need to set up all the data to reguired to test say 10 different scenarios for the calculation of attribution on 20 different investment portfolios. How to you capture the 2,000 or more data points from the business user, validate them, allow him to check them and finally set them up in the data base for the test?

GreeBo · 07-03-2014 08:35PM

I'd set them up once, export then and build unit tests with dbunit ( fine it before)

During normal use, how does the data get entered, a ui or something else?

magooly · 08-03-2014 08:04PM

Thanks for the replies, looks like a hot topic.

Heres an example scenario with rough ddl.

Say I have an existing table with 4 columns:

TABLE REF_DATA (
Integer Primary Key,
Integer Foreign Key(to another table PK),
Text name,
Decimal tax,
Decimal price
);

To populate the 500 rows of seed reference data for this table I have a simple insert.sql script - lets just say it was created manually from an export from excel and tacked together with search/replace.

Now in the next Sprint I decide to add 1 more column
Decimal discount

This must be populated with specified discount rates that vary per name field.

To populate this I must either update the 500 insert.sql lines with the new column data or create a 500 line update.sql.

This is a very simplified example but I think it explains things.

DBUnit is a great suggestion to write the SQL

Jim2007 · 08-03-2014 09:32PM

GreeBo wrote: »

I'd set them up once, export then and build unit tests with dbunit ( fine it before)

During normal use, how does the data get entered, a ui or something else?

But where do you keep the data, do all the calculations to come up with the expected out comes, allow the business users to modify the data (which in turn impacts the expected out come), create new test cases and so on. And then when the whole thing has been modelled, create the required SQL to inserted in the database?

I use XL for this, the business users can do there stuff there to come up with the test case and I can extract the pieces needed to be inserted into the database with a couple of VBA functions.

ChRoMe · 09-03-2014 06:47PM

Two pages in and no one has suggested NoSQL MongoDB?

stevenmu · 10-03-2014 03:33PM

I'm not very familiar with Oracle, is there not some kind of ETL tools available like DTS/SSIS for the SQL Server world?

That would make it easy to recreate your databases, import data from flat files if needed etc, with one simple to manage package which could be quickly and easily updated if needed due to schema changes.

DublinWriter · 10-03-2014 03:41PM

stevenmu wrote: »

I'm not very familiar with Oracle, is there not some kind of ETL tools available like DTS/SSIS for the SQL Server world?

SSIS works perfectly well in the Oracle world.

stevenmu · 10-03-2014 04:45PM

Good point, if the OPs willing/able to install that it should work perfectly.

Jim2007 · 10-03-2014 09:12PM

stevenmu wrote: »

That would make it easy to recreate your databases, import data from flat files if needed...

Did you ever try and get business users to maintain flat files...

stevenmu · 11-03-2014 10:50AM

Yes, and it's a huge pain, but better than maintaining 1000+SQL Inserts.

(I'm including Excel worksheets in "flat files" btw)

How to avoid manually writing/managing SQL

Comments