Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Database design question: multiple tables or query from one table

Options
  • 15-05-2014 9:26am
    #1
    Registered Users Posts: 1,931 ✭✭✭


    I have to make a decision how to design a database to store gps locations belonging to different tracks. DB is mysql running on raspberry pi at this moment.

    Option 1:
    Table with:
    - track id (6 chars)
    - timestamp
    - latitude
    - longitude
    - [possibly other gps location parameters]

    When track expires drop all data with certain "track id"

    To get all points belonging to a track I'd have to use single query, filtered by "track id".

    Option 2
    Create a table for each track with the same fields as above except the "track id". "track id" would be the name of the table. All points stored in the table belong to the track, so no filtering required.

    When track expires drop whole table defining that track

    Now the question: which way is better and more efficient? Creating multiple tables or one big table + filtering?

    Number of tracks/points is currently undefined - the project is in early development stage (more here: http://www.boards.ie/vbulletin/showthread.php?t=2057188741 )

    All DB handling will be from php


Comments

  • Registered Users Posts: 2,022 ✭✭✭Colonel Panic


    Creating a table for a track and dropping it when it expires? That's a paddlin'!

    What's so bad about having a where clause that just shows a single or range of track ids? That's the point of SQL so it's not inefficient, it's pretty much how it's done.

    Also, you don't drop a record, you delete it. "delete from Table where trackid = something"


  • Registered Users Posts: 1,931 ✭✭✭PrzemoF


    Creating a table for a track and dropping it when it expires? That's a paddlin'!

    What's so bad about having a where clause that just shows a single or range of track ids? That's the point of SQL so it's not inefficient, it's pretty much how it's done.

    Also, you don't drop a record, you delete it. "delete from Table where trackid = something"

    I have virtually 0 (z-e-r-o) practical experience of how to design a db, so your answer is exactly what I wanted :D Thanks!


  • Registered Users Posts: 7,242 ✭✭✭Tow


    They will tech you to use Option 1 in collage and obey Codd's Rules etc, at least they did years ago.

    In reality may modern high performance real world databases (MongoDB) have moved away from the pure relational model. Option 2 would be much faster, unless MySQL does strange things when you create and drop tables.

    When is the money (including lost growth) Michael Noonan took in the Pension Levy going to be paid back?



  • Registered Users Posts: 1,421 ✭✭✭Merrion


    Also Option 1 will allow you to do queries such as "select all the tracks that have a longitude greater than x"


  • Registered Users Posts: 2,022 ✭✭✭Colonel Panic


    Tow wrote: »
    They will tech you to use Option 1 in collage and obey Codd's Rules etc, at least they did years ago.

    In reality may modern high performance real world databases (MongoDB) have moved away from the pure relational model. Option 2 would be much faster, unless MySQL does strange things when you create and drop tables.

    That post doesn't really make any sense.

    You say a completely different type of database is faster for adding tables when there is no concept of tables in MongoDB but collections of data.

    Then, you go on to say "unless MySQL does strange things wen you create and drop tables", so you don't even know the impact of doing it! (Or why it's a bad thing).

    Finally, this user is a newbie developer just learning MySQL so why confuse him with this stuff when he's just working things out?


  • Advertisement
  • Technology & Internet Moderators Posts: 28,799 Mod ✭✭✭✭oscarBravo


    Tow wrote: »
    Option 2 would be much faster...
    I doubt it. Creating and dropping tables isn't free, and selecting rows based on an indexed column is exactly what a DBMS is designed to do as quickly as possible.


  • Registered Users Posts: 7,242 ✭✭✭Tow


    Mongo was just the latest example of a non relational database. If his data is hierarchical in nature then I see little point in forcing it into a relational model. Years ago I would have said Option 1 was the only way to go, but not now. Also the OP is developing on a Raspberry Pi, so has limited processing resources and they want to create battery powered device.

    When is the money (including lost growth) Michael Noonan took in the Pension Levy going to be paid back?



  • Technology & Internet Moderators Posts: 28,799 Mod ✭✭✭✭oscarBravo


    Tow wrote: »
    Mongo was just the latest example of a non relational database. If his data is hierarchical in nature then I see little point in forcing it into a relational model. Years ago I would have said Option 1 was the only way to go, but not now. Also the OP is developing on a Raspberry Pi, so has limited processing resources and they want to create battery powered device.

    Still doesn't make sense. Option 2 in a Mongo context would suggest creating a new collection for every track, which would still be a bad idea.

    I mean, how would you get a list of tracks - "show collections" and filter out any collection that isn't a track? What if you want a list of tracks for a specific date, or within a geographical boundary?

    Option 2 is the wrong way to do it, no matter what database you're using.


  • Registered Users Posts: 1,931 ✭✭✭PrzemoF


    I'm really impressed with the size of the storm my question created! :-)
    Thanks for all opinions - I'm learning a lot here. Keep them coming! :-)

    P.S. RPI won't be battery operated - the device sending location over internet (phone/tablet) will be battery operated, but that's not really related to the db as the db will be run on the server.


  • Registered Users Posts: 2,781 ✭✭✭amen


    why are you using a character field (char 6) to for TrackID ?

    Would you not be better using an int?

    I read your other post and I'm not quite sure what you are trying to do but if it involves users recording their locations how will that work with your TrackIDs

    are you going to have a user table and a UserTrack table ?


  • Advertisement
  • Registered Users Posts: 1,931 ✭✭✭PrzemoF


    amen wrote: »
    why are you using a character field (char 6) to for TrackID ?

    Would you not be better using an int?

    I read your other post and I'm not quite sure what you are trying to do but if it involves users recording their locations how will that work with your TrackIDs

    are you going to have a user table and a UserTrack table ?

    The track id will be used as part of the link that will point to page showing the track. Number of combinations (capital letters + letters + numbers) for a 6 char string is 3.8e10, so to get the same number of combinations with INT I'd have to use 11 digits. I'll stick with char.

    Something like this (key is the track id):

    http://www.*******.eu/map.php?key=2WgT7a

    There will be no users, just tracks. New tracking request will generate new track id (key).

    The full picture:
    client 1 (sender): a mobile device opens webpage/application that reads current location and send it to server. The server generates unique track id (key) that is send back to the device. The key is valid for till the track expires (set by user). Then all location points sent by the device are send together with the key and the server stores them in mysql database.

    client 2 (receiver): opens webpage using key provided by client 1 ( i.e. http://www.******.eu/map.php?key=2WgT7a ) and can watch current location of client 1. Server will generate map image showing the track (map rendered on server) or will send the track points to the client 2 device (map rendering on the client device)

    ****** - I don't want anybody to mess with the server - it's not ready yet :D


  • Registered Users Posts: 27,161 ✭✭✭✭GreeBo


    oscarBravo wrote: »
    Still doesn't make sense. Option 2 in a Mongo context would suggest creating a new collection for every track, which would still be a bad idea.

    I mean, how would you get a list of tracks - "show collections" and filter out any collection that isn't a track? What if you want a list of tracks for a specific date, or within a geographical boundary?

    Option 2 is the wrong way to do it, no matter what database you're using.

    Wouldnt you just have a track collection and add/remove tracks from that collection as required? (or set some status on them)

    As for querying:
    db.tracks.find( { date: { $eq: myDate } } )?


  • Technology & Internet Moderators Posts: 28,799 Mod ✭✭✭✭oscarBravo


    GreeBo wrote: »
    Wouldnt you just have a track collection and add/remove tracks from that collection as required? (or set some status on them)

    As for querying:
    db.tracks.find( { date: { $eq: myDate } } )?

    Yes. That's option 1. :)


  • Registered Users Posts: 7,468 ✭✭✭Evil Phil


    PrzemoF wrote: »

    Option 1:
    Table with:
    - track id (6 chars)
    - timestamp
    - latitude
    - longitude
    - [possibly other gps location parameters]

    When track expires drop all data with certain "track id"

    To get all points belonging to a track I'd have to use single query, filtered by "track id".


    I wouldn't "drop", assuming you mean delete the data, I'd add an active or enabled field and set this to true if it's still being used and false to exclude it from any queries. Has the same effect but means you hang onto the data.


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    PrzemoF wrote: »
    Now the question: which way is better and more efficient? Creating multiple tables or one big table + filtering?
    When I first scanned your question I thought you wanted to avoid a series of complex SQL JOINs with multiple relational tables.

    Reading it again - and do correct me if I'm off - I've realized you'd just be doing a simple SELECT * FROM MyTABLE WHERE track_id = 1234, then deleting it eventually with a DELETE FROM MyTABLE WHERE track_id = 1234. So rather than doing that, you're asking whether creating a new table for each track, like this:
    CREATE TABLE IF NOT EXISTS track_1234 (
      timestamp INT NOT NULL, 
      latitude FLOAT( 10, 6 ) NOT NULL,
      longitude FLOAT( 10, 6 ) NOT NULL
    ) ENGINE = InnoDB;
    
    And then eventually run a DROP TABLE IF EXISTS track_1234.

    All ultimately because you want to run a query that doesn't have a WHERE clause.

    So, between a single table query based upon a track_id field versus creating and dropping tables where you still have to query them (but just with no WHERE clause), which do you think is going to use up fewer resources and thus be more efficient?

    This would depend on the following:
    • What is the resource cost of creating and dropping tables, of adding a where clause to a SELECT or DELETE statement?
    • What impact on the database with there be of a single table with a LOT of data, versus LOTs of tables?
    • How often will you be creating and/or deleting a new track?
    • How often will you be querying the table(s)?
    • How many entries will a track typically have?
    Calculate the above and you have your answer, but honestly my feeling is that going down the multiple table route will end up significantly less efficient and also create problems down the road if you want to start adding functionality such as selecting more than one track at a time.


  • Registered Users Posts: 1,931 ✭✭✭PrzemoF



    This would depend on the following:
    • What is the resource cost of creating and dropping tables, of adding a where clause to a SELECT or DELETE statement?
    • What impact on the database with there be of a single table with a LOT of data, versus LOTs of tables?
    • How often will you be creating and/or deleting a new track?
    • How often will you be querying the table(s)?
    • How many entries will a track typically have?
    1. I have no clue, I could possibly measure it, but it's too early for it.
    2. same as above
    3. that depends how may users I'll get, so again might vary from 2 a day to 100 per hour or more :-)
    4. one track per user, probably on average a track will live an hour
    5. that depends on the numbers of observers, so probably the same as the number of users sending the track (one to one seems to be the most common situation)
    6. probably up to 100, but that will be my decision

    Thanks for the detailed answer! I have mysql db with one table up & running. Now I have convince firefoxos application to work in background and update location every (x) seconds. But that's for another thread...

    Thanks again for all good advices that I got from this thread - I have much better idea what to do and what to expect from mysql db now!

    P.S. The address is http://geofinder.eu WARNING: if you agree to "share the location" it will land in my mysql db, but no other information is collected except location. If anybody is interested to see the whole code (there are bits of php) I might post it on github.


  • Moderators, Technology & Internet Moderators Posts: 1,334 Mod ✭✭✭✭croo


    What is the resource cost of creating and dropping tables
    1. I have no clue,
    It would be very database specific but I would imagine a database schema change has a very high overhead.

    Typically with a database there are many changes happening at the same time and the essence of a good db server is how it manages all this.

    Some database events that occur cause the server to stop processing all transactions while it addresses the event. This is a very bad thing to happen to a database and stopping it happening in a busy database is normally a key goal.

    In a best case scenario I would think a schema change (create/drop table) fits into this area. At worst, it might also flush all the existing transactions before applying the change and while it does this it would not be accepting any other transactions. In a busy multi-user system this would be a big problem.

    As I said, the impact would be very database specific but I would expect the worst case (flushing) is the most likely. For this alone I would not be even considering the schema change approach - not unless as I said you are talking about a 1 user system and even then I would be sceptical.


Advertisement