Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Boards.ie Dowtime - 2009-01-06 - Technical Discourse

Options
  • 06-01-2009 6:08pm
    #1
    Subscribers Posts: 9,716 ✭✭✭


    May as well start keeping a nice public record of these things here so that for generations people can see how much a fecking eejit I was. Although not necessarily full of equations and code, any discussion I'll want to engage in here will be of a technical nature, and so don't necessarily expect to understand it unless you're a bit of a coder/sysadmin yourself.

    This particular downtime concerns the database of boards.ie (suprise) and may involve several actions depending on what I have time to do.

    Primary problem is that backups are not completing correctly at the moment, and this is a problem from a redundancy pov obviously. What's happening right now is a gigantic ugly pointer error, which I think is probably something to do with a buffer being too big or small, but could equally be a bug with the version of MySQL we're running (5.0.22).

    So, three things I would like to do:

    First, kill all connections to the database and try running a backup to see if the problem occurs with zero external connections and an empty queue.

    Second, change some of the memory allocated to various buffers in my.cnf, fairly basic stuff, then try running a backup.

    If it's simply a buffer problem, that'll leave me with some time to start the process of setting up our newly purchased secondary DB server to work in a Master-Slave configuration.

    Will need to update MySQL from 5.0.22 to the latest build, this shouldn't hopefully cause a problem. Either way, will require a fully function backup before this step is taken.

    Once both database servers have synchronous versions of mysql, the new server will be set up as the slave for testing, the "old" server remaining as the master (specs are similar).

    Once replication is working, boards.ie can be brought back online, and monitoring for instabilities and problems will continue for a week or two before any further action is taken.

    Dump of relevant log information from the mysql logs.

    If anyone with experience of doing similar wishes to comment, I'd be much obliged :)

    I'll only be doing the replication if everything else happens to go well, I'd prefer not to use up three hours if possible.


Comments

  • Subscribers Posts: 9,716 ✭✭✭CuLT


    Absolute miserable failure to do anything there other than cause 6 hours downtime and exacerbate my illhealth.

    Took boards offline.

    Switched redirects to maintenance server.

    Began test database backup, "failed" at 3GB mark (no error in log).

    Tweaked some my.cnf values, "failed" at 3GB mark (no error in log).

    Flushed tables with lock, took MySQL offline, ran myisamchk, errors found, no repairs made.

    Ran myisamchk with repair, apparently just ended up corrupting/crashing tables

    Manually ran CHECK and REPAIR on tables via mysql CLI

    posthash table crashed shortly after 23:09, just after bringing site back up.

    Site functionally online at 00:10, six hours downtime approx. Disastrous, but no major information loss detected yet, and there is still the possibility that the nightly backup will complete correctly. Will check tomorrow.


  • Moderators, Music Moderators Posts: 23,359 Mod ✭✭✭✭feylya


    How exactly are you backing up the DB? Assumingly, it's a mysqldump but where are you dumping the file too? Are you going straight onto the MySQL server itself or SCPing to another box? Do you have any monitoring in place to keep an eye on disk space/load/memory on the DB box and whichever machine you're backing up to? Are any of the tables partitioned?

    What sort of connection do you have between the DB servers? Assuming it's gigabit and depending on the size of the DB, it may be better just to replicate live across the servers without dumping from the master first.

    Be careful about upgrading MySQL - we had some nasty surprises with compatibility.


  • Subscribers Posts: 9,716 ✭✭✭CuLT


    Gonna split this up to answer, I'm heading straight to bed after this though so sorry if I don't respond any further until tomorrow!
    feylya wrote: »
    How exactly are you backing up the DB? Assumingly, it's a mysqldump but where are you dumping the file too? Are you going straight onto the MySQL server itself or SCPing to another box?
    Dump gets tarballed on the fly to a local directory, datestamped and replicated off to a NAS box.
    Do you have any monitoring in place to keep an eye on disk space/load/memory on the DB box and whichever machine you're backing up to?
    Yep, munin.
    Are any of the tables partitioned?
    No, requires a potentially significant departure from the vBulletin trunk and if we plan to keep current with updates it would be a heavily time consuming task for me, I reckon.
    What sort of connection do you have between the DB servers? Assuming it's gigabit and depending on the size of the DB, it may be better just to replicate live across the servers without dumping from the master first.
    Yeah it's gigabit, having never tested it practically (only the live server and the second one that have the specs to run the boards db in any usable form) I wasn't sure which would be faster, and any tutorials seem to be vague on it.
    Be careful about upgrading MySQL - we had some nasty surprises with compatibility.

    Seeing as it's only an upgrade from 5.0.22 -> 5.0.67 or something like that I don't imagine there'll be any hiccups, regi and ecksor seem to agree with me there, but yeah, it's not something I actually want to do even then.

    I'd rather get the db up and running on the new machine rather than risk trouble with the current one, and then use the old one as the "slave", but it seems to be a bit of a coin toss.


  • Moderators, Music Moderators Posts: 23,359 Mod ✭✭✭✭feylya


    CuLT wrote: »
    Gonna split this up to answer, I'm heading straight to bed after this though so sorry if I don't respond any further until tomorrow!


    Dump gets tarballed on the fly to a local directory, datestamped and replicated off to a NAS box.

    Yep, munin.

    What sort of file space is it taking up as it dumps and compresses?
    No, requires a potentially significant departure from the vBulletin trunk and if we plan to keep current with updates it would be a heavily time consuming task for me, I reckon.

    Yeah it's gigabit, having never tested it practically (only the live server and the second one that have the specs to run the boards db in any usable form) I wasn't sure which would be faster, and any tutorials seem to be vague on it.

    Hmm, I was going to suggest starting a replication from your last proper backup but you'll be missing the status details that you'd need.
    Seeing as it's only an upgrade from 5.0.22 -> 5.0.67 or something like that I don't imagine there'll be any hiccups, regi and ecksor seem to agree with me there, but yeah, it's not something I actually want to do even then.

    If it's only to 5.0.67 instead of 5.1, you should be alright
    I'd rather get the db up and running on the new machine rather than risk trouble with the current one, and then use the old one as the "slave", but it seems to be a bit of a coin toss.

    I'd agree with moving over to the newer server for the live one. Once you having running as a slave, it shouldn't be hugely difficult to move it over to Master-Master replication and then slave the old box.


  • Subscribers Posts: 4,075 ✭✭✭IRLConor


    CuLT wrote: »
    Began test database backup, "failed" at 3GB mark (no error in log).

    Tweaked some my.cnf values, "failed" at 3GB mark (no error in log).

    Did it fail at exactly 3GB or around that value? Did it fail at exactly the same point both times?


  • Advertisement
  • Subscribers Posts: 9,716 ✭✭✭CuLT


    What sort of file space is it taking up as it dumps and compresses?
    I'm not sure really, it gzips it as it dumps it, I don't know if that takes temp space additionally, uncompressed the db is about 8-9 GB, or should be.
    Did it fail at exactly 3GB or around that value? Did it fail at exactly the same point both times?
    Around that value, I'm pretty sure it's failing on a particular query, or at a certain table or something but I actually just don't know how to go about determining where.


  • Moderators, Music Moderators Posts: 23,359 Mod ✭✭✭✭feylya


    If you have the space, dump the db to an uncompressed sql file and then tail it to find the last table and record it backed up. Then, run a query to find the next record and see if it's funky.


  • Subscribers Posts: 9,716 ✭✭✭CuLT


    feylya wrote: »
    If you have the space, dump the db to an uncompressed sql file and then tail it to find the last table and record it backed up. Then, run a query to find the next record and see if it's funky.
    Oh, feck, of course. That should have been somewhat blindingly obvious, thanks :)


  • Moderators, Music Moderators Posts: 23,359 Mod ✭✭✭✭feylya


    Invoice is in the post ;)


  • Registered Users Posts: 85 ✭✭df_h


    Hi Cult

    firstly hello from Galway, @argnite referred me, we work together
    I run several large custom coded sites on few dozen linux servers with about 10million visitors a month

    I dont know the specifics of boards.ie setup but might be able to help with mysql

    Firstly upgrading to 5.0.67 is painless, but don't try 5.1 (its still very buggy)
    May I recommend compiling it instead of installing from RPM this way you could tweak things down and it be little bit faster

    do this on a virtual machine first to test etc


    heres complation details
    http://paste.pierce.tv/?p=kmps4uotz9&pasteid=494

    put the mysql data files under "files" tweak the above script to own locations, users, groups, where possible use innodb table type

    heres the my.cnf i use (8core/8gb server)
    http://paste.pierce.tv/?p=aextdoeavh&pasteid=495

    once again i dont know specifics but you could tweak


    now to backup the database i hope yee are using mysqldump?

    something like
    /servers/server/mysql_5.0.67/bin/mysqldump --socket=/server/tmp/mysql.sock  -u root --password=pass db_name  --ignore-table=table_1 --ignore-table=table_2  > /boards.sql
    

    now i dont know what boards database looks like but i presume there are alot of uneeded temp tables that theres no point wasting time backing up, i think vbuleting has table for search index etc? so set these tables to ignore

    also when dumping disable site (or itll crawl more)




    some other optimisations unrelated to mysql

    use nginx, its alot faster than apache, use php5 with APC enabled i can provide detailed info on these too if needed

    anyways i hope thats a start?


    cheers


  • Advertisement
  • Subscribers Posts: 9,716 ✭✭✭CuLT


    Thanks for posting df_h, it's always most useful to understand how others have approached the same/similar problems. You've got me thinking in a couple of directions I might not otherwise have.

    I've been hesitant to go down the nginx route because it closes off htaccess to us as an option, but it's certianly on the cards.

    Cheers for the input.


Leave a Comment

Rich Text Editor. To edit a paragraph's style, hit tab to get to the paragraph menu. From there you will be able to pick one style. Nothing defaults to paragraph. An inline formatting menu will show up when you select text. Hit tab to get into that menu. Some elements, such as rich link embeds, images, loading indicators, and error messages may get inserted into the editor. You may navigate to these using the arrow keys inside of the editor and delete them with the delete or backspace key.

Advertisement