Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Need to get a server powered on in the morning - Very concerned

  • 10-03-2010 12:04am
    #1
    Closed Accounts Posts: 179 ✭✭


    Hi guys,
    Was doing some work on a server today - opened it up to check for a connector - disconnected one or two boards and one or two cables. Cables were I/O board to the SAS controller. I lifted the network card at the back and reseated and also the SAS card - reseating the card involved disconnecting a few drives aswell. Then someone came in and wrapped their hand around the SAS without earthing themselves - I was careful to ground myself all along.

    Anyway it's a x366 IBM and when I plugged all the cables back in to the back - power, network etc (network lights came on). There is a system power light on the front beside the power button to indicate power - this is on. The problem is the nothing happens when I press the power button - tried holding it down for up to 15 secs etc. - nothing happening.

    I'm afraid that the SAS drive controller got damaged by static by that person - at least it's the only direction I am aware there could be an issue at the moment. Perhaps reseating the cards I released might do something. Anyway I'm concerrned as I need to get it running for something important - I don't think there are any spares (hardware support outsourced overall) and this isn't usually my area in the workplace - of course this is further adding to my concerns.

    Any help appreciated here guys on what to check out for in the morning.

    Thanks.


Comments

  • Registered Users, Registered Users 2 Posts: 919 ✭✭✭n0brain3r


    If it doesn't even try spin up the fans or load the bios I doubt your SAS controller is causing the issue you could always try powering up without it.

    If you have an amber power light but nothing happens when you press the power I'd suspect you haven't reconnected the CPU power it's a square 4 pin connector the socket is usually somewhere close to the CPU I'm not familiar with the IBM hardware you mentioned but this connector exists on older hardware. Check all the power connections on the MB and make sure their seated correctly and pushed home. Something could have been knocked reseat your RAM and any expansion cards in the system. Try power up again.

    If everything looks connected properly and it doesn't power up it's time to get back to basics and disconnect/remove everything you moved yesterday,try power up with as little as possible connected including hard drives and controllers you want to get the BIOS loading and fans spinning up at least.

    Also be very careful of the order your harddrives are connected to the SAS controller if they're in a Hot Swappable bay you should be ok but if they're internal note where and what they are plugged into.


  • Closed Accounts Posts: 52 ✭✭_PCahill_


    There is a good chance you'll be fine. All the power lights mean is that the PSU has power.

    If it's not turning on it's because the motherboard is telling it not to. This would not happen for a blown SAS board (the machine should still POST/error check and print stuff on screen.) This is actually a feature as the machine is detecting a serious fault and it wont turn on as the cause of the error could be serious/destroy the machine.

    Did you add any new hardware into it since it last booted?

    Normally if you cant get it to try to boot up (i.e. POST) it's a sign that the CPU or RAM is installed incorrectly/not installed. Either that or you have a short circuit.

    Modern machines are very well designed - if the SAS card was installed when the person potentially put static into it the machine would more than likely absorb it fine (unless they are touching the chip pins directly - but they are probably not even exposed on your SAS board).

    Try this:
    Turn off. Remove SAS card and Network card. Then turn the machine on. See if you can get it to POST (print up the self checks as it boots...). It will give you an error about the SAS/harddisk not being detected but that's fine - if you get this far you should be ok and just turn off and resit cards and boot up! :)

    if it still doesn't boot, try to resit the RAM in it.

    If that still doesnt work, I'd guess you have dodgy cables. Cables with bad folds in them DO cause shorts and will result in failure to start. Unplug ANY Harddisk/cdrom/DAT/etc drives (you can leave their power connectors in, just take the data ones out). Make sure to unplug these at the mainboard and not at the drive end. Then try booting again.

    If you still have no luck send a reply and I'll list some other suggestions :)

    Good luck...


  • Closed Accounts Posts: 179 ✭✭irlforum


    _PCahill_ wrote: »
    There is a good chance you'll be fine. All the power lights mean is that the PSU has power.

    If it's not turning on it's because the motherboard is telling it not to. This would not happen for a blown SAS board (the machine should still POST/error check and print stuff on screen.) This is actually a feature as the machine is detecting a serious fault and it wont turn on as the cause of the error could be serious/destroy the machine.

    Did you add any new hardware into it since it last booted?

    Normally if you cant get it to try to boot up (i.e. POST) it's a sign that the CPU or RAM is installed incorrectly/not installed. Either that or you have a short circuit.

    Modern machines are very well designed - if the SAS card was installed when the person potentially put static into it the machine would more than likely absorb it fine (unless they are touching the chip pins directly - but they are probably not even exposed on your SAS board).

    Try this:
    Turn off. Remove SAS card and Network card. Then turn the machine on. See if you can get it to POST (print up the self checks as it boots...). It will give you an error about the SAS/harddisk not being detected but that's fine - if you get this far you should be ok and just turn off and resit cards and boot up! :)

    if it still doesn't boot, try to resit the RAM in it.

    If that still doesnt work, I'd guess you have dodgy cables. Cables with bad folds in them DO cause shorts and will result in failure to start. Unplug ANY Harddisk/cdrom/DAT/etc drives (you can leave their power connectors in, just take the data ones out). Make sure to unplug these at the mainboard and not at the drive end. Then try booting again.

    If you still have no luck send a reply and I'll list some other suggestions :)

    Good luck...

    Thanks for this and also n0brain3r. It will the morning before I can do anything here.


  • Closed Accounts Posts: 179 ✭✭irlforum


    One more thing to add actually - I've just read over your post a few times there to understand it clearly.

    While I understand I am removing SAS and network to remove a potential short circuit I have just remembered that aswell as disconnecting the SAS from the I/O I also pulled another cable from the I/O board. This cable (can't think what these are called) is like a thin film with exposed connectors at the end - no hard physical connector at all. I suspect this was either RAM, CPU or CD-ROM drive. Maybe this is the CPU cable you are referring to n0brain3r?

    I pulled it out (forcibly to remove) and forced it again to re-insert. Don't know what I was thinking not checking what that was for earlier - I was under pressure I suppose though.


  • Registered Users, Registered Users 2 Posts: 8,813 ✭✭✭BaconZombie


    This is the connection way it;s not powering on, that cable connected the circuity that the power button is connected to.

    There should be a little flip release on the cable, check this and try re-connecting it.
    irlforum wrote: »
    I have just remembered that aswell as disconnecting the SAS from the I/O I also pulled another cable from the I/O board. This cable (can't think what these are called) is like a thin film with exposed connectors at the end - no hard physical connector at all. I suspect this was either RAM, CPU or CD-ROM drive. Maybe this is the CPU cable you are referring to n0brain3r?

    I pulled it out (forcibly to remove) and forced it again to re-insert. Don't know what I was thinking not checking what that was for earlier - I was under pressure I suppose though.


  • Advertisement
  • Closed Accounts Posts: 179 ✭✭irlforum


    Hi,
    I have it turned on here and opened to resit the connector way. IO board is showing error 06 which I can't find any info on IBM for or google though I'm continuing search.

    Connector way does not seem to have any connector release anywhere!! :mad:


  • Closed Accounts Posts: 179 ✭✭irlforum


    **


  • Closed Accounts Posts: 179 ✭✭irlforum


    I'm just updating here incase anyone logs in, I'm just taking a step back from it now.

    I was looking for SAS controller info which wasn't coming up on main BIOS. I checked to make sure there was no quick boot set and there wasn't. When I went into hardware dianostics - I could then see a printout for the SAS controller and it said "Not installed".

    However when I do try to boot the maching setting "SAS Planar" as first device I do get a master boot file does not exist or is invalid. So it's accessing the disks but I have no config for SAS planar.


  • Registered Users, Registered Users 2 Posts: 919 ✭✭✭n0brain3r


    How did you get on with this today?


  • Registered Users, Registered Users 2 Posts: 2,743 ✭✭✭funk-you


    There are a few things I would check.

    On the front of the box where the power button is, there is a blue latch, clip this and slide out the light panel diagnostic panel. On there when you attempt to power on do any amber lights appear next to any of the values? If so, which one?

    The thin ribbon cable you were talking about connects the LPD panel and power button to the system board. Check that you are at least getting a power light on the LPD panel, if not reseat the cable.

    If you can power on the system, does it POST? You will hear two beeps if it completes. This can take anywhere from 30 second to 2 minutes. What happens after POST (Beeps)?

    If the system doesn't POST what output are you getting on the GUI/monitor?

    If the system does begin POST, you should be able to see a reference to either physical drives or a RAID card/configuration and whether it is installed. There may also be reference and prompt to enter MegaRAID or ServeRAID manager. If there is, follow the prompt and check is an array configured. If it is not, do you need the data on the drives? Can you create a new one?

    Is there an RSA card in the system? Can you log into it and retrieve the system error log. Login details are USERID and PASSW0RD Thats a zero in the password. This is the quickest way to find the fault. Post it here or PM me and I'll take a look.

    Are there any faults listed in the error log in the BIOS?

    Before going ahead with any of this, I would go back and reseat all components. Take your time and check to see have you missed any small sets of cables or connected on to the wrong slot/connection. Make sure all cards are securly seated, dont be too afraid of them. Tbh, it smells of either a loose cable or connection. Probably to either the DASD backplane or the controller.

    Also, try to power on the system with the lid open. DO NOT TOUCH THE INSIDE. Are there any amber light near to or on any component?

    Below is the link the the problem determination and service guide for the x366. Everything you'll need is in there.

    ftp://ftp.software.ibm.com/systems/support/system_x_pdf/49y0058.pdf

    Let me know how you get on.

    -Funk


  • Advertisement
  • Closed Accounts Posts: 179 ✭✭irlforum


    Guys...got it posted but it's not booting server 2003. Problem was LPD panel connector.

    Now when I try to boot it says master boot record does not exist or invalid.

    It's an cheap board for connecting the SAS drives and there doesn't seem to be any config for it available at all through BIOS.

    So either the master boot record is frigged - thinking loading recovery console via windows cd. Really annoying the way I can't check some SAS controller config - is this normal?

    Is it easy for the drives to just lose a master boot record like this? On select boot device I have an option of two hard disks (hard disk 1 and hard disk 2) - there are four drives in total physically installed into the SAS though. Both drives available in the boot menu give master boot record error does not exist or invalid. grrr.


  • Registered Users, Registered Users 2 Posts: 919 ✭✭✭n0brain3r


    How many drives have you in the server? Do you see the cards bios load at any stage and list the drives attached?


  • Closed Accounts Posts: 179 ✭✭irlforum


    n0brain3r wrote: »
    How many drives have you in the server? Do you see the cards bios load at any stage and list the drives attached?

    4 drives installed physically and connected to the same SAS board. Nothing comes up for the SAS board at all during BIOS. Looked around to see if there might be a quick boot on aswell but couldn't find any setting.


  • Registered Users, Registered Users 2 Posts: 919 ✭✭✭n0brain3r


    I was hoping it was fewer we could of tried a direct connetion to the mb. With 4 drives there is no way of knowng it if was setup wth RAID5 it could be any combination or just stand alone drives. Can you try the controller in another slot? In the bios boot order does what does it list?


  • Registered Users, Registered Users 2 Posts: 2,743 ✭✭✭funk-you


    irlforum wrote: »
    Guys...got it posted but it's not booting server 2003. Problem was LPD panel connector.

    Now when I try to boot it says master boot record does not exist or invalid.

    It's an cheap board for connecting the SAS drives and there doesn't seem to be any config for it available at all through BIOS.

    So either the master boot record is frigged - thinking loading recovery console via windows cd. Really annoying the way I can't check some SAS controller config - is this normal?

    Is it easy for the drives to just lose a master boot record like this? On select boot device I have an option of two hard disks (hard disk 1 and hard disk 2) - there are four drives in total physically installed into the SAS though. Both drives available in the boot menu give master boot record error does not exist or invalid. grrr.

    Okay, a couple of things.

    Do you need the data on the drives?

    Can you see all drives physically present in the BIOS? What state are they in?

    Is there an RSA card installed? If so, just log in and check the error log. It will let you know if there is a hardware failure.

    What RAID controller are you using? 8i? The IBM part number or the FRU will have two digits, a letter and four digits. It'll be on a sticker on the card. Post it here.

    It looks like you've lost your RAID array and need to recover it or RAID controller has failed. You can recover it from the disks but you need to get into the RAID manager, is there an option for this. Do you have a ServeRAID/MegaRAID CD?

    Is there another system you can try the card in or take a known good card and try it in yours?

    Is there any change when you remove the controller and boot? Have you tried alternate slots? Is the cache battery connected to the controller?

    Can you post screen shots of what you are seeing during POST? I have a spare x366 here, I can recreate the fault and work though it with you if you want.

    -Funk


  • Closed Accounts Posts: 179 ✭✭irlforum


    funk-you wrote: »
    Okay, a couple of things.

    Do you need the data on the drives?

    Can you see all drives physically present in the BIOS? What state are they in?

    Is there an RSA card installed? If so, just log in and check the error log. It will let you know if there is a hardware failure.

    Well I really don't want to setup Windows again - the data isn't critical though.
    funk-you wrote: »
    What RAID controller are you using? 8i? The IBM part number or the FRU will have two digits, a letter and four digits. It'll be on a sticker on the card. Post it here.

    I'm not even sure if there is a RAID controller in this machine. This machine has slots for 6 or 8 drives on the front which are connected to a board (SAS board is what I'm calling this) - this is connected to the I/O board. This SAS board looks basic and has no brand name though there is a white sticker - I will try and get the FRU tomorrow. Is your x366 a similar sounding setup?
    funk-you wrote: »
    It looks like you've lost your RAID array and need to recover it or RAID controller has failed. You can recover it from the disks but you need to get into the RAID manager, is there an option for this. Do you have a ServeRAID/MegaRAID CD?

    Nothing about RAID is coming up on BIOS - I really don't think there's a RAID card in the machine - as per my description above I think it's just an SAS board connected into the I/O. The sas board looks very basic - I'll try and get FRU tomorrow - possibly could have RAID I suppose integrated. The pci slots only have network cards installed.
    funk-you wrote: »
    Is there another system you can try the card in or take a known good card and try it in yours?

    Is there any change when you remove the controller and boot? Have you tried alternate slots? Is the cache battery connected to the controller?

    Can you post screen shots of what you are seeing during POST? I have a spare x366 here, I can recreate the fault and work though it with you if you want.

    -Funk

    It's posting fine now - it's just I'm getting master boot record invalid or not available on the only two drives I get as boot devices on "select boot device" menu. Surely if it can take 6 drives though physically all these need to be available through BIOS or could it just be that they are accessed via a software config in OS. What I'm saying is maybe everything is working from a hardware POV and that the master boot record is actually gone.


  • Registered Users, Registered Users 2 Posts: 2,743 ✭✭✭funk-you


    When you post the FRU, post a picture of the DASD backplane, controllers etc.

    -Funk


  • Closed Accounts Posts: 179 ✭✭irlforum


    Gonna try and get this info over to you early this week.


  • Closed Accounts Posts: 179 ✭✭irlforum


    Sorry for only getting back now - appreciate you going to the trouble here.

    FRU number is FRU13M 7880 602.

    Have the pictures here now but no USB - will bring camera home - should be a proper USB cable there.


  • Registered Users, Registered Users 2 Posts: 4,109 ✭✭✭sutty


    irlforum wrote: »
    Sorry for only getting back now - appreciate you going to the trouble here.

    FRU number is FRU13M 7880 602.

    Have the pictures here now but no USB - will bring camera home - should be a proper USB cable there.


    Thats your backplane. Its not a controller card. The controller card is a large card sitting into pci-x slot. Make sure the cables going from the backplane card to the SAS controller are connected in the correct order as before. If you have mixed them up, the 4 drives could be being detected in the wrong order. Meaning your old drive 1 is now drive 3 or something to that effect. Which means the system is looking at the wrong drive for the MBR.

    As for the ribbon cable connector. The top surface or sides can sometimes be the latch to make a seal onto the ribbon. Like below. You need to pull them out before you can remove or reseat
    the cable.

    20070403-gbaccelerator-sp-lcd-connector.jpg


  • Advertisement
  • Closed Accounts Posts: 179 ✭✭irlforum


    sutty wrote: »
    Thats your backplane. Its not a controller card. The controller card is a large card sitting into pci-x slot. Make sure the cables going from the backplane card to the SAS controller are connected in the correct order as before. If you have mixed them up, the 4 drives could be being detected in the wrong order. Meaning your old drive 1 is now drive 3 or something to that effect. Which means the system is looking at the wrong drive for the MBR.

    As for the ribbon cable connector. The top surface or sides can sometimes be the latch to make a seal onto the ribbon. Like below. You need to pull them out before you can remove or reseat
    the cable.

    20070403-gbaccelerator-sp-lcd-connector.jpg

    This server is still down I'm afraid - I haven't had the chance to look at it properly since.

    The cables from the backplane (there are two of these) are actually connected to the I/O plate which includes the above switch (in image above) for error lights etc. The only PCI or PCIx is a Fiber Optic network card if I recall correctly.

    As I said there are two connectors between the backplane and I/O plate and I have actually tried switching them around. I was reading in a manual that the cd's that come with this server contain apps to configure the hd setup but I haven't tried it yet.

    Do you think seeing as it's a case of the disks being connected directly to the I/O disk that there is no RAID setup on this server at all? I do know that alot of the data is accessed by a SAN by example so maybe a RAID setup was not thought of as important or required when initally setting up the machine.


Advertisement