StarTech PEXM2SAT3422 initial problems
Background
The card:
The StarTech PEXM2SAT3422 is a PCIe 4x card featuring 4 SATA ports as:2 x SATA connector
2 x M.2 NGFF connector
It uses the Marvell 88SE9230 chipset
The card features RAID 1, 0, 10, as well as SATA pass-through.
Upon initially deploying the card, I experienced some very serious problems with it
My setup:
Server is a SuperMicro X9SCM in a 2U half-depth chassis.3Ware Card exists with 2 1TB WD Black HDs.
Motherboard Intel SATA has 2 Kingston SSDs installed in a RAID1 (Microsoft Dynamic Disk)
OS is Server 2012 R2
This server has been rock solid for 3 years.
About 1 year ago:
I added an Ablecomm SATA card that has 2 M.2 connectors. It has housed a single Crucial CX300 M.2 SATA SSD. This setup has been rock solid with no hiccups whatsoever.Today:
I replaced the Ablecomm PCIe card with the StarTech card and moved the CX300. I also installed a 2nd 1TB CX300 alongside. The plan was to mirror the 2 CX300s and start using them for production workload.Here are the problems I encountered:
- One M.2 disk disappeared (showed offline in the Marvell tray tool). This degraded the RAID1. I had to shutdown the server & power back up to see the disk again.
- Cannot update the firmware of the Crucial MX300 1TB SSD while connected through this PCIe card. This is true whether in RAID mode or pass-through. (this seems like a glitch, as I would really expect pass-through mode to expose the drive directly off of the Marvell controller to the OS just like any other storage controller).
- Both CX300s disappeared while the RAID1 was rebuilding after my reboot (after the 1 drive had initially disappeared).
- The Marvell management utility Windows service stopped running (I believe it crashed). I had to restart it manually to discover that the drives were missing entirely.
- Upon rebooting the server, we get a black screen with txt cursor and the OS never loads.
Theories & Additional Details:
Problems 1-3 happened while the disks were under max load transferring VHD files AND I was using the Maevell management utility (web interface) to view various data (health & configuration information). I don't know whether my viewing of this info while the controller is under load should cause a disk to drop off. But it is likely that the disks would have fallen offline regardless of whether I was viewing info in the management interface.
The 2 SSDs' firmware are not on the same version. They are on 1 revision difference.
I suspect the current black screen at boot up is actually the controller rebuilding the RAID. This is because the LEDs on the card for both SSDs are flashing like crazy. If this is true, it means the card halts boot-up (potentially for hours) for array rebuild.
Game Plan:
Let the rebuild complete (assuming that's even happening at this point).
If it completes:
- Try to figure out how to update the firmware on the 2nd Crucial MX300
- Call StarTech & find out whether I'm doing anything wrong.
- Card firmware updates, drivers (though I already installed their driver), etc.
If it doesn't complete:
- Power off
- Pull out the card
- Remove the 2nd (older firmware) SSD
- Reinstall the card
- Move the VHD files that are on it back to other storage.
- Contact StarTech, try firmware updates, etc.
- (Leaving it in this state should at least verify the card stability for a single disk.)
- Move the Ablecomm card & old SSD to a different PC & update firmware there.
- Verify whether I have the correct version of the Crucial utility for firmware update (I updated the other MX300 SSD using this same card 1-2 weeks ago in a Windows 7 PC).
Final Thoughts & Conclusions:
It seems I should have gone with a real RAID controller. I realize technically this is a real RAID controller. But I mean something like an LSI or SuperMicro that is rock-solid.
Comments
Post a Comment