The #1 rule in troubleshooting is isolate, isolate, isolate. Eliminate as many variables as possible.
Aye. Figure out all components of the trouble and cycle the instance as many times as needed replacing only one component at a time with a new one so as to see when it fails exactly.

Nobody remind me about doing this check with all stuff on the mobo please. Apparently I already did.