Hi TCLUG! This is my first post to the mailing list.
<br />
<br />
I have a question about IBM Netfinity servers and the ServeRAID controller (aic7xxx series). My company got a Netfinity 6000R (also known as an xSeries 350). The machine has two Xeon processors and 3 34gb scsi drives in a Raid 5. We are running Red Hat 7.3 and we patched the kernel to 2.4.18-4.
<br />
<br />
I have looked EVERYWHERE trying to find information on the following error. It appeared a few times in posts to the kernel and hardware mailing lists at kernel.org, but there never seemed to be a definitive answer.
<br />
<br />
Whenever we try to move a large number of files through the raid controller (general file moves to/from samba clients, or to a tape backup), we receive 10 to 30 i/o errors, like the following (sectors change):
<br />
<br />
Jun 5 08:09:07 gar kernel: I/O error: dev 08:05, sector 49041320
<br />
Jun 5 08:09:07 gar kernel: SCSI disk error : host 3 channel 0 id 0 lun 0 return code = 70000
<br />
Jun 5 08:09:07 gar kernel: I/O error: dev 08:05, sector 49041328
<br />
Jun 5 08:09:07 gar kernel: SCSI disk error : host 3 channel 0 id 0 lun 0 return code = 70000
<br />
<br />
<br />
Does anyone have any insight into this problem? I tried turning the NMI_watchdog option off, tried resetting the raid controller's timer back to 256ms from its default 64ms, updated all the bios/driver levels on the motherboard, tape drive firmware, raid controller, etc.
<br />
<br />
It works fine when its just poking along, but when we try to move much data through it, it generates these errors.
<br />
<br />
Thanks,
<br />
<br />
Brent Friedman
<br />
<br />
PS - I'm a developer by trade, not a dyed-in-the-wool admin type. But I have spent days trying to find a solution, and I'm not intimately familiar with scsi / raid error stuff.<p><hr><b>Join Excite! - <a href="http://www.excite.com" target=_top>http://www.excite.com</a></b><br>The most personalized portal on the Web!