Discussion:
Diagnosis help needed
(too old to reply)
michelle
2012-06-24 07:01:50 UTC
Permalink
Hi Folks,

Sorry to come to you with a long and convoluted problem.

Situation - home-type standard PC, 4gig of RAM, running two SSDs in a
mirrored root raid pool. Three 2tb hard drives in a raidz.

System is...

OpenIndiana Development oi_151.1.4 X86 (powered by illumos)
Copyright 2011 Oracle and/or its affiliates. All rights reserved.
Use is subject to license terms.
Assembled 22 April 2012

I have an external, "toaster" which takes two hard drives, one is
connected via e-sata; the other is running via USB (although for this
instance, there was no drive in teh socket) because I've had a long
running battle to try and get an affordable (to me) e-sata card that
will give me another e-sata channel.

The ZFS set was getting full; something like only 50gig free. I was
starting file copies off the server to an external drive via an SMB
client, and going to bed, to wake up and find the process had frozen.
Diagnosis led me to the server, which appeared to hang on any log on
attempt. It even didn't listen to the power button properly.

Eventually, it was a dirty, hard down.

On coming up again at 07:13 , it gave this...

Jun 24 07:01:55 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:01:55 jaguar This may result in reduced system performance.
Jun 24 07:01:55 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:01:55 jaguar This may result in reduced system performance.
Jun 24 07:10:25 jaguar power: [ID 199196 kern.notice] NOTICE: Power
Button pressed 3 times, cancelling all requests
Jun 24 07:10:43 jaguar suspend: [ID 221072 daemon.notice] System is
being shut down.
Jun 24 07:10:43 jaguar poweroff: [ID 330035 auth.crit] initiated by mich
on /dev/console
Jun 24 07:13:20 jaguar genunix: [ID 108120 kern.notice] ^MOpenIndiana
Build oi_151a4 64-bit (illumos 13676:98ca40df9171)
Jun 24 07:13:20 jaguar genunix: [ID 107366 kern.notice] SunOS Release
5.11 - Copyright 1983-2010 Oracle and/or its affiliates.
Jun 24 07:13:20 jaguar genunix: [ID 864463 kern.notice] All rights
reserved. Use is subject to license terms.
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: lgpg
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: tsc
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: msr
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: mtrr
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: pge
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: de
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: cmov
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: mmx
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: mca
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: pae
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: cv8
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: pat
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sep
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sse
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sse2
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: htt
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: asysc
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: nx
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sse3
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: cx16
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: cmp
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: tscp
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: mwait
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: cpuid
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: ssse3
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sse4_1
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: sse4_2
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: clfsh
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: 64
Jun 24 07:13:20 jaguar unix: [ID 223955 kern.info] x86_feature: vmx
Jun 24 07:13:20 jaguar unix: [ID 168242 kern.info] mem = 3923444K
(0xef77d000)
Jun 24 07:13:20 jaguar acpica: [ID 233941 kern.notice] ACPI: RSDP f70c0
00014 (v0 GBT )
Jun 24 07:13:20 jaguar acpica: [ID 329277 kern.notice] ACPI: RSDT
d77e3040 00040 (v1 GBT GBTUACPI 42302E31 GBTU 01010101)
Jun 24 07:13:20 jaguar acpica: [ID 229170 kern.notice] ACPI: FACP
d77e30c0 00074 (v1 GBT GBTUACPI 42302E31 GBTU 01010101)
Jun 24 07:13:20 jaguar acpica: [ID 764759 kern.notice] ACPI: DSDT
d77e3180 055B4 (v1 GBT GBTUACPI 00001000 MSFT 0100000C)
Jun 24 07:13:20 jaguar acpica: [ID 347281 kern.notice] ACPI: FACS
d77e0000 00040
Jun 24 07:13:20 jaguar acpica: [ID 193408 kern.notice] ACPI: HPET
d77e8880 00038 (v1 GBT GBTUACPI 42302E31 GBTU 00000098)
Jun 24 07:13:20 jaguar acpica: [ID 265575 kern.notice] ACPI: MCFG
d77e8900 0003C (v1 GBT GBTUACPI 42302E31 GBTU 01010101)
Jun 24 07:13:20 jaguar acpica: [ID 254724 kern.notice] ACPI: EUDS
d77e8940 004D0 (v1 GBT 00000000 00000000)
Jun 24 07:13:20 jaguar acpica: [ID 403521 kern.notice] ACPI: TAMG
d77e8e10 00A4A (v1 GBT GBT B0 5455312E BG?? 53450101)
Jun 24 07:13:20 jaguar acpica: [ID 651266 kern.notice] ACPI: APIC
d77e8780 000BC (v1 GBT GBTUACPI 42302E31 GBTU 01010101)
Jun 24 07:13:20 jaguar acpica: [ID 186916 kern.notice] ACPI: SSDT
d77e9880 01BF8 (v1 INTEL PPM RCM 80000001 INTL 20061109)
Jun 24 07:13:20 jaguar unix: [ID 190185 kern.info] SMBIOS v2.4 loaded
(1190 bytes)
Jun 24 07:13:20 jaguar unix: [ID 972737 kern.info] Skipping psm: xpv_psm
Jun 24 07:13:20 jaguar rootnex: [ID 466748 kern.info] root nexus = i86pc
Jun 24 07:13:20 jaguar iommulib: [ID 321598 kern.info] NOTICE:
iommulib_nexus_register: rootnex-1: Succesfully registered NEXUS i86pc
nexops=fffffffffbd13720
Jun 24 07:13:20 jaguar rootnex: [ID 349649 kern.info] pseudo0 at root
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] pseudo0 is /pseudo
Jun 24 07:13:20 jaguar rootnex: [ID 349649 kern.info] scsi_vhci0 at root
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] scsi_vhci0 is
/scsi_vhci
Jun 24 07:13:20 jaguar pci_autoconfig: [ID 139057 kern.info] NOTICE:
reprogram io-range on ppb[0/1c/0]: 0x1000 ~ 0x1fff
Jun 24 07:13:20 jaguar pci_autoconfig: [ID 596873 kern.info] NOTICE:
reprogram mem-range on ppb[0/1c/0]: 0xd7800000 ~ 0xd78fffff
Jun 24 07:13:20 jaguar pci_autoconfig: [ID 596873 kern.info] NOTICE:
reprogram mem-range on ppb[0/1c/5]: 0xd7900000 ~ 0xd79fffff
Jun 24 07:13:20 jaguar pci_autoconfig: [ID 595143 kern.info] NOTICE: add
io-range on subtractive ppb[0/1e/0]: 0x2000 ~ 0x2fff
Jun 24 07:13:20 jaguar genunix: [ID 596552 kern.info] Reading Intel
IOMMU boot options
Jun 24 07:13:20 jaguar rootnex: [ID 349649 kern.info] npe0 at root:
space 0 offset 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] npe0 is /***@0,0
Jun 24 07:13:20 jaguar npe: [ID 236367 kern.info] PCI Express-device:
***@1f, isa0
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 0 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 1 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 2 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 3 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 4 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 5 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 6 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 615120 kern.info] NOTICE: apic:
local nmi: 7 0x0 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 419660 kern.info] pcplusmp: irq 0x9
vector 0x80 ioapic 0x2 intin 0x9 is bound to cpu 1
Jun 24 07:13:20 jaguar pcplusmp: [ID 419660 kern.info] pcplusmp: irq 0xb
vector 0xd1 ioapic 0x2 intin 0xb is bound to cpu 2
Jun 24 07:13:20 jaguar amd_iommu: [ID 251261 kern.info] NOTICE:
amd_iommu: No AMD IOMMU ACPI IVRS table
Jun 24 07:13:20 jaguar pseudo: [ID 129642 kern.info] pseudo-device: acpippm0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] acpippm0 is
/pseudo/***@0
Jun 24 07:13:20 jaguar pseudo: [ID 129642 kern.info] pseudo-device: ppm0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] ppm0 is /pseudo/***@0
Jun 24 07:13:20 jaguar ahci: [ID 405770 kern.info] NOTICE: ahci0: hba
AHCI version = 1.30
Jun 24 07:13:20 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,010601 (ahci) instance 0 irq 0x18 vector 0x40 ioapic 0xff intin
0xff is bound to cpu 3
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 0
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model INTEL
SSDSA2M040G2GC
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware 2CV102HB
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial number
CVGB949301PH040GGN
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
78165360 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd0 at ahci0: target
0 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd0 is
/***@0,0/pci1458,***@1f,2/***@0,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@0,0 (sd0) online
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 1
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model INTEL
SSDSA2M040G2GC
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware 2CV102HB
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial number
CVGB949301PC040GGN
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
78163247 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd2 at ahci0: target
1 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd2 is
/***@0,0/pci1458,***@1f,2/***@1,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@1,0 (sd2) online
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 2
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model ST32000542AS
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware CC34
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial
number 5XW17ARW
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
3907029168 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd3 at ahci0: target
2 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd3 is
/***@0,0/pci1458,***@1f,2/***@2,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@2,0 (sd3) online
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 3
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model WDC
WD20EARS-00MVWB0
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware 51.0AB51
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial
number WD-WMAZA0555575
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
3907029168 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd4 at ahci0: target
3 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd4 is
/***@0,0/pci1458,***@1f,2/***@3,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@3,0 (sd4) online
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 4
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model WDC
WD20EARS-00MVWB0
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware 51.0AB51
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial
number WD-WMAZA0484508
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
3907029168 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd5 at ahci0: target
4 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd5 is
/***@0,0/pci1458,***@1f,2/***@4,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@4,0 (sd5) online
Jun 24 07:13:20 jaguar sata: [ID 663010 kern.info]
/***@0,0/pci1458,***@1f,2 :
Jun 24 07:13:20 jaguar sata: [ID 761595 kern.info] SATA disk device
at port 5
Jun 24 07:13:20 jaguar sata: [ID 846691 kern.info] model SAMSUNG
HD154UI
Jun 24 07:13:20 jaguar sata: [ID 693010 kern.info] firmware 1AG01118
Jun 24 07:13:20 jaguar sata: [ID 163988 kern.info] serial number
S1Y6J90SB10084
Jun 24 07:13:20 jaguar sata: [ID 594940 kern.info] supported features:
Jun 24 07:13:20 jaguar sata: [ID 981177 kern.info] 48-bit LBA, DMA,
Native Command Queueing, SMART, SMART self-test
Jun 24 07:13:20 jaguar sata: [ID 643337 kern.info] SATA Gen2
signaling speed (3.0Gbps)
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] Supported queue
depth 32
Jun 24 07:13:20 jaguar sata: [ID 349649 kern.info] capacity =
2930277168 sectors
Jun 24 07:13:20 jaguar scsi: [ID 583861 kern.info] sd6 at ahci0: target
5 lun 0
Jun 24 07:13:20 jaguar genunix: [ID 936769 kern.info] sd6 is
/***@0,0/pci1458,***@1f,2/***@5,0
Jun 24 07:13:20 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1f,2/***@5,0 (sd6) online
Jun 24 07:13:20 jaguar zfs: [ID 249136 kern.info] imported version 0
pool rpool using 28
Jun 24 07:13:20 jaguar genunix: [ID 308332 kern.info] root on
rpool/ROOT/openindiana-3 fstype zfs
Jun 24 07:13:21 jaguar rootnex: [ID 349649 kern.info] acpinex0 at root
Jun 24 07:13:21 jaguar genunix: [ID 936769 kern.info] acpinex0 is /fw
Jun 24 07:13:21 jaguar acpinex: [ID 328922 kern.info] acpinex: ***@0,
cpudrv0
Jun 24 07:13:21 jaguar genunix: [ID 408114 kern.info] /fw/***@0
(cpudrv0) online
Jun 24 07:13:21 jaguar pseudo: [ID 129642 kern.info] pseudo-device: dld0
Jun 24 07:13:21 jaguar genunix: [ID 936769 kern.info] dld0 is /pseudo/***@0
Jun 24 07:13:21 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0320 (ehci) instance 0 irq 0x12 vector 0x81 ioapic 0x2 intin
0x12 is bound to cpu 0
Jun 24 07:13:22 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1a,7, ehci0
Jun 24 07:13:22 jaguar genunix: [ID 936769 kern.info] ehci0 is
/***@0,0/pci1458,***@1a,7
Jun 24 07:13:22 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0320 (ehci) instance 1 irq 0x17 vector 0x82 ioapic 0x2 intin
0x17 is bound to cpu 1
Jun 24 07:13:23 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1d,7, ehci1
Jun 24 07:13:23 jaguar genunix: [ID 936769 kern.info] ehci1 is
/***@0,0/pci1458,***@1d,7
Jun 24 07:13:23 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0300 (uhci) instance 0 irq 0x10 vector 0x83 ioapic 0x2 intin
0x10 is bound to cpu 2
Jun 24 07:13:23 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1a, uhci0
Jun 24 07:13:23 jaguar genunix: [ID 936769 kern.info] uhci0 is
/***@0,0/pci1458,***@1a
Jun 24 07:13:23 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0300 (uhci) instance 1 irq 0x15 vector 0x84 ioapic 0x2 intin
0x15 is bound to cpu 3
Jun 24 07:13:23 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1a,1, uhci1
Jun 24 07:13:23 jaguar genunix: [ID 936769 kern.info] uhci1 is
/***@0,0/pci1458,***@1a,1
Jun 24 07:13:23 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1a,2, uhci2
Jun 24 07:13:23 jaguar genunix: [ID 936769 kern.info] uhci2 is
/***@0,0/pci1458,***@1a,2
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1d, uhci3
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] uhci3 is
/***@0,0/pci1458,***@1d
Jun 24 07:13:24 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0300 (uhci) instance 4 irq 0x13 vector 0x85 ioapic 0x2 intin
0x13 is bound to cpu 0
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1d,1, uhci4
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] uhci4 is
/***@0,0/pci1458,***@1d,1
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1d,2, uhci5
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] uhci5 is
/***@0,0/pci1458,***@1d,2
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu0: x86 (chipid 0x0
GenuineIntel 20655 family 6 model 37 step 5 clock 3067 MHz)
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu0: Intel(r)
Core(tm) i3 CPU 540 @ 3.07GHz
Jun 24 07:13:24 jaguar acpinex: [ID 328922 kern.info] acpinex: ***@1,
cpudrv1
Jun 24 07:13:24 jaguar genunix: [ID 408114 kern.info] /fw/***@1
(cpudrv1) online
Jun 24 07:13:24 jaguar acpinex: [ID 328922 kern.info] acpinex: ***@2,
cpudrv2
Jun 24 07:13:24 jaguar genunix: [ID 408114 kern.info] /fw/***@2
(cpudrv2) online
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu2: x86 (chipid 0x0
GenuineIntel 20655 family 6 model 37 step 5 clock 3067 MHz)
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu2: Intel(r)
Core(tm) i3 CPU 540 @ 3.07GHz
Jun 24 07:13:24 jaguar unix: [ID 557947 kern.info] cpu2 initialization
complete - online
Jun 24 07:13:24 jaguar acpinex: [ID 328922 kern.info] acpinex: ***@3,
cpudrv3
Jun 24 07:13:24 jaguar genunix: [ID 408114 kern.info] /fw/***@3
(cpudrv3) online
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu1: x86 (chipid 0x0
GenuineIntel 20655 family 6 model 37 step 5 clock 3067 MHz)
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu3: x86 (chipid 0x0
GenuineIntel 20655 family 6 model 37 step 5 clock 3067 MHz)
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu1: Intel(r)
Core(tm) i3 CPU 540 @ 3.07GHz
Jun 24 07:13:24 jaguar unix: [ID 950921 kern.info] cpu3: Intel(r)
Core(tm) i3 CPU 540 @ 3.07GHz
Jun 24 07:13:24 jaguar unix: [ID 557947 kern.info] cpu1 initialization
complete - online
Jun 24 07:13:24 jaguar unix: [ID 557947 kern.info] cpu3 initialization
complete - online
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci8086,***@1c, pcieb1
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] pcieb1 is
/***@0,0/pci8086,***@1c
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci8086,***@1c,4, pcieb2
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] pcieb2 is
/***@0,0/pci8086,***@1c,4
Jun 24 07:13:24 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci8086,***@1c,5, pcieb3
Jun 24 07:13:24 jaguar genunix: [ID 936769 kern.info] pcieb3 is
/***@0,0/pci8086,***@1c,5
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 2.0 device
(usb5e3,608) operating at hi speed (USB 2.x) on USB 2.0 root hub: ***@1,
hubd3 at bus address 2
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] USB2.0 Hub
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hubd3 is
/***@0,0/pci1458,***@1a,7/***@1
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@1 (hubd3) online
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 2.0 device
(usb5e3,608) operating at hi speed (USB 2.x) on USB 2.0 root hub: ***@6,
hubd1 at bus address 3
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] USB2.0 Hub
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hubd1 is
/***@0,0/pci1458,***@1a,7/***@6
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@6 (hubd1) online
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 1.10 device
(usb4b4,333) operating at low speed (USB 1.x) on USB 2.0 external hub:
***@1, hid5 at bus address 4
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] Cypress Semi.
Optical Mouse for U+P
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hid5 is
/***@0,0/pci1458,***@1a,7/***@6/***@1
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@6/***@1 (hid5) online
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 1.10 device
(usbd3d,1) operating at low speed (USB 1.x) on USB 2.0 external hub:
***@2, usb_mid2 at bus address 5
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] USBPS2
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] usb_mid2 is
/***@0,0/pci1458,***@1a,7/***@6/***@2
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@6/***@2 (usb_mid2) online
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 1.10 interface
(usbifd3d,1.config1.0) operating at low speed (USB 1.x) on USB 2.0
external hub: ***@0, hid6 at bus address 5
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] USBPS2
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hid6 is
/***@0,0/pci1458,***@1a,7/***@6/***@2/***@0
Jun 24 07:13:25 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
stmf_sbd0
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] stmf_sbd0 is
/pseudo/***@0
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@6/***@2/***@0 (hid6) online
Jun 24 07:13:25 jaguar usba: [ID 912658 kern.info] USB 1.10 interface
(usbifd3d,1.config1.1) operating at low speed (USB 1.x) on USB 2.0
external hub: ***@1, hid7 at bus address 5
Jun 24 07:13:25 jaguar usba: [ID 349649 kern.info] USBPS2
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hid7 is
/***@0,0/pci1458,***@1a,7/***@6/***@2/***@1
Jun 24 07:13:25 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@6/***@2/***@1 (hid7) online
Jun 24 07:13:25 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci8086,***@1e, pci_pci0
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] pci_pci0 is
/***@0,0/pci8086,***@1e
Jun 24 07:13:25 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pciclass,0c0010 (hci1394) instance 0 irq 0x11 vector 0x86 ioapic 0x2
intin 0x11 is bound to cpu 1
Jun 24 07:13:25 jaguar pci_pci: [ID 370704 kern.info] PCI-device:
pci1458,***@7, hci13940
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] hci13940 is
/***@0,0/pci8086,***@1e/pci1458,***@7
Jun 24 07:13:25 jaguar pseudo: [ID 129642 kern.info] pseudo-device: audio0
Jun 24 07:13:25 jaguar genunix: [ID 936769 kern.info] audio0 is
/pseudo/***@0
Jun 24 07:13:26 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp:
pci10ec,8168 (rge) instance 0 irq 0x19 vector 0x60 ioapic 0xff intin
0xff is bound to cpu 2
Jun 24 07:13:26 jaguar rge: [ID 801725 kern.info] NOTICE: rge0: Using
MSI interrupt type
Jun 24 07:13:26 jaguar mac: [ID 469746 kern.info] NOTICE: rge0 registered
Jun 24 07:13:27 jaguar pseudo: [ID 129642 kern.info] pseudo-device: zfs0
Jun 24 07:13:27 jaguar genunix: [ID 936769 kern.info] zfs0 is /pseudo/***@0
Jun 24 07:13:27 jaguar nwamd[76]: [ID 605049 daemon.error] 1:
nwamd_set_unset_link_properties: dladm_set_linkprop failed: operation
not supported
Jun 24 07:13:30 jaguar mac: [ID 435574 kern.info] NOTICE: rge0 link up,
1000 Mbps, full duplex
Jun 24 07:13:33 jaguar genunix: [ID 227219 kern.info] This Solaris
instance has UUID 351542cb-5131-c7ac-9876-e2c340b173a7
Jun 24 07:13:33 jaguar genunix: [ID 454863 kern.info] dump on
/dev/zvol/dsk/rpool/dump size 1979 MB
Jun 24 07:13:34 jaguar npe: [ID 236367 kern.info] PCI Express-device:
pci1458,***@1b, audiohd0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] audiohd0 is
/***@0,0/pci1458,***@1b
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: pm0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] pm0 is /pseudo/***@0
Jun 24 07:13:34 jaguar rootnex: [ID 349649 kern.info] iscsi0 at root
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] iscsi0 is /iscsi
Jun 24 07:13:34 jaguar rootnex: [ID 349649 kern.info] xsvc0 at root:
space 0 offset 0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] xsvc0 is /***@0,0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: power0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] power0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: srn0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] srn0 is /pseudo/***@0
Jun 24 07:13:34 jaguar /usr/lib/power/powerd: [ID 387247 daemon.error]
Able to open /dev/srn
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: devinfo0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] devinfo0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: pseudo1
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] pseudo1 is
/pseudo/***@1
Jun 24 07:13:34 jaguar acpinex: [ID 328922 kern.info] acpinex: ***@0,
acpinex1
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] acpinex1 is /fw/***@0
Jun 24 07:13:34 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp: asy
(asy) instance 0 irq 0x4 vector 0xb0 ioapic 0x2 intin 0x4 is bound to cpu 3
Jun 24 07:13:34 jaguar isa: [ID 202937 kern.info] ISA-device: asy0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] asy0 is
/***@0,0/***@1f/***@1,3f8
Jun 24 07:13:34 jaguar isa: [ID 202937 kern.info] ISA-device: pit_beep0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] pit_beep0 is
/***@0,0/***@1f/pit_beep
Jun 24 07:13:34 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:13:34 jaguar This may result in reduced system performance.
Jun 24 07:13:34 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:13:34 jaguar This may result in reduced system performance.
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: llc10
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] llc10 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: lofi0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] lofi0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
ramdisk1024
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] ramdisk1024 is
/pseudo/***@1024
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: ucode0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] ucode0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
nvidia255
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] nvidia255 is
/pseudo/***@255
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fcp0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fcp0 is /pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: dcpc0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] dcpc0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: dtrace0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] dtrace0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
fasttrap0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fasttrap0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fbt0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fbt0 is /pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
lockstat0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] lockstat0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: profile0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] profile0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: sdt0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] sdt0 is /pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
systrace0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] systrace0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fcsm0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fcsm0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fct0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fct0 is /pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: stmf0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] stmf0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: pool0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] pool0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: bpf0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] bpf0 is /pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: winlock0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] winlock0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fssnap0
Jun 24 07:13:34 jaguar genunix: [ID 936769 kern.info] fssnap0 is
/pseudo/***@0
Jun 24 07:13:34 jaguar ipf: [ID 774698 kern.info] IP Filter: v4.1.9,
running.
Jun 24 07:13:35 jaguar pseudo: [ID 129642 kern.info] pseudo-device: nsmb0
Jun 24 07:13:35 jaguar genunix: [ID 936769 kern.info] nsmb0 is
/pseudo/***@0
Jun 24 07:13:36 jaguar console-kit-daemon[516]: [ID 702911
daemon.warning] WARNING: signal "open_session_request" (from
"OpenSessionRequest") exported but not found in object class "CkSeat"
Jun 24 07:13:36 jaguar console-kit-daemon[516]: [ID 702911
daemon.warning] WARNING: signal "close_session_request" (from
"CloseSessionRequest") exported but not found in object class "CkSeat"
Jun 24 07:13:36 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:13:36 jaguar This may result in reduced system performance.
Jun 24 07:13:36 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:13:36 jaguar This may result in reduced system performance.

---

...then I determined to unplug the external USB hard drive and got this...

---

Jun 24 07:20:52 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:20:52 jaguar This may result in reduced system performance.
Jun 24 07:20:52 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:20:52 jaguar This may result in reduced system performance.
Jun 24 07:45:43 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 07:45:43 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 07:45:43 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 07:45:43 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 07:46:04 jaguar genunix: [ID 408114 kern.info]
/***@0,0/pci1458,***@1a,7/***@1 (hubd3) removed
Jun 24 07:46:04 jaguar rootnex: [ID 349649 kern.info] xsvc0 at root:
space 0 offset 0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] xsvc0 is /***@0,0
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/aszeszo.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/aszeszo.fastdev01.uk.openindiana.org/zoneconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/trisk.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/trisk.fastdev01.uk.openindiana.org/zoneconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/transcode.fastdev01.uk.openindiana.org/zoneconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/transcode.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/tomww.fastdev01.uk.openindiana.org/zoneconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/tomww.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/python.fastdev01.uk.openindiana.org/zoneconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/python.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/jds.fastdev01.uk.openindiana.org/zoneconsole: No such file or
directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/jds.fastdev01.uk.openindiana.org/masterconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/sfw.fastdev01.uk.openindiana.org/zoneconsole: No such file or
directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/sfw.fastdev01.uk.openindiana.org/masterconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for /dev/zcons/jdsbuild/zoneconsole:
No such file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/jdsbuild/masterconsole: No such file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/onnv.fastdev01.uk.openindiana.org/masterconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/onnv.fastdev01.uk.openindiana.org/zoneconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/g11n.fastdev01.uk.openindiana.org/masterconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/g11n.fastdev01.uk.openindiana.org/zoneconsole: No such file
or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/alasdair.fastdev01.uk.openindiana.org/zoneconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/alasdair.fastdev01.uk.openindiana.org/masterconsole: No such
file or directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/test01.alasdair.openindiana.org/zoneconsole: No such file or
directory
Jun 24 07:46:04 jaguar devfsadmd[255]: [ID 317882 daemon.error]
build_devlink_list: readlink failed for
/dev/zcons/test01.alasdair.openindiana.org/masterconsole: No such file
or directory
Jun 24 07:46:04 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:46:04 jaguar This may result in reduced system performance.
Jun 24 07:46:04 jaguar unix: [ID 954099 kern.info] NOTICE: IRQ16 is
being shared by drivers with different interrupt levels.
Jun 24 07:46:04 jaguar This may result in reduced system performance.
Jun 24 07:46:04 jaguar pcplusmp: [ID 805372 kern.info] pcplusmp: asy
(asy) instance 0 irq 0x4 vector 0xb0 ioapic 0x2 intin 0x4 is bound to cpu 0
Jun 24 07:46:04 jaguar isa: [ID 202937 kern.info] ISA-device: asy0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] asy0 is
/***@0,0/***@1f/***@1,3f8
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: llc10
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] llc10 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: lofi0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] lofi0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
ramdisk1024
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] ramdisk1024 is
/pseudo/***@1024
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: ucode0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] ucode0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
nvidia255
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] nvidia255 is
/pseudo/***@255
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fcp0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fcp0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: dcpc0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] dcpc0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
fasttrap0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fasttrap0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fbt0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fbt0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
lockstat0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] lockstat0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: profile0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] profile0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: sdt0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] sdt0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device:
systrace0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] systrace0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fcsm0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fcsm0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fct0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fct0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: stmf0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] stmf0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: bpf0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] bpf0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: winlock0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] winlock0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: fssnap0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] fssnap0 is
/pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: pm0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] pm0 is /pseudo/***@0
Jun 24 07:46:04 jaguar pseudo: [ID 129642 kern.info] pseudo-device: nsmb0
Jun 24 07:46:04 jaguar genunix: [ID 936769 kern.info] nsmb0 is
/pseudo/***@0


---

... and it has stopped complaining for a few minutes now.

However, the terminal was giving me...

"Jun 24 07:06:04 jaguar devfsadmd[255]: build_devlink_list:readkubj
fauked fir /dev/zcons/test01.alasdair.openindiana.org/masterconsule: No
such file or directory

I'm also unsure of the event at 07:13:27 - nwamd[76]: [ID 605049
daemon.error] 1: nwamd_set_unset_link_properties: dladm_set_linkprop
failed: operation not supported

Could the lack of free space on the ZFS set also have caused a problem,
or is it likely that the weight of another problem, possibly the USB
external drive connection, caused it to keel over?
Jan Owoc
2012-06-24 16:44:33 UTC
Permalink
Post by michelle
Situation - home-type standard PC, 4gig of RAM, running two SSDs in a
mirrored root raid pool. Three 2tb hard drives in a raidz.
System is...
            OpenIndiana Development oi_151.1.4 X86 (powered by illumos)
[...]
Post by michelle
I have an external, "toaster" which takes two hard drives, one is connected
via e-sata; the other is running via USB (although for this instance, there
was no drive in teh socket) because I've had a long running battle to try
and get an affordable (to me) e-sata card that will give me another e-sata
channel.
So it's a 2-bay RAID enclosure with either USB or eSATA connections.
One of the two bays are occupied, and how is the enclosure being
connected?
Post by michelle
The ZFS set was getting full; something like only 50gig free.
You probably have two zpools - one is the mirrored "rpool", while the
other is your data pool, say, "tank". Am I understanding that it's
"tank" that has 50 GB (out of 4TB) free, while "rpool" does not have
any problems?
Post by michelle
I was starting
file copies off the server to an external drive via an SMB client, and going
to bed, to wake up and find the process had frozen. Diagnosis led me to the
server, which appeared to hang on any log on attempt. It even didn't listen
to the power button properly.
Is this external drive the RAID enclosure discussed above, that you
connected via Ethernet and it shows up as an SMB device, or is this on
a separate computer?


My thoughts are:

1) if "rpool" is not full, then the system should not freeze.
Depending on any snapshots, even removing files from "tank" may fail
and if you do it via SMB, as opposed to over a local command line, you
won't know why. Could you try logging on to the system, and copying
the files from the server to a client, so you see any local error
messages on the command line?

2) if, for some reason, data loss crept in, ZFS will refuse to return
bad data. Could you run a "zpool scrub" on each of the pools and then
verify that they are healthy?
Post by michelle
Could the lack of free space on the ZFS set also have caused a problem, or
is it likely that the weight of another problem, possibly the USB external
drive connection, caused it to keel over?
Not sure if it will help, but could you give details on what the
enclosure is (brand/model), and what motherboard/USB controller are on
the server?


Jan
michelle
2012-06-24 17:00:35 UTC
Permalink
The motherboard is a Gigabyte GA-H55M-UD2H and has five SATA sockets,
two IDE and one E-sata.

The SATA are, I believe, Intel.

The mb has five internal sata - two are rpool which are SSDs with plenty
of space on them.

The other three are given over to a "tank" mounted at /mirror.

The external toaster is a Sharkoom Quickport Duo II, where drive 1 is
connected via E-sata and, when I connect the second, it is via USB
because I only have one e-sata port.

I believe that disconnecting the USB has resulted in stopping the IRQ
resource problem, but now I am having this...

Jun 24 15:59:46 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 15:59:46 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:46 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 15:59:46 jaguar ahci: [ID 811322 kern.info] NOTICE: ahci0:
ahci_tran_reset_dport port 3 reset device
Jun 24 15:59:51 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 15:59:51 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 15:59:51 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:51 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 15:59:54 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 15:59:54 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 15:59:54 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 15:59:54 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 16:00:00 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 16:00:00 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 16:00:00 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 16:00:00 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed


The set is usually automatically scrubbed once a month.

I have removed about 200 gig of data and it seems to be stable-ish.

I'll begin another scrub of the tank now. It will likely take 10 hours.
Post by Jan Owoc
Post by michelle
Situation - home-type standard PC, 4gig of RAM, running two SSDs in a
mirrored root raid pool. Three 2tb hard drives in a raidz.
System is...
OpenIndiana Development oi_151.1.4 X86 (powered by illumos)
[...]
Post by michelle
I have an external, "toaster" which takes two hard drives, one is connected
via e-sata; the other is running via USB (although for this instance, there
was no drive in teh socket) because I've had a long running battle to try
and get an affordable (to me) e-sata card that will give me another e-sata
channel.
So it's a 2-bay RAID enclosure with either USB or eSATA connections.
One of the two bays are occupied, and how is the enclosure being
connected?
Post by michelle
The ZFS set was getting full; something like only 50gig free.
You probably have two zpools - one is the mirrored "rpool", while the
other is your data pool, say, "tank". Am I understanding that it's
"tank" that has 50 GB (out of 4TB) free, while "rpool" does not have
any problems?
Post by michelle
I was starting
file copies off the server to an external drive via an SMB client, and going
to bed, to wake up and find the process had frozen. Diagnosis led me to the
server, which appeared to hang on any log on attempt. It even didn't listen
to the power button properly.
Is this external drive the RAID enclosure discussed above, that you
connected via Ethernet and it shows up as an SMB device, or is this on
a separate computer?
1) if "rpool" is not full, then the system should not freeze.
Depending on any snapshots, even removing files from "tank" may fail
and if you do it via SMB, as opposed to over a local command line, you
won't know why. Could you try logging on to the system, and copying
the files from the server to a client, so you see any local error
messages on the command line?
2) if, for some reason, data loss crept in, ZFS will refuse to return
bad data. Could you run a "zpool scrub" on each of the pools and then
verify that they are healthy?
Post by michelle
Could the lack of free space on the ZFS set also have caused a problem, or
is it likely that the weight of another problem, possibly the USB external
drive connection, caused it to keel over?
Not sure if it will help, but could you give details on what the
enclosure is (brand/model), and what motherboard/USB controller are on
the server?
Jan
_______________________________________________
OpenIndiana-discuss mailing list
http://openindiana.org/mailman/listinfo/openindiana-discuss
michelle
2012-06-25 06:30:55 UTC
Permalink
Came back to it this morning, there had been errors. However, now the
system has crashed.

It was responding to console, but when I asked for a zpool status, the
terminal froze.

I had another terminal that was already logged on, and gave an init 6,
but the server didn't respond. It's just sitting there.

---

The end result was that I had to hit the power again.

Reboot and log on gave zpool result of...

***@jaguar:~# zpool status
pool: data
state: ONLINE
scan: scrub in progress since Sun Jun 24 18:00:28 2012
1.38T scanned out of 5.03T at 177K/s, (scan is slow, no estimated time)
3.22M repaired, 27.53% done
config:

NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 ONLINE 0 0 0 (repairing)
c2t4d0 ONLINE 0 0 0

errors: No known data errors

pool: rpool
state: ONLINE
scan: scrub repaired 0 in 0h1m with 0 errors on Sat Jun 9 00:01:26 2012
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0s0 ONLINE 0 0 0
c2t1d0s0 ONLINE 0 0 0

errors: No known data errors

.. and this is the messages at about the time of the crash...

Jun 24 21:20:55 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 21:20:58 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 21:20:58 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 21:20:58 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 21:21:00 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed
Jun 24 21:21:27 jaguar sata: [ID 801845 kern.info]
/***@0,0/pci1458,***@1f,2:
Jun 24 21:21:27 jaguar SATA port 3 error
Jun 24 21:21:27 jaguar sata: [ID 801845 kern.info]
/***@0,0/pci1458,***@1f,2:
Jun 24 21:21:27 jaguar SATA port 3 error
Jun 24 21:21:27 jaguar sata: [ID 801845 kern.info]
/***@0,0/pci1458,***@1f,2:
Jun 24 21:21:27 jaguar SATA port 3 error
Jun 24 21:21:27 jaguar sata: [ID 801845 kern.info]
/***@0,0/pci1458,***@1f,2:
Jun 24 21:21:27 jaguar SATA port 3 error
Jun 24 21:21:42 jaguar fmd: [ID 377184 daemon.error] SUNW-MSG-ID:
ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
Jun 24 21:21:42 jaguar EVENT-TIME: Sun Jun 24 21:21:42 BST 2012
Jun 24 21:21:42 jaguar PLATFORM: H55M-UD2H, CSN: -, HOSTNAME: jaguar
Jun 24 21:21:42 jaguar SOURCE: zfs-diagnosis, REV: 1.0
Jun 24 21:21:42 jaguar EVENT-ID: fbae1f99-ef10-ca46-c308-958e66bd0ddb
Jun 24 21:21:42 jaguar DESC: The number of I/O errors associated with a
ZFS device exceeded
Jun 24 21:21:42 jaguar acceptable levels. Refer to
http://illumos.org/msg/ZFS-8000-FD for more information.
Jun 24 21:21:42 jaguar AUTO-RESPONSE: The device has been offlined and
marked as faulted. An attempt
Jun 24 21:21:42 jaguar will be made to activate a hot spare if
available.
Jun 24 21:21:42 jaguar IMPACT: Fault tolerance of the pool may be
compromised.
Jun 24 21:21:42 jaguar REC-ACTION: Run 'zpool status -x' and replace the
bad device.
Jun 24 21:21:55 jaguar ahci: [ID 517647 kern.warning] WARNING: ahci0:
watchdog port 3 satapkt 0xffffff020f879888 timed out
Jun 25 07:24:04 jaguar genunix: [ID 108120 kern.notice] ^MOpenIndiana
Build oi_151a4 64-bit (illumos 13676:98ca40df9171)
Jun 25 07:24:04 jaguar genunix: [ID 107366 kern.notice] SunOS Release
5.11 - Copyright 1983-2010 Oracle and/or its affiliates.
Jun 25 07:24:04 jaguar genunix: [ID 864463 kern.notice] All rights
reserved. Use is subject to license terms.
Jun 25 07:24:04 jaguar unix: [ID 223955 kern.info] x86_feature: lgpg

On the face of this, a drive failure has taken the system down.

I'm going to have to check the connections,
michelle
2012-06-25 18:08:02 UTC
Permalink
OK - my back is up against the wall.

***@jaguar:~# zpool status
pool: data
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 7.10M in 24h30m with 0 errors on Mon Jun 25
18:30:48 2012
config:

NAME STATE READ WRITE CKSUM
data DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 FAULTED 0 102 0 too many errors
c2t4d0 ONLINE 0 0 0

errors: No known data errors


The cables appear fine, so I'm dealing with either a controller issue,
or a hard drive issue, I don't know how to interpret those earlier
"messages" messages.

I need help please to make the right call, before I potentially loose
4tb of files.
Udo Grabowski (IMK)
2012-06-25 18:21:08 UTC
Permalink
Post by michelle
OK - my back is up against the wall.
pool: data
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 7.10M in 24h30m with 0 errors on Mon Jun 25
18:30:48 2012
NAME STATE READ WRITE CKSUM
data DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 FAULTED 0 102 0 too many errors
c2t4d0 ONLINE 0 0 0
errors: No known data errors
The cables appear fine, so I'm dealing with either a controller issue,
or a hard drive issue, I don't know how to interpret those earlier
"messages" messages.
I need help please to make the right call, before I potentially loose
4tb of files.
Probably a failing drive - use fmdump -eV for the scsi errors,
the dev links in /dev/rdsk/c2t3d0 gives you the device path to look
for. Use iostat -Exn to get the Smart Errors for that drive,
/var/adm/messages probably has some details, fmadm faulty
messages will convince your support (if you have one) to replace
the drive.
--
Dr.Udo Grabowski Inst.f.Meteorology a.Climate Research IMK-ASF-SAT
www-imk.fzk.de/asf/sat/grabowski/ www.imk-asf.kit.edu/english/sat.php
KIT - Karlsruhe Institute of Technology http://www.kit.edu
Postfach 3640,76021 Karlsruhe,Germany T:(+49)721 608-26026 F:-926026
Edward M
2012-06-25 18:34:23 UTC
Permalink
Post by michelle
The cables appear fine, so I'm dealing with either a controller issue,
or a hard drive issue, I don't know how to interpret those earlier
"messages" messages.
Appers it is HD. Have you looked at this doc titled:

Too many I/O errors on ZFS device

reference:
http://illumos.org/msg/ZFS-8000-FD

in the action section mentions how to fix, hope it helps.:-)
Sašo Kiselkov
2012-06-25 18:40:30 UTC
Permalink
Post by michelle
OK - my back is up against the wall.
pool: data
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 7.10M in 24h30m with 0 errors on Mon Jun 25
18:30:48 2012
NAME STATE READ WRITE CKSUM
data DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c2t2d0 ONLINE 0 0 0
c2t3d0 FAULTED 0 102 0 too many errors
c2t4d0 ONLINE 0 0 0
errors: No known data errors
The cables appear fine, so I'm dealing with either a controller issue,
or a hard drive issue, I don't know how to interpret those earlier
"messages" messages.
I need help please to make the right call, before I potentially loose
4tb of files.
Have you tried running "iostat -En c2t3d0 1" and watching it while
placing load on the machine? I've had bad SAS cabling give me weird ZFS
behavior. Meanwhile, I went a hunt for all sorts of firmware issues, not
realizing it was the physical layer which bit me.

--
Saso
michelle
2012-06-25 18:50:07 UTC
Permalink
Well, right now I can't do anything.

I asked it to unmount the data set and it said it was busy.

I checked all my client links and everything was closed, so I asked it
to export again. It still said it was busy.

I decided to use -f and then it froze.

I logged in to another terminal session and issued an init 0, but the
box just sits there and refuses to go down.
michelle
2012-06-25 19:31:27 UTC
Permalink
I did a hard reset and moved the drive to another channel.

The fault followed the drive so I'm certain it is the drive, as people
have said.

The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.

I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
Dan Swartzendruber
2012-06-25 19:34:05 UTC
Permalink
Post by michelle
I did a hard reset and moved the drive to another channel.
The fault followed the drive so I'm certain it is the drive, as people
have said.
The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.
I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
Given that Nexenta has the same underlying (basically) OS as OI, you may
be in for a disappointment if you think this will help you. Can't speak
to Schillix.
Sašo Kiselkov
2012-06-25 19:52:03 UTC
Permalink
Post by michelle
I did a hard reset and moved the drive to another channel.
The fault followed the drive so I'm certain it is the drive, as people
have said.
Then return it for warranty repairs or get a new one. SMART data should
help you get a clearer picture of exactly what's going on too.
Post by michelle
The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.
You're obviously dealing with very bad hardware behaving in very
unpredictable ways. It is true that ZFS was built to run on inexpensive
hardware, however, the ways in which badly behaving controllers can
screw you over can far exceed ZFS' abilities to cope with that. Also,
"inexpensive" is a flexible term - SAS storage is inexpensive compared
to FC, but nowhere near "on-board SATA ports"-inexpensive.

It is likely that what you're seeing isn't so much the fault of ZFS
itself, but rather the SATA HBA not being able to cope with strange
device behavior (your logs actually tell that story) - don't make the
mistake of thinking all SATA chips can cope with hot-plug or weird bus
errors. Enterprise vendors, even with SATA hardware, test the various
failure modes and make sure things go down in at least a somewhat
predictable manner.
Post by michelle
I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
I can't speak for FreeNAS, but Nexenta and Schillix share practically
the same OS kernel underneath, so I don't think you'll get different
behavior from them. As I said, I think it was most probably caused your
on-board SATA HBA anyway, so no matter the OS, badly done hardware tends
to carry over into the software world.

--
Saso
Ray Arachelian
2012-06-25 21:06:07 UTC
Permalink
Post by michelle
I did a hard reset and moved the drive to another channel.
The fault followed the drive so I'm certain it is the drive, as people
have said.
The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.
I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
My guess is that since all the pools are by default set to have failmode
set to "wait" on failure, it'll wait forever.

Now, changing it to "continue" which will return an error, but it could
lead to worse behavior in some cases. (You could set it to panic also,
but you'll like that far less.)
ken mays
2012-06-25 22:50:55 UTC
Permalink
Feel free to test the included 32-bit build of openusb 1.1.6 for oi_151a at:.

https://www.illumos.org/issues/2934

An update (or patch) may resolve a few reported issues bugs in USB io transfers.

The original openusb 1.0.1 userland port is maintained upstream.

~ Ken Mays
Richard Elling
2012-06-26 00:19:45 UTC
Permalink
Post by Ray Arachelian
Post by michelle
I did a hard reset and moved the drive to another channel.
The fault followed the drive so I'm certain it is the drive, as people
have said.
The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.
I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
My guess is that since all the pools are by default set to have failmode
set to "wait" on failure, it'll wait forever.
I do not believe this is a case where failmode is used -- the pool state is
not FAULTED.

In my experience, this is a case where the disk is there, accepting commands,
but not actually responding. Eventually, (by default for most Solaris-derivative)
after 60seconds * 3 retries, something else is attempted, that also fails. Also in
my experience, I see this on consumer-grade drives (qv TLER discussions).
However, we can't know more until we see the "fmdump -eV" error log descriptive
output.
-- richard
--
ZFS storage and performance consulting at http://www.RichardElling.com
michelle
2012-06-26 00:34:24 UTC
Permalink
Apologies,

This went to an individual rather than back to the group.

----

Thanks for the response.

The thing that set of major alarms in my head is the fact that these
errors caused OI to freeze up to the degree where it needed to be
powered off. It would acknowledge the power switch instruction, but
wouldn't power down.

Messages is full of ...

Jun 24 21:20:58 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 21:20:58 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery
Jun 24 21:20:58 jaguar ahci: [ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 21:21:00 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed

The fmdump gives loads of ...

Jun 25 2012 17:20:59.709529495 ereport.fs.zfs.probe_failure
nvlist version: 0
class = ereport.fs.zfs.probe_failure
ena = 0x9af81a688900c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0xb4f18a4cb65803ce
vdev = 0xe8c24676aa3c2a12
(end detector)

pool = data
pool_guid = 0xb4f18a4cb65803ce
pool_context = 0
pool_failmode = wait
vdev_guid = 0xe8c24676aa3c2a12
vdev_type = disk
vdev_path = /dev/dsk/c2t3d0s0
vdev_devid =
id1,***@SATA_____WDC_WD20EARS-00M_____WD-WMAZA0555575/a
parent_guid = 0x394c2bd12b4ffcfc
parent_type = raidz
prev_state = 0x0
__ttl = 0x1
__tod = 0x4fe88feb 0x2a4a8f97

iostat gives...

***@jaguar:~# iostat -Exn
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.1 1.1 3.4 8.4 0.0 0.0 0.5 0.5 0 0 c2t0d0
0.1 1.1 3.5 8.4 0.0 0.0 0.5 0.5 0 0 c2t1d0
332.1 6.6 29970.5 65.3 0.1 2.6 0.2 7.7 2 39 c2t2d0
261.3 6.6 23380.2 61.2 0.5 4.4 1.8 16.3 8 68 c2t3d0
373.4 7.8 29970.7 65.2 0.1 2.9 0.3 7.6 2 47 c2t4d0
c2t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
CVGB949301PH040
Size: 40.02GB <40020664320 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 36 Predictive Failure Analysis: 0
c2t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
CVGB949301PC040
Size: 40.02GB <40019582464 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 35 Predictive Failure Analysis: 0
c2t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: ST32000542AS Revision: CC34 Serial No:
5XW17ARW
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 36 Predictive Failure Analysis: 0
c2t3d0 Soft Errors: 0 Hard Errors: 795 Transport Errors: 0
Vendor: ATA Product: WDC WD20EARS-00M Revision: AB51 Serial No:
WD-WMAZA0555575
Size: 2000.40GB <2000398934016 bytes>
Media Error: 370 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 70 Predictive Failure Analysis: 0
c2t4d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD20EARS-00M Revision: AB51 Serial No:
WD-WMAZA0484508
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 36 Predictive Failure Analysis: 0
Mike La Spina
2012-06-26 04:36:41 UTC
Permalink
The error reported task_file_status = 0x4041 on port 3 is a result of an
ATA response where the PxIS.TFES bit was set. The ahci driver must do a
port reset at that point. The SATA disk failed to perform an operation
and reported it. I suspect this is not a data error and may be more on
the lines of a firmware fault on the SATA controller. So it may well be
a faulty disk in the most unruly way, that is conditional e.g. could be
anything around or on the ATA chain, a bad cable, running to hot,
interference, mobo firmware ... etc.

-----Original Message-----
From: michelle [mailto:***@msknight.com]
Sent: Monday, June 25, 2012 7:34 PM
To: Discussion list for OpenIndiana
Subject: Re: [OpenIndiana-discuss] Diagnosis help needed

Apologies,

This went to an individual rather than back to the group.

----

Thanks for the response.

The thing that set of major alarms in my head is the fact that these
errors caused OI to freeze up to the degree where it needed to be
powered off. It would acknowledge the power switch instruction, but
wouldn't power down.

Messages is full of ...

Jun 24 21:20:58 jaguar ahci: [ID 296163 kern.warning] WARNING: ahci0:
ahci port 3 has task file error
Jun 24 21:20:58 jaguar ahci: [ID 687168 kern.warning] WARNING: ahci0:
ahci port 3 is trying to do error recovery Jun 24 21:20:58 jaguar ahci:
[ID 693748 kern.warning] WARNING: ahci0:
ahci port 3 task_file_status = 0x4041
Jun 24 21:21:00 jaguar ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 3 succeed

The fmdump gives loads of ...

Jun 25 2012 17:20:59.709529495 ereport.fs.zfs.probe_failure nvlist
version: 0
class = ereport.fs.zfs.probe_failure
ena = 0x9af81a688900c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0xb4f18a4cb65803ce
vdev = 0xe8c24676aa3c2a12
(end detector)

pool = data
pool_guid = 0xb4f18a4cb65803ce
pool_context = 0
pool_failmode = wait
vdev_guid = 0xe8c24676aa3c2a12
vdev_type = disk
vdev_path = /dev/dsk/c2t3d0s0
vdev_devid =
id1,***@SATA_____WDC_WD20EARS-00M_____WD-WMAZA0555575/a
parent_guid = 0x394c2bd12b4ffcfc
parent_type = raidz
prev_state = 0x0
__ttl = 0x1
__tod = 0x4fe88feb 0x2a4a8f97

iostat gives...

***@jaguar:~# iostat -Exn
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.1 1.1 3.4 8.4 0.0 0.0 0.5 0.5 0 0 c2t0d0
0.1 1.1 3.5 8.4 0.0 0.0 0.5 0.5 0 0 c2t1d0
332.1 6.6 29970.5 65.3 0.1 2.6 0.2 7.7 2 39 c2t2d0
261.3 6.6 23380.2 61.2 0.5 4.4 1.8 16.3 8 68 c2t3d0
373.4 7.8 29970.7 65.2 0.1 2.9 0.3 7.6 2 47 c2t4d0
c2t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
CVGB949301PH040
Size: 40.02GB <40020664320 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal
Request: 36 Predictive Failure Analysis: 0
c2t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
CVGB949301PC040
Size: 40.02GB <40019582464 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal
Request: 35 Predictive Failure Analysis: 0
c2t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: ST32000542AS Revision: CC34 Serial No:
5XW17ARW
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal
Request: 36 Predictive Failure Analysis: 0
c2t3d0 Soft Errors: 0 Hard Errors: 795 Transport Errors: 0
Vendor: ATA Product: WDC WD20EARS-00M Revision: AB51 Serial No:
WD-WMAZA0555575
Size: 2000.40GB <2000398934016 bytes>
Media Error: 370 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal
Request: 70 Predictive Failure Analysis: 0
c2t4d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA Product: WDC WD20EARS-00M Revision: AB51 Serial No:
WD-WMAZA0484508
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal
Request: 36 Predictive Failure Analysis: 0
michelle
2012-06-26 06:52:14 UTC
Permalink
Many thanks to all.

I now have a better understanding of what is happening, why OI locked up
in the way it did, and that with the exception of FreeNAS (ZFS 15) and
NAS4Free, (claiming ZFS 28) it will be the same thing with Schillix and
NexcentaStor.

I've sold some of my possessions in order to buy the remaining 3tb
drives I need so while they take time to come, I'll try and get a
machine and try out NAS4Free as the web based control system looks nice
and central, while there is still CLI.

Thanks once again; now I have the information I need to make some solid
decisions.
Mike La Spina
2012-06-26 00:12:30 UTC
Permalink
I have observed the same SATA hard disk error wait behavior over many
operating systems. It's a SATA hardware issue. I have even observed it
on expensive high end storage servers. (HP, IBM, etc.) The SATA disk or
subsystem is trying to correct/recover errors, it should not and just
return the fault back to the caller (ZFS in this case). It can handle
the issue much more diplomatically and will correct the data fault
immediately. Don't blame ZFS, it is well designed, SATA on the other
hand is trying hard to do a bad thing.

-----Original Message-----
From: michelle [mailto:***@msknight.com]
Sent: Monday, June 25, 2012 2:31 PM
To: Discussion list for OpenIndiana
Subject: Re: [OpenIndiana-discuss] Diagnosis help needed

I did a hard reset and moved the drive to another channel.

The fault followed the drive so I'm certain it is the drive, as people
have said.

The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.

I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
Martin Frost
2012-06-25 21:20:32 UTC
Permalink
Date: Mon, 25 Jun 2012 17:06:07 -0400
OpenPGP: id=E556D4A0
Post by michelle
I did a hard reset and moved the drive to another channel.
The fault followed the drive so I'm certain it is the drive, as people
have said.
The thing that bugs me is that this ZFS fault locked up the OS - and
that's a real concern.
I think I'm going to need to have a hard think about my options and
possibly leave OI for FreeNAS, Nexenta or Schillix.
My guess is that since all the pools are by default set to have failmode
set to "wait" on failure, it'll wait forever.
Now, changing it to "continue" which will return an error, but it could
lead to worse behavior in some cases. (You could set it to panic also,
but you'll like that far less.)
Why should it return an error when no data has been lost yet, given
the redundancy (which is all that's lost so far)?

Martin
Ray Arachelian
2012-06-26 10:54:44 UTC
Permalink
Post by Martin Frost
Post by Ray Arachelian
My guess is that since all the pools are by default set to have failmode
set to "wait" on failure, it'll wait forever.
Now, changing it to "continue" which will return an error, but it could
lead to worse behavior in some cases. (You could set it to panic also,
but you'll like that far less.)
Why should it return an error when no data has been lost yet, given
the redundancy (which is all that's lost so far)?
Exactly my point: In some cases, such as this, you want it to wait, in
others, where there's no chance of it ever coming back, you don't.
Continue reading on narkive:
Loading...