So, while having a week away in the lake district just gone, I notice on Friday that all my email alerts for my home network and more importantly this web server! Stopped flooding my Inbox, now some might say that this might be a small relief, however as I was on holiday I decided there was A) Nothing I could do remotely and B) When I get home to check the firewall and resolve said issue. Now I should add that for the last couple of week’s I’ve been noticing more and more network drops, basically, for no apparent reason the WAN Link would just drop, no errors or warning just *poof* gone. For the Network Infrastructure people reading this, they will jump to the “Owh its a SPOF (Single Point of Failure)“ having only one WAN Connection and no redundancy or 3/4G backup. and so on.
Yes I realise this, and it may change after this blog post and spat of outages, but as there’s currently financial return from this project/network it’s not a major problem just an annoyance.
Anyway that aside, prior to leaving for the Lake District I installed smartctl on the OPNSense Firewall and ran some checks and found the following.
sudo pkg install smartmontools
=== START OF INFORMATION SECTION === Model Family: Western Digital RE4 Device Model: WDC WD2503ABYX-01WERA1 Serial Number: WD-WMAYS0079516 LU WWN Device Id: 5 0014ee 003734f6b Firmware Version: 01.01S02 User Capacity: 251,059,544,064 bytes [251 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sat Jul 13 15:53:00 2019 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled
SMART REPORT – Notice those lovely Pre-Failure Warning Flags!
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 167 3 Spin_Up_Time POS--K 150 141 021 - 3491 4 Start_Stop_Count -O--CK 100 100 000 - 89 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 046 046 000 - 39634 10 Spin_Retry_Count -O--CK 100 253 000 - 0 11 Calibration_Retry_Count -O--CK 100 253 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 87 192 Power-Off_Retract_Count -O--CK 200 200 000 - 78 193 Load_Cycle_Count -O--CK 200 200 000 - 10 194 Temperature_Celsius -O---K 110 092 000 - 33 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 199 000 - 0 198 Offline_Uncorrectable ----CK 200 199 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning
So my temporary solution to resolve this issue until I replace the physical firewall was to purchase a Crucial BX500 120Gb 2.5″ SSD (Amazon Link)
and a 3.5″ to 2.5″ Conversion Tray Kit (Amazon Link)
After 45 Minutes of De-Racking the Firewall, it was time to open the Sophos UTM Unit, give it a spring clean and replace the failing Western Digital Hard Drive.
A Top-Down View of the Sophos UTM 220 (Data Sheet Download)
Amazon Purchases arrived, Quick Photo before installation.
SSD Installed into Conversion Tray, ready for the soft test to ensure that the Motherboard actually detects the SSD. – SPOILER: It was detected!!
After Replacing the disk, I downloaded the latest version of OPNSense, created a bootable USB using rufus, ran through the installation process. Once the base installation of OPNSense was completed, I Logged in configured the WAN Interface for the BT Business Line and then wanted to check the status of the new Crucial SSD. By Default, the ‘smartmontools’ is not installed on the FreeBSD Firewall, so I used the following command:
sudo pkg install smartmontools
Once ‘smartctl’ was installed, I ran the following command to find out where the Crucial SSD was mounted to:
sudo smartctl --scan
The scan reported the following mount location:
/dev/ada0 -d atacam # /dev/ada0, ATA device
then using smartctl -i /dev/ada0 Iclear was able to obtain the following information
smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p9-HBSD amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Crucial/Micron BX/MX1/2/3/500, M5/600, 1100 SSDs Device Model: CT120BX500SSD1 Serial Number: 1921E1833763 LU WWN Device Id: 0 000000 000000000 Firmware Version: M6CR013 User Capacity: 120,034,123,776 bytes [120 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sat Jul 13 19:45:18 2019 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled
With that, I was happy, as I will be able to use the SMART Support to be able to monitor the life of the SSD. I then restored the rest of the config and was back in business!