Skip to main content
Topic: System Lockup SDD Drive (Read 1010 times) previous topic - next topic
0 Members and 4 Guests are viewing this topic.

System Lockup SDD Drive

This is not in regards to Artix itself, but more in regards to problems with an SDD disk drive.

I have a newish ACER laptop with an SDD disk drive, as opposed to a traditional spin harddrive.  I experience hard lock ups of the system (no mouse, no keyboard, no <CTRL><ALT>F2 to other session; but does sometimes respond to <ALT><SYSRQ><B>), frequency every couple days.  I f the system doesnt respond to the above recouvery methods, I have to shutdown via the power button.

Upon reboot, there are many lines of "inode" errors, which pass by so fast I cant read them.  I believe "inode"  is in regards to disk drive error/problems, which is my SDD.  So I assume the lockups are due to problems with the SDD drive?

Question is I am sure this is not normal, for SDD drive itself to be so unreliable?  Is there any "tuning" that can be done to help this poor SDD drive function more reliably?  Is the SDD drive defective?



Re: System Lockup SDD Drive

Reply #2
Install the smartmontools package and look what the command
Code: [Select]
sudo smartctl -a <your drive>
shows, where <your drive> is the device name of your SSD, presumably /dev/sda.


Re: System Lockup SDD Drive

Reply #4
Install the smartmontools package and look what the command
Code: [Select]
sudo smartctl -a <your drive>
shows, where <your drive> is the device name of your SSD, presumably /dev/sda.


Installed smartctl and ran:

Code: [Select]
ACER5:[root]:~# smartctl -a /dev/nvme0n1p1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.8-artix1-1] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       WDC PC SN520 SDAPNUW-128G-1014
Serial Number:                      20467J454707
Firmware Version:                   20110000
PCI Vendor/Subsystem ID:            0x15b7
IEEE OUI Identifier:                0x001b44
Total NVM Capacity:                 128,035,676,160 [128 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          128,035,676,160 [128 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            001b44 4a46b83a65
Local Time is:                      Wed Aug 25 07:19:33 2021 EDT
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Log Page Attributes (0x02):         Cmd_Eff_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     82 Celsius
Critical Comp. Temp. Threshold:     86 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     2.50W       -        -    0  0  0  0        0       0
 1 +     2.50W       -        -    1  1  1  1        0       0
 2 +     1.70W       -        -    2  2  2  2        0       0
 3 -   0.0250W       -        -    3  3  3  3     5000    9000
 4 -   0.0025W       -        -    4  4  4  4     5000   44000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        36 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    8,701,185 [4.45 TB]
Data Units Written:                 1,510,585 [773 GB]
Host Read Commands:                 81,088,560
Host Write Commands:                22,768,275
Controller Busy Time:               111
Power Cycles:                       572
Power On Hours:                     316
Unsafe Shutdowns:                   24
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged


The problem is random in nature.  I can go days or even a week and not have it happen.  Then suddenly for no reason, machine locks up.

Keeping a closer watch on things, will look further into smartctl as well as monitor logs.



 

Re: System Lockup SDD Drive

Reply #5
The SMART record seems to be clean, no errors.

BTW, laptops may be quite capricious and have insidious hardware faults. There are two options:

1. It's a hardware issue. May be it's related to the SSD (to be precise, NVME), but may be that's some power issues. These are common to laptops, batteries and power controllers are often bugged. Maybe, you'd better bring your laptop to the service, while its warranty is not expired.

2. It's a software issue. It may be related to the kernel, which may not be able to support the newest hardware yet. Do you use the latest kernel version? Did this laptop have a preinstalled linux like Ubuntu (that gives you some assurance that linux can support the laptop's hardware)?

Ah, and regarding this:
Upon reboot, there are many lines of "inode" errors, which pass by so fast I cant read them.
Perhaps you can read them by running:
Code: [Select]
sudo dmesg | grep inode
But those inode errors are most likely related to the fact you are using a journaling filesystem like ext4. Upon a sudden power shutdown, the journal may become damaged, especially if you are using an SSD, and the journal is not flushed to the disk immediately.

Re: System Lockup SDD Drive

Reply #6
SSDs have relatively poor support in Linux, very dependant on model and manufacturer.

https://wiki.archlinux.org/title/Solid_state_drive/NVMe
https://wiki.archlinux.org/title/Solid_state_drive

Edit: Also, with regards to topic, especially check the Airflow section.

Re: System Lockup SDD Drive

Reply #7
If I asked you this question, it's because I had the same symptoms on a laptop with a ssd.
It turned out that the swap partition was not in /etc/fstab and was therefore not mounted at startup (forgotten during installation).
So, after a certain time of use (surfing the internet, youtube, etc.) the system froze and the machine had to be rebooted using the button.

Re: System Lockup SDD Drive

Reply #8
System lockups can be caused by many things, mostly I get graphics related ones and recently a Broadcom wireless card which was swapped for another make.  Not had any particular problems with SSD's except on rare occasions usually after transporting my laptop, and it might not find the hard drive on boot but shaking it or tapping the end of the caddy and trying again seems to work - so possibly you could try unplugging and replugging the drive if it was easily accessible, and check any mounting screws are securely fitted. BTRFS is COW - Copy On Write - and is far more resilient to unplanned power offs as this means the file system should never be in an inconsistent state. Although I try to avoid it I've had loads of hard power offs and never had any problems except possibly losing unsaved work. Some BIOS options may include RAID which can overwrite the backup GPT table at the end of the drive so they need to be set to AHCI. gdisk will fix this but use with caution. I doubt that's your problem, but there can be something as simple as a  BIOS setting that might affect the hard drive. I had a faulty mobo once that would cause lockups, it would do so 10 or 20 minutes after startup even if I wasn't doing anything or doing random things, it was quite difficult to determine it was faulty but usually software based problems have some obvious origin.

Re: System Lockup SDD Drive

Reply #9

 Do you use the latest kernel version? Did this laptop have a preinstalled linux like Ubuntu (that gives you some assurance that linux can support the laptop's hardware)?


Yes, latest kernel, I pacman -Su weekly. 
No, laptop came with windows.  I formatted that off without even booting it, and installed Artix. 



But those inode errors are most likely related to the fact you are using a journaling filesystem like ext4. Upon a sudden power shutdown, the journal may become damaged, especially if you are using an SSD, and the journal is not flushed to the disk immediately.

Good to know, thanks.


Re: System Lockup SDD Drive

Reply #10
Due to the random nature of this issue it's hard to determine its cause, alas.

As already mentioned earlier, it can be caused by wrong partition setup, by video drivers (check you GPU drivers, which do you use, try both proprietary and free), and also by xorg input drivers. AFAIK there are two variants of them, evdev and libinput, and one of them is treated as obsolete and can cause problems, but I forgot which one. I had problems with both in various distros. As of now, I use libinput with Artix, but evdev with Debian also works fine.

In search for clues, you can check dmesg and xorg logs for error messages. Another variant is to install and configure rsyslog, it will store kernel messages to your disk and shit into your SSD, but perhaps after the next failure something informative may remain in the logs.