Hi there,
just an update.
It is the power management indeed. I was able to hock in the needed modifications to the kernel and the NVME is beeing recognized.
Culprit for my case - Unstable in case of the WD SN700 4TB - even with the additional 5 V power.
But if you want to give it a try for your drive I will describe what I have done.
Disclaimer : Use it at your own risk!
Get your current kernel parameters
cat /proc/cmdline
It should look like
cgroup_disable=pressure log_buf_len=256K earlycon=msm_geni_serial,0x994000 rcupdate.rcu_expedited=1 rcu_nocbs=0-7 kpti=off noinitrd console=ttyMSM0,115200,n8 video=Virtual-1:d earlycon=msm_geni_serial,0x994000 androidboot.hardware=qcom androidboot.console=ttyMSM0 androidboot.memcg=1 lpm_levels.sleep_disabled=1 msm_rtb.filter=0x237 service_locator.enable=1 firmware_class.path=/lib/firmware androidboot.usbcontroller=a600000.dwc3 swiotlb=2048 loop.max_part=7 cgroup.memory=nokmem,nosocket reboot=panic_warm net.ifnames=0 apparmor=1 security=apparmor root=PARTLABEL=system_a androidboot.bootdevice=1d84000.ufshc androidboot.fstab_suffix=default androidboot.serialno=97979b3 androidboot.baseband=msm msm_drm.dsi_display0=qcom,mdss_dsi_lt9611_720p_video: systemd.setenv="SLOT_SUFFIX=_a" skip_initramfs rootwait rw init=/sbin/init
It consists of a prefix + kernel parameters + suffix.
The parameters defined in boot_a is just this portion:
noinitrd console=ttyMSM0,115200,n8 video=Virtual-1:d earlycon=msm_geni_serial,0x994000 androidboot.hardware=qcom androidboot.console=ttyMSM0 androidboot.memcg=1 lpm_levels.sleep_disabled=1 msm_rtb.filter=0x237 service_locator.enable=1 firmware_class.path=/lib/firmware androidboot.usbcontroller=a600000.dwc3 swiotlb=2048 loop.max_part=7 cgroup.memory=nokmem,nosocket reboot=panic_warm net.ifnames=0 apparmor=1 security=apparmor
The goal is to append three additional parameters to the kernel to disable the pcie power management whilst keeping the length below 512 Byte (including the terminating NUL at the end)
Confirm that boot_a is (in my case - and it should be everywhere the same) in /dev/sdg24
lsblk -o NAME,SIZE,PARTLABEL /dev/sdg
Should output ├─sdg24 96M boot_a
Then create a working directory in root to keep our backup and as a working directory
mkdir -p /root/bootmod
cd /root/bootmod
Following will extract boot_a into boot_a.backup.img and extract the first 512 Byte starting from 0x40. The last command will show you the kernel parameters in boot_a.
dd if=/dev/sdg24 of=boot_a.backup.img bs=4M status=Progress
dd if=/dev/sdg24 of=cmdline.bin bs=1 skip=$((0x40)) count=512
tr '\0' '\n' < cmdline.bin | head -1
To append the needed parameters and make sure we do not define more then possible in boot_a
CURRENT=$(tr '\0' '\n' < cmdline.bin | head -1)
EXTRA="nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off"
NEW="$CURRENT $EXTRA"
LEN=$(printf '%s' "$NEW" | wc -c)
echo $LEN
LEN should be <= 511 - Otherwise we will destroy our boot_a
If ok - Then we can create the partial boot kernel parameter bin and overwrite boot_a only with this partial modification and cross-check the result directly from sdg24
printf '%s\0' "$NEW" > newcmdline.bin
truncate -s 512 newcmdline.bin
dd if=newcmdline.bin of=/dev/sdg24 bs=1 seek=$((0x40)) conv=notrunc
sync
dd if=/dev/sdg24 bs=1 skip=$((0x40)) count=512 | tr '\0' '\n' | head -1
Reboot and check again against
cat /proc/cmdline
Now the NVME should be kept alive during boot.
Create partition - mount - make some tests.
If you want to revert to the original boot_a
dd if=boot_a.backup.img of=/dev/sdg24 bs=4M conv=fsync
Hope it helps somewhat.
Best regards
Thomas