*For somehow Samsung shipped the A72 S kernel for A52 too, but only renamed the defconfig without even changing device-specific stuff like Tele-camera, panel or fingerprint drivers in defconfig
*Manually correct these to as they were on R
Change-Id: I9d69c9f8db3ff1d2dbc5246673fb4ab8f0463946
The call to pm_runtime_get_sync() in ufshcd_program_key() can deadlock
because it waits for the UFS controller to be resumed, but it can itself
be reached while resuming the UFS controller via:
- ufshcd_runtime_resume()
- ufshcd_resume()
- ufshcd_reset_and_restore()
- ufshcd_host_reset_and_restore()
- ufshcd_hba_enable()
- ufshcd_hba_execute_hce()
- ufshcd_hba_start()
- ufshcd_crypto_enable()
- keyslot_manager_reprogram_all_keys()
- ufshcd_crypto_keyslot_program()
- ufshcd_program_key()
But pm_runtime_get_sync() *is* needed when evicting a key. Also, on
pre-4.20 kernels it's needed when programming a keyslot for a bio since
the block layer used to resume the device in a different place.
Thus, it's hard for drivers to know what to do in .keyslot_program() and
.keyslot_evict(). In old kernels it may even be impossible unless we
were to pass more information down from the keyslot_manager.
There's also another possible deadlock: keyslot programming and eviction
take ksm->lock for write and then resume the device, which may result in
ksm->lock being taken again via the above call stack. To fix this, we
should resume the device before taking ksm->lock.
Fix these problems by moving to a better design where the block layer
(namely, the keyslot manager) handles runtime power management instead
of drivers. This is analogous to the block layer's existing runtime
power management support (blk-pm), which handles resuming devices when
bios are submitted to them so that drivers don't need to handle it.
Test: Tested on coral with:
echo 5 > /sys/bus/platform/devices/1d84000.ufshc/rpm_lvl
sleep 30
touch /data && sync # hangs before this fix
Also verified via kvm-xfstests that blk-crypto-fallback continues
to work both with and without CONFIG_PM=y.
Bug: 137270441
Bug: 149368295
Change-Id: I6bc9fb81854afe7edf490d71796ee68a61f7cbc8
Signed-off-by: Eric Biggers <ebiggers@google.com>
Git-Commit: 8d97219e60
Git-Repo: https://android.googlesource.com/kernel/common
[neersoni@codeaurora.org]: fixed compilation issues.
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
[vagrawa@codeaurora.org]: fix merge conflicts.
Signed-off-by: Vaibhav Agrawal <vagrawa@codeaurora.org>
The call to pm_runtime_get_sync() in ufshcd_program_key() can deadlock
because it waits for the UFS controller to be resumed, but it can itself
be reached while resuming the UFS controller via:
- ufshcd_runtime_resume()
- ufshcd_resume()
- ufshcd_reset_and_restore()
- ufshcd_host_reset_and_restore()
- ufshcd_hba_enable()
- ufshcd_hba_execute_hce()
- ufshcd_hba_start()
- ufshcd_crypto_enable()
- keyslot_manager_reprogram_all_keys()
- ufshcd_crypto_keyslot_program()
- ufshcd_program_key()
But pm_runtime_get_sync() *is* needed when evicting a key. Also, on
pre-4.20 kernels it's needed when programming a keyslot for a bio since
the block layer used to resume the device in a different place.
Thus, it's hard for drivers to know what to do in .keyslot_program() and
.keyslot_evict(). In old kernels it may even be impossible unless we
were to pass more information down from the keyslot_manager.
There's also another possible deadlock: keyslot programming and eviction
take ksm->lock for write and then resume the device, which may result in
ksm->lock being taken again via the above call stack. To fix this, we
should resume the device before taking ksm->lock.
Fix these problems by moving to a better design where the block layer
(namely, the keyslot manager) handles runtime power management instead
of drivers. This is analogous to the block layer's existing runtime
power management support (blk-pm), which handles resuming devices when
bios are submitted to them so that drivers don't need to handle it.
Test: Tested on coral with:
echo 5 > /sys/bus/platform/devices/1d84000.ufshc/rpm_lvl
sleep 30
touch /data && sync # hangs before this fix
Also verified via kvm-xfstests that blk-crypto-fallback continues
to work both with and without CONFIG_PM=y.
Bug: 137270441
Bug: 149368295
Change-Id: I6bc9fb81854afe7edf490d71796ee68a61f7cbc8
Signed-off-by: Eric Biggers <ebiggers@google.com>
Git-Commit: 8d97219e60
Git-Repo: https://android.googlesource.com/kernel/common
[neersoni@codeaurora.org]: fixed compilation issues.
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
When queue is in PREEMPT_ONLY mode, only REQ_PREEMPT request
can be allocated and dispatched, other requests won't be allowed
to enter I/O path.
This is useful for supporting safe SCSI quiesce.
Part of this patch is from Bart's '[PATCH v4 4∕7] block: Add the QUEUE_FLAG_PREEMPT_ONLY
request queue flag'.
Change-Id: Iffe29f0d6385e56bd6352c3f5c09a11346c12142
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Patch-mainline: linux-block@vger.kernel.org @ 03/10/2017, 22:04
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
We need this return value in the following patch to decide
if a explicit synchronize_rcu() is needed.
Change-Id: I75559b7ec1dee4df607fd3e67abc6954b656edf4
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Patch-mainline: linux-block@vger.kernel.org @ 03/10/2017, 22:04
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
We need to check if the request to be allocated is PREEMPT_ONLY,
and have to pass REQ_PREEEMPT flag to blk_queue_eneter(), so pass
'op' to blk_queue_enter() directly.
Change-Id: I53bafb80d59917f65b5855571489638d9fe507c3
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Patch-mainline: linux-block@vger.kernel.org @ 03/10/2017, 22:04
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
This patch does not change any functionality but makes the
REQ_PREEMPT flag available to blk_get_request(). A later patch
will add code to blk_get_request() that checks the REQ_PREEMPT
flag. Note: the IDE sense_rq request is allocated statically so
there is no blk_get_request() call that corresponds to this
request.
Change-Id: I380e869515f106e882c03b5305dc8e675eefd915
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Patch-mainline: linux-block@vger.kernel.org @ 03/10/2017, 22:04
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
This patch makes it possible to pause request allocation for
the legacy block layer by calling blk_mq_freeze_queue() and
blk_mq_unfreeze_queue().
Change-Id: Id6d86a2e73f01043349969bd5df99e10c8a6a092
Signed-off-by: Ming Lei <ming.lei@redhat.com>
[ bvanassche: Combined two patches into one, edited a comment and made sure
REQ_NOWAIT is handled properly in blk_old_get_request() ]
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Git-commit: 055f6e18e08f5b7fd98171fce857a0bad87a919d
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
Backport a fix from the v7 inline crypto patchset which ensures that the
block layer knows the number of DUN bytes the inline encryption hardware
supports, so that hardware isn't used when it shouldn't be.
(This unfortunately means introducing some increasing long argument
lists; this was all already fixed up in later versions of the patchset.)
To avoid breaking the KMI for drivers, don't add a dun_bytes argument to
keyslot_manager_create() but rather allow drivers to call
keyslot_manager_set_max_dun_bytes() to override the default. Also,
don't add dun_bytes as a new field in 'struct blk_crypto_key' but rather
pack it into the existing 'hash' field which is for block layer use.
Bug: 144046242
Bug: 153512828
Change-Id: I285f36557fb3eafc5f2f64727ef1740938b59dd7
Signed-off-by: Eric Biggers <ebiggers@google.com>
Git-commit: 72091967bf
Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable
[neersoni@codeaurora.org: back port the changes and update ufshcd-crypto-qti.c
file to specify max dun byte support]
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Version 1.0.0 did not add iv_offset to dun and did not mandate
sector size. This resulted in different on disk data format
compared to what version 2.1.0 will support. To support OTA upgrades with
legacy data format, adapt the sector size and iv_offset if legacy
encryption algorithm is used. Fix compilation issue for block crypto
fallback using keyslot manager API.
Change-Id: I3b7a0279bcb98c3cba9dec3f572c12d618fdc816
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
This reverts commit b73e822d12.
This is reverted to integrate new file encryption framework support changes
to ensure all fixes are present to use new encryption policies.
Change-Id: I455ec66664064069ac34e6fe410bd28dc3a53d07
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Remove the Per File Key logic based inline crypto support
for file encryption framework.
Change-Id: I90071562ba5c41b9db470363edac35c9fe5e4efa
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Backport a fix from the v7 inline crypto patchset which ensures that the
block layer knows the number of DUN bytes the inline encryption hardware
supports, so that hardware isn't used when it shouldn't be.
(This unfortunately means introducing some increasing long argument
lists; this was all already fixed up in later versions of the patchset.)
To avoid breaking the KMI for drivers, don't add a dun_bytes argument to
keyslot_manager_create() but rather allow drivers to call
keyslot_manager_set_max_dun_bytes() to override the default. Also,
don't add dun_bytes as a new field in 'struct blk_crypto_key' but rather
pack it into the existing 'hash' field which is for block layer use.
Bug: 144046242
Bug: 153512828
Change-Id: I285f36557fb3eafc5f2f64727ef1740938b59dd7
Signed-off-by: Eric Biggers <ebiggers@google.com>
Git-commit: 72091967bf
Git-repo: https://android.googlesource.com/kernel/common/+/refs/heads/android-4.14-stable
[neersoni@codeaurora.org: back port the changes and update ufshcd-crypto-qti.c
file to specify max dun byte support]
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Version 1.0.0 did not add iv_offset to dun and did not mandate
sector size. This resulted in different on disk data format
compared to what version 2.1.0 will support. To support OTA upgrades with
legacy data format, adapt the sector size and iv_offset if legacy
encryption algorithm is used. Fix compilation issue for block crypto
fallback using keyslot manager API.
Change-Id: I3b7a0279bcb98c3cba9dec3f572c12d618fdc816
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
This reverts commit b73e822d12.
This is reverted to integrate new file encryption framework support changes
to ensure all fixes are present to use new encryption policies.
Change-Id: I455ec66664064069ac34e6fe410bd28dc3a53d07
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
Remove the Per File Key logic based inline crypto support
for file encryption framework.
Change-Id: I90071562ba5c41b9db470363edac35c9fe5e4efa
Signed-off-by: Neeraj Soni <neersoni@codeaurora.org>
c57952b UPSTREAM: ubifs: wire up FS_IOC_GET_ENCRYPTION_NONCE
379237b UPSTREAM: f2fs: wire up FS_IOC_GET_ENCRYPTION_NONCE
10e5acf UPSTREAM: ext4: wire up FS_IOC_GET_ENCRYPTION_NONCE
63bf273 ANDROID: scsi: ufs: add ->map_sg_crypto() variant op
10d4512 FROMLIST: f2fs: Handle casefolding with Encryption
4efb7e2 ANDROID: fscrypt: fall back to filesystem-layer crypto when needed
a14fa7b ANDROID: block: require drivers to declare supported crypto key type(s)
5578bea ANDROID: block: make blk_crypto_start_using_mode() properly check for support
e9c80bd UPSTREAM: fscrypt: add FS_IOC_GET_ENCRYPTION_NONCE ioctl
9e469e7 UPSTREAM: fscrypt: don't evict dirty inodes after removing key
53f2446 fscrypt: don't evict dirty inodes after removing key
207be96 FROMLIST: fscrypt: Have filesystems handle their d_ops
06ab740 ANDROID: dm: Add wrapped key support in dm-default-key
23e670a ANDROID: dm: add support for passing through derive_raw_secret
166fda7 ANDROID: block: Prevent crypto fallback for wrapped keys
fe6e855 fscrypt: improve format of no-key names
216d8ca fscrypt: clarify what is meant by a per-file key
7e25032 fscrypt: derive dirhash key for casefolded directories
e16d849 fscrypt: don't allow v1 policies with casefolding
0bc68c1 fscrypt: add "fscrypt_" prefix to fname_encrypt()
85b9c3e fscrypt: don't print name of busy file when removing key
9c5c8c5 fscrypt: document gfp_flags for bounce page allocation
bee5bd5 fscrypt: optimize fscrypt_zeroout_range()
1c88eea fscrypt: remove redundant bi_status check
04f5184 fscrypt: Allow modular crypto algorithms
737ae90 fscrypt: include <linux/ioctl.h> in UAPI header
8842133 fscrypt: don't check for ENOKEY from fscrypt_get_encryption_info()
b21b79d fscrypt: remove fscrypt_is_direct_key_policy()
19b132b fscrypt: move fscrypt_valid_enc_modes() to policy.c
add6ac4 fscrypt: check for appropriate use of DIRECT_KEY flag earlier
2454b5b fscrypt: split up fscrypt_supported_policy() by policy version
bfa4ca6 fscrypt: introduce fscrypt_needs_contents_encryption()
3871977 fscrypt: move fscrypt_d_revalidate() to fname.c
39a0acc fscrypt: constify inode parameter to filename encryption functions
3942229 fscrypt: constify struct fscrypt_hkdf parameter to fscrypt_hkdf_expand()
a7b6398 fscrypt: verify that the crypto_skcipher has the correct ivsize
9c1b3af fscrypt: use crypto_skcipher_driver_name()
3529026 fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY
Change-Id: Ib1abe832e16d5f40bfcc9e34bdccbb063b37dbbc
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
commit a75ca9303175d36af93c0937dd9b1a6422908b8d upstream.
commit e7bf90e5afe3 ("block/bio-integrity: fix a memory leak bug") added
a kfree() for 'buf' if bio_integrity_add_page() returns '0'. However,
the object will be freed in bio_integrity_free() since 'bio->bi_opf' and
'bio->bi_integrity' were set previousy in bio_integrity_alloc().
Fixes: commit e7bf90e5afe3 ("block/bio-integrity: fix a memory leak bug")
Signed-off-by: yu kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes the
4.14 backport commit 574eb136ec
which was
upstream commit f5bbbbe4d63577026f908a809f22f5fd5a90ea1f.
The upstream commit added a call to synchronize_rcu to
_blk_mq_update_nr_hw_queues, just after freezing queues.
In the backport this landed just after unfreezeing queues.
This commit moves the call to its intended place.
Fixes: 574eb136ec ("blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter")
Signed-off-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
In bfq_idle_slice_timer func, bfqq = bfqd->in_service_queue is
not in bfqd-lock critical section. The bfqq, which is not
equal to NULL in bfq_idle_slice_timer, may be freed after passing
to bfq_idle_slice_timer_body. So we will access the freed memory.
In addition, considering the bfqq may be in race, we should
firstly check whether bfqq is in service before doing something
on it in bfq_idle_slice_timer_body func. If the bfqq in race is
not in service, it means the bfqq has been expired through
__bfq_bfqq_expire func, and wait_request flags has been cleared in
__bfq_bfqd_reset_in_service func. So we do not need to re-clear the
wait_request of bfqq which is not in service.
KASAN log is given as follows:
[13058.354613] ==============================================================
[13058.354640] BUG: KASAN: use-after-free in bfq_idle_slice_timer+0xac/0x290
[13058.354644] Read of size 8 at addr ffffa02cf3e63f78 by task fork13/19767
[13058.354646]
[13058.354655] CPU: 96 PID: 19767 Comm: fork13
[13058.354661] Call trace:
[13058.354667] dump_backtrace+0x0/0x310
[13058.354672] show_stack+0x28/0x38
[13058.354681] dump_stack+0xd8/0x108
[13058.354687] print_address_description+0x68/0x2d0
[13058.354690] kasan_report+0x124/0x2e0
[13058.354697] __asan_load8+0x88/0xb0
[13058.354702] bfq_idle_slice_timer+0xac/0x290
[13058.354707] __hrtimer_run_queues+0x298/0x8b8
[13058.354710] hrtimer_interrupt+0x1b8/0x678
[13058.354716] arch_timer_handler_phys+0x4c/0x78
[13058.354722] handle_percpu_devid_irq+0xf0/0x558
[13058.354731] generic_handle_irq+0x50/0x70
[13058.354735] __handle_domain_irq+0x94/0x110
[13058.354739] gic_handle_irq+0x8c/0x1b0
[13058.354742] el1_irq+0xb8/0x140
[13058.354748] do_wp_page+0x260/0xe28
[13058.354752] __handle_mm_fault+0x8ec/0x9b0
[13058.354756] handle_mm_fault+0x280/0x460
[13058.354762] do_page_fault+0x3ec/0x890
[13058.354765] do_mem_abort+0xc0/0x1b0
[13058.354768] el0_da+0x24/0x28
[13058.354770]
[13058.354773] Allocated by task 19731:
[13058.354780] kasan_kmalloc+0xe0/0x190
[13058.354784] kasan_slab_alloc+0x14/0x20
[13058.354788] kmem_cache_alloc_node+0x130/0x440
[13058.354793] bfq_get_queue+0x138/0x858
[13058.354797] bfq_get_bfqq_handle_split+0xd4/0x328
[13058.354801] bfq_init_rq+0x1f4/0x1180
[13058.354806] bfq_insert_requests+0x264/0x1c98
[13058.354811] blk_mq_sched_insert_requests+0x1c4/0x488
[13058.354818] blk_mq_flush_plug_list+0x2d4/0x6e0
[13058.354826] blk_flush_plug_list+0x230/0x548
[13058.354830] blk_finish_plug+0x60/0x80
[13058.354838] read_pages+0xec/0x2c0
[13058.354842] __do_page_cache_readahead+0x374/0x438
[13058.354846] ondemand_readahead+0x24c/0x6b0
[13058.354851] page_cache_sync_readahead+0x17c/0x2f8
[13058.354858] generic_file_buffered_read+0x588/0xc58
[13058.354862] generic_file_read_iter+0x1b4/0x278
[13058.354965] ext4_file_read_iter+0xa8/0x1d8 [ext4]
[13058.354972] __vfs_read+0x238/0x320
[13058.354976] vfs_read+0xbc/0x1c0
[13058.354980] ksys_read+0xdc/0x1b8
[13058.354984] __arm64_sys_read+0x50/0x60
[13058.354990] el0_svc_common+0xb4/0x1d8
[13058.354994] el0_svc_handler+0x50/0xa8
[13058.354998] el0_svc+0x8/0xc
[13058.354999]
[13058.355001] Freed by task 19731:
[13058.355007] __kasan_slab_free+0x120/0x228
[13058.355010] kasan_slab_free+0x10/0x18
[13058.355014] kmem_cache_free+0x288/0x3f0
[13058.355018] bfq_put_queue+0x134/0x208
[13058.355022] bfq_exit_icq_bfqq+0x164/0x348
[13058.355026] bfq_exit_icq+0x28/0x40
[13058.355030] ioc_exit_icq+0xa0/0x150
[13058.355035] put_io_context_active+0x250/0x438
[13058.355038] exit_io_context+0xd0/0x138
[13058.355045] do_exit+0x734/0xc58
[13058.355050] do_group_exit+0x78/0x220
[13058.355054] __wake_up_parent+0x0/0x50
[13058.355058] el0_svc_common+0xb4/0x1d8
[13058.355062] el0_svc_handler+0x50/0xa8
[13058.355066] el0_svc+0x8/0xc.
Change-Id: I510c704a6f2324741d70db33f0350e14642fe92f
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reported-by: Wang Wang <wangwang2@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Feilong Lin <linfeilong@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Git-commit: 2f95fa5c955d0a9987ffdc3a095e2f4e62c5f2a9
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
Backport a fix from the v7 inline crypto patchset which ensures that the
block layer knows the number of DUN bytes the inline encryption hardware
supports, so that hardware isn't used when it shouldn't be.
(This unfortunately means introducing some increasing long argument
lists; this was all already fixed up in later versions of the patchset.)
To avoid breaking the KMI for drivers, don't add a dun_bytes argument to
keyslot_manager_create() but rather allow drivers to call
keyslot_manager_set_max_dun_bytes() to override the default. Also,
don't add dun_bytes as a new field in 'struct blk_crypto_key' but rather
pack it into the existing 'hash' field which is for block layer use.
Bug: 144046242
Bug: 153512828
Change-Id: I285f36557fb3eafc5f2f64727ef1740938b59dd7
Signed-off-by: Eric Biggers <ebiggers@google.com>
[ Upstream commit 2f95fa5c955d0a9987ffdc3a095e2f4e62c5f2a9 ]
In bfq_idle_slice_timer func, bfqq = bfqd->in_service_queue is
not in bfqd-lock critical section. The bfqq, which is not
equal to NULL in bfq_idle_slice_timer, may be freed after passing
to bfq_idle_slice_timer_body. So we will access the freed memory.
In addition, considering the bfqq may be in race, we should
firstly check whether bfqq is in service before doing something
on it in bfq_idle_slice_timer_body func. If the bfqq in race is
not in service, it means the bfqq has been expired through
__bfq_bfqq_expire func, and wait_request flags has been cleared in
__bfq_bfqd_reset_in_service func. So we do not need to re-clear the
wait_request of bfqq which is not in service.
KASAN log is given as follows:
[13058.354613] ==================================================================
[13058.354640] BUG: KASAN: use-after-free in bfq_idle_slice_timer+0xac/0x290
[13058.354644] Read of size 8 at addr ffffa02cf3e63f78 by task fork13/19767
[13058.354646]
[13058.354655] CPU: 96 PID: 19767 Comm: fork13
[13058.354661] Call trace:
[13058.354667] dump_backtrace+0x0/0x310
[13058.354672] show_stack+0x28/0x38
[13058.354681] dump_stack+0xd8/0x108
[13058.354687] print_address_description+0x68/0x2d0
[13058.354690] kasan_report+0x124/0x2e0
[13058.354697] __asan_load8+0x88/0xb0
[13058.354702] bfq_idle_slice_timer+0xac/0x290
[13058.354707] __hrtimer_run_queues+0x298/0x8b8
[13058.354710] hrtimer_interrupt+0x1b8/0x678
[13058.354716] arch_timer_handler_phys+0x4c/0x78
[13058.354722] handle_percpu_devid_irq+0xf0/0x558
[13058.354731] generic_handle_irq+0x50/0x70
[13058.354735] __handle_domain_irq+0x94/0x110
[13058.354739] gic_handle_irq+0x8c/0x1b0
[13058.354742] el1_irq+0xb8/0x140
[13058.354748] do_wp_page+0x260/0xe28
[13058.354752] __handle_mm_fault+0x8ec/0x9b0
[13058.354756] handle_mm_fault+0x280/0x460
[13058.354762] do_page_fault+0x3ec/0x890
[13058.354765] do_mem_abort+0xc0/0x1b0
[13058.354768] el0_da+0x24/0x28
[13058.354770]
[13058.354773] Allocated by task 19731:
[13058.354780] kasan_kmalloc+0xe0/0x190
[13058.354784] kasan_slab_alloc+0x14/0x20
[13058.354788] kmem_cache_alloc_node+0x130/0x440
[13058.354793] bfq_get_queue+0x138/0x858
[13058.354797] bfq_get_bfqq_handle_split+0xd4/0x328
[13058.354801] bfq_init_rq+0x1f4/0x1180
[13058.354806] bfq_insert_requests+0x264/0x1c98
[13058.354811] blk_mq_sched_insert_requests+0x1c4/0x488
[13058.354818] blk_mq_flush_plug_list+0x2d4/0x6e0
[13058.354826] blk_flush_plug_list+0x230/0x548
[13058.354830] blk_finish_plug+0x60/0x80
[13058.354838] read_pages+0xec/0x2c0
[13058.354842] __do_page_cache_readahead+0x374/0x438
[13058.354846] ondemand_readahead+0x24c/0x6b0
[13058.354851] page_cache_sync_readahead+0x17c/0x2f8
[13058.354858] generic_file_buffered_read+0x588/0xc58
[13058.354862] generic_file_read_iter+0x1b4/0x278
[13058.354965] ext4_file_read_iter+0xa8/0x1d8 [ext4]
[13058.354972] __vfs_read+0x238/0x320
[13058.354976] vfs_read+0xbc/0x1c0
[13058.354980] ksys_read+0xdc/0x1b8
[13058.354984] __arm64_sys_read+0x50/0x60
[13058.354990] el0_svc_common+0xb4/0x1d8
[13058.354994] el0_svc_handler+0x50/0xa8
[13058.354998] el0_svc+0x8/0xc
[13058.354999]
[13058.355001] Freed by task 19731:
[13058.355007] __kasan_slab_free+0x120/0x228
[13058.355010] kasan_slab_free+0x10/0x18
[13058.355014] kmem_cache_free+0x288/0x3f0
[13058.355018] bfq_put_queue+0x134/0x208
[13058.355022] bfq_exit_icq_bfqq+0x164/0x348
[13058.355026] bfq_exit_icq+0x28/0x40
[13058.355030] ioc_exit_icq+0xa0/0x150
[13058.355035] put_io_context_active+0x250/0x438
[13058.355038] exit_io_context+0xd0/0x138
[13058.355045] do_exit+0x734/0xc58
[13058.355050] do_group_exit+0x78/0x220
[13058.355054] __wake_up_parent+0x0/0x50
[13058.355058] el0_svc_common+0xb4/0x1d8
[13058.355062] el0_svc_handler+0x50/0xa8
[13058.355066] el0_svc+0x8/0xc
[13058.355067]
[13058.355071] The buggy address belongs to the object at ffffa02cf3e63e70#012 which belongs to the cache bfq_queue of size 464
[13058.355075] The buggy address is located 264 bytes inside of#012 464-byte region [ffffa02cf3e63e70, ffffa02cf3e64040)
[13058.355077] The buggy address belongs to the page:
[13058.355083] page:ffff7e80b3cf9800 count:1 mapcount:0 mapping:ffff802db5c90780 index:0xffffa02cf3e606f0 compound_mapcount: 0
[13058.366175] flags: 0x2ffffe0000008100(slab|head)
[13058.370781] raw: 2ffffe0000008100 ffff7e80b53b1408 ffffa02d730c1c90 ffff802db5c90780
[13058.370787] raw: ffffa02cf3e606f0 0000000000370023 00000001ffffffff 0000000000000000
[13058.370789] page dumped because: kasan: bad access detected
[13058.370791]
[13058.370792] Memory state around the buggy address:
[13058.370797] ffffa02cf3e63e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb
[13058.370801] ffffa02cf3e63e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370805] >ffffa02cf3e63f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370808] ^
[13058.370811] ffffa02cf3e63f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370815] ffffa02cf3e64000: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[13058.370817] ==================================================================
[13058.370820] Disabling lock debugging due to kernel taint
Here, we directly pass the bfqd to bfq_idle_slice_timer_body func.
--
V2->V3: rewrite the comment as suggested by Paolo Valente
V1->V2: add one comment, and add Fixes and Reported-by tag.
Fixes: aee69d78d ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Reported-by: Wang Wang <wangwang2@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Feilong Lin <linfeilong@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 30a2da7b7e225ef6c87a660419ea04d3cef3f6a7 ]
There is a potential race between ioc_release_fn() and
ioc_clear_queue() as shown below, due to which below kernel
crash is observed. It also can result into use-after-free
issue.
context#1: context#2:
ioc_release_fn() __ioc_clear_queue() gets the same icq
->spin_lock(&ioc->lock); ->spin_lock(&ioc->lock);
->ioc_destroy_icq(icq);
->list_del_init(&icq->q_node);
->call_rcu(&icq->__rcu_head,
icq_free_icq_rcu);
->spin_unlock(&ioc->lock);
->ioc_destroy_icq(icq);
->hlist_del_init(&icq->ioc_node);
This results into below crash as this memory
is now used by icq->__rcu_head in context#1.
There is a chance that icq could be free'd
as well.
22150.386550: <6> Unable to handle kernel write to read-only memory
at virtual address ffffffaa8d31ca50
...
Call trace:
22150.607350: <2> ioc_destroy_icq+0x44/0x110
22150.611202: <2> ioc_clear_queue+0xac/0x148
22150.615056: <2> blk_cleanup_queue+0x11c/0x1a0
22150.619174: <2> __scsi_remove_device+0xdc/0x128
22150.623465: <2> scsi_forget_host+0x2c/0x78
22150.627315: <2> scsi_remove_host+0x7c/0x2a0
22150.631257: <2> usb_stor_disconnect+0x74/0xc8
22150.635371: <2> usb_unbind_interface+0xc8/0x278
22150.639665: <2> device_release_driver_internal+0x198/0x250
22150.644897: <2> device_release_driver+0x24/0x30
22150.649176: <2> bus_remove_device+0xec/0x140
22150.653204: <2> device_del+0x270/0x460
22150.656712: <2> usb_disable_device+0x120/0x390
22150.660918: <2> usb_disconnect+0xf4/0x2e0
22150.664684: <2> hub_event+0xd70/0x17e8
22150.668197: <2> process_one_work+0x210/0x480
22150.672222: <2> worker_thread+0x32c/0x4c8
Fix this by adding a new ICQ_DESTROYED flag in ioc_destroy_icq() to
indicate this icq is once marked as destroyed. Also, ensure
__ioc_clear_queue() is accessing icq within rcu_read_lock/unlock so
that icq doesn't get free'd up while it is still using it.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Co-developed-by: Pradeep P V K <ppvk@codeaurora.org>
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e74d93e96d721c4297f2a900ad0191890d2fc2b0 ]
Field bdi->io_pages added in commit 9491ae4aad ("mm: don't cap request
size based on read-ahead setting") removes unneeded split of read requests.
Stacked drivers do not call blk_queue_max_hw_sectors(). Instead they set
limits of their devices by blk_set_stacking_limits() + disk_stack_limits().
Field bio->io_pages stays zero until user set max_sectors_kb via sysfs.
This patch updates io_pages after merging limits in disk_stack_limits().
Commit c6d6e9b0f6b4 ("dm: do not allow readahead to limit IO size") fixed
the same problem for device-mapper devices, this one fixes MD RAIDs.
Fixes: 9491ae4aad ("mm: don't cap request size based on read-ahead setting")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----------------------------------------------------------------
This patch is not on mainline and is meant to 4.19 stable *only*.
After the patch description there's a reasoning about that.
-----------------------------------------------------------------
Commit 37f9579f4c31 ("blk-mq: Avoid that submitting a bio concurrently
with device removal triggers a crash") introduced a NULL pointer
dereference in generic_make_request(). The patch sets q to NULL and
enter_succeeded to false; right after, there's an 'if (enter_succeeded)'
which is not taken, and then the 'else' will dereference q in
blk_queue_dying(q).
This patch just moves the 'q = NULL' to a point in which it won't trigger
the oops, although the semantics of this NULLification remains untouched.
A simple test case/reproducer is as follows:
a) Build kernel v4.19.56-stable with CONFIG_BLK_CGROUP=n.
b) Create a raid0 md array with 2 NVMe devices as members, and mount
it with an ext4 filesystem.
c) Run the following oneliner (supposing the raid0 is mounted in /mnt):
(dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3;
echo 1 > /sys/block/nvme1n1/device/device/remove
(whereas nvme1n1 is the 2nd array member)
This will trigger the following oops:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
RIP: 0010:generic_make_request+0x32b/0x400
Call Trace:
submit_bio+0x73/0x140
ext4_io_submit+0x4d/0x60
ext4_writepages+0x626/0xe90
do_writepages+0x4b/0xe0
[...]
This patch has no functional changes and preserves the md/raid0 behavior
when a member is removed before kernel v4.17.
----------------------------
Why this is not on mainline?
----------------------------
The patch was originally submitted upstream in linux-raid and
linux-block mailing-lists - it was initially accepted by Song Liu,
but Christoph Hellwig[0] observed that there was a clean-up series
ready to be accepted from Ming Lei[1] that fixed the same issue.
The accepted patches from Ming's series in upstream are: commit
47cdee29ef9d ("block: move blk_exit_queue into __blk_release_queue") and
commit fe2008640ae3 ("block: don't protect generic_make_request_checks
with blk_queue_enter"). Those patches basically do a clean-up in the
block layer involving:
1) Putting back blk_exit_queue() logic into __blk_release_queue(); that
path was changed in the past and the logic from blk_exit_queue() was
added to blk_cleanup_queue().
2) Removing the guard/protection in generic_make_request_checks() with
blk_queue_enter().
The problem with Ming's series for -stable is that it relies in the
legacy request IO path removal. So it's "backport-able" to v5.0+,
but doing that for early versions (like 4.19) would incur in complex
code changes. Hence, it was suggested by Christoph and Song Liu that
this patch was submitted to stable only; otherwise merging it upstream
would add code to fix a path removed in a subsequent commit.
[0] lore.kernel.org/linux-block/20190521172258.GA32702@infradead.org
[1] lore.kernel.org/linux-block/20190515030310.20393-1-ming.lei@redhat.com.
Change-Id: Idc4807280e16512cfbe7b0b4fcbc724c3e637042
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Eric Ren <renzhengeek@gmail.com>
Fixes: 37f9579f4c31 ("blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit: c9d8d3e9d7a0db238dbef5e85405d41051cb1ff7
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block
Signed-off-by: Sridhar Arra <sarra@codeaurora.org>
If we end up splitting a bio and the queue goes away between
the initial submission and the later split submission, then we
can block forever in blk_queue_enter() waiting for the reference
to drop to zero. This will never happen, since we already hold
a reference.
Mark a split bio as already having entered the queue, so we can
just use the live non-blocking queue enter variant.
Thanks to Tetsuo Handa for the analysis.
Change-Id: Ifb03a192324bcb53c79f6d249422cce37cae2776
Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Git-commit: cd4a4ae4683dc2e09380118e205e057896dcda2b
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block
Signed-off-by: Sridhar Arra <sarra@codeaurora.org>
Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
it is no longer safe to access cgroup information during or after the
blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
call with blk_queue_enter() / blk_queue_exit().
Change-Id: I156a5ecabeb21b320d9303f8cb193b273cf3b8a1
Reported-by: Ming Lei <ming.lei@redhat.com>
Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Git-commit: 37f9579f4c31a6d698dbf3016d7bf132f9288d30
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block
Signed-off-by: Sridhar Arra <sarra@codeaurora.org>
commit 530ca2c9bd6949c72c9b5cfc330cb3dbccaa3f5b upstream.
A recent commit runs tag iterator callbacks under the rcu read lock,
but existing callbacks do not satisfy the non-blocking requirement.
The commit intended to prevent an iterator from accessing a queue that's
being modified. This patch fixes the original issue by taking a queue
reference instead of reading it, which allows callbacks to make blocking
calls.
Fixes: f5bbbbe4d6357 ("blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter")
Acked-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5bbbbe4d63577026f908a809f22f5fd5a90ea1f upstream.
For blk-mq, part_in_flight/rw will invoke blk_mq_in_flight/rw to
account the inflight requests. It will access the queue_hw_ctx and
nr_hw_queues w/o any protection. When updating nr_hw_queues and
blk_mq_in_flight/rw occur concurrently, panic comes up.
Before update nr_hw_queues, the q will be frozen. So we could use
q_usage_counter to avoid the race. percpu_ref_is_zero is used here
so that we will not miss any in-flight request. The access to
nr_hw_queues and queue_hw_ctx in blk_mq_queue_tag_busy_iter are
under rcu critical section, __blk_mq_update_nr_hw_queues could use
synchronize_rcu to ensure the zeroed q_usage_counter to be globally
visible.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Giuliano Procida <gprocida@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need a way to tell which type of keys the inline crypto hardware
supports (standard, wrapped, or both), so that fallbacks can be used
when needed (either blk-crypto-fallback, or fscrypt fs-layer crypto).
We can't simply assume that
keyslot_mgmt_ll_ops::derive_raw_secret == NULL
means only standard keys are supported and that
keyslot_mgmt_ll_ops::derive_raw_secret != NULL
means that only wrapped keys are supported, because device-mapper
devices always implement this method. Also, hardware might support both
types of keys.
Therefore, add a field keyslot_manager::features which contains a
bitmask of flags which indicate the supported types of keys. Drivers
will need to fill this in. This patch makes the UFS standard crypto
code set BLK_CRYPTO_FEATURE_STANDARD_KEYS, but UFS variant drivers may
need to set BLK_CRYPTO_FEATURE_WRAPPED_KEYS instead.
Then, make keyslot_manager_crypto_mode_supported() take the key type
into account.
Bug: 137270441
Bug: 151100202
Test: 'atest vts_kernel_encryption_test' on Pixel 4 with the
inline crypto patches backported, and also on Cuttlefish.
Change-Id: Ied846c2767c1fd2f438792dcfd3649157e68b005
Signed-off-by: Eric Biggers <ebiggers@google.com>
If blk-crypto-fallback is needed but is disabled by kconfig, make
blk_crypto_start_using_mode() return an error rather than succeeding.
Use ENOPKG, which matches the error code used by fscrypt when crypto API
support is missing with fs-layer encryption.
Also, if blk-crypto-fallback is needed but the algorithm is missing from
the kernel's crypto API, change the error code from ENOENT to ENOPKG.
This is needed for VtsKernelEncryptionTest to pass on some devices.
Bug: 137270441
Bug: 151100202
Test: 'atest vts_kernel_encryption_test' on Pixel 4 with the
inline crypto patches backported, and also on Cuttlefish.
Change-Id: Iedf00ca8e48c74a5d4c40b12712f38738a04ef11
Signed-off-by: Eric Biggers <ebiggers@google.com>
[ Upstream commit 14afc59361976c0ba39e3a9589c3eaa43ebc7e1d ]
The bfq_find_set_group() function takes as input a blkcg (which represents
a cgroup) and retrieves the corresponding bfq_group, then it updates the
bfq internal group hierarchy (see comments inside the function for why
this is needed) and finally it returns the bfq_group.
In the hierarchy update cycle, the pointer holding the correct bfq_group
that has to be returned is mistakenly used to traverse the hierarchy
bottom to top, meaning that in each iteration it gets overwritten with the
parent of the current group. Since the update cycle stops at root's
children (depth = 2), the overwrite becomes a problem only if the blkcg
describes a cgroup at a hierarchy level deeper than that (depth > 2). In
this case the root's child that happens to be also an ancestor of the
correct bfq_group is returned. The main consequence is that processes
contained in a cgroup at depth greater than 2 are wrongly placed in the
group described above by BFQ.
This commits fixes this problem by using a different bfq_group pointer in
the update cycle in order to avoid the overwrite of the variable holding
the original group reference.
Reported-by: Kwon Je Oh <kwonje.oh2@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato95@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Avoid that the following race can occur:
blk_cleanup_queue() blkcg_print_blkgs()
spin_lock_irq(lock) (1) spin_lock_irq(blkg->q->queue_lock) (2,5)
q->queue_lock = &q->__queue_lock (3)
spin_unlock_irq(lock) (4)
spin_unlock_irq(blkg->q->queue_lock) (6)
(1) take driver lock;
(2) busy loop for driver lock;
(3) override driver lock with internal lock;
(4) unlock driver lock;
(5) can take driver lock now;
(6) but unlock internal lock.
This change is safe because only the SCSI core and the NVME core keep
a reference on a request queue after having called blk_cleanup_queue().
Neither driver accesses any of the removed data structures between its
blk_cleanup_queue() and blk_put_queue() calls.
Change-Id: I231cdcd45d3a880eeb744183860d943776ce2fee
Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Git-commit: a063057d7c731cffa7d10740e8ebc2970df8dbb3
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git
Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
blk-crypto-fallback does not support wrapped keys, hence
prevent falling back when program_key fails. Add 'is_hw_wrapped'
flag to blk-crypto-key to mention if the key is wrapped
when the key is initialized.
Bug: 147209885
Test: Validate FBE, simulate a failure in the underlying blk
device and ensure the call fails without falling back
to blk-crypto-fallback.
Change-Id: I8bc301ca1ac9e55ba6ab622e8325486916b45c56
Signed-off-by: Barani Muthukumaran <bmuthuku@codeaurora.org>
The call to pm_runtime_get_sync() in ufshcd_program_key() can deadlock
because it waits for the UFS controller to be resumed, but it can itself
be reached while resuming the UFS controller via:
- ufshcd_runtime_resume()
- ufshcd_resume()
- ufshcd_reset_and_restore()
- ufshcd_host_reset_and_restore()
- ufshcd_hba_enable()
- ufshcd_hba_execute_hce()
- ufshcd_hba_start()
- ufshcd_crypto_enable()
- keyslot_manager_reprogram_all_keys()
- ufshcd_crypto_keyslot_program()
- ufshcd_program_key()
But pm_runtime_get_sync() *is* needed when evicting a key. Also, on
pre-4.20 kernels it's needed when programming a keyslot for a bio since
the block layer used to resume the device in a different place.
Thus, it's hard for drivers to know what to do in .keyslot_program() and
.keyslot_evict(). In old kernels it may even be impossible unless we
were to pass more information down from the keyslot_manager.
There's also another possible deadlock: keyslot programming and eviction
take ksm->lock for write and then resume the device, which may result in
ksm->lock being taken again via the above call stack. To fix this, we
should resume the device before taking ksm->lock.
Fix these problems by moving to a better design where the block layer
(namely, the keyslot manager) handles runtime power management instead
of drivers. This is analogous to the block layer's existing runtime
power management support (blk-pm), which handles resuming devices when
bios are submitted to them so that drivers don't need to handle it.
Test: Tested on coral with:
echo 5 > /sys/bus/platform/devices/1d84000.ufshc/rpm_lvl
sleep 30
touch /data && sync # hangs before this fix
Also verified via kvm-xfstests that blk-crypto-fallback continues
to work both with and without CONFIG_PM=y.
Bug: 137270441
Bug: 149368295
Change-Id: I6bc9fb81854afe7edf490d71796ee68a61f7cbc8
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add a device-mapper target "dm-default-key" which assigns an encryption
key to bios that aren't for the contents of an encrypted file.
This ensures that all blocks on-disk will be encrypted with some key,
without the performance hit of file contents being encrypted twice when
fscrypt (File-Based Encryption) is used.
It is only appropriate to use dm-default-key when key configuration is
tightly controlled, like it is in Android, such that all fscrypt keys
are at least as hard to compromise as the default key.
Compared to the original version of dm-default-key, this has been
modified to use the new vendor-independent inline encryption framework
(which works even when no inline encryption hardware is present), the
table syntax has been changed to match dm-crypt, and support for
specifying Adiantum encryption has been added. These changes also mean
that dm-default-key now always explicitly specifies the DUN (the IV).
Also, to handle f2fs moving blocks of encrypted files around without the
key, and to handle ext4 and f2fs filesystems mounted without
'-o inlinecrypt', the mapping logic is no longer "set a key on the bio
if it doesn't have one already", but rather "set a key on the bio unless
the bio has the bi_skip_dm_default_key flag set". Filesystems set this
flag on *all* bios for encrypted file contents, regardless of whether
they are encrypting/decrypting the file using inline encryption or the
traditional filesystem-layer encryption, or moving the raw data.
For the bi_skip_dm_default_key flag, a new field in struct bio is used
rather than a bit in bi_opf so that fscrypt_set_bio_crypt_ctx() can set
the flag, minimizing the changes needed to filesystems. (bi_opf is
usually overwritten after fscrypt_set_bio_crypt_ctx() is called.)
Bug: 137270441
Bug: 147814592
Change-Id: I69c9cd1e968ccf990e4ad96e5115b662237f5095
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Update the device-mapper core to support exposing the inline crypto
support of the underlying device(s) through the device-mapper device.
This works by creating a "passthrough keyslot manager" for the dm
device, which declares support for the set of (crypto_mode,
data_unit_size) combos which all the underlying devices support. When a
supported combo is used, the bio cloning code handles cloning the crypto
context to the bios for all the underlying devices. When an unsupported
combo is used, the blk-crypto fallback is used as usual.
Crypto support on each underlying device is ignored unless the
corresponding dm target opts into exposing it. This is needed because
for inline crypto to semantically operate on the original bio, the data
must not be transformed by the dm target. Thus, targets like dm-linear
can expose crypto support of the underlying device, but targets like
dm-crypt can't. (dm-crypt could use inline crypto itself, though.)
When a key is evicted from the dm device, it is evicted from all
underlying devices.
Bug: 137270441
Bug: 147814592
Change-Id: If28b574f2e28268db5eb9f325d4cf8f96cb63e3f
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
The regular keyslot manager is designed for devices that have a small
number of keyslots that need to be programmed with keys ahead of time,
and bios that are sent to the device need to be tagged with a keyslot
index.
Some inline encryption hardware may not have any limitations on the
number of keyslot, and may instead allow each bio to be tagged with
a raw key, data unit number, etc. rather than a pre-programmed keyslot's
index. These devices don't need any sort of keyslot management, and it's
better for these devices not to have to allocate a regular keyslot
manager with some fixed number of keyslots. These devices can instead
set up a passthrough keyslot manager in their request queue, which
require less resources than regular keyslot managers, as they simply
do no-ops when trying to program keys into slots.
Separately, the device mapper may map over devices that have inline
encryption hardware, and it wants to pass the key along to the
underlying hardware. While the DM layer can expose inline encryption
capabilities by setting up a regular keyslot manager with some fixed
number of keyslots in the dm device's request queue, this only wastes
memory since the keys programmed into the dm device's request queue
will never be used. Instead, it's better to set up a passthrough
keyslot manager for dm devices.
Bug: 137270441
Bug: 147814592
Change-Id: I6d91e83e86a73b0d6066873c8a9117cf2c089234
Signed-off-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Export the blk-crypto symbols needed for modules to use inline crypto.
These would have already been exported, except that so far they've only
been used by fs/crypto/, which is no longer modular.
Bug: 137270441
Bug: 147814592
Change-Id: I64bf98aecabe891c188b30dd50124aacb1e008ca
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
While we're waiting for v7 of the inline crypto patchset, fix some bugs
that made it into the v6 patchset, including one that caused bios with
an encryption context to never be merged, and one that could cause
non-contiguous pages to incorrectly added to a bio.
Bug: 137270441
Change-Id: I3911fcd6c76b5c9063b86d6af6267ad990a46718
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Some inline encryption hardware supports protecting the keys in hardware
and only exposing wrapped keys to software. To use this capability,
userspace must provide a hardware-wrapped key rather than a raw key.
However, users of inline encryption in the kernel won't necessarily use
the user-specified key directly for inline encryption. E.g. with
fscrypt with IV_INO_LBLK_64 policies, each user-provided key is used to
derive a file contents encryption key, filenames encryption key, and key
identifier. Since inline encryption can only be used with file
contents, if the user were to provide a wrapped key there would
(naively) be no way to encrypt filenames or derive the key identifier.
This problem is solved by designing the hardware to internally use the
unwrapped key as input to a KDF from which multiple cryptographically
isolated keys can be derived, including both the inline crypto key (not
exposed to software) and a secret that *is* exposed to software.
Add a function to the keyslot manager to allow upper layers to request
this software secret from a hardware-wrapped key.
Bug: 147209885
Change-Id: I32f3aa4f25bcf6b9d6f7d8890260533fad00dd1d
Co-developed-by: Gaurav Kashyap <gaurkash@codeaurora.org>
Signed-off-by: Gaurav Kashyap <gaurkash@codeaurora.org>
Signed-off-by: Barani Muthukumaran <bmuthuku@codeaurora.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Currently, blk-crypto uses the algorithm to determine the size of keys.
However, some inline encryption hardware supports protecting keys from
software by wrapping the storage keys with an ephemeral key. Since
these wrapped keys are not of a fixed size, add the capability to
provide the key size when initializing a blk_crypto_key, and update the
keyslot manager to take size into account when comparing keys.
Bug: 147209885
Change-Id: I9bf26d06d18a2d671c51111b4896abe4df303988
Co-developed-by: Gaurav Kashyap <gaurkash@codeaurora.org>
Signed-off-by: Gaurav Kashyap <gaurkash@codeaurora.org>
Signed-off-by: Barani Muthukumaran <bmuthuku@codeaurora.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Changes v5 => v6:
- Blk-crypto's kernel crypto API fallback is no longer restricted to
8-byte DUNs. It's also now separately configurable from blk-crypto, and
can be disabled entirely, while still allowing the kernel to use inline
encryption hardware. Further, struct bio_crypt_ctx takes up less space,
and no longer contains the information needed by the crypto API
fallback - the fallback allocates the required memory when necessary.
- Blk-crypto now supports all file content encryption modes supported by
fscrypt.
- Fixed bio merging logic in blk-merge.c
- Fscrypt now supports inline encryption with the direct key policy, since
blk-crypto now has support for larger DUNs.
- Keyslot manager now uses a hashtable to lookup which keyslot contains
any particular key (thanks Eric!)
- Fscrypt support for inline encryption now handles filesystems with
multiple underlying block devices (thanks Eric!)
- Numerous cleanups
Backport notes: In the time between the update from v5 to v6,
"scsi: ufs: override auto suspend tunables for ufs" was merged in
upstream, and as a result, UFSHCD_CAP_RPM_AUTOSUSPEND took up the
7th bit in the ufs crypto caps - however, that patch has not been
backported to 4.14 yet, so we manually change UFSHCD_CAP_CRYPTO to
use the 8th bit (to reflect what's in v6 in android-mainline).
Bug: 137270441
Test: refer to I26376479ee38259b8c35732cb3a1d7e15f9b05a3
Change-Id: I13e2e327e0b4784b394cb1e7cf32a04856d95f01
Link: https://lore.kernel.org/linux-block/20191218145136.172774-1-satyat@google.com/
Signed-off-by: Satya Tangirala <satyat@google.com>