[01:00:47] *** Joins: tkulasek (~tkulasek@134.134.139.83)
[09:23:24] *** Quits: tkulasek (~tkulasek@134.134.139.83) (Ping timeout: 260 seconds)
[09:28:21] *** Joins: travis-ci (~travis-ci@ec2-54-167-141-20.compute-1.amazonaws.com)
[09:28:22] <travis-ci> (spdk/master) bdev/qos: Add unit tests for spdk_bdev_set_qos_limit_iops (Ben Walker)
[09:28:22] <travis-ci> Diff URL: https://github.com/spdk/spdk/compare/8e17d9f21f3c...e18d2b768751
[09:28:22] *** Parts: travis-ci (~travis-ci@ec2-54-167-141-20.compute-1.amazonaws.com) ()
[09:49:23] <pwodkowx> is spdk_tgt supporting iSCSI?
[09:49:39] <bwalker> it's supposed to - I'm not ramped up on what the current state is
[09:49:48] <bwalker> iscsi, nvmf, vhost all in one
[09:50:40] <pwodkowx> some extra switch or something? I put a breakpoint in spsk_subsystem_init but not get there
[09:51:23] <bwalker> I haven't used it yet - was on sabbatical when they wrote it
[09:51:33] <bwalker> make sure the iscsi stuff is linked in
[09:55:00] <bwalker> this is neat: https://review.gerrithub.io/#/c/spdk/spdk/+/409577/
[11:09:36] <pwodkowx> jimharris: about bdev_iscsi, are we missing iscsi_set_target_name() and iscsi_set_initiaor_username_pwd() calls in create_iscsi_disk()??
[11:10:44] <pwodkowx> apps/examples from libiscsi call those bere connecting
[11:10:57] <pwodkowx> * before connecting
[11:10:58] <jimharris> username_pwd should be added to the TODO list - it's needed if CHAP is enabled
[11:11:40] <pwodkowx> ok, Add those as I'm already there?
[11:11:40] <jimharris> iscsi_set_target_name?
[11:13:09] <jimharris> let's make initiator_username_pwd a separate patch - you'll need to set up the iSCSI target for CHAP to test it
[11:13:45] <jimharris> oh - iscsi_set_targetname
[11:14:30] <pwodkowx> see examples/iscsi-dd.c: iscsi_set_targetname(client.src_iscsi, iscsi_url->target)
[11:15:11] <jimharris> i don't think we need that since we're using iscsi_parse_full_url
[11:15:32] <jimharris> i think iscsi_set_targetname would be used if you wanted to do that more piecemeal instead of specifying the full URL in a string
[11:15:47] <jimharris> it's possible you can pass CHAP user/pass in the URL string too
[11:15:57] <pwodkowx> ok
[11:15:57] <jimharris> in which case we shouldn't add set_initiator_username_pwd
[11:16:18] <jimharris> could you check that?  we may eventually want to support a piecemeal approach especially over RPC
[11:16:45] <jimharris> the idea of typing out the full URL string is not a pleasant thought
[11:17:01] <pwodkowx> yep, testing the RPC now, and this is why I'm asking :P
[11:18:06] <jimharris> i'm open to changes - when i got the original module up and running i just picked the full URL path since that was the easiest to get something running
[11:19:43] <pwodkowx> ok, will leave this for separate patchset
[11:21:10] <pwodkowx> from other side: about blobs and lvols. Pawel Kaminski is often hitting crashes if there are some unrecoverable errors in blobstore
[11:21:31] <pwodkowx> most annoying one is spdk_bit_array_get(bs->used_clusters, cluster_num) == false
[11:21:41] <jimharris> can you describe specifically what you mean by "unrecoverable errors"?
[11:22:00] <pwodkowx> spdk_tgt: blobstore.c:81: _spdk_bs_claim_cluster: Assertion `spdk_bit_array_get(bs->used_clusters, cluster_num) == false' failed.
[11:23:13] <jimharris> does he have a reproducible test case?
[11:23:18] <pwodkowx> this is debug build, but in release build we will simply ignore this. Should this be converted to some kind of runtime errors that will abort any IO to broken blobstore?
[11:24:19] <jimharris> does he hit this during runtime of some test, or when the blobstore is first loaded?
[11:26:17] <pwodkowx> I asked him to narrow this down but no luck yet.
[11:26:41] <pwodkowx> 100% repro was when we hit the 32MB cluster size issue, but this was fixed so we don't have a simple way to reproduce this.
[11:28:17] <pwodkowx> but the issue is that if blobstore is broken SPDK will crash in debug build. So there is no other way to fix this than overwriting nvme in linux
[11:29:36] <pwodkowx> Ask him to report this in github issues?
[11:29:45] <jimharris> definitely
[11:29:59] <pwodkowx> ok, will do.
[11:31:54] <jimharris> bwalker: is the used cluster mask on disk protected by a crc?
[11:32:55] <bwalker> I don't think it is
[11:33:02] <bwalker> that needs to be added
[11:33:22] <bwalker> the metadata pages each have a crc, and I think the whole blobstore header does now too
[11:33:29] <bwalker> but the masks don't as far as I can remember
[15:17:53] *** Joins: travis-ci (~travis-ci@ec2-184-73-56-0.compute-1.amazonaws.com)
[15:17:54] <travis-ci> (spdk/master) bdev/virtio/scsi: validate bs and num_blocks before creating a bdev (Dariusz Stojaczyk)
[15:17:54] <travis-ci> Diff URL: https://github.com/spdk/spdk/compare/e18d2b768751...14cf1ad3ac9f
[15:17:54] *** Parts: travis-ci (~travis-ci@ec2-184-73-56-0.compute-1.amazonaws.com) ()
[16:00:43] <peluse> so this is kidna fun....
[16:01:32] <peluse> can do crypto IO to different bdevs using different PMDs now at the same time.  ptd0 us uing QAT and ptd1 is using software PMD:
[16:01:34] <peluse> Running I/O for 10 seconds...
[16:01:34] <peluse>  Logical core: 0
[16:01:34] <peluse>  ptd0                :   85966.60 IO/s     335.81 MB/s
[16:01:34] <peluse>  Logical core: 1
[16:01:34] <peluse>  ptd1                :   76558.40 IO/s     299.06 MB/s
[16:01:36] <peluse>  =====================================================
[16:01:38] <peluse>  Total               :  162525.00 IO/s     634.86 MB/s
[16:01:45] <peluse> super quick test, 4K reads
[16:02:10] <peluse> QAT using 3DES CBC and software PME using AES_NI CBC
[16:04:04] <peluse> but still a few beers away from being able dynamically/efficiently select which PMD to use based on # of available PMDs, number of lcores and number of supported qpairs per PMD. And to make it really fun, 1 QAT device presents 32 PMDs with 2 QP each and a SW PMD is just 1 device with 8 QP each.  My brain hurts :)
[18:41:42] <peluse> PS: both of those were to a RAM disk...
[19:41:36] <peluse> here's the same thing with a single NVMe device split in 2 with the split module
[19:41:38] <peluse> Running I/O for 10 seconds...
[19:41:39] <peluse>  Logical core: 0
[19:41:39] <peluse>  cry0                :   57678.60 IO/s     225.31 MB/s
[19:41:39] <peluse>  Logical core: 1
[19:41:39] <peluse>  cry1                :   62243.00 IO/s     243.14 MB/s
[19:41:40] <peluse>  =====================================================
[19:41:41] <peluse>  Total               :  119921.60 IO/s     468.44 MB/s
[19:45:53] <peluse> unclear as to why QAT is slower here, have some ideas though... still early
[21:39:27] *** Quits: pwodkowx (pwodkowx@nat/intel/x-pkpmobcinybjfhcv) (Ping timeout: 240 seconds)
[21:39:53] *** Joins: pwodkowx (pwodkowx@nat/intel/x-jrkpgmntidtnkwuj)