[00:21:57] *** Joins: tomzawadzki (tomzawadzk@nat/intel/x-mbjosakskimtogca) [00:24:04] *** Quits: alessio (~alessio@140.105.207.227) (Quit: alessio) [00:58:20] *** Joins: alessio (~alessio@140.105.207.227) [01:02:47] @jimharris @sethhowe - I noticed that after merging https://review.gerrithub.io/#/c/406248/ one of the tests execution time has increased significantly. [01:03:29] Not sure if it's a problem or if we're ok with that. [01:04:29] Problem is with test/nvmf/fio/fio.sh test, specifically with step where we do hotplug/hotdetach test. [01:05:21] Previously FIO exited immediately after removing the bdev, but now it waits until timeout kicks in. [01:06:15] This results in about 2 extra minutes in execution time, on both VM and physical test servers I think [01:09:51] *** Quits: pzedlews_ (uid285827@gateway/web/irccloud.com/x-fsogisuydxgfyktm) (Quit: Connection closed for inactivity) [01:10:18] *** Quits: alessio (~alessio@140.105.207.227) (Quit: alessio) [01:10:24] *** Quits: nKumar (sid239884@gateway/web/irccloud.com/x-kvtknahzyfxhknax) (Read error: Connection reset by peer) [01:10:33] *** Joins: nKumar (sid239884@gateway/web/irccloud.com/x-qgtvcjojffypjtzl) [01:15:40] *** Joins: alessio (~alessio@140.105.207.227) [03:04:15] *** Joins: tkulasek (~tkulasek@134.134.139.83) [03:05:30] *** Quits: tkulasek (~tkulasek@134.134.139.83) (Remote host closed the connection) [03:24:48] *** Quits: alessio (~alessio@140.105.207.227) (Quit: alessio) [04:22:23] *** Joins: tkulasek (~tkulasek@134.134.139.83) [04:57:10] *** Joins: alessio (~alessio@140.105.207.227) [05:04:13] *** Quits: tkulasek (~tkulasek@134.134.139.83) (Quit: Leaving) [05:50:44] *** Joins: tkulasek (~tkulasek@192.55.54.40) [06:25:18] *** Quits: alessio (~alessio@140.105.207.227) (Quit: alessio) [06:26:10] *** Joins: alessio (~alessio@140.105.207.227) [06:45:50] *** Quits: tkulasek (~tkulasek@192.55.54.40) (Quit: Leaving) [07:16:50] *** Joins: lhodev (~lhodev@inet-hqmc06-o.oracle.com) [07:22:57] *** Quits: lhodev (~lhodev@inet-hqmc06-o.oracle.com) (Remote host closed the connection) [07:47:46] *** Joins: lhodev (~lhodev@inet-hqmc06-o.oracle.com) [07:49:47] *** Quits: lhodev (~lhodev@inet-hqmc06-o.oracle.com) (Remote host closed the connection) [07:50:44] *** Joins: lhodev (~lhodev@inet-hqmc06-o.oracle.com) [07:52:46] *** Quits: lhodev (~lhodev@inet-hqmc06-o.oracle.com) (Remote host closed the connection) [07:53:17] *** Joins: lhodev (~lhodev@inet-hqmc06-o.oracle.com) [07:55:17] *** Quits: lhodev (~lhodev@inet-hqmc06-o.oracle.com) (Remote host closed the connection) [07:55:48] *** Joins: lhodev (~lhodev@inet-hqmc06-o.oracle.com) [08:55:57] *** Quits: tomzawadzki (tomzawadzk@nat/intel/x-mbjosakskimtogca) (Ping timeout: 240 seconds) [08:58:49] for anyone who regularly attends the SPDK community meetings, http://www.spdk.io/community/ has been updated to show the next meeting times in your system's local time zone (Wednesday, April 11, 2018 9:00 PM MST for me) [08:59:26] *** bwalker_ is now known as bwalker [08:59:36] *** ChanServ sets mode: +o bwalker [09:12:34] *** Joins: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) [09:12:35] (spdk/master) nvmf: reduce log level of informational messages (Daniel Verkamp) [09:12:35] Diff URL: https://github.com/spdk/spdk/compare/4c7733618a79...9770ee7817d7 [09:12:35] *** Parts: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) () [09:13:23] but note that Trello and announcements will still be made using UTC... [09:13:49] jimharris, or anyone interested in the ongoing crypto work... have a few options to consider [09:14:15] right now, the proposed RFC patch supports the virtual PMD that does AES-NI CBC and that's all it supports [09:14:57] one path forward would be a different vbdev per PMD (different DPDK virtual PMD or HW like Intel QAT) [09:15:14] however since DPDK has a unified API called crypto dev we don't have to go that route [09:15:58] we can have just one, slightly more complicated vbdev, where multiple PMDs virutal types as well as HW can be supported and chosen via config file [09:16:39] The latter seems like the best option but before I embark on that design based on the current RFC patch wanted to get some feedback... [09:17:53] (the former would have a few benefits though, mainly keeping a vbdev simple per PMD for those who aren't interested in multiple drivers and/or mixing HW and virtual PMDs) [09:18:34] *** Parts: alessio (~alessio@140.105.207.227) () [09:30:06] *** Joins: alessio (~alessio@140.105.207.227) [09:33:10] I'd do one crypto bdev with an option to select the encryption algorithm [09:38:46] bwalker, thanks that's where I'm leaning as well but certainly open to other thoughts if there's some other reason I hadn't thought of for having multiple vbdev modules (one per alg) [09:56:12] I'd also just always use a poller [09:56:18] in the implementation [09:56:31] you can't know ahead of time whether the encryption algorithm is async or not [09:56:44] but from your performance measurements, always using a poller doesn't seem to hurt to me [09:56:47] if anything, it helps [09:57:25] *** Joins: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) [09:57:26] (spdk/master) nvmf: Queue incoming requests to a paused subsystem (Ben Walker) [09:57:26] Diff URL: https://github.com/spdk/spdk/compare/9770ee7817d7...fe54959b623c [09:57:26] *** Parts: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) () [10:16:55] bwalker, yes, we know as there are only a dozen of them and each has to be manually added to support it [10:17:35] but it's not exposed in their public API, so even if one is known to be synchronous today it could later be changed to asynchronous [10:18:47] we aren't just writing to the API though, the init is all manual. Anyway, it doesn't make much of a perf difference and it can't be 100% one way or the other because if the PMD is full at enqueue time I have to call the poller manually [10:19:10] but the most consistent way is to always use the poller via the reactor and only call manually in that rare exception when the PMD is full at submission [10:27:39] the poller is for periodically checking dequeue, right? enqueue full is a separate condition that you have to handle regardless [10:27:49] because calling dequeue isn't guaranteed to fix enqueue full [10:28:19] it is for a synchronous encryption PMD, but not an async one [10:28:30] yes thats what the poller is for but calling dequeue is the suggested path for it being full (the only path) [10:29:28] that won't work if enqueue is full but none of the outstanding encryption operations are actually done [10:30:01] correct [10:30:38] so how do you handle that case? [10:30:53] the options are to sit there and spin until there's room or save enough context off to defer the remaining items to enqueue until some later point in time and try then [10:31:09] the facilities to handle the latter are a bit more complex of cource but doable [10:31:25] I think you queue the bdev_io until an encryption operation completes [10:31:34] that's the most robust solution [10:31:45] and once you have that, then you use that to handle enqueue full in all cases [10:31:51] its not a bdev_io [10:31:52] you never call dequeue in response to enqueue full [10:32:03] the code calling enqueue has a bdev_io though [10:32:07] in our bdev module [10:32:41] these are different structures that we build up prior to enqueuing [10:32:58] we don't know if its going to be full or not until we try to enqueue a batch of them and it may or may not swallow them all [10:33:51] and each crypto op is 1:1 with an IOvector, not a bdev IO [10:34:13] so I could easily have a partial bdev_io complete when I get a full condition back [10:35:00] complete in that last sentence menaing encrypted/decrypted, not complete wrt any other layer [10:35:10] I see [10:35:31] so you have to break up the buffers on some bdev_io's to be suitable for the encryption algos? [10:35:31] so I'd have to queue and maintain state for the bdev_io as well as the remaining vectors to be processed by the engine [10:35:43] which is probably why the maintainers suggested dequeue until it can take more, all in line :) [10:35:50] what are the rules for when you need to split? [10:36:25] I don't actually break up the SGLs, I walk them and create a crypto op per vector but there is a max vector size so it is possible that one IO vector can turn into multipl crypto ops ufortunately [10:36:51] the max size is 32K but for performance reaosns the maintainers sugested something much smaller to help the back end pipeline smooth out, right now I think I ahve it at 4K [10:36:55] per vector [10:36:57] so you create a crypto op per element in the sgl [10:37:12] yes [10:37:17] I see - you break it up into small pieces [10:37:19] or more if its > 4K [10:37:20] ok, I get that [10:37:31] that makes it harder to queue for sure [10:37:34] yeah but never have to actually touch the provided IOV list [10:37:40] yeah :( [10:37:42] in an effort to just get it working I totally agree - just spin on dequeue [10:38:00] but ultimately I think we'll want to implement the queueing [10:38:06] but on the plus side they were surprised I could even get to a Q full condition, I had to make some #defines artififcally small to do so [10:38:08] if this starts to get widespread use [10:38:18] for sure [10:39:22] gotta run, catch ya after lunch... thanks bwalker [10:41:50] *** Joins: darsto_ (~darsto@89-68-135-211.dynamic.chello.pl) [10:42:14] *** darsto_ is now known as Guest1106 [10:46:11] *** Quits: Guest60464 (~dstojacx@89-68-135-211.dynamic.chello.pl) (Quit: Leaving) [10:48:10] *** darsto is now known as Guest1107 [10:48:13] *** Guest1106 is now known as darsto [10:52:41] *** Parts: Guest1107 (dstojacx@nat/intel/x-vaxkapwalmtblprj) ("Leaving") [10:53:37] *** Joins: pzedlews_ (uid285827@gateway/web/irccloud.com/x-cscuywquejxialdc) [11:12:34] *** Quits: alessio (~alessio@140.105.207.227) (Quit: alessio) [11:45:15] *** Quits: darsto (~darsto@89-68-135-211.dynamic.chello.pl) (Quit: ZNC 1.6.6 - http://znc.in) [11:46:40] *** Joins: darsto (~darsto@89-68-135-211.dynamic.chello.pl) [11:47:04] *** darsto is now known as Guest57054 [11:51:23] *** Quits: Guest57054 (~darsto@89-68-135-211.dynamic.chello.pl) (Client Quit) [12:00:48] *** Joins: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) [12:00:49] (spdk/master) bdev/rbd: add missing block_size in RPC config dump (Pawel Wodkowski) [12:00:50] Diff URL: https://github.com/spdk/spdk/compare/fe54959b623c...be98deff3e1e [12:00:50] *** Parts: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) () [12:02:26] *** Joins: darsto_ (~darsto@89-68-135-211.dynamic.chello.pl) [12:32:58] hmm, the hotplug test case in test/nvmf/fio is taking a full minute before it times out - I wonder if there is some fio option we can tweak to make that quicker [12:34:08] actually, it looks like the delete_bdev RPC is hanging - that's the minute timeout [12:40:20] I think the nvmf hot-remove path is probably missing code to close all of its io_channels; it looks like it currently just calls spdk_bdev_close() immediately [12:40:33] in _spdk_nvmf_ns_hot_remove() -> spdk_nvmf_subsystem_remove_ns() [12:41:47] *** darsto_ is now known as darsto [12:42:33] I finally configured my znc [12:42:42] hopefully there will be just one darsto from now on [13:00:06] *** Joins: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) [13:00:07] (spdk/master) bdev/virtio/scsi: fix bdev->name memory leak (Dariusz Stojaczyk) [13:00:08] Diff URL: https://github.com/spdk/spdk/compare/be98deff3e1e...20d8fec0397f [13:00:08] *** Parts: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) () [13:00:38] *** Joins: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) [13:00:39] (spdk/master) nvmf: use standard types in spdk_nvmf_valid_nqn() (Daniel Verkamp) [13:00:40] Diff URL: https://github.com/spdk/spdk/compare/20d8fec0397f...9689e6cca576 [13:00:40] *** Parts: travis-ci (~travis-ci@ec2-54-166-126-35.compute-1.amazonaws.com) () [13:03:07] *** Quits: pzedlews_ (uid285827@gateway/web/irccloud.com/x-cscuywquejxialdc) (Quit: Connection closed for inactivity) [13:30:34] drv: I'll take a look [13:32:09] yeah, I think this is broken [13:32:26] I see it - it will happen if you have two namespaces, say nsid 5 and 10 [13:32:30] and you delete 5 but leave 10 [13:32:41] the channel won't ever get destroyed [13:47:06] jimharris, bwalker: quick RPC load_config fix if you have a moment: https://review.gerrithub.io/#/c/406659/ [14:32:27] *** Joins: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) [14:32:28] (spdk/master) scripts/rpc.py: fix load_config method check (Daniel Verkamp) [14:32:29] Diff URL: https://github.com/spdk/spdk/compare/9689e6cca576...46cbc7408a34 [14:32:29] *** Parts: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) () [15:16:20] jimharris, bwalker: another simple JSON config review (I've tested this one locally): https://review.gerrithub.io/#/c/406587/ [15:29:24] *** Joins: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) [15:29:25] (spdk/master) bdev/nvme: add JSON config dump (Pawel Wodkowski) [15:29:26] Diff URL: https://github.com/spdk/spdk/compare/46cbc7408a34...33aad6ee8d8e [15:29:26] *** Parts: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) () [15:42:05] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [15:43:42] jimharris: mszwed made a comment about a potential cleanup on https://review.gerrithub.io/#/c/404616/ - I went ahead and merged it, just wanted to make sure you saw it [15:46:12] *** Joins: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) [15:46:13] (spdk/master) blob: make spdk_blob_resize an async operation (Jim Harris) [15:46:13] Diff URL: https://github.com/spdk/spdk/compare/33aad6ee8d8e...463925ff0ff9 [15:46:13] *** Parts: travis-ci (~travis-ci@ec2-23-22-243-102.compute-1.amazonaws.com) () [16:57:55] hmm, I'm trying to run the lvol tests on my machine, and it seems like the order the tests run in with "all" mode is wrong [16:58:25] it runs test case 1, 100, 10000, in that order, then tries to run 101, which doesn't work, since 10000 causes the target to exit [16:58:45] but there is too much python magic going on for me to understand what is supposed to be happening [16:59:07] the order for "all" seems to be based on this giant nested thing from lvol_test.py: [16:59:08] for num_test in [i.split("test_case")[1] for i in dir(TestCases) if "test_case" in i]: [16:59:42] I assume dir(TestCases) can return things in whatever order it feels like? [17:05:20] I think this only works during the CI runs because we run an explicit set of tests, not all [17:13:29] I believe this should fix it: https://review.gerrithub.io/#/c/406677/ [17:31:21] for sure! [17:34:30] hmm, fixed and simplified, nice [17:59:23] *** Quits: lhodev (~lhodev@inet-hqmc06-o.oracle.com) (Remote host closed the connection) [18:00:02] *** Joins: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) [23:07:25] *** Joins: pzedlews_ (uid285827@gateway/web/irccloud.com/x-uxzqiitpktepmnjj) [23:10:00] *** Joins: alessio (~alessio@140.105.207.227)