[00:12:05] *** Quits: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) (Ping timeout: 256 seconds) [00:12:13] *** Joins: lhodev_ (~lhodev@66-90-218-190.dyn.grandenetworks.net) [00:52:52] *** Joins: tkulasek_ (~tkulasek@134.134.139.75) [00:52:53] *** Joins: tkulasek (~tkulasek@134.134.139.75) [02:18:19] *** Quits: tkulasek (~tkulasek@134.134.139.75) (Quit: Leaving) [02:18:37] *** Quits: tkulasek_ (~tkulasek@134.134.139.75) (Quit: Leaving) [02:18:48] *** Joins: tkulasek_ (~tkulasek@192.55.54.44) [02:22:46] *** Joins: dlw (~Thunderbi@114.255.44.143) [03:44:51] *** Quits: dlw (~Thunderbi@114.255.44.143) (Ping timeout: 240 seconds) [04:19:19] *** Joins: johnmeneghini (~johnmeneg@216.240.30.5) [06:40:38] *** Quits: drv (daniel@oak.drv.nu) (Ping timeout: 260 seconds) [06:45:54] *** Joins: drv (daniel@oak.drv.nu) [06:45:54] *** ChanServ sets mode: +o drv [06:49:16] *** Quits: drv (daniel@oak.drv.nu) (Client Quit) [06:50:51] *** Joins: drv (daniel@oak.drv.nu) [06:50:51] *** ChanServ sets mode: +o drv [08:53:55] johnmeneghini: sorry for the delay - it's running now [11:14:26] *** Joins: travis-ci (~travis-ci@ec2-54-158-138-97.compute-1.amazonaws.com) [11:14:27] (spdk/master) jsonrpc: fix closed connection hadling (Pawel Wodkowski) [11:14:27] Diff URL: https://github.com/spdk/spdk/compare/967339f3e533...01a9118d0c29 [11:14:27] *** Parts: travis-ci (~travis-ci@ec2-54-158-138-97.compute-1.amazonaws.com) () [12:06:06] jimharris drv: Are you submitting talks for KVM Forum on Oct 24-26 in Edinburgh, Scotland? https://events.linuxfoundation.org/events/kvm-forum-2018/program/cfp/ [12:06:54] I looked at the SPDK Summit slides. It would be great to see SPDK material at KVM Forum and learn more about the use cases you've been targetting. [12:07:58] bwalker: could you take a look at https://review.gerrithub.io/#/c/spdk/spdk/+/412695/ again? [12:08:28] stefanha: we don't have anything submitted yet [12:09:01] The CFP deadline is June 14 [12:09:02] not sure if drv or i can get travel budget approved, but would be great to get someone from our Intel team in poland to attend [12:10:45] *** Quits: tkulasek_ (~tkulasek@192.55.54.44) (Ping timeout: 260 seconds) [12:14:03] jimharris: That would be cool. Talks can be either for users (overview of how to configure/deploy SPDK) or internals (vhost-user, I/O architecture, etc) [12:21:48] *** Joins: JoeGruher (c037362d@gateway/web/freenode/ip.192.55.54.45) [12:22:19] When using the SPDK NVMe FIO plugin to access a namespace on an NVMeoF target, this is the example given: filename=trtype=RDMA adrfam=IPv4 traddr=192.168.100.8 trsvcid=4420 ns=1 [12:22:47] but it doesn't seem to allow me to specify which subsystem on the target to access? shouldn't we need an NQN or something? [12:35:08] *** Joins: travis-ci (~travis-ci@ec2-54-157-238-183.compute-1.amazonaws.com) [12:35:09] (spdk/master) test/vhost: move negative tests to separate file (Karol Latecki) [12:35:09] Diff URL: https://github.com/spdk/spdk/compare/4404da7ceaec...0af5182e1994 [12:35:09] *** Parts: travis-ci (~travis-ci@ec2-54-157-238-183.compute-1.amazonaws.com) () [12:36:19] *** Joins: travis-ci (~travis-ci@ec2-54-158-138-97.compute-1.amazonaws.com) [12:36:20] (spdk/master) blobstore: freeze I/O during resize (Piotr Pelplinski) [12:36:21] Diff URL: https://github.com/spdk/spdk/compare/0af5182e1994...69fa57cdf079 [12:36:21] *** Parts: travis-ci (~travis-ci@ec2-54-158-138-97.compute-1.amazonaws.com) () [12:39:53] JoeGruher: you can specify the subsystem NQN with the subnqn= key in the filename [12:39:58] *** Quits: darsto (~darsto@89-68-12-72.dynamic.chello.pl) (Ping timeout: 260 seconds) [12:40:15] if you don't specify a subnqn, it will try to connect to a discovery service at that address [12:41:46] *** Quits: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) (Ping timeout: 264 seconds) [12:42:39] got it, thx [12:42:57] i realize now my FIO plugin build failed on these systems for some reason [12:42:58] *** Joins: gila (~gila@static.214.50.9.5.clients.your-server.de) [12:43:11] so i'll have to figure that out first [12:44:34] *** Joins: darsto (~darsto@89-68-12-72.dynamic.chello.pl) [12:59:51] jimharris: I posted another comment on https://review.gerrithub.io/#/c/spdk/spdk/+/412695/ - want to get your input before I send Vishal off on a wild goose chase :) [13:01:41] i was thinking we don't worry about precise stats [13:09:16] *** Quits: pohly (~pohly@p54BD5098.dip0.t-ipconnect.de) (Quit: Leaving.) [13:19:17] jimharris: took_action determines whether the poller goes idle or not - shouldn't that be the same metric we use to track idle/active time? [13:20:36] I'm also not sure why we need all these extra now = spdk_get_ticks calls either [13:23:09] does the SPDK initiator work with the Linux kernel target? [13:23:12] I think we only need to capture the tsc when the state flips from active to idle and vis versa [13:23:19] yes it does [13:23:25] I can run IO for about 10 seconds and then it drops to zero and I get this printed on the target system: [ 323.024748] nvmet: ctrl 1 keep-alive timer (10 seconds) expired! [ 323.036784] nvmet: ctrl 1 fatal error occurred! [13:23:39] hmm [13:23:44] what version of the kernel? [13:23:55] 4.16.14 [13:24:04] is this with the nvme fio_plugin? [13:24:14] yes, fio plugin on the initator system [13:24:42] at a glance, it looks like we never call spdk_nvme_ctrlr_process_admin_completions() from the nvme fio_plugin, which is what sends the keep-alives [13:24:49] not sure how nobody noticed this before [13:25:02] yeah - that's just a bug [13:25:18] must only be using fio on local disks everywhere [13:25:39] will this also cause a failure with SPDK target, or only kernel target? [13:25:47] bdev_nvme does poll for admin completions, so if you are willing to switch to the bdev fio_plugin, that should work [13:25:58] the SPDK target doesn't currently enforce keep alives [13:26:16] (which is also a bug that should eventually get fixed) [13:26:23] I don't actually think the kernel target used to enforce either [13:26:27] I think that's more recent [13:27:25] OK - so I can try nvme plugin with SPDK target and bdev plugin with kernel target [13:27:36] or bdev with both [13:28:08] for the bdev plugin, looks like I need to attach the target to create a bdev on the initiator system, then I run FIO against the bdev - how do I create the bdev, is there a rpc.py command to do it? [13:29:00] I believe you pass the fio plugin a configuration file that defines the bdevs [13:29:28] the same kind of configuration format that you pass to the nvme-of target [13:30:02] see examples/bdev/fio_plugin/README.md [13:30:08] I see, spdk_conf=./examples/bdev/fio_plugin/bdev.conf [13:30:13] yep [13:30:33] but that bdev.conf example only shows a malloc device [13:30:44] what contents should bdev.conf have for an nvmeof target device [13:31:07] [Nvme] [13:31:07] TransportID "trtype:PCIe traddr:0000:00:00.0" Nvme0 [13:31:25] no - trtype rdma, right [13:31:32] except trtype:RDMA traddr: trsvcid: subnqn: [13:31:33] what would that look like [13:31:37] ah ok [13:31:43] bwalker: i don't think so - the "idle" stuff in spdk_reactor_run today is all around whether there were any events/pollers/timers that executed [13:32:20] but in vishal's code, it's interpreting the poller return code to see if it did any "real" work [13:32:34] it's conflating two different concepts of idle [13:32:46] which probably needs to be fixed - i'm just saying we should do that separately from vishal's patch [13:33:02] why? Isn't the point of vishal's patch to change the meaning of idle? [13:33:35] the point of vishal's patch is to get an idea of how much a reactor core is actually being utilized in terms of cpu cycles [13:33:43] top always says 100% since its polling [13:33:52] his patch adds a lot of spdk_get_ticks() overhead that could be eliminated if it was tied to the idle mechanism [13:34:07] how much does his patch hurt performance in benchmarks? [13:34:15] i've done measurements and the extra spdk_get_ticks overhead is very minimal [13:35:38] it's effectively one extra spdk_get_ticks call on every iteration through the loop [13:35:48] correct [13:36:18] or more precisely, we call spdk_get_ticks every time through the loop now instead of once every five times through the loop [13:37:35] FIO doesn't seem to like the spdk_conf parameter... that goes right into the FIO .ini file? [13:37:36] Bad option [13:40:48] didn't we put in the count to 5 without doing timed-pollers in because of a performance benchmark? [13:40:57] so if we take that out - what changed? [13:41:02] that's changing our mind [13:41:56] i think previously there was a suspicion that extra get_ticks calls would hurt performance [13:42:10] I thought we measured it with your event perf tool [13:42:33] i can re-run the data again with bdevperf [13:43:02] it's a bit noticeable with something like event perf which is really the worst case scenario [13:44:08] vishal's first patch changes it from a call to spdk_get_ticks() every 5th to a call on every iteration (if there are timed pollers registered). His second patch, the one I commented on, changes it to every iteration if there are any type of poller [13:44:33] ah ok if I do the ld_preload and change the ioengine to spdk_bdev then FIO doesn't complain about the parameter [13:46:10] but i get this failure: nvme_rdma.c: 803:nvme_rdma_qpair_connect: *ERROR*: Unhandled ADRFAM 0 nvme_rdma.c:1393:nvme_rdma_ctrlr_construct: *ERROR*: failed to create admin qpair [13:46:59] trtype:RDMA traddr: trsvcid: subnqn: adrfam:IPv4 [13:48:23] great that works [13:48:31] except I get the same failure as with the nvme plugin: [ 1917.393609] nvmet: ctrl 1 keep-alive timer (10 seconds) expired! [ 1917.405660] nvmet: ctrl 1 fatal error occurred! [13:48:51] ran great for the first ten seconds tho [13:49:16] hmm [13:50:26] hm, the bdev_nvme adminq poller is supposed to run every second by default (unless you've changed AdminPollRate in the [Nvme] section), so that should be fine [13:50:40] unless we're getting the wrong keep-alive timeout value back from the target somehow [13:50:46] jimharris: the return code of the poller functions is used to decide whether to increment active/idle/unknown time [13:50:54] yes [13:51:01] but I just looked through the code base and some pollers return -1 for active, some return 0 for active, and some return > 0 for active [13:51:09] so the stats captured in that patch are meaningless [13:51:30] you have to make all pollers return active/idle before you can gather the stats [13:52:11] i suggested the vishal work on the stat collection; we also need to go through all of the pollers to have them return the correct values [13:52:19] and if the pollers are correctly returning active/idle, you may as well improve the took_action flag while you're there [13:52:34] drv originally had all of them return -1 to mean "unknown" - but it's possible more recent pollers aren't doing the right thing [13:53:08] really? if an nvme_of poll group is idle for one second, we don't want the reactor to go to sleep - what will wake it up? [13:54:07] it will only sleep for a configurable amount of time [13:54:13] your maximum acceptable latency [13:54:20] if you set that to 0, it won't ever sleep [13:54:58] it could be that new pollers are returning 0 for "success" [13:55:03] and that's what I'm seeing [14:00:53] currently, as long as there is one poller on the reactor, it will never sleep, even if you set the max_delay_us [14:01:40] it just keeps track of how long it's been since either an event executed or a poller existed [14:02:30] but if we change that determination based on vishal's stats - now you could have cases where the target is just idle for a while, not doing any I/Os, and the reactor would go to sleep [14:03:21] personally, I think we should just remove the max_delay_us stuff for now - maybe in the future when we get the dynamic threading stuff implemented, we can put a reactor to sleep when it's not running any threads [14:08:00] in fio I can't specify an NQN like this in the filename parameter because the ':' breaks it, right? subnqn=nqn.2018-05.io.spdk:nqn01 [14:08:44] is there a workaround? [14:10:00] drv: i responded to https://review.gerrithub.io/#/c/spdk/spdk/+/414472/ [14:15:31] oh, I see [14:15:34] I'm OK either way [14:18:41] *** Quits: johnmeneghini (~johnmeneg@216.240.30.5) (Quit: Leaving.) [14:20:44] JoeGruher: right, I don't think there is a workaround for that currently (unless there's some escaping mechanism in FIO that we didn't find yet) [14:21:18] can i create nqns without a ':' on the spdk target? i get an error if i just change it to a '.' the target seems to enforce some NQN rules? [14:25:18] which doesn't really seem necessary, why does the target care if i want to use an out of spec nqn format [14:25:56] "message": "Invalid parameters", "code": -32602 [14:28:13] no, we check that it's a valid NQN, so it won't allow you to make one without a : in it [14:28:22] but you could patch that out for testing [14:28:45] the check is the spdk_nvmf_valid_nqn call in spdk_nvmf_subsystem_create [14:29:01] so how do you use the fio nvme plugin for nvmeof, if fio can't handle a ':', and the spdk target won't let you omit the ':' in the NQN [14:29:34] it seems like it renders that feature useless? why does the fio nvme plugin support rdma transport at all then? [14:40:35] JoeGruher: we just found the bug in the bdev fio_plugin with keep alive [14:40:40] trying to come up with a fix [14:41:58] cool [14:42:27] you can use an invalid NQN with the kernel target that doesn't have a ':' in it [14:42:32] because the kernel allows for invalid NQNs [14:42:37] so then you could use our fio nvme plugin [14:42:50] but I agree - this is a problem [14:42:56] and I'm not entirely sure how to solve itg [14:44:27] i'd suggest just not enforcing valid NQNs, it doesn't really seem necessary [14:44:38] as you mentioned, the kernel target doesn't enforce, no one seems to mind :) [14:45:09] I commented out the valid NQN check and created a non-spec NQN and I am successfully running the FIO NVMe plugin against SPDK target now, woohoo [14:45:48] maybe we can downgrade the valid NQN check to a warning [14:51:20] bwalker: i just went through all of the poller functions in spdk - the iscsi initiator wasn't returning the right values (it was always return 0 indicating idle) - but otherwise they are all returning a correct value (including returning -1 if they don't know) [14:53:31] that's not terrible then [14:54:00] if it really doesn't hurt performance to add those extra spdk_get_ticks, we can move ahead with that then [14:54:13] and clean it up once the pollers are cleaned up [14:55:16] i ran some more tests on my system and put the results in gerrithub [14:55:35] we did base our original thinking on event_perf, which does show a degradation with vishal's patch [14:55:45] but a "real" workload shows no difference [14:56:57] yeah, I'm in favor of removing the once-every-5 check just for simplicity, if nothing else [15:04:26] *** Quits: JoeGruher (c037362d@gateway/web/freenode/ip.192.55.54.45) (Quit: Page closed) [15:05:05] *** Quits: gila (~gila@static.214.50.9.5.clients.your-server.de) (Ping timeout: 240 seconds) [15:08:16] *** Joins: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) [15:25:11] *** Joins: Jianjian (~jianjian@208.185.211.6) [15:58:52] jimharris: you're probably already on top of it, but I added some comments about bsdump: https://review.gerrithub.io/#/c/spdk/spdk/+/414479/ [15:59:35] you also don't dump the masks right now [15:59:51] yeah - there's a lot of stuff that isn't getting dumped yet [16:00:00] snapshot related stuff, per-blob flags [16:07:41] drv: fixed two of your comments and replied to the other [16:12:01] jimharris: I see where the xattr_name gets terminated in the blobstore.c code, but this is actually the xattr value for the xattr named "name" [16:12:04] if that's not confusing enough :) [16:12:37] also, I think peluse factored out the hex dump thing from log, so we could use that here if you wanted to [16:13:04] spdk_trace_dump() takes a FILE* [16:15:46] oh - that's where it was, I was looking for it but couldn't find it [16:17:05] *** Quits: Jianjian (~jianjian@208.185.211.6) (Remote host closed the connection) [16:17:32] *** Joins: Jianjian (~jianjian@208.185.211.6) [16:17:56] probably could use a better name [16:22:13] *** Quits: Jianjian (~jianjian@208.185.211.6) (Ping timeout: 256 seconds) [16:23:55] i think i'll keep my hex dump function for now [16:24:07] i just tried it out but i don't care for how it looks in this case [16:24:33] yeah, that's fine [16:33:12] bsdump output in case you're interested [16:33:13] http://spdk.intel.com/public/spdk/builds/review/7c62ff12fc63e6e282184e68c4d52158a5f2ae2d.1528499432/fedora-04/rocksdb/bsdump.txt [16:33:47] (that's an intel-internal link) [16:34:14] now we just need it to fail :) [16:40:46] exactly! [16:41:31] i was telling bwalker earlier that i already found a pseudo-bug with this bsdump tool - we don't coalesce unallocated clusters into a single extent [20:43:19] *** Joins: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) [21:40:45] *** Quits: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) (Remote host closed the connection) [21:41:26] *** Joins: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) [21:45:05] *** Quits: darsto (~darsto@89-68-12-72.dynamic.chello.pl) (Ping timeout: 240 seconds) [21:50:43] *** Joins: darsto (~darsto@89-68-12-72.dynamic.chello.pl) [22:30:31] *** Quits: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) (Remote host closed the connection) [22:31:06] *** Joins: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) [22:35:30] *** Quits: Jianjian (~jianjian@c-73-231-38-189.hsd1.ca.comcast.net) (Ping timeout: 260 seconds)