[00:08:27] *** Joins: tomzawadzki (tomzawadzk@nat/intel/x-idjdlzrkqjdkiwbo) [00:28:54] *** Joins: tkulasek (~tkulasek@192.55.54.40) [01:31:42] Hi all. Is there crypto bdev of @peluse on master ? [03:05:39] Hi alekseymmm. No, it is not yet: https://review.gerrithub.io/c/spdk/spdk/+/403107 [03:22:48] *** Quits: alekseymmm (050811aa@gateway/web/freenode/ip.5.8.17.170) (Quit: Page closed) [03:24:56] *** Quits: darsto (~darsto@89-68-111-146.dynamic.chello.pl) (Quit: /quit) [03:28:14] *** Joins: alekseymmm (050811aa@gateway/web/freenode/ip.5.8.17.170) [05:08:59] *** Joins: darsto (~darsto@89-68-111-146.dynamic.chello.pl) [05:23:43] *** Joins: bluebird (~bluebird@p5DE94EC4.dip0.t-ipconnect.de) [05:42:22] *** Quits: bluebird (~bluebird@p5DE94EC4.dip0.t-ipconnect.de) (Ping timeout: 268 seconds) [06:53:17] *** Quits: tkulasek (~tkulasek@192.55.54.40) (Ping timeout: 268 seconds) [07:50:18] *** Quits: tomzawadzki (tomzawadzk@nat/intel/x-idjdlzrkqjdkiwbo) (Ping timeout: 264 seconds) [09:16:11] Good morning folks. Do I remember correctly during the last community meeting that someone was going to create a new Packaging board for SPDK? [09:16:28] That is, on Trello. [09:34:45] *** Joins: travis-ci (~travis-ci@ec2-54-196-94-221.compute-1.amazonaws.com) [09:34:46] (spdk/master) test/bdev: properly cleanup bdev layer in bdev_io_wait_test (Jim Harris) [09:34:47] Diff URL: https://github.com/spdk/spdk/compare/da01835d84a1...8ef7818a2e5b [09:34:47] *** Parts: travis-ci (~travis-ci@ec2-54-196-94-221.compute-1.amazonaws.com) () [10:25:21] *** Quits: darsto (~darsto@89-68-111-146.dynamic.chello.pl) (*.net *.split) [10:25:23] *** Quits: pwodkowx (~pwodkowx@134.134.139.72) (*.net *.split) [10:28:14] *** Joins: darsto (~darsto@89-68-111-146.dynamic.chello.pl) [11:16:04] *** Quits: ChanServ (ChanServ@services.) (*.net *.split) [11:22:07] *** Joins: ChanServ (ChanServ@services.) [11:22:07] *** verne.freenode.net sets mode: +o ChanServ [11:40:33] lhodev: yes - but i don't think it's been set up quite yet [12:10:35] *** Joins: Tracy35 (0cda5282@gateway/web/freenode/ip.12.218.82.130) [12:13:50] @bwalker: With the new spdk (master taken on Friday) still get segfault. The segfault is in a new location spdk_nvmf_qpair_request_cleanup. As mentioned on late Friday, without switch is OK and with switch there is this segfault. [12:15:03] ok two things [12:15:11] 1) this proves that your connection is dropping because of a switch issue [12:15:21] 2) SPDK shouldn't crash when the connection drops unexpectedly [12:15:37] is there a bug report on GitHub for the crash already? [12:16:06] specifically, the backtrace [12:16:22] Not yet. I can open a case on GitHub and put the backtrace there. [12:16:28] that would be great [12:19:08] About the switch, previously it was not configured and when checked it reported a high level (~8000) of discarded packets when there was a crash. Now the switch has been configured correct and after the spdk crash, checked and there is 0 discarded packets reported by the switch. But in the spdk, we do see disconnects. [12:21:00] With give details in the log. But for your quick info, the fault is in "assert(qpair->state == SPDK_NVMF_QPAIR_ACTIVE);" and qpair->state is SPDK_NVMF_QPAIR_INACTIVE. [12:21:24] ok - I'm sure SPDK is trying to do RDMA error recovery on the connection [12:21:42] which is not well tested because we don't currently have a way to really control how the connections drop [12:22:38] In the auto regression test, is a switch part of the test configuration? [12:23:02] no, they are all loopback [12:23:08] Do you need my help to reproduce the issue further? [12:23:17] Ok [12:23:29] I likely will, once I look at the backtrace and come up with some theories [12:23:48] When you mean loopback, do you mean point to point? [12:24:03] no initiator and target running on same system [12:24:13] you can loop a cable back on a physical NIC [12:24:17] between ports [12:24:34] Got it. [12:25:23] Appreciate your help. Will open a case on GitHub so that we can take it further. Thanks [12:26:59] darsto: Can rte_malloc() return buffers that span multiple memory segments in DPDK? Basically, does rte_malloc() guarantee that things are physical contiguous? In DPDK 18.02 [12:28:31] I'm fairly sure you said it could a few weeks ago, but I can't remember if that was only for new DPDK memory management [12:32:17] bwalker: rte_malloc() buffers do not have to be physically contiguous [12:34:07] but there is RTE_IOVA_VA mode that uses vaddr as iova. We don't make use of it right now, but if we did, then obviously each virtually contiguous buffer would be physically contiguous as well [12:36:04] well, technically not physically contiguous, but iova-contiguous [12:36:34] even on DPDK 18.02 rte_malloc() is not guaranteed physically contiguous [12:36:38] ? [12:36:57] only with dynamic memory management [12:37:03] that's why there's --legacy-mem [12:37:31] so in --legacy-mem it's guaranteed physical contiguous? [12:37:40] on DPDK 18.02 there isn't the legacy mem thing - that's always on [12:37:48] yes and yes [12:38:25] crap. I'm debugging something for someone and I don't have a lot of info on what's going wrong, but rte_malloc() is returning a buffer and somehow it is in two different RDMA memory regions [12:38:28] --legacy-mem was made for dpdk users that need some time to accomodate their applications [12:38:30] just like us [12:38:46] and there should be 1 RDMA memory region per memory segment [12:39:13] that's possible [12:39:37] I thought memory segments would be coalesced such that you wouldn't get two that are physically contigous [12:39:38] imagine starting the app with --mem-size 512 and trying to allocate a 1GB buffer [12:39:55] this is all --legacy-mem mode [12:40:00] so that would fail [12:40:04] oh ok [12:40:08] it's DPDK 18.02 [12:41:49] yeah I don't think it's possible on DPDK 18.02 [12:42:00] memory segments should be coalesced, but only if they have the same mountpoint and size [12:42:23] but it doesn't look like rte_malloc() will allow an allocation to span two memory segments [12:42:26] as far as I can tell [12:42:30] correct [12:42:49] and if there's one RDMA memory region per memory segment [12:42:56] then a buffer won't ever span two RDMA memory segments [12:44:21] i take back my 'correct'. I would to need to revisit that code again [13:18:11] oh, this is hacky. On DPDK 18.02 we called spdk_mem_register separately on each page, but we were doing it just after rte_eal_init() and before any mem_map could be allocated [13:21:12] so whenever a mem_map was allocated afterwards, it was coalescing virtually contiguous regions [13:21:45] which means rte_malloc() buffer could not possibly span two rdma regions [13:22:01] peluse: pushed some comments on your crypto patch - it's getting there [13:22:29] sethhowe: I swear that https://review.gerrithub.io/#/c/spdk/spdk/+/423157/ was in the build pool queue earlier, but i don't see it there now and it never ran [13:23:42] let me look real quick [13:27:39] jimharris: It looks like it's failing because our submodule doesn't recognize the dpdk commit. [13:28:20] jimharris: So the chandler pool keeps failing it and putting it back at the top of the queue. [13:30:29] oh [13:31:08] When I try to check it out on my own machine I get the same error. Has this change to our dpdk submodule not been merged? [13:32:23] *** Joins: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) [13:32:24] (spdk/master) trace: update TpointGroupMask comment and related code (Liang Yan) [13:32:24] Diff URL: https://github.com/spdk/spdk/compare/242201d2c9bf...1d1496dc0d7e [13:32:24] *** Parts: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) () [13:32:39] i had pushed it to gerrithub but not github [13:32:43] should work now [13:32:52] thanks! [13:33:02] bwalker: https://review.gerrithub.io/#/c/spdk/spdk/+/423050/ [13:37:52] *** Joins: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) [13:37:53] (spdk/master) bdev: increment io_time if queue depth > 0 (Seth Howell) [13:37:53] Diff URL: https://github.com/spdk/spdk/compare/1d1496dc0d7e...b7d9caf2e620 [13:37:53] *** Parts: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) () [13:59:41] *** Joins: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) [13:59:42] (spdk/master) test/nvmf: add bdev_io_wait tests (Tomasz Zawadzki) [13:59:42] Diff URL: https://github.com/spdk/spdk/compare/b7d9caf2e620...75408adc3db7 [13:59:42] *** Parts: travis-ci (~travis-ci@ec2-54-163-147-142.compute-1.amazonaws.com) () [15:15:24] *** Quits: Tracy35 (0cda5282@gateway/web/freenode/ip.12.218.82.130) (Ping timeout: 252 seconds) [15:33:14] *** Joins: Tracy35 (~Tracy35@12.218.82.130) [15:33:34] *** Quits: Tracy35 (~Tracy35@12.218.82.130) (Client Quit) [15:34:42] *** Joins: Tracy35 (~Tracy35@12.218.82.130) [15:35:17] *** Quits: Tracy35 (~Tracy35@12.218.82.130) (Client Quit) [15:49:45] *** Quits: ChanServ (ChanServ@services.) (*.net *.split) [16:03:49] *** Joins: ChanServ (ChanServ@services.) [16:03:49] *** verne.freenode.net sets mode: +o ChanServ [16:09:03] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [19:29:30] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 252 seconds) [19:29:32] *** Quits: guerby (~guerby@april/board/guerby) (*.net *.split) [20:29:51] *** Quits: bwalker (bwalker@nat/intel/x-yjafjefmughyleem) (ZNC - http://znc.in)