[09:39:05] *** Quits: tomzawadzki (uid327004@gateway/web/irccloud.com/x-qbjgkrctqqjyedrb) (Quit: Connection closed for inactivity) [09:43:18] *** Joins: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) [09:43:19] (spdk/master) nvmf: Shorten the shutdown test (Ben Walker) [09:43:19] Diff URL: https://github.com/spdk/spdk/compare/fd1e204647d7...08f64b576176 [09:43:19] *** Parts: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) () [09:47:17] bwalker: jimharris: Are there any gotcha's for the SPDK/DPDK on systems where the default hugepage size = 1GB and only 1GB pages are reserved; i.e. no 2MB pages? [09:49:15] not that I know of - we've run that way before and had success [09:49:25] beyond the fact that you can only reserve memory in 1GB units [09:49:29] the main gotcha is that you can't allocate less than 1GB to any one application - but i think for these kinds of use cases, that's a reasonable limitation [09:50:59] Thanks. What if transparent hugepages is enabled, too? Is that known to cause any issues? [09:52:18] you can't use transparent hugepages for DMA-able memory [09:52:22] but they otherwise don't interfere [09:52:32] if you have both some reserved hugepages and some used for transparent hugepages, that should work [09:52:54] or with the new dynamic memory allocation, you can reserve a bunch of memory on the system for use in hugepages [09:53:07] and then as memory is allocated it will get assigned to the regular hugepage pool or consumed as transparent hugepages [09:53:28] I'm not sure if a hugepage ever gets "released" back to the regular hugepage pool after it is used as a transparent hugepage one time though - not sure where Linux is at with that [09:55:39] bwalker: Thx. Trying to support someone who is mysteriously running to trouble (again) during nvmf_tgt startup. Ugh. Seeing ibv_reg_mr() failures again as well as a couple other memory-related complaints. [09:58:03] are they trying to use the new dynamic memory stuff with RDMA? [09:58:14] some NICs have fairly limited numbers of available ibv_mrs [09:58:28] and the dynamic memory stuff means we have to allocate 1 per 2MB hugepage, more or less [10:06:31] Actually, no. They're running much older SPDK/DPDK (and I don't know why, yet). Per their screen output, [10:06:33] Starting SPDK v18.04 / DPDK 17.11.0 initialization... [10:15:59] oh, then not related to dynamic memory [10:16:17] that's not that old - just last April [10:16:32] pretty good relative to what I've seen a lot of people running [10:19:22] I recall seeing a Mellanox driver parameter, log_num_mtt (and also log_mtts_per_seg) which were tunable. Mellanox has an older doc (page created in 2013, updated in 2015) describing how one might set these based on the physical memory in a system. [10:20:02] However, I note that log_num_mtt no longer appears and I *think* I recall reading somewhere awhile back that maybe it's calculated automatically now. Any of that familiar to you? [10:20:58] I thought it was a fixed value to be honest [10:21:34] *** Joins: fionatrahe (~fionatrah@134.134.139.72) [10:27:07] Well, I don't think it's related to that, but I'm not 100% sure. I have verified that have large values for their ulimit hard/soft memlock. Ditto for /proc/sys/vm/max_map_count. [10:29:59] hmm [10:32:24] Nothing appears in dmesg, either [10:34:35] On launching nvmf_tgt, too, they also restricted the amount of mem to use (-s 4096). [10:35:04] The system has a large amount of memory: 196693012 kB [10:36:10] ^ that usually causes some trouble [10:36:21] are there any errors prefixed with EAL? [10:37:01] The other memory complaints at startup of nvmf_tgt -- i.e. before the ibv_reg_mr() failures -- include: [10:37:02] reactor.c: 671:spdk_reactors_init: *NOTICE*: Event_mempool creation failed on preferred socket 0. [10:37:03] RING: Cannot reserve memory [10:37:03] reactor.c: 549:spdk_reactor_construct: *NOTICE*: Ring creation failed on preferred socket 0. Try other sockets. [10:37:31] There's this beauty: EAL: WARNING: Master core has no memory on local socket! [10:39:53] that shouldn't cause those RDMA failures though [10:43:05] still, the memory was allocated manually on one socket and spdk was run on another [10:43:14] you might want to get that fixed up anyway [10:44:06] `setup.sh status` will tell you how many hugepages are reserved on each socket [10:44:44] then `./dpdk/usertools/cpu_layout.py` will tell you which cpu ids correspond to which socket [10:58:01] darsto: thanks for the suggestions. I'm trying to gain access to the system where this was reported. [11:01:50] *** Joins: travis-ci (~travis-ci@ec2-54-161-182-138.compute-1.amazonaws.com) [11:01:51] (spdk/master) mk: set executable bit only for real libraries (Pawel Wodkowski) [11:01:51] Diff URL: https://github.com/spdk/spdk/compare/bb2486a468ec...535466d7cd5d [11:01:51] *** Parts: travis-ci (~travis-ci@ec2-54-161-182-138.compute-1.amazonaws.com) () [11:06:32] *** Joins: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) [11:06:33] (spdk/master) Bdev/QoS: return actual submitted IO count (GangCao) [11:06:33] Diff URL: https://github.com/spdk/spdk/compare/535466d7cd5d...91420344f4ed [11:06:33] *** Parts: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) () [11:07:53] *** Joins: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) [11:07:54] (spdk/master) nvmf: don't implicitly create the transport in tgt listen. (Seth Howell) [11:07:54] Diff URL: https://github.com/spdk/spdk/compare/91420344f4ed...7f128c757bce [11:07:54] *** Parts: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) () [11:23:59] *** Joins: JoeGruher (86868b53@gateway/web/freenode/ip.134.134.139.83) [11:24:26] hello. pkgdep.sh is listing some packages as unable to locate, and then 'error!'. any suggestions what i should do? i think the problem may be some packages are not available in Ubuntu 18.04. is it known if SPDK can be run on 18.04? [11:24:45] which packages? [11:25:37] libcunit1-dev, astyle, pep8, lcov, clang [11:26:02] in some cases names may have changed - apt search shows no pep8, but there is a python3-pep8 package [11:26:25] there's a libclang1-6.0 package.... but no hits for an apt search on 'lcov' [11:27:12] no hits for 'astyle' [11:27:30] it looks like the newest ubuntu we test on currently is 17.10 [11:28:41] https://packages.ubuntu.com/bionic/astyle [11:29:13] https://packages.ubuntu.com/bionic/lcov [11:29:47] I don't know what [universe] means, but I assume the package repository [11:30:13] oh, it's non-canonical maintained stuff [11:30:19] do you have the universe repo enabled? [11:30:25] https://help.ubuntu.com/community/Repositories/Ubuntu [11:30:30] nope looks like i don't - i'll try that [11:30:43] that's funny i think older versions enabled universe by default [11:31:00] maybe they're slowly locking it down like an apple product [11:31:10] hah [11:31:26] by the way, for my unsolicited two cents, it doesn't make a lot of sense to test a non-LTS version and leave out the latest LTS version [11:31:45] I think when we set up the ubuntu test machines, it was before 18.04 came out [11:31:54] we do 16.04 LTS and 17.10 [11:32:00] probably set it up at the beginning of the year [11:32:04] i'd suggest picking up 18.04 and dropping 17.10 now that 18.04 is out [11:32:09] yeah - I'll add it to the list [11:32:14] cool [11:34:09] *** Joins: travis-ci (~travis-ci@ec2-54-167-223-61.compute-1.amazonaws.com) [11:34:10] (spdk/master) nvme: improve probe error handling in MP even further (Darek Stojaczyk) [11:34:10] Diff URL: https://github.com/spdk/spdk/compare/7f128c757bce...04ee899fcfa2 [11:34:10] *** Parts: travis-ci (~travis-ci@ec2-54-167-223-61.compute-1.amazonaws.com) () [11:34:16] cool, confirmed pkgdep runs happing after i enabled the universe repos [11:35:34] er, happily [11:39:49] The Infiniband Verbs opcode Send With Invalidate is either not supported or is not functional with the current version of libibverbs installed on this system. Please upgrade to at least version 1.1. [11:40:25] yeah that is a fun one [11:40:33] my package reports version 17.1-1 [11:40:45] *** Joins: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) [11:40:46] (spdk/master) version: fix version string (Darek Stojaczyk) [11:40:46] that probably means 1.0, maybe [11:40:46] Diff URL: https://github.com/spdk/spdk/compare/04ee899fcfa2...c7bb861a852d [11:40:46] *** Parts: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) () [11:40:53] ok [11:40:56] where do i find 1.1? [11:40:56] so here's the issue with that [11:41:20] the Linux kernel NVMe-oF initiator uses a new RDMA operation called Send With Invalidate [11:41:30] by "new" I mean less than a decade old [11:41:56] and the verbs interface for user space only implemented support for this in version 1.1, which again isn't "new" [11:42:18] oh so this is only a problem if i use the kernel initiator with spdk target [11:42:27] so your target will work, but it falls back to a much slower path when connecting to NVMe-oF kernel initiators [11:42:35] and additionally, only for kernel's newer than 4.14 [11:42:44] if you use an old kernel or the SPDK initiator, it's fine [11:43:18] I can probably work around it then.... but it would be nice to just fix it with newer libibverbs. Do you know if a newer package is available for Ubuntu? [11:43:39] *** Joins: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) [11:43:40] (spdk/master) test: fix nvmf connect/disconnect (Vitaliy Mysak) [11:43:40] Diff URL: https://github.com/spdk/spdk/compare/c7bb861a852d...8cbbdf2ddf06 [11:43:40] *** Parts: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) () [11:44:39] http://www.mellanox.com/page/mlnx_ofed_eula?mtag=linux_sw_drivers&mrequest=downloads&mtype=ofed&mver=MLNX_OFED-4.4-2.0.7.0&mname=MLNX_OFED_LINUX-4.4-2.0.7.0-ubuntu18.04-x86_64.tgz [11:44:55] Mellanox releases their own version called "OFED" [11:45:06] *** Quits: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) (Ping timeout: 252 seconds) [11:45:08] it's the same set of packages, but is much newer often than the ones packages by the distributions [11:45:27] I mostly use Fedora, we stays reasonably close to the latest [11:45:33] and so I don't typically install the OFED thing [11:45:41] but Ubuntu hangs way further back [11:45:45] Yeah I don't like to install OFED if I can help it [11:45:59] It usually just breaks everything in my limited experience :) [11:46:09] that's been my experience as well [11:46:42] I guess I'll just proceed by avoiding kernel init with spdk target for now, plan is to run spdk/spdk anyway, and maybe kernel/kernel as well [11:46:42] *** Joins: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) [11:46:43] (spdk/master) check_format: use -P$(nproc) to speed up this script (Pawel Wodkowski) [11:46:43] Diff URL: https://github.com/spdk/spdk/compare/8cbbdf2ddf06...38103f1f5993 [11:46:43] *** Parts: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) () [11:47:03] that ubuntu kernel may not have the send with invalidate patches backported anyway [11:47:04] thanks! [11:47:18] 4.18.16-041816-generic [11:47:23] oh, that definitely does [11:47:32] it's the latest stable [11:47:37] well, as of a couple days ago [11:48:03] *** Joins: travis-ci (~travis-ci@ec2-54-211-33-91.compute-1.amazonaws.com) [11:48:04] (spdk/master) doc: Fix the table on app_overview.html (Seth Howell) [11:48:04] Diff URL: https://github.com/spdk/spdk/compare/38103f1f5993...4a91adb802ca [11:48:04] *** Parts: travis-ci (~travis-ci@ec2-54-211-33-91.compute-1.amazonaws.com) () [11:48:11] hey, while i've got you, what's the status of multipath support in SPDK? [11:48:18] for NVMe-oF? [11:48:41] yeah... like target with two NICs, initiator with two NICs, initiator connects two separate network paths to target system [11:49:01] not PCIe multipath from the target to the NVMe device [11:49:07] *** Joins: lhodev (~lhodev@inet-hqmc05-o.oracle.com) [11:49:13] there is some support [11:49:23] you can create a subsystem and make it listen on two different addresses [11:49:35] and two different initiators can connect to it [11:49:47] and that should be fine - they'll need to coordinate reads and writes to the backing storage somehow [11:50:21] we don't have support for ANA, which attempts to report which paths are optimal [11:50:38] on our initiator side, it doesn't have any comprehension of multipathing [11:50:55] so you are free to connect to the same subsystem twice using two different NICs (so theyr'e on different paths) [11:51:01] but it will show up as two separate block devices [11:51:20] right, I see, nothing to present them as a single block device and manage path selection [11:51:34] At some point I'd like us to add that, but it's not currently present [11:52:07] *** Joins: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) [11:52:08] (spdk/master) test/nvmf: Fix check_ip_is_soft_roce (Seth Howell) [11:52:08] Diff URL: https://github.com/spdk/spdk/compare/4a91adb802ca...2d153b29dbe0 [11:52:08] *** Parts: travis-ci (~travis-ci@ec2-54-159-105-224.compute-1.amazonaws.com) () [11:52:13] yeah kernel has a basic version now i've played with a little... it just does active/passive paths, active/active would be neat for more bandwidth [11:52:33] yep I think there are quite a few neat things that could be done [11:52:41] hehe yeah [11:52:46] thx! [11:52:49] sure thing [11:54:25] sorry, one more question I guess, since I just ran the unit test - I passed but have to warnings - should I care? WARN: lcov not installed or SPDK built without coverage! WARN: neither valgrind nor ASAN is enabled! [11:54:29] two* [11:55:42] that's just telling you that you don't have some tools installed [11:55:52] so the unit tests are not performing some cool things we do [11:56:07] the tests are still running, but they aren't generating code coverage metrics (lcov) [11:56:14] and they aren't checking for memory leaks (valgrind/asan) [11:57:14] ah, ok, cool [11:57:23] darsto: Can you please expand on your "trouble" comment with respect to large memory? Other than ensuring large soft/hard ulimit memlock and vm.max_map_count, are there any other configuration parameters anywhere -- i.e. kernel, DPDK, ethernet driver, etc. -- that I should be aware of ? [11:58:10] JoeGruher: what version of gcc are you running? [11:58:28] lhodev: gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0 [11:59:05] Ah, ok. Good. From my own experience, I ran into an issue with an older version of gcc that yielded corrupted gcno files. As such, that gave lcov fits. [11:59:33] *** Joins: travis-ci (~travis-ci@ec2-54-80-113-186.compute-1.amazonaws.com) [11:59:34] (spdk/master) Doxyfile: fix indentation (Darek Stojaczyk) [11:59:34] Diff URL: https://github.com/spdk/spdk/compare/2d153b29dbe0...aef69a986b44 [11:59:34] *** Parts: travis-ci (~travis-ci@ec2-54-80-113-186.compute-1.amazonaws.com) () [12:00:08] *** Joins: travis-ci (~travis-ci@ec2-54-198-135-204.compute-1.amazonaws.com) [12:00:09] (spdk/master) bdev/raid: avoid reference null pointer in log message (wuzhouhui) [12:00:09] Diff URL: https://github.com/spdk/spdk/compare/aef69a986b44...1ab092348ad5 [12:00:09] *** Parts: travis-ci (~travis-ci@ec2-54-198-135-204.compute-1.amazonaws.com) () [12:06:27] *** Joins: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) [12:06:28] (spdk/master) rpc: revise comments for construct_nvme_bdev (Liu Xiaodong) [12:06:29] Diff URL: https://github.com/spdk/spdk/compare/1ab092348ad5...98187ed9a750 [12:06:30] *** Parts: travis-ci (~travis-ci@ec2-54-204-172-201.compute-1.amazonaws.com) () [12:08:25] lhodev: not really. I just remember we used to have problems with that multiple times [12:09:25] but since your DPDK initializes correctly and fails on ibv_reg_mr, then that's something new [12:10:34] *** Joins: travis-ci (~travis-ci@ec2-54-198-135-204.compute-1.amazonaws.com) [12:10:35] (spdk/master) dpdk: update submodule (paul luse) [12:10:35] Diff URL: https://github.com/spdk/spdk/compare/98187ed9a750...d853ced622c9 [12:10:35] *** Parts: travis-ci (~travis-ci@ec2-54-198-135-204.compute-1.amazonaws.com) () [12:10:39] I have seen this before, but it was due to a low ulimit memlock. [12:10:50] By "this", I refer to the failure of ibv_reg_mr(). [12:15:33] *** Joins: alekseymmm (bcf3adf1@gateway/web/freenode/ip.188.243.173.241) [12:19:37] *** Joins: travis-ci (~travis-ci@ec2-54-161-182-138.compute-1.amazonaws.com) [12:19:38] (spdk/master) barrier.h: fix load fence on armv8 (Kefu Chai) [12:19:38] Diff URL: https://github.com/spdk/spdk/compare/d853ced622c9...04df6e694056 [12:19:38] *** Parts: travis-ci (~travis-ci@ec2-54-161-182-138.compute-1.amazonaws.com) () [12:39:42] *** Quits: JoeGruher (86868b53@gateway/web/freenode/ip.134.134.139.83) (Quit: Page closed) [14:09:31] jimharris: Can you respond to my comment on https://review.gerrithub.io/#/c/spdk/spdk/+/424973/ with respect to the version? [14:27:45] alekseymmm: I posted a potential fix for your fio issue [14:27:48] responded to github bug [14:38:53] *** Joins: travis-ci (~travis-ci@ec2-54-211-135-142.compute-1.amazonaws.com) [14:38:54] (spdk/master) iscsi: Support Base64 constants for iSCSI certification CHAP test (Shuhei Matsumoto) [14:38:54] Diff URL: https://github.com/spdk/spdk/compare/04df6e694056...4e54b1a082f2 [14:38:54] *** Parts: travis-ci (~travis-ci@ec2-54-211-135-142.compute-1.amazonaws.com) () [14:52:59] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [14:55:26] *** Joins: guerby (~guerby@april/board/guerby) [15:17:31] lhodev: done [15:17:52] i think the 18.10 was still there from before the release [15:18:10] i think we should change it to 19.01-pre on master, then change it to 18.10.1 when we cherry-pick it back to the 18.10 branch? [15:22:19] *** Joins: travis-ci (~travis-ci@ec2-54-211-38-250.compute-1.amazonaws.com) [15:22:20] (spdk/master) bdev/raid: raid_bdev_add_base_device: fix wrong param in log (wuzhouhui) [15:22:20] Diff URL: https://github.com/spdk/spdk/compare/4e54b1a082f2...7806ece53bf0 [15:22:20] *** Parts: travis-ci (~travis-ci@ec2-54-211-38-250.compute-1.amazonaws.com) () [15:23:14] Thanks jimharris. Sounds good to me. [17:11:06] @bwalker Thanks a lot . Going to check it soon [17:11:13] *** Quits: alekseymmm (bcf3adf1@gateway/web/freenode/ip.188.243.173.241) (Quit: Page closed)