[03:30:34] *** Joins: BelAzi (~BelAzi@193.158.83.227) [07:23:14] *** Quits: BelAzi (~BelAzi@193.158.83.227) (Ping timeout: 272 seconds) [07:37:12] *** Joins: BelAzi (~BelAzi@193.158.83.227) [08:17:58] *** Quits: BelAzi (~BelAzi@193.158.83.227) (Ping timeout: 245 seconds) [08:21:40] *** Joins: BelAzi (~BelAzi@193.158.83.227) [08:43:10] *** Quits: BelAzi (~BelAzi@193.158.83.227) (Ping timeout: 250 seconds) [08:47:00] PSA: There were some hardware related failures over the weekend on the Jenkins test pool. I am retriggering a lot of the patches that failed, but if you happened to look back in the next few hours and your patch has not run, please feel free to retrigger it. [08:50:32] yup, noticed that too. sethhowe did a little work on some of the issues they had with the QAT cards as well - just fixed this morn [08:54:46] *** Joins: BelAzi (~BelAzi@193.158.83.227) [08:55:06] peluse: i've seen this failure a few times over the last few days - it hit one of my patches again just now: [08:55:06] https://10.102.17.104:8080/job/Other/job/blockdev_autotest/11852/artifact/build.log [08:55:16] any thoughts? [08:55:58] looking... [08:57:31] sethhowe: i just ran an experiment with linking all of the libraries with --whole-archive [08:57:45] the increased size of nvmf_tgt was less than 1% [08:58:18] jimharris, yeah I saw this too once last week. I thought I mentioned in on IRC asking bwalker if it was related to the passthru 'already claimed' issue from early/mid last week but either I didn't bring it up or didn't hear a response. Did you work on the PT claiming thing and if so what was the root cause there? [08:59:49] want me to look into the "io_device 0x2cc1940 already registered" failure before I get rolling on other shit? [09:00:14] jimharris: Awesome! Sounds like we're good to move forward on that plan then. I haven't quite started on the heirarchy patches yet. I have been trying to get our bdev tests working after binding the QAT cards to the 01.org driver (this will be necessary for compressdev to work) [09:00:29] i have a few fixes out for the already claimed thing [09:00:42] but this hit even with those fixes in place [09:01:03] see three patches starting here: https://review.gerrithub.io/#/c/spdk/spdk/+/432600/ [09:03:03] crap, that was my bad on that one... when I added RPC testing for PT I failed to add the delete. Srrrr [09:03:51] jimharris, so do you want me to debug the io_device issue now then? [09:04:44] yeah - i'm guessing if you are able to reproduce it locally you'll have a fix pretty quick [09:04:55] I'm on it... [09:05:03] but i was running this json_config test locally late last week, and don't think i hit this issue at all [09:15:38] hmmm, yeah I ran them locally when I added RPC testing to crypto - wont' run for me now at least on one machine. DPDK won't init for some reason. Happy Monday right?? [09:28:48] *** Quits: BelAzi (~BelAzi@193.158.83.227) (Ping timeout: 245 seconds) [09:31:35] heh, of course, doesn't fail for me locally on first run at least. Will keep poking [11:18:24] jimharris, actually there's at least one thing in crypto (propogate register err code) that you just added to PT that's not in crypto. Will be out for a few hours but will probably do a quick pass patch on crypto to get it up to date with PT whiel I look closer into how the device was still there following clear_config [11:18:39] cool [11:19:49] *** Joins: JoeGruher (c037362a@gateway/web/freenode/ip.192.55.54.42) [11:20:07] Got this when running setup.sh... haven't seen this before "Current user memlock limit: 16 MB" [11:20:38] interesting - you get that when running setup.sh as root? [11:20:46] well, with sudo [11:21:46] what does "sudo ulimit -a" show? [11:21:55] specifically for max locked memory [11:22:13] and can you narrow down where setup.sh is throwing this error? [11:22:42] i don't seem to have a ulimit command on ubuntu 18.04 [11:23:00] setup.sh prints it as a warning at the end: [11:23:02] Current user memlock limit: 16 MB This is the maximum amount of memory you will be able to use with DPDK and VFIO if run as current user. To change this, please adjust limits.conf memlock limit for current user. ## WARNING: memlock limit is less than 64MB ## DPDK with VFIO may not be able to initialize if run as current user. [11:25:17] OK if I do sudo sh -c "ulimit -a" that works - locked memory(kbytes) 16384 [11:25:26] ok - are you bringing spdk up for the first time on this system? [11:26:00] sort of. i cloned the OS from another system, which didn't have this problem, so the underlying hardware is new / has changed. [11:27:47] does /etc/security/limits.conf have any memlock entries? [11:28:07] no, everything in the file is commented out [11:28:44] hmmmm [11:29:24] new system has at least the same amount of memory as the original system? [11:29:51] may want to double check things like kernel command line parameters for number of hugepages [11:31:40] linux /boot/vmlinuz-4.18.16-041816-generic root=UUID=9e8940e1-dd53-11e8-9a9b-a4bf011d7fa7 ro console=tty0 console=ttyS1,115200 intel_iommu=on [11:32:18] VT-d is enabled in BIOS on both systems? [11:32:47] ah, let me look, something in the bios that's different would certainly make sense [11:34:50] *** Joins: travis-ci (~travis-ci@ec2-54-80-235-78.compute-1.amazonaws.com) [11:34:52] (spdk/master) nvmf: Use bdevperf for the shutdown test (Ben Walker) [11:34:52] Diff URL: https://github.com/spdk/spdk/compare/bf1a82cf5ac2...edb693c58df1 [11:34:52] *** Parts: travis-ci (~travis-ci@ec2-54-80-235-78.compute-1.amazonaws.com) () [11:38:22] Yeah, at least according to the BIOS screens, VT-d is set to enabled [11:39:02] any the cloned system has >= memory of the original system? [11:39:05] any => and [11:40:45] No, actually, the source system was running 512GB and the cloned system has 384GB [11:45:43] 16MB memlock limit is the default on ubuntu [11:46:00] unless you want to run SPDK as non-root, it shouldn't cause any trouble [11:46:20] JoeGruher says he is running as root though [11:46:41] sorry, i didn't follow the entire discussion [11:46:45] is there any issue though? [11:47:01] i'm running setup.sh with sudo [11:47:17] he's cloned the OS disk from another system, and on the new system, hits this memlock issue [11:47:59] i'm not really sure what the problem could be here - certainly seems like something related to the cloning process [11:49:11] Yeah. I guess I could do a fresh install and see if that fixes the problem. It's just time consuming to set up from scratch, so wanted to see if there's a known fix for this situation. [11:49:48] i've never seen this come up before [11:50:01] if you sudo su, and then do a ulimit -a, does it still show the memlock limit? [11:51:12] yeah, it does [11:52:31] but root doesn't care about that limit [11:53:00] on the old system i think it wasn't binding to vfio-pci [11:53:57] it was using uio_pci_generic [11:54:20] your current system has the iommu enabled according to your kernel boot parameters [11:54:23] which is why it's using vfio-pci [11:55:01] so does the old system but for some reason that one still binds to generic [11:55:15] oh yeah, the old system also has 'locked memory(kbytes) 16384' but doesn't generate the warning when running setup.sh [11:55:45] i can try removing the kernel boot parameter on the new system and see if it binds to generic and that prevents the setup.sh warning [11:55:57] but don't i want to be using vfio-pci [11:56:53] in general, yes [11:56:56] that's the newer set up [11:57:43] hmm [11:58:01] do you know if you're testing with vfio-pci in ubuntu? [11:59:27] I don't actually. sethhowe? [11:59:59] JoeGruher: We aren't. All of our ubuntu systems are VMs [12:01:49] so maybe i'm in a bit of an unexplored corner case with ubuntu + vfio-pci. i also wonder why the older system doesn't bind to vfio-pci even when i force intel_iommu=on. they're both skylake systems. [12:03:11] confirmed - on the new system if i remove intel_iommu=on, i bind to uio_pci_generic when i run setup.sh, and then i don't see that warning from setup.sh [12:47:18] *** Quits: JoeGruher (c037362a@gateway/web/freenode/ip.192.55.54.42) (Quit: Page closed) [13:18:47] *** Joins: travis-ci (~travis-ci@ec2-54-211-144-2.compute-1.amazonaws.com) [13:18:48] (spdk/master) bdev/passthru: check early for duplicate passthru bdev names (Jim Harris) [13:18:48] Diff URL: https://github.com/spdk/spdk/compare/edb693c58df1...0d8783cabc50 [13:18:48] *** Parts: travis-ci (~travis-ci@ec2-54-211-144-2.compute-1.amazonaws.com) () [13:19:03] *** Joins: travis-ci (~travis-ci@ec2-174-129-149-194.compute-1.amazonaws.com) [13:19:04] (spdk/master) iscsi: change connection messages to DEBUGLOGs (Jim Harris) [13:19:05] Diff URL: https://github.com/spdk/spdk/compare/0d8783cabc50...807c3a2b27a6 [13:19:05] *** Parts: travis-ci (~travis-ci@ec2-174-129-149-194.compute-1.amazonaws.com) () [15:00:05] *** Joins: travis-ci (~travis-ci@ec2-54-80-140-157.compute-1.amazonaws.com) [15:00:06] (spdk/master) bdev_aio: enable double buffering on write path (Piotr Pelplinski) [15:00:06] Diff URL: https://github.com/spdk/spdk/compare/807c3a2b27a6...17d652d72009 [15:00:06] *** Parts: travis-ci (~travis-ci@ec2-54-80-140-157.compute-1.amazonaws.com) () [15:25:25] *** Joins: travis-ci (~travis-ci@ec2-54-163-45-129.compute-1.amazonaws.com) [15:25:26] (spdk/master) app: RPC to wait for app subsystem initialization. (Seth Howell) [15:25:27] Diff URL: https://github.com/spdk/spdk/compare/17d652d72009...b7f54bd66eef [15:25:27] *** Parts: travis-ci (~travis-ci@ec2-54-163-45-129.compute-1.amazonaws.com) () [15:27:21] *** Joins: travis-ci (~travis-ci@ec2-54-80-140-157.compute-1.amazonaws.com) [15:27:22] (spdk/master) reduce: write path of pm file to backing dev (Jim Harris) [15:27:22] Diff URL: https://github.com/spdk/spdk/compare/b7f54bd66eef...cfc372c2ec9e [15:27:22] *** Parts: travis-ci (~travis-ci@ec2-54-80-140-157.compute-1.amazonaws.com) () [15:27:51] *** Joins: travis-ci (~travis-ci@ec2-54-211-207-110.compute-1.amazonaws.com) [15:27:52] (spdk/master) vfio: don't use VFIO when IOMMU is disabled (Darek Stojaczyk) [15:27:52] Diff URL: https://github.com/spdk/spdk/compare/cfc372c2ec9e...b3db2a65eacd [15:27:52] *** Parts: travis-ci (~travis-ci@ec2-54-211-207-110.compute-1.amazonaws.com) () [15:31:15] *** Joins: travis-ci (~travis-ci@ec2-54-224-121-86.compute-1.amazonaws.com) [15:31:16] (spdk/master) test: Add support to configure ipsec git repo (Ed rodriguez) [15:31:17] Diff URL: https://github.com/spdk/spdk/compare/b3db2a65eacd...9ba446a25ce4 [15:31:17] *** Parts: travis-ci (~travis-ci@ec2-54-224-121-86.compute-1.amazonaws.com) () [15:36:53] *** Joins: JoeGruher (c0373626@gateway/web/freenode/ip.192.55.54.38) [15:41:20] if I'm running the SPDK NVMeoF target on a dual socket system, and each socket has a NIC and NVMe attached, is there a way to set up SPDK so it uses cores on each socket, but those cores only service their NUMA local NIC and NVMe? I know I can use the reactor mask to give SPDK cores on both sockets, and I can set things up so storage on socket A is only exposed through the NIC on socket A, but by default nothing would prevent those I [15:42:31] could I run two separate instances of nvmf_tgt with unique reactor masks and config files, for example? [15:43:22] you can definitely just run two targets [15:43:27] and that's a valid way to do it [15:43:37] you don't have much control right now within one target though [15:44:13] Can I use rpc.py for configuration in that scenario? It doesn't grok multiple instances of nvmf_tgt does it? [15:44:55] I've gotten in the happen of using a script with rpc.py for configuring the target, but no reason I can't go back to a config file, I just have to refresh myself on how setting up that way works [15:45:16] the rpc.py script can do it too the -s and -p options let you connect to a server and a port [15:45:24] let me look at exactly how you do that with a domain socket [15:46:25] yeah when you start the target, there is a -r option [15:46:30] which is the UNIX domain socket for rpc [15:46:36] by default it is /var/tmp/spdk.sock [15:46:42] but you can change it to /var/tmp/spdk1.sock [15:46:45] and /var/tmp/spdk2.sock [15:46:56] and then when you connection with rpc.py, use the -s option to set the socket file [15:47:01] rpc.py -s /var/tmp/spdk1.sock [15:47:28] neat! I'll check it out. thanks. [15:47:49] it even supports tcp sockets, if you are in an environment where security isn't important [15:47:57] so you can configure remotely [15:48:31] awesome [16:25:27] bwalker, you there? [16:25:32] ya [16:26:26] QQ.. I'm gonna do a small patch for the crypto module that includes all of the relevant fixes recently in the PT module around RPC and registration, etc., from Jim. Do you care if they're all in one patch (not a lot of stuff and all related but Jim's was a series) [16:27:07] if it's 3 or 4 one line fixes grouped together I don't care, but generally a patch series is easier to review and what I prefer [16:28:31] no problem, I'll do a series, thanks! [16:32:53] jimharris, LOL not sure if you noticed as you were updating vbdev_passthru_register() but that function no longer registers a bdev :) It changed when RPC stuff was added. I'll circle back after the crypto patch and io_device already registered stuff and clean up the names... [16:34:19] jimharris, oh wait, was looking at and old version in my IDE. disregard [16:34:39] fyi - bwalker and i will both miss the community meeting tomorrow [16:34:59] it's possible i missed some other cleanup while i was in there mucking with the passthru module [16:39:23] *** Quits: JoeGruher (c0373626@gateway/web/freenode/ip.192.55.54.38) (Quit: Page closed) [17:01:03] OK [17:01:04] *** Quits: gila (~gila@5ED74129.cm-7-8b.dynamic.ziggo.nl) (Quit: My Mac Pro has gone to sleep. ZZZzzz…) [17:23:03] jimharris, play dad taxi for a bit, found a few places in the crypto vbdev where we weren't unregistering the io_device in certain failures but don't think that's the root cause for the CI thing. Will walk through the clear_config python stuff next and see what/how/where bdev/vbdev ordering is happening.... [20:27:46] jimharris, FYI one small interesting data point on the io_device failure... the last 4-5 failures that I tracked down all failed on the Jenkins CI and passed on CH CI. More tomorrow... [21:19:04] *** Joins: tomzawadzki (uid327004@gateway/web/irccloud.com/x-vhmraxxnavrpadmy)