[00:13:01] *** Joins: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) [00:54:27] *** Quits: ziyeyang_ (~ziyeyang@134.134.139.73) (Quit: Leaving) [02:45:13] *** Quits: cunyinch_ (~cunyinch@134.134.139.78) (Remote host closed the connection) [03:42:31] *** Quits: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) (Ping timeout: 240 seconds) [03:46:14] *** Joins: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) [04:13:03] *** Quits: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) (Ping timeout: 276 seconds) [04:25:44] *** Joins: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) [05:03:39] *** Joins: ziyeyang_ (~ziyeyang@134.134.139.75) [05:19:10] " what about using a ramdisk instead? run parted on the ramdisk, then specify the ramdisk as an AIO bdev" [05:19:50] For this suggestion, I tried and found a bug with our AIO device, and it can also be reproduced with split driver. [05:20:40] If you split AIO device into 2, then run test/lib/bdev/blockdev.sh and bdevperf test with verify mode will also fail [05:20:56] And today, I did not have the time to look at it. [05:22:26] *** Quits: ziyeyang_ (~ziyeyang@134.134.139.75) (Quit: Leaving) [06:49:06] drv, the env.sh on my test machine is in vtophys dpdk init, will look into it more later today... [06:49:08] Starting DPDK 17.05.0 initialization... [06:49:08] [ DPDK EAL parameters: vtophys -c 0x1 --file-prefix=spdk_pid23341 ] [06:49:08] EAL: Detected 72 lcore(s) [06:49:08] EAL: No free hugepages reported in hugepages-2048kB [06:49:08] EAL: No free hugepages reported in hugepages-1048576kB [06:49:09] EAL: FATAL: Cannot get hugepage information. [06:49:11] EAL: Cannot get hugepage information. [06:49:15] Failed to initialize DPDK [07:07:21] *** Joins: swmoon (afcb4763@gateway/web/freenode/ip.175.203.71.99) [07:07:45] *** Quits: swmoon (afcb4763@gateway/web/freenode/ip.175.203.71.99) (Client Quit) [07:16:37] *** Quits: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) (Ping timeout: 255 seconds) [07:16:42] note: I can run vtophys directly and it works fine... [07:18:27] also, just for my own curiosity, what [07:18:47] 's the deal with this msg that I see even on successful runs (and quite often) "EAL: No free hugepages reported in hugepages-1048576kB [07:18:47] " [07:19:46] *** Joins: gila (~gila@ec2-54-91-114-223.compute-1.amazonaws.com) [09:15:10] Good morning Everyone, just a heads up that I am adding an ubuntu machine to the test pool this morning. [09:15:19] I had to make some slight changes to the autotest script to accommodate this machine so please rebase your changes off of the latest master before submitting them. [09:15:38] *** Quits: sethhowe_ (~sethhowe@192.55.54.39) (Remote host closed the connection) [09:16:28] peluse: that message just means your system has no 1GB hugepages [09:16:41] the troubling one is the one above it that says you have no 2MB hugepages [09:16:53] what's the output of /proc/meminfo? sometimes when things crash, the hugepages aren't released [09:17:01] cat /proc/meminfo [09:17:16] or if you rebooted and hadn't run setup.sh script yet? [09:23:50] *** Joins: sethhowe (sethhowe@nat/intel/x-jisnxezwwfvaidim) [09:24:05] *** Quits: sethhowe (sethhowe@nat/intel/x-jisnxezwwfvaidim) (Remote host closed the connection) [09:33:44] *** Joins: sethhowe (~sethhowe@192.55.54.39) [09:36:07] *** Quits: sethhowe (~sethhowe@192.55.54.39) (Remote host closed the connection) [10:07:13] *** Joins: sethhowe (~sethhowe@192.55.54.44) [10:53:18] sethhowe: can you add uuid library to all of the test systems? one of our new features (logical volumes) needs that library for generating uuids [11:04:47] I think you want libuuid-devel on fedora [11:38:52] bwalker: gerrit says this needs a rebase https://review.gerrithub.io/#/c/365295/ [12:22:22] bwalker, will check here in a few, thanks. [12:22:54] jimharris, wrt setup.sh, I didn't hunt through the scripts but I assume that's in there somewhere starting with autorun.sh? [12:31:14] * peluse wonders who the genius is that made the Intel server BIOS setup default to a blink of an eye to see which key to hit to enter setup... [12:52:22] bwalker, here's my before and after /proc/meminfo https://gist.github.com/peluse/f994763912c048d67bb3df13c9778807 if you can take a quick peek [12:54:13] peluse: autotest.sh will call scripts/setup.sh [12:54:35] jimharris, yeah, I figured and was just going to look [12:54:55] sethhowe: can you take a look at this log? http://spdk.intel.com/public/spdk/builds/review/ae5b4a0f6cc425fbbc60ce1ae6837057ac9cc5c0.1498074406/vm-ubuntu-01/build.log [12:55:14] OLD_STDERR and OLD_STDOUT - do you know what those files are? [12:55:46] bwalker, note that immediately after the failure if I run the test manually it works fine [12:57:15] jimharris, one is the old std error and one is the old std out ;) [12:57:29] oh - of course - i should have figured that out :) [12:57:39] I'm here to help of course :) [12:58:53] i'm not sure how to login to vm-ubuntu-01 [12:59:30] sethhowe, I assume you know that it exited because of extra files in the spdk dir from a failed run? I asked drv about that the other day, you have to maunally clean them out between runs if you're messing around and getting failures then restarting manually [13:00:20] jimharris, you are making this too easy for me to resist today... try "ssh vm-ubuntu-01" :) [13:01:12] sethhowe, FYI I just commented out that "exit 1" in mine for the time being... [13:04:40] peluse jimharris: Those files are created by lcov in ubuntu. I pushed a patch yesterday t delete them. If you rebased on top of master this morning, the patch should have passed. If you did rebase and it failed, I need to do some more fixing. [13:09:26] jimharris: You can get into the vm's by ssh sys_sgsw@cpdk-ci-000.ch.intel.com. then tmux attach, the vm's are running in separate windows of the tmux session. [13:12:39] jimharris: I made some comments on the previous rev of your nbd review [13:12:43] you fixed one of them already :) [13:37:00] peluse: /proc/meminfo looked fine [13:37:04] you are running as root, right? [13:42:34] drv: thanks - yeah, there were some shortcuts I took last night to get this working that I forgot to fix :) [13:44:09] yeah, looks good overall, just nit-picky stuff [13:51:57] bwalker, well, I was running sudo ../spdk/autorun.sh [13:52:04] but also tried as root, same result [13:55:11] I am not using the same .conf file for the test though, let me grab that and rerun [13:57:03] sudo is fine [13:58:04] I don't really have a theory as to why you are seeing that failure [14:00:27] yeah, right after the failure I can run it manually with the exact same parms and it works, so I'll start by double checking everything that happened before against good log output and make sure I'm not missing a subtle failure or other message of some kind [14:17:33] does it reproduce if you run the full autorun but have SPDK_TEST_NVME_MULTIPROCESS=0 in autorun-spdk.conf? [14:34:45] *** Quits: sethhowe (~sethhowe@192.55.54.44) (Remote host closed the connection) [14:36:39] *** Joins: sethhowe (~sethhowe@192.55.54.44) [14:44:57] drv, I'm not sure my .conf changes are being picked up, I'm putting the file in ~/ but in the test output where you see the exports, they don't match. I'll try just commenting out some of the earlier tests one by one and see if one of them missing makes it pass [14:47:58] peluse: if autorun-spdk.conf is working, you should see a 'source /home/sys_sgsw/autorun-spdk.conf' line as one of the very first things in the log [14:48:05] well, your home dir rather than sys_sgsw [14:48:52] yeah, that's where its at. I'll look next time I run it, just commented out all the earlier tests and it passed. Will start adding them back and see where it pukes [14:49:42] I would guess that some earlier test is not correctly releasing the hugepage files [14:50:29] yeah but why on my system and not the test systems? [14:50:59] no idea :) [14:51:08] you just have the magic touch [14:51:15] oh come on, I know you know but you're just not saying :) [14:51:32] if you can narrow it down to a specific test, we can try to strace it or something and see what's going wrong [14:51:38] but I would bet money that it's the NVMe multiprocess test [14:51:40] yup [14:52:05] I'm just gonna rerun a few times like it is and make sure its consistent first [14:54:13] peluse: I just had a thought [14:54:19] try just running ../spdk/autorun.sh [14:54:20] no sudo [14:54:26] the scripts will automatically sudo the right parts [14:54:28] bwalker, will do [14:54:39] that's why it isn't finding your config file [14:54:49] it's looking in root's home [14:55:48] maybe put a whoami in autorun.sh with a warning if it is root? [14:55:55] bwalker, OK, need to check on some permissions I guess.. Creating CONFIG.local..../configure: line 228: config.h: Permission denied [14:56:26] yeah, you should not be running autorun.sh as sudo - it calls sudo itself [14:56:26] do "sudo git clean -x -d -f -f" [14:56:33] in your repository [14:56:38] if you have nothing in there you want to keep [14:56:56] you ran our configure script as root, so now you can't delete the local config file [14:57:03] you also shouldn't run the configure script as root [14:58:32] OK, retrying... [15:02:35] if it all just works we'll put out some patches that will catch running as root [15:03:03] oh, it also assumes your user has passwordless sudo set up [15:06:02] so the running w/o sudo didn't fix the issue but I can still run w/or w/o sudo as long the earlier tests are commented out so will try w/o sudo with tests added back in one at a time to see which one it is instigating [15:06:19] *** Quits: sethhowe (~sethhowe@192.55.54.44) (Remote host closed the connection) [15:14:51] seems to be something in nvme.sh. have to go pickup son, will narrow it down further later. also will swing by the lab and grab more memory just for fun... see ya in a bit [15:17:46] bwalker: i added a couple of comments to your bdev fio plugin patch - i didn't really get to the code yet since i wasn't sure if it was ready for review or not [15:18:12] you can review it when you have time - the part that's missing are actual tests that use it [15:20:41] *** Joins: sethhowe (~sethhowe@134.134.139.72) [16:26:11] *** Joins: ziyeyang_ (~ziyeyang@134.134.139.74) [17:28:25] *** Quits: ziyeyang_ (~ziyeyang@134.134.139.74) (Remote host closed the connection) [17:28:46] *** Joins: ziyeyang_ (~ziyeyang@192.55.54.42) [17:35:54] *** Quits: ziyeyang_ (~ziyeyang@192.55.54.42) (Remote host closed the connection)