[05:34:31] *** Joins: nKumar (uid239884@gateway/web/irccloud.com/x-dofuurpcwckbzigg) [05:40:37] So I am attempting to run the hello_blob example on my device, however Im trying to do it using actual NVMe drives instead of the memory but am having some issues. What should the hello_blob.conf file look like for NVME for this example? [06:30:24] one sec [06:36:42] [Nvme] [06:36:42] TransportID "trtype:PCIe traddr:0000:06:00.0" Nvme0 [06:37:20] and the "06" in there depends on your system. You can see the address when you start the program and SPDK comes up, hello_world won't work that first time but go back and edit the address and it will the next time [06:37:26] then in the code you have to change one line: [06:37:42] bdev = spdk_bdev_get_by_name("Nvme0n1"); [06:37:56] where it used to say Malloc0 or something like that [06:41:02] here's the output that you'll see as spdk starts up where you can get the nvme address on your system "EAL: PCI device 0000:06:00.0 on NUMA socket 0" [07:21:59] *** Joins: lhodev (~Adium@inet-hqmc01-o.oracle.com) [07:39:30] *** ChanServ sets mode: +o jimharris [08:25:18] drv: i'm looking at that vagrant centos7 issue [08:59:00] *** ChanServ sets mode: +o bwalker [09:03:30] peluse: regarding iteration - if you want to break earlier, just call spdk_bs_md_close_blob [09:10:43] jimharris: he pushed a patch on github for that vagrant thing [09:11:04] https://github.com/spdk/spdk/pull/180 [09:11:21] well, one of the two vagrant issues [09:11:32] yeah - I'm testing it [09:12:00] the patch also installs git in the VM and does the submodule update in the VM which I don't think is right [09:12:54] i'm wondering how we can make it easier for folks to get set up to push to gerrithub [09:13:27] I wonder if it is scriptable via their REST API [09:14:08] the main annoyance is probably having to do the github account import steps, and I don't think there's much we can do about making that easier [09:15:04] we could at least script adding the "review" remote to .git/config, and downloading the gerrit hook [09:15:21] but the https password or ssh side is a bit more challenging [09:15:46] personally I use ssh - do most people use https passwords to interact with github/gerrithub? [09:16:05] I use https [09:16:08] I think https is easier to set up [09:16:13] I use SSH [09:16:16] ultimately there isn't much difference [09:16:37] with ssh you don't have an extra step with gerrithub [09:16:45] you either generate an https password on gerrithub or you upload your ssh key to gerrithub [09:16:52] it's the same number of steps [09:17:17] the advantage to https is proxy configuration is usually already worked out for most companies [09:17:19] hmmm - I was thinking I didn't have to upload my keys separately for gerrithub [09:17:26] but I guess that's wrong [09:17:47] maybe it can import - I didn't use ssh for github either [09:18:05] and I'm confident most people use https on github [09:22:29] i think you'll find a fair number of sophisticated github users that use ssh [09:23:17] but regardless - my only point is that anything we do to help automated should cover either case [09:24:06] bwalker, thanks yeah that was clear :) Just odd that we (the app) didn't open it but do need to close it which is why I mentioned it... [09:28:56] how do we want to handle patch submissions that still come in through github? [09:29:31] jimharris, can't we redirect them to use GH? Who is trying to make pull requests? [09:30:46] oh, I see it now [09:30:49] we can redirect them to use GH, but there will still be folks who are accustomed to just doing pull requests and setting up for gerrithub submission does require a lot of manual steps [09:31:54] I don't think its *that* many steps but our directions are sort lengthy so it's a bit daunting for anyone in a hurry. One of the things on the todo list is to make it more like the readme in the repo and then have a section later on w/all the details for someone who cares [09:31:55] i think we want to redirect folks to gerrithub - just wondering if we should be a bit lenient in the short term [09:32:25] yeah, like separate out the URLs for viewing the dashboard from the stuff to set it up to push a patch [09:32:57] yeah, maybe I should just push a proposal sooner than later... [09:33:08] *** Parts: lhodev (~Adium@inet-hqmc01-o.oracle.com) () [09:33:15] for now that, for the one liner that's mentioned earlier, one of us can just do it I suppose [09:34:14] man, I can't type this morning "for now though" I meant [09:37:47] I updated RocksDB in the test pool which requires a patch to our repo - make sure you rebase before submitting new patches [09:39:27] bwalker: looks like DPDK will soon be able to just use virtual address as IOVA [09:39:29] drv, when you get a chance can you take a look at https://review.gerrithub.io/#/c/370775/ and chime in with your opinion. gracias [09:40:17] IOVA as in the bus address? [09:40:34] dma address [09:40:37] yes [09:40:56] that's probably a smart design [09:41:10] it's a patch set from Cavium, but reading through the patch descriptions this *should* work like we want it to for us too [09:41:39] so then we need a way to ask, for a given PCI device, whether the IOMMU is enabled or not [09:41:44] if it is enabled, we just use VA [09:41:49] if it is not, we ask for the PA [09:42:04] or DPDK could transparently do that in the translation call too [09:42:11] http://dpdk.org/ml/archives/dev/2017-August/072871.html [09:44:02] so it should just work then [09:44:14] I don't think it will break my fixes to run as an unprivileged user either [09:53:40] jimharris, does the NVMe driver work with SSDs that use other LBA sizes, like 4104? Any changes need to be made if so? [09:54:49] yes - the NVMe driver works for 'non-standard' LBA sizes [10:02:39] OK, I haven't tried it but nkumar got errors running hello_blob against one setup for 4104 but not one for 4096, I'll try it on my end and see if I get similar errors and go from there. thanks! [10:03:02] *** Joins: lhodev (~Adium@inet-hqmc01-o.oracle.com) [10:03:16] *** Quits: lhodev (~Adium@inet-hqmc01-o.oracle.com) (Client Quit) [10:03:29] blobstore does not work with 4104 namespaces [10:03:37] yeah that's blobstore that's not working [10:03:49] the NVMe driver works, but blobstore needs a block size of 4096 [10:04:15] got it, thanks! [10:04:40] blobstore may work with 512 too [10:04:43] I'd have to look [10:04:45] good to know! [10:04:51] but it needs to be easily divisible into 4096 [10:04:51] blobstore works with 512 [10:07:24] I'll throw a check in spdk_bdev_create_bs_dev() if you think that makes sense and return an error code for unsupported block sizes? [10:08:43] *** Joins: johnmeneghini (~johnmeneg@pool-96-252-112-122.bstnma.fios.verizon.net) [10:09:51] yeah that would be a good addition [10:19:04] hi, I saw the following error when using the kernel initiator to connect to SPDK NVMe-oF target "nvme connect -n "nqn.2016-06.io.spdk:cnode1" -t rdma -a 192.168.10.10 -s 4420 -q "nqn.2016-06.io.spdk:init1" [10:19:04] Failed to write to /dev/nvme-fabrics: Invalid cross-device link [10:19:04] " [10:19:20] Looks like it has to do with the number of io queues [10:19:39] dmesg has [10:19:40] [262720.987901] nvme nvme0: creating 15 I/O queues. [10:19:40] [262721.496897] nvme nvme0: Connect command failed, error wo/DNR bit: -16402 [10:19:40] [262721.496945] nvme nvme0: failed to connect i/o queue: -18 [10:19:50] *** Quits: johnmeneghini (~johnmeneg@pool-96-252-112-122.bstnma.fios.verizon.net) (Quit: Leaving.) [10:20:06] so I changed the number of i/o queues and I can connect successfully [10:20:32] root@nvmeof-cli001:/home/bench-admin# nvme connect -n "nqn.2016-06.io.spdk:cnode1" -t rdma -a 192.168.10.10 -s 4420 -q "nqn.2016-06.io.spdk:init1" -i 12 [10:20:32] dmesg [10:20:32] [263422.072530] nvme nvme0: creating 12 I/O queues. [10:20:32] [263422.511279] nvme nvme0: new ctrl: NQN "nqn.2016-06.io.spdk:cnode1", addr 192.168.10.10:4420 [10:20:56] The system has 2 NUMA nodes with 12 cores per node [10:21:17] if I try to do anything greater than 12 queues I get the failure [10:21:43] *** Joins: johnmeneghini (~johnmeneg@pool-96-252-112-122.bstnma.fios.verizon.net) [10:21:44] Is this expected on the host? [10:21:46] jkkariu: what kernel version are you using? [10:21:56] 4.9 [10:22:35] there was a bug in the Linux kernel NVMe-oF host code that wasn't fixed until (if I remember correctly) 4.11 [10:23:03] if you use latest Fedora, it should already have 4.11 [10:24:11] Ok. Thanks. Let me download 4.11 and see if I can no i/o queues greater than 12 [10:24:26] hopefully fedora updates to 4.12 soon because then soft RoCE works with NVMe-oF [10:24:38] that makes everything really convenient [10:28:38] *** Joins: lhodev (~Adium@inet-hqmc01-o.oracle.com) [10:40:14] I'm actually using the soft RoCE stuff to test NVMe-oF at my desk now and it's super convenient [10:41:01] *** Parts: lhodev (~Adium@inet-hqmc01-o.oracle.com) () [12:11:12] hmm, just had a test failure (unrelated to my patch) that I haven't seen in a while, thought these issues were cleared up: run_test ./test/vhost/spdk_vhost.sh --integrity [12:11:20] or maybe I did just really break something :) [12:11:28] anyone wanna look or should I just reubmit [12:11:42] ? https://review.gerrithub.io/#/c/374182/ [12:38:28] hmm, that is worrying - obviously has nothing to do with your patch [12:41:02] yeah, ran through the 2nd time OK... [12:55:01] i have a git submodule quesiton [12:56:36] I want to use the dpdk submodule in spdk, but the url for my git server is different. How do I change the url of the git server without modifying the .gitmodule file in spdk? [13:00:08] johnmeneghini: I think you can modify your local .git/config [submodule "dpdk"] to point it at a different URL [13:00:50] yes, I've tried that. Haven't been able to get it to work successfully. I'll try again and report back. [13:38:39] peluse: the dependencies on your ut/nvme patches are still a bit out of whack [13:39:01] this one looks good [13:39:01] https://review.gerrithub.io/#/c/372541/ [13:39:21] hmmm [13:39:28] which one looks "suspect"? [13:39:31] but the other patches depend on the older version of that patch with ALL CAPS [13:40:15] not sure how that keeps happening, I must be missing a step when I updated the commit msgs or something [13:40:23] if you click on that link, you see the next patch in the series is the nvme_allocate_request_null() patch, but it has the green squiggly line meaning it is based on a different version of the patch [13:40:47] https://review.gerrithub.io/#/c/372542/ [13:41:05] when you click on that one (for nvme_allocate_request_null), it is dependent on the UNIT TEST version of the patch [13:41:13] I guess to be totally correct I probably just need to rebase that one? [13:41:31] i have to jump in the car here in 20 min, I'll look at the one real quick as an example [13:41:32] but it's that way through the rest of the patches [13:41:44] well, yeah would have to do all of them I suppose [13:41:46] it looks like you pushed a new rev of all of them yesterday morning [13:41:47] when you're about to push, you should be able to see the full up-to-date list in 'git log', and then push the final patch in the series for review [13:41:55] but does it matter merging them one a time? [13:41:55] that should make sure all of them are up to date [13:42:28] it means we have to rebase each one individually rather than checking in the whole series at once [13:42:34] not the end of the world, but it takes longer [13:42:35] ahhh [13:42:42] no, I'd rather get it right [13:43:39] yeah, they all say merge conflict now anyway so I have to f with them... [13:44:28] jimharris, yeah yesterday I fixed all the commit msgs again [13:49:13] So spdk_nvme_probe() is first in the chain and nvme_allocate_request() is last. I would assume if I rebased the first and pushed that all the ones after it would get updated but drv you are saying push the final patch, not the first? [13:49:30] right, if you push just the first one in the series, it won't update any of the other ones [13:49:40] so they will all still depend on old versions of the previous patches [13:49:53] if you push the last one in the series, it will push any of the ones it depends on too [13:49:54] why does that seem backwards to me? [13:50:16] let's clarify what we mean by "first" and "last" :) [13:50:20] "first" meaning first in the series to be checked in, which is at the bottom of the list on gerrithub, just to be confusing [13:50:21] first mean it has no dependencies, last means it has al the dependencies :) [13:50:58] same thing right? everyone else depends on the first right? [13:51:11] right [13:51:27] from my vantage point, it looks like each of the patches got pushed, with a version where the commit message says ut/nvme (not UNIT TEST), but each of them is dependent on the older version of all of the patches preceding it [13:52:54] jimharris, yeah that would make sense as to why the CAPS shit keeps coming back, I'm just not sure how I fix those later dependencies in the chain to point to the correct earlier later versions of the patches/ Heh, earlier later. Earlier in the series, later revision of the patch [13:57:37] i think you need to run through it F2F with drv or bwalker if you can - this is super easy with stgit but seems like git rebase -i is throwing you for a loop [13:58:08] can you do a git log right now and see the right commit messages for all of your patches? [14:02:03] peluse: drv thinks this makes it more confusing, but sethhowe and I swear by this alias to visual the tree [14:02:05] [alias] [14:02:05] tree = log --graph --abbrev-commit --decorate --format=format:'%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n'' %C(white)%s%C(reset) %C(bold white)- %an%C(reset)' --exclude=/origin/dev/* --branches=* [14:02:16] then you type "git tree" [14:02:19] and it draws you the graph [14:02:25] HEAD is where you are now [14:05:53] you may also want to set your pager to less [14:06:00] [core] [14:06:08] pager = "less --tabs=1,5 -XRF" [14:07:59] actually for our code do pager = "less --tabs=1,9 -XRF" [14:09:19] *** Quits: gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) (Quit: My Mac Pro has gone to sleep. ZZZzzz…) [14:09:31] the -X is cool because it makes it so the screen doesn't clear when you exit less [14:09:40] so you can draw the tree, exit the pager [14:09:44] then still see it and type out commit hashes [14:09:53] without switching to like another tmux pane [15:06:21] drv: can you +2 this again? i had to rebase it https://review.gerrithub.io/#/c/373272/ [15:12:02] jimharris: done [16:14:20] so we can't revert the defer put_io_channel patch, because of the nvme bdev spinwait stuff that got added [16:14:41] yeah - it is breaking stuff in blobstore too [16:14:59] so it needs a good re-think [16:17:25] should we go ahead with the "defer destryoing bs_dev" patch then? https://review.gerrithub.io/#/c/371885/ [16:17:33] (the lvol stuff is on top of that currently) [16:24:48] I think so [16:51:25] *** Quits: ziyeyang (~ziyeyang@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: ppelplin (~ppelplin@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: gangcao (~gangcao@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: pbshah1 (~pbshah1@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: mszwed (~mszwed@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: changpe1 (~changpe1@192.55.54.44) (*.net *.split) [16:51:25] *** Quits: kjakimia (~kjakimia@192.55.54.44) (*.net *.split) [16:51:37] *** Quits: ChanServ (ChanServ@services.) (*.net *.split) [16:55:21] *** Joins: ziyeyang (~ziyeyang@192.55.54.44) [16:55:22] *** Joins: ppelplin (~ppelplin@192.55.54.44) [16:55:22] *** Joins: gangcao (~gangcao@192.55.54.44) [16:55:22] *** Joins: pbshah1 (~pbshah1@192.55.54.44) [16:55:22] *** Joins: mszwed (~mszwed@192.55.54.44) [16:55:22] *** Joins: changpe1 (~changpe1@192.55.54.44) [16:55:22] *** Joins: kjakimia (~kjakimia@192.55.54.44) [16:56:01] *** Joins: ChanServ (ChanServ@services.) [16:56:02] *** orwell.freenode.net sets mode: +o ChanServ [16:56:45] *** Quits: ChanServ (ChanServ@services.) (*.net *.split) [16:57:29] *** Quits: ziyeyang (~ziyeyang@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: ppelplin (~ppelplin@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: gangcao (~gangcao@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: pbshah1 (~pbshah1@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: mszwed (~mszwed@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: changpe1 (~changpe1@192.55.54.44) (*.net *.split) [16:57:29] *** Quits: kjakimia (~kjakimia@192.55.54.44) (*.net *.split) [16:59:49] *** Joins: ziyeyang (~ziyeyang@192.55.54.44) [16:59:50] *** Joins: ppelplin (~ppelplin@192.55.54.44) [16:59:50] *** Joins: gangcao (~gangcao@192.55.54.44) [16:59:50] *** Joins: pbshah1 (~pbshah1@192.55.54.44) [16:59:50] *** Joins: mszwed (~mszwed@192.55.54.44) [16:59:50] *** Joins: changpe1 (~changpe1@192.55.54.44) [16:59:50] *** Joins: kjakimia (~kjakimia@192.55.54.44) [17:00:06] *** Joins: ChanServ (ChanServ@services.) [17:00:06] *** orwell.freenode.net sets mode: +o ChanServ [17:55:44] *** ChanServ sets mode: +o peluse [17:56:23] bwalker, yeah I already use that one - you gave it to me a while ago thanks :) [17:57:21] jimharris, yeah I chatted with drv and bwalker and we think its clear how I messed up the chain. wasn't w/rebase -i it was by updating each commit msg indivudally. Have a plan to fix them all here as soon as I have another beer :) [17:57:38] * peluse building up the courage... [18:22:35] I think it might have worked! Will check back in a bit [18:29:15] so the dependencies all look right and there were no issues as I started with the first, rebased and then cherry picked up the chain and pushed. A few show conflicts with others though that I don't totally understand where/why but I dunno, take a look and see what happens to the rest when one is found worthy of merging :) [19:34:08] *** Quits: nKumar (uid239884@gateway/web/irccloud.com/x-dofuurpcwckbzigg) (Quit: Connection closed for inactivity) [19:36:28] *** Joins: lhodev (~Adium@inet-hqmc07-o.oracle.com) [20:33:43] so 2 of the UT patches failed for unrelated reasons. Can someone remove the -1 label so I don't have to resubmit and risk breaking the chain again... [21:25:24] *** Quits: lhodev (~Adium@inet-hqmc07-o.oracle.com) (Quit: Leaving.)