[00:25:47] *** Joins: tkulasek_ (~tkulasek@192.55.54.39) [02:08:02] *** Quits: dlw (~Thunderbi@114.255.44.143) (Remote host closed the connection) [02:08:29] *** Joins: dlw (~Thunderbi@114.255.44.143) [03:42:23] *** Joins: johnmeneghini (~johnmeneg@100.0.53.181) [03:56:09] *** Quits: dlw (~Thunderbi@114.255.44.143) (Ping timeout: 264 seconds) [06:07:27] *** Joins: lyan (~lyan@2605:a000:160e:2124:4a4d:7eff:fef2:eea3) [06:47:31] *** Joins: dlw (~Thunderbi@111.197.236.176) [07:17:29] *** Quits: dlw (~Thunderbi@111.197.236.176) (Ping timeout: 256 seconds) [07:34:44] *** Joins: dlw (~Thunderbi@111.197.236.176) [07:50:21] *** Quits: dlw (~Thunderbi@111.197.236.176) (Ping timeout: 240 seconds) [08:10:56] darsto: excellent! [08:25:28] pushed https://review.gerrithub.io/#/c/spdk/spdk/+/416420/ for changing the base-virtaddr [09:04:45] looks good - I don't see any more of the "Base virtual address hint [...] not respected!" messages in any of the logs [09:10:48] jimharris: I responded to your comments on the zcopy patch [09:13:01] I also originally named the operations spdk_bdev_map_blocks and spdk_bdev_unmap_blocks [09:13:09] but obviously, spdk_bdev_unmap_blocks means something else already [09:13:38] I kind of liked modeling it after mmap in that way, but it isn't an exact analogy [09:13:44] so I moved away from that [09:21:09] *** Joins: travis-ci (~travis-ci@ec2-107-22-7-197.compute-1.amazonaws.com) [09:21:09] (spdk/master) blob: factor out mask loading into a function (Daniel Verkamp) [09:21:10] Diff URL: https://github.com/spdk/spdk/compare/8f26b74e020a...f4a018722c0f [09:21:10] *** Parts: travis-ci (~travis-ci@ec2-107-22-7-197.compute-1.amazonaws.com) () [09:33:12] *** Joins: pault (c7ff2cfa@gateway/web/freenode/ip.199.255.44.250) [09:34:29] *** Parts: pault (c7ff2cfa@gateway/web/freenode/ip.199.255.44.250) () [09:46:10] sethhowe_: can you respond to my email? [09:46:46] Also, I am seeing the following error when I run check_format.sh on my Freebsd platform [09:46:48] Checking Python style.../usr/local/lib/python2.7/site-packages/pep8.py:2124: UserWarning: [09:46:48] pep8 has been renamed to pycodestyle (GitHub issue #466) [09:46:48] Use of the pep8 tool will be removed in a future release. [09:46:48] Please install and use `pycodestyle` instead. [09:46:49] $ pip install pycodestyle [09:46:50] $ pycodestyle ... [09:58:29] thanks for the report john - could you submit a patch that checks for this in check_format.sh? [09:59:00] i.e. if freebsd, then try pep8 and if pep8 isn't available do pycodestyle instead [09:59:14] or maybe try pycodestyle first since long term that's what will be used [09:59:19] I think the pep8 -> pycodestyle rename is not freebsd specific [09:59:30] or just if hash pycodestyle, run that, else if hash pep8, run that. [09:59:34] I don't think it's a check format problem. I think the pep utility is broken on my FreeBSD box [09:59:49] https://github.com/PyCQA/pycodestyle/issues/466 [10:28:17] Yes, I got that issue with the pip change. My question so Seth is: why is no-one else seeing this on the SPDK CITs. My conclusion is that you guys must be running a different version of the FreeBSD tool chain. I'll open a bug report and propose a patch. [10:41:19] it looks like this is a recent change, so you probably just have a newer version of FreeBSD [11:05:02] jimharris: It's restarting now. [11:53:50] *** Quits: tkulasek_ (~tkulasek@192.55.54.39) (Ping timeout: 268 seconds) [12:19:29] I tried Daniel's fix and pycodestyle is finding python errors than pep8 does not. [12:19:32] Checking Python style... Python formatting errors detected [12:19:33] scripts/spdkcli.py:25:5: E722 do not use bare except' [12:19:33] test/iscsi_tgt/rpc_config/rpc_config.py:434:9: E722 do not use bare except' [12:19:33] test/lvol/lvol_test.py:53:5: E722 do not use bare except' [12:20:34] i think more recent version of pep8 throw that E722 error too [12:20:57] john - can you push a patch that fixes these E722 errors? [12:21:15] drv: do you have that 4TB SSD back in your system? [12:23:38] Are you guys running MacOS Sierra ? (10.12.6) [12:24:49] yes - i'm running 10.12.6 [12:25:21] OK. I'll push up my patch an you can try it out. [12:50:03] *** Joins: alekseymmm (bcf3adf1@gateway/web/freenode/ip.188.243.173.241) [12:51:03] @jimharris Hello, could you kindly answer some of my questions about bdevperf app from spdk? [12:53:09] jimharris: I have the 4TB disk back in my box now [12:55:53] sure alekseymmm [12:56:19] drv - can you try the latest rev of your patch? i just pushed a small mod to it [12:56:26] sure [12:56:44] do you want me to try the whole series or just the first patch? [12:56:52] As far as I could understand from my tests and code of bdevperf it could test many bdevs at once but every single bdev is tested in single thread (on 1 core) [12:57:09] is it possible to test 1 bdev but from many threads [12:57:28] like what fio in lunux does with numjobs parameter [12:57:55] currently it's not possible, but you are not the first person to want that ;-) [12:58:12] fne plans to do so ? [12:58:22] any plans to do so ? [12:59:11] I think it would be more useful to test 1 bdev from many threads. this way you could get it full performance [12:59:13] not yet - bwalker and i have talked about this previously - as this gets more complicated, bdevperf (and nvme/perf) will really need some kind of config file mechanism like fio [12:59:44] but maybe a shorter-term solution could be a flag to bdevperf that says to test every bdev on every thread? [12:59:56] i suspect this would cover the vast majority of use cases [13:00:45] Yes you allocate head array in code of leght num cores. but use only 1 head item if it is only 1 bdev. I think it is possible to intialize all of them with targets [13:02:36] "but maybe a shorter-term solution could be a flag to bdevperf that says to test every bdev on every thread?" Are planning to do that? [13:03:35] what is nvme/perf you mentioned before ? what is the difference between bdevperf? [13:04:02] i don't have any plans to do that currently - if you'd like to submit a patch for it I would be more than happy to review [13:04:19] nvme/perf is similar to bdevperf, except it bolts directly to the SPDK NVMe driver [13:04:53] Hmm I have no experience in submiting patches via gerrithub, so it could take a while for me ( [13:05:29] i don't think there's a big rush - and any of use here on IRC would be happy to help if you run into problems [13:06:00] http://www.spdk.io/development/ gives a lot of good instructions for submitting via gerrithub [13:06:43] What kind of multithread bdev test you like more when you run many bdevs form many threads or just sspecify -b "bdev name" flag (like in hello world) and test only this bdev from many threads? [13:07:14] I think the first option is more complicated [13:08:12] looking at the bdevperf code... [13:09:11] me either [13:10:12] so -b "bdev name" could be useful as a way to say "only test this bdev" - whether that's single thread or multi-thread [13:10:38] i think having -b "bdev name" imply that it should be tested from all threads might be a bit confusing [13:11:05] no , I mean -b defines the bdev name and -m defines coremask [13:11:51] so if you want to test like Malloc0 from 8 threads you just run bdevperf -b Malloc0 -m0xff [13:11:55] so if you want to test like Malloc0 from 8 threads you just run bdevperf -b Malloc0 -m 0xff [13:13:06] i guess i think that's too implicit - should specifying -b Malloc0 be treated differently than a conf file that only describes Malloc0? [13:13:55] i don't think testing all bdevs on all threads is really that much harder than just one bdev on all threads [13:14:17] bdevperf_construct_targets needs to be refactored a bit - primarily pull out the code for building the struct io_target into a separate function [13:14:20] so you don;t like the idea of -b flag ? [13:14:42] I like the idea of -b flag to say "only test this one bdev, even if the conf file specifies more" [13:14:56] i just don't like -b to also imply test it on all threads [13:15:33] I think -m specifies the threads. [13:16:29] it does, but today, if there are 2 bdevs and user specifies 0x7 (meaning three threads) - the first two threads get one bdev each and the third won't run any bdevs [13:16:44] I got it [13:17:52] once bdevperf_construct_targets is refactored, your new flag could choose one of two scheduling options - the default one we use today, or a new one that goes through the full list of bdevs for each thread [13:18:17] but the multithreaded bdevperf should then run 3 threads and each of them submits ios to both bdevs and each thread submits up to queuedepth ios. [13:19:05] exactly [13:19:35] what do you think a good option would be for this multi-thread capability? -M? [13:20:04] never mind - we already use -M for something else [13:20:12] R/W mix [13:20:20] -M is r\w mix [13:20:24] yep [13:20:57] -T and -t are also out (there's a patch in flight to use -T in bdevperf) [13:21:07] thinking T as in thread [13:21:07] what for ? [13:21:18] oh wait - no, it's -L [13:21:24] -T would work [13:21:49] so i think we're saying it would look like: bdevperf -T -m 0xF [13:22:01] which would mean test every bdev on all four threads? [13:23:00] in fact we don't have right now any flags without value. Is it ok to have just "-T" [13:23:15] ? [13:23:22] yes - that's fine, just don't put a : after the letter in the getopt string [13:23:38] also fyi we try to keep the getopt string in alphabetical order [13:23:43] so it would go at the end [13:24:09] it will be like this ""c:d:m:q:s:t:w:M:P:S:T" [13:24:14] yep [13:24:38] i would suggest doing this in two patches - first one does the refactoring to pull the io_target building code into one function [13:24:40] ok then it sounds like an interesting exercise for me if no one is planning to do something lie this [13:24:44] then the second one does this new multi-thread options [13:24:46] option [13:24:50] will make it easier to review [13:25:14] and please - any questions you have on gerrithub, etc anyone here will be glad to help [13:25:44] ok thanks then. the problem is timezones ) [13:26:30] what time zone are you in? part of our spdk team at intel is based in poland [13:26:48] and a lot of them are here on irc during their working hours [13:27:22] Oh then it is not that bad then, I am gmt+3 [13:27:31] asia is also well represented - part of spdk team is in shanghai and shuhei matsumoto from hitachi is also usually on irc during his working hours [13:27:47] part of intel spdk team i meant to say [13:28:01] Thanks for all the ideas [13:29:34] jimharris: things look good with https://review.gerrithub.io/#/c/spdk/spdk/+/416448/ - the bdev shows up with the right size now on my 4 TB device [13:30:13] excellent [13:30:42] looks like it needs a rebase, though [13:31:06] oh yeah - i forgot to do that after pulling down your patch [13:31:51] OK Jim you've got your wish. If you checkout my branch on your Macbook and follow the instructions in scripts/vagrant/README.md, you should be able to build a FreeBSD vagrant box and test out my patch. [13:31:53] https://review.gerrithub.io/#/c/spdk/spdk/+/416461/ [13:32:16] I developed these fixed on my FreeBSD vbox. [13:35:35] jimharris why didn't you initially design bdevperf to test only one bdev ? then if you wish to test many bdevs just run it several times... And it would be easier to impliment multithreading [13:37:02] there are a lot of cases where we want to test multiple bdevs at once to test throughput on one thread for example - i.e. one nvme ssd might get 500K IO/s, but a CPU core is capable of doing many more IO than that [13:45:06] got it [13:56:28] bwalker: could you take a look at the patches from tomek k that need another +2? [14:07:21] sure [14:10:27] drv, johnmeneghini: https://review.gerrithub.io/#/c/spdk/spdk/+/416434/1/include/spdk/nvme.h [14:10:51] for trsvcid - can FC just be a zero-length string just like PCIe? [14:11:05] hmm, yeah, I missed that [14:11:54] let me look at the spec and see what it says [14:12:11] ok - i'll wait to remove your +2 then :) [14:12:12] but I think it should be able to be an empty string - we will convert that into all spaces since the TRSVCID field is defined as an "ASCII string" per the NVMe spec [14:13:40] the fc-nvmf spec draft I have says "Transport Service ID: shall be set to the ASCII string “none” (see NVMe over Fabrics)." [14:13:55] I don't know if that is the same in the final version [14:14:43] and it also says "Transport Address: shall be set to “nn-0xWWNN:pn-0xWWPN” where: a) WWNN is the Node_Name of the target NVMe_Port; and b) WWPN is the N_Port_Name of the target NVMe_Port." [14:14:46] even then, it seems like it's easier for SPDK to fill out whatever the spec says should be there since we can key off of the trtype [14:15:22] yes [14:15:41] and there's two places where this matters: on the host side, and in the target discovery log page [14:15:46] the host can just ignore these if they have no meaning for FC [14:15:53] and the target should fill out the values as the spec requires [14:21:12] I started thinking about how to actually use this zcopy stuff from within nvmf [14:21:19] and then my brain melted [14:21:35] but in all seriousness, it's going to take quite a bit of work to restructure the nvmf request state machine [14:21:58] there is a whole series of states that get the data buffer and perform the RDMA transfer, all without ever parsing the nvme command itself [14:22:22] but when we introduce this, for just read and write we need to use the bdev zcopy thing [14:22:30] but for all other commands, we need to use the existing nvmf buffer pool [14:22:38] which means we have to shift the command parsing much earlier [14:23:04] i feel like maybe we don't have the API quite right yet [14:23:18] I just uploaded a new revision [14:23:20] we have two things sort of combined - zero copy and buffer allocation [14:28:56] or trsvcid - can FC just be a zero-length string just like PCIe? [14:29:04] No, not for FC-NVMe [14:29:22] It's a string, but it can't be zero length. [14:29:53] I started thinking about how to actually use this zcopy stuff from within nvmf: [14:30:16] I will have to look at what you mean by zcopy [14:30:50] But the goal of the BDEV-IO changes is to eliminate all copies in both the read and write path [14:37:34] johnmeneghini: what needs to go in trsvcid for FC, then? [14:37:38] the spec I have says it should literally say "none" [14:37:42] but I don't know if that's the final spec [14:38:42] and we will do the conversion from C-style string to space-padded string in the discovery code; it doesn't need to be space-padded in the transport_id [14:38:56] e.g. see spdk_nvmf_rdma_discover(), which uses spdk_strcpy_pad() to fill out trsvcid and traddr [14:41:24] drv: what needs to go in trsvcid for FC, then? [14:43:09] Sorry, I need to look at the spec. to answer clearly. [14:43:15] I'll be back [14:45:02] johnmeneghini - i'm just saying that for the SPDK APIs, when the user specifies an NVMe-FC controller to connect to, they just zero the trsvcid since it doesn't really apply to FC - we can do the conversion to all spaces, or "none" internally in SPDK [14:45:16] looking at /home/sys_sgsw removal [14:45:33] what is instead of /home/${USER}, we instead parameterized this directory altogether? [14:45:35] Thanks Jim [14:45:57] I agree, something like a parameterized change is needed [14:46:08] let's assume we call it DEPENDENCY_DIR [14:46:24] and this is where you put VM images, nvme-cli checkouts, etc. [14:46:46] autotest_common.sh can just define this as /home/sys_sgsw by default, but user can override it [14:47:11] then this patch could pass the existing test pools without having to move or rename anything [14:47:12] We can't hard code it, and we can't assume that $USER will work because these scripts are all called by a number of different users, including root - much to my distain. [14:47:43] you don't have to hardcode it [14:47:44] DEPENDENCY_DIR sounds like a good idea to me. [14:47:49] you would do something like: [14:47:59] DEPENDENCY_DIR=/home/${USER} some_spdk_script.sh [14:48:27] would that work for you john? [14:49:14] darsto: did you test the ASAN patch with DPDK 18.05? [14:49:37] That will work for me. I just need all of these scripts to support a programmable user directory, including /home/vagrant! [14:50:28] could you modify your /home/${USER} patch to take this approach instead? [14:50:57] He guys, I'm hitting a time out because I am on EST. [14:51:12] Yes, I will work on this patch. [14:51:33] The /home/${USER} stuff won't work [14:52:15] ok john - let's chat about it more in our meeting tomorrow [14:52:26] unless you have a brief moment to explain now [14:53:15] I need to log off and leave to pick up my motorcyle and my wife. I'll let you figure out the order of that [14:53:39] My wife just called and said the car is smoking. [14:53:48] So that's the order [14:54:00] lol [14:56:55] Sorry to quit just when I am warming up. I [14:57:03] will talk to you tomorrow. [14:58:37] Try my patch! I want to get the vbox stuff working! [14:58:39] https://review.gerrithub.io/#/c/spdk/spdk/+/416461/ [14:58:47] *** Quits: johnmeneghini (~johnmeneg@100.0.53.181) (Quit: Leaving.) [14:58:51] i'm trying - i can't select the distro [14:59:01] debugging create_vbox.sh as we speak [15:08:43] In linux kernel for nvmeof testing there is quite useful nvme-loop.ko driver which is basically the local loopback backend for nvmeof. One can use it if there is no infiniband or other rdma devices but you need to try nvmeof. just do nvme connect-all -t loop with corresponding traget started and you get /dev/nvme?n1 for tests. Is there any kind of loopback nvmeof target in spdk? [15:09:12] you can use the linux kernel's soft roce driver in loopback [15:09:45] i am asking about spdk nvmeof target [15:10:00] oh sorry [15:10:15] so waht parameters should i specify in config then ? [15:10:45] if you have soft roce running, just set the target up to listen on the NIC's IP [15:10:50] and have the initiator connect ot that same IP [15:11:19] What if I don't have any NIC and just want to test it locally [15:11:26] would it work? [15:11:45] I think soft roce does have to attach to a real NIC, but I'm not sure [15:11:57] you could try to attach it to your lo device [15:12:15] I am not sure either, ok then , I'll look at this direction [15:12:47] I don't think it will let you attach it to your lo device [15:13:02] but if you have an onboard nic on the system, you can definitely run loopback attached to that [15:13:10] it doesn't need to be an RDMA NIC, just any NIC [15:14:10] But spdk nvmeof says in config that the only sopported trtype is rdma. Would it be fine with this configuration ? [15:14:29] I mean it is not "real" rdma [15:15:30] well at least it would be interesting to try this way, thanks [15:16:31] functionally it works the same and is great for development, testing, etc. - but performance/efficiency isn't as good obviously since the host CPU is doing the data placement [15:16:56] yes , sure it is only for testing purposes [15:21:42] bye [15:26:19] *** Quits: alekseymmm (bcf3adf1@gateway/web/freenode/ip.188.243.173.241) (Ping timeout: 260 seconds) [15:48:58] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [15:50:06] *** Joins: travis-ci (~travis-ci@ec2-174-129-94-18.compute-1.amazonaws.com) [15:50:07] (spdk/master) blob: always use uint64_t to represent page_idx (Jim Harris) [15:50:07] Diff URL: https://github.com/spdk/spdk/compare/841f0beae542...f3001308726b [15:50:07] *** Parts: travis-ci (~travis-ci@ec2-174-129-94-18.compute-1.amazonaws.com) () [15:53:11] *** Joins: travis-ci (~travis-ci@ec2-107-22-7-197.compute-1.amazonaws.com) [15:53:12] (spdk/master) env_dpdk: pick base-virtaddr that ASAN won't override (Jim Harris) [15:53:12] Diff URL: https://github.com/spdk/spdk/compare/f3001308726b...9d04d0efd5e0 [15:53:12] *** Parts: travis-ci (~travis-ci@ec2-107-22-7-197.compute-1.amazonaws.com) () [15:55:46] *** Joins: travis-ci (~travis-ci@ec2-174-129-94-18.compute-1.amazonaws.com) [15:55:47] (spdk/master) blobstore: add decouple parent function (Tomasz Kulasek) [15:55:48] Diff URL: https://github.com/spdk/spdk/compare/9d04d0efd5e0...635a1aa8a962 [15:55:48] *** Parts: travis-ci (~travis-ci@ec2-174-129-94-18.compute-1.amazonaws.com) () [18:40:31] *** Joins: dlw (~Thunderbi@114.255.44.143) [18:52:55] changpe1: I noticed that I missed to read the critical logic in your patch. Now looks fine to me and you don't have to change anything. Thanks. [18:57:12] *** Quits: lyan (~lyan@2605:a000:160e:2124:4a4d:7eff:fef2:eea3) (Remote host closed the connection) [19:04:04] *** Joins: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) [19:12:15] *** Quits: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) (Quit: Textual IRC Client: www.textualapp.com) [19:29:15] *** Joins: lhodev (~lhodev@66-90-218-190.dyn.grandenetworks.net) [19:42:17] *** Joins: johnmeneghini (~johnmeneg@pool-100-0-53-181.bstnma.fios.verizon.net) [20:40:09] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 260 seconds) [21:11:41] *** Quits: johnmeneghini (~johnmeneg@pool-100-0-53-181.bstnma.fios.verizon.net) (Quit: Leaving.) [21:35:10] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [22:07:39] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 260 seconds) [22:31:44] jimharris: I ran it once with DPDK 18.05 primary process and didn't get any warnings