[00:56:02] *** Quits: ziyeyang_ (ziyeyang@nat/intel/x-ttkaskwqcoyocgcu) (Quit: Leaving) [01:32:51] *** Joins: tomzawadzki (tzawadzk@nat/intel/x-gtmigccbcuwuczrk) [04:41:08] *** Joins: ziyeyang_ (~ziyeyang@192.55.54.45) [04:41:45] Though vhost-blk is disabled by Jim, SPDK_vhost related test still fails randomly [05:46:21] *** Quits: ziyeyang_ (~ziyeyang@192.55.54.45) (Remote host closed the connection) [07:09:33] *** Quits: tomzawadzki (tzawadzk@nat/intel/x-gtmigccbcuwuczrk) (Ping timeout: 260 seconds) [08:06:33] ziyeyang, looks like it was merged 10 hrs ago and is pretty clearly commented out in autotest.sh - do you have a build that was rebased after the change that still ran it? [08:12:07] turns out the vhost autotest scripts is randomly failing in both vhost-blk and vhost-scsi mode [08:12:14] i'm actively working on it with karol [08:39:25] coolio [09:05:49] jimharris, anything change wrt the dpdk submodule recently? I cloned a fresh repo (and of course init'd the module) on 2 different dev systems and in both cases a make failes with env.c:38:24: fatal error: rte_config.h: No such file or directory [09:06:25] hmmm, is failes the British spelling? :) [09:08:48] bloody submodules [09:11:45] you also ran configure? [09:13:43] yeah, I forgot that part :( [09:14:05] strange thing is, on my main dev system I was working fine last night and this morning it wouldn't build, that's why I went and tried a few other systems with a fresh repo [09:14:30] configure fixed it there too, wonder what I did to mess that up. oh well, thx [09:33:55] FYI I had two +2 on this (hasn't been merged yet) but it got rebased based on the push of a dependent patch, no other changes.... https://review.gerrithub.io/#/c/368420/ [11:03:43] jimharris: I don't totally understand https://review.gerrithub.io/#/c/369470/ - does removing the 'tee' actually fix the problem? [11:06:02] it fixes a different problem than this fio issue [11:06:15] one sec - let me find one of those failures so you can understand [11:06:30] we should probably check in the tee removal as a separate patch [11:06:59] I also don't understand the -x in qemu-src [11:07:29] oh, it's a fiotest/autotest.sh parameter, not part of --qemu-src [11:08:03] anyway, seems unrelated to the tee removal patch as well [11:09:27] hmmm - having trouble finding one of those ATM - but one of the failures was the script was starting vhost in the background (like we do for iscsi and nvmf tests) but doing a bunch of fancy indirection so they could both tee the output and echo $! to a temp pidfile [11:09:51] and sometimes they would try to waitforlisten on the pid in that file and it would report the pidfile did not exist yet [11:11:16] so I suggested get rid of the tee stuff and just use $! directly to get the vhost pid rather than relying on the pidfile mechanism [11:11:24] ok, that seems like a good change [11:31:53] bwalker: i've pushed some changes that get rid of the bdev/vbdev module differentiation [11:32:03] ok will take a look [11:33:29] debating whether to get rid of the bdev/vbdev differentiation itself (not just the module differences) - i.e. everyone calls spdk_bdev_register() and you pass in a NULL array of base bdevs if its a physical bdev (nvme namespace, malloc disk, etc.) [11:34:58] i think consolidating on spdk_bdev_register() is fine - it's a tiny bit more work for the module developer/reader to understand, but i think if you're working on an spdk bdev module you really need to understand these details anyways [11:35:40] I like unifying - maybe a separate call to associate instead of passing it into register? [11:35:47] I have to take a closer look at the API [11:36:22] one problem i see with that is when do you know to pass the bdev to other modules for examination? [11:36:37] yeah true [11:36:44] or maybe you do the association first? [11:36:49] hmmm - that might actually work [11:38:53] yeah - please take a look at the spdk_internal/bdev.h api when you have a sec (from my latest patch) - besides this potential change to spdk_bdev_register() i think it looks in pretty good shape [11:39:31] so then next question is do we get rid of the "vbdev" string everywhere too? i.e. vbdev_gpt.c => bdev_gpt.c? [11:39:42] i feel like that's something we can do later [11:40:09] that doesn't impact public API [11:40:25] so we didn't move to the open/close model - we still have claim/unclaim [11:40:51] we have both [11:41:13] so modules like gpt still do an open to try to read the partition table [11:41:25] if/when they find a gpt and want to create bdevs on top of it, then they do a claim [11:41:51] ok, I need to pull the whole patch series [11:41:55] so I can see both header files [11:42:07] this makes it a lot easier to let a bunch of vbdev modules do their examinations in parallel [11:42:25] (we might get rid of it in the code, but the phrase "vbdev modules" is still very convenient) [11:44:01] i have a patch mostly complete that moves to passing the bdev_desc instead of the bdev to the various i/o api calls [11:44:42] you need a descriptor to "claim" a bdev? [11:45:00] oh - i need to add some documentation on that [11:45:16] no, the purpose there is to upgrade your descriptor to read/write [11:46:13] i.e. for logical volumes, you've already opened the bdev to read the metadata - what happens when you find a logical volume store there and now want claim the bdev? you need your descriptor to be read/write now [11:46:27] yeah, ok [11:46:53] i need to get that patch wrapped up - going to be tight for this release though [11:47:13] it's a rather significant api change (but transparent to bdev modules) [11:47:25] in terms of processing I/Os coming in [11:55:56] I think there are some issues with ordering during module initialization [11:56:04] they're sorted based on whether they have an examine callback [11:56:13] but what if you have an lvol manager inside a GPT partition [11:56:24] there is no guarantee that the lvol stuff runs after the GPT stuff [11:56:32] that's fine [11:56:56] neither lvol nor gpt will start examining until some physical bdev module like nvme registers the first bdev [11:57:15] oh, of course [11:57:19] so by sorting based on examination, we know that gpt and lvol initialization routines have run before they get their first examine callback [12:00:13] I think we should definitely do the bdev_desc change, but your series out so far looks good [12:00:37] this is a big clarity improvement [12:06:57] drv: http://spdk.intel.com/public/spdk/builds/review/74443d452e6683b2ada76b16be4ec625bd8d5368.1499972018/vm-fedora-03/build.log [12:07:03] example of that pidfile issue [12:43:30] bwalker: bdev descriptor API patch pushed [13:51:41] jimharris: opinions on https://review.gerrithub.io/#/c/369470/ ( [13:51:44] test/vhost: remove tee pipe from vhost start command)? [13:52:04] it sneaks the vhost-blk test back in too, which I would prefer to do in a separate step [14:23:34] drv: i'm ok with it [14:51:57] drv: going to see if this gets us anywhere: https://review.gerrithub.io/#/c/369501/ [14:52:32] could be a suitable workaround for now - if it actually works, we'll know more about what the issue really is too (we need to wait longer to start the client most likely) [14:57:14] hmm, yeah