[00:41:37] *** Quits: ziyeyang_ (~ziyeyang@134.134.139.74) (Quit: Leaving)
[05:41:23] *** Quits: jkkariu (~jkkariu@192.55.54.44) (Ping timeout: 246 seconds)
[05:42:23] *** Joins: jkkariu (~jkkariu@192.55.54.44)
[06:18:48] <peluse> yeah, I'm about to rebase a chain of 6 or 7 patches and I'll bet half of them fail because of this. Has anyone started looking into this one?
[06:25:10] <peluse> bwalker, if you're out there I think I just managed to work on a chain of patches making changes in the middle/rebasing dependencies, etc., using the raw git method you showed me. Fingers crossed I didn't jack it all up :)  Thanks!
[06:50:42] <peluse> 1/5 so far hit the rnd failure above... 4 still running (well, 5 after resubmitting that bogus failure)
[07:29:07] <peluse> OK, only that one.  Had another failure too that I've seen before that I don't *think* is me.  Its a scan-build failure but the warnings aren't on changes I've made and the scan-build output isn't in the dir that the build.log says it should be.  I'm going to resubmit but Seth maybe you can help me out with where the output is supposed to be for a CI scan-build run?  https://gist.github.com/peluse/33abbd05108905700b4cde2ea5999967
[07:30:11] <peluse> ps: scan-build didn't fail for me locally... re-running that before resubmission to make sure as its caught a few issues in my UT additions so far...
[07:38:11] <peluse> oh never, mind.  its mine but oddly enough I have to look at the report (locally) to see it, the warnings in the build.log output don't include it. Also, have to take off for a bit but...
[07:38:37] <peluse> bah never mind, will look at it closer later
[08:30:38] <bwalker> jimharris: great choice of 'examine' instead of 'taste'
[08:30:56] <bwalker> Now I have some more radical thoughts
[08:31:09] <bwalker> 1) Do we even need vbdevs as a separate thing from bdevs?
[08:31:49] <bwalker> with the examination mechanism, the order of initialization doesn't actually matter much. The vbdevs are created in response to that examine callback.
[08:32:18] <bwalker> 2) Does module initialization still need to be asynchronous, or does only the examine step need to be async?
[08:37:45] <bwalker> it would be nice if we could be super clear in the API about "bdev modules" and "bdevs", and have those be the only two concepts
[08:38:08] <bwalker> I think today there is some calls that look like they're on bdevs but are really on the module.
[08:38:23] <bwalker>  /s/is/are
[08:56:01] <drv> jimharris: it looks like the first lvol patch adds blob as a bdev module dependency, which breaks the RocksDB build, since it links spdk_blob twice
[09:39:52] <jimharris> bwalker: not radical thoughts at all
[09:40:25] <jimharris> for #1, yes, I think we could consolidate now on a single bdev interface - I'd suggest we keep this patch set as-is and then I can take a look at that
[09:40:56] <jimharris> for #2, module initialization does not need to be asynchronous anymore - that was on my todo list after this patch set gets merged
[10:02:49] <peluse> heh, I'm always up for making a difference ;)
[10:05:21] <jimharris> peluse: looking at your unit test patches - something we'll need for our asynchronous callback driven modules is a way for a stub to save the value of some argument so we can evaluate it later
[10:06:21] <peluse> OK, so I am stuck with a silly UT case where I have to alloc memory and call the code under test which frees it based on one of the pass parms. If I free it again in the test code, valgrind pukes.  if I don't, the clang static-anal pukes. Any suggestions?.
[10:06:57] <jimharris> for example, a very common unit test paradigm with blobstore (and blobfs) is to call a blobstore function and pass it a function pointer to a test stub
[10:07:08] <jimharris> blobstore will call that function pointer with a parameter denoting pass/fail
[10:07:29] <jimharris> the blobstore function itself, and the function pointer both return void
[10:07:32] <peluse> jimharris, can you point me to an example function?
[10:07:50] <jimharris> pretty much anything in the lib/blob unit tests
[10:08:30] <peluse> jimharris, OK, I'll look at one shortly...
[10:08:49] <jimharris> yeah - doesn't need to be now, but will need it eventually
[10:09:06] <jimharris> which function specifically are you hitting this UT case issue?
[10:09:14] <peluse> on the malloc thing I just posted a Q about, I think if I mock the spdk_free then I'll be OK
[10:09:31] <peluse> test case is test_nvme_user_copy_cmd_complete()
[10:11:37] <peluse> testing now...
[10:14:25] <peluse> yay mocks! that solved it, I alloc/free all in the UT code now and everyone is happy :)
[10:20:11] <drv> hmm, bdev_ut doesn't build if VTune integration is enabled... I wonder how this has been tested
[10:20:59] <jimharris> drv: your gpt partition name patches need to be rebased
[10:21:09] <drv> ok
[13:50:21] <drv> jimharris: this patch and the one after it fix a bug in the new async JSON-RPC handling: https://review.gerrithub.io/#/c/369292/
[13:50:36] <drv> it's hit a couple of times in the test pool already (iscsi test hangs while calling an RPC)
[14:42:20] <peluse> so FYI on the UT patch chain, I was doing one step wrong in rebasing the top for changes early in the chain so they're not up to date with earlier patches and I'm just gonna screw it up even more trying to get it right so I'll rebase each upstream patch from master as earlier ones land.  ugh
[15:25:13] <peluse> is the only thing I have to do to use ASAN is enable it in CONFIG.local and if so how can I tell that its doing tis thing?
[15:28:54] <drv> peluse: you can just do ./configure --enable-asan (along with any other options that you need)
[15:30:15] <peluse> OK, but that will put it in CONFIG.local right? Updating unittest.sh per bwalker's input that if ASAN is being used locally that unittest.sh should not use valgrind
[15:30:39] <peluse> so really the question is, if I just look in CONFIG.local is that good enough?
[15:31:33] *** Joins: sethhowe_ (~sethhowe@192.55.54.42)
[15:32:39] <drv> yes
[15:32:39] <drv> yep, it should work just like the CONFIG_COVERAGE test
[15:32:52] *** Quits: sethhowe (~sethhowe@192.55.54.42) (Remote host closed the connection)
[15:33:11] <peluse> K, thanks
[15:36:13] <peluse> drv, hmm... the CONFIG_COVERAGE check in unittest.sh is looking at config.h - I was about ready to push a check in CONFIG.local for ASAN. Given that our readme tells people to edit CONFIG.local directly I think we should be checking CONFIG.local for both options?
[15:37:19] <drv> the top-level Makefile regenerates config.h from CONFIG.local, so they should always be in sync
[15:37:35] <peluse> ahh, OK
[15:37:50] <drv> and the user can override options on the make command line, so config.h is the right place
[15:38:10] <drv> (e.g. make CONFIG_ASAN=y won't write CONFIG.local but it will update config.h)
[15:41:03] <peluse> drv, OK, I'll update it real quick to use that instead, thanks
[15:44:26] <peluse> jimharris, see my note on the * mock thing, I think I can make it a lot clearer.  thanks for the feedback :)
[15:45:26] <jimharris> cool
[16:33:32] <peluse> jimharris, OK see if that's clearer. It only shows the case being used so far, setting the mock return to a * value. The other global is still in the stub defn so can be used to mock something other than NULL return so will show how to do that when we run into a case
[16:33:45] <peluse> but I think this is a lot easier to see what's going on...
[16:43:13] <peluse> drv, thanks for all the reviews BTW and putting up with my shit.. can't believe how much 2 yrs of Python and a year of powershell have made me so careless with details...
[16:44:58] <jimharris> peluse: oh wow - yes, that is much clearer - my feeble brain caught on to this much more quickly :)
[16:46:49] <peluse> jimharris, sheeeeit :)
[16:47:05] <drv> peluse: haha, no problem - we'll have the Python out of your head in no time
[16:47:31] <drv> jimharris: do you know if there's any reason we don't test iSCSI with digests enabled? (or are we intending to and the code coverage is wrong?)
[16:47:31] <peluse> yeah, I'm getting too old to remember more than one thing at a time.  C til the end!
[16:48:14] <jimharris> no - we should add some tests that enable it
[16:48:34] <jimharris> there are no tests currently that try to enable it to my knowledge
[16:48:41] <jimharris> well - maybe some calsoft tests?
[16:48:50] <drv> I thought calsoft did, but it doesn't show up in coverage
[16:49:00] <drv> e.g. http://spdk.intel.com/public/spdk/builds/review/ec4d9d17a1181d4cdb0ff072e50d070dcd6b9c8f.1499899656/coverage/lib/iscsi/iscsi.c.gcov.html
[16:49:16] <drv> we never go into the header_digest or data_digest blocks
[16:49:42] <jimharris> maybe try to enable digests for the calsoft tests?
[16:50:04] <jimharris> well...maybe not - not sure how well the calsoft tests themselves have been validated for digests
[16:50:09] <drv> well, I'll put it on the backlog at least :)
[16:50:36] <drv> I was just poking at the CRC-32 stuff and got sucked in
[16:52:31] <drv> I think our code that sends data with digests enabled is making some unfounded assumptions
[16:52:49] * peluse makes unfounded assumptions daily
[16:52:58] <drv> spdk_iscsi_build_iovecs() just rounds up the data buffer len with ISCSI_ALIGN, but that will presumably read off the end of the buffer if it isn't already a multiple of 4
[16:53:18] <drv> (in real life, our data buffers are almost certainly always multiples of 4 anyway, so it may not matter in practice)
[16:54:04] <drv> if we wanted to do this correctly, we'd probably have to add another iovec entry for padding after the data, then take that into account when calculating the digest
[16:54:45] <jimharris> is that a digest-specific problem?
[16:54:55] <jimharris> sounds like an assert to me
[16:54:56] <drv> we have a helper function, spdk_fixup_crc32c(), which tries to handle the padding without overreading, but we always round up the length before passing it to fixup
[16:55:14] <drv> yeah, I guess the actual data buffer overread is not digest specific
[16:55:59] <drv> hmm, I wonder if this path gets hit for things other than block data - I guess INQUIRY results and stuff like that are still "data", so they may not be a multiple of 4?
[16:56:48] <jimharris> true
[16:57:15] <drv> it shouldn't be very hard to just add another iovec entry, since we have this nice iovec stuff already set up
[16:57:24] <jimharris> for iSCSI, the data segments are always 4-byte multiples
[16:58:02] <drv> the iSCSI RFC says DataSegmentLength is length in bytes excluding padding, so it sounds like there could be cases where it's not already a multiple of 4?
[16:58:07] <jimharris> other information in the pdu will tell it if the valid data does not consume all of the bytes
[16:59:15] <jimharris> yeah - data_segment_length could be 18
[16:59:36] <jimharris> so we have to read 20 bytes off of the wire, but only 18 of them are valid
[16:59:43] <drv> yeah, the receive path is fine, I think
[17:00:05] <drv> but the send path is just taking pdu->buf and potentially reading the rounded-up size
[17:00:57] <drv> pdu->data, not pdu->buf
[17:03:18] <drv> plus I don't see anywhere that we zero out the padding bytes on send, although I may be missing it
[17:08:38] <drv> I think I found a concrete example
[17:08:56] <drv> spdk_iscsi_task_response() sets pdu->data to the sense buffer, which is typically 18 bytes
[17:09:20] <drv> it
[17:09:37] <drv> it's possible that we end up with a zeroed-out sense buffer by accident here, but that seems pretty fragile
[17:41:04] <drv> I think our AHS handling is currently broken, too - we have pdu->ahs and pdu->ahs_data, but we never set pdu->ahs to point at ahs_data
[18:22:48] *** Joins: ziyeyang_ (ziyeyang@nat/intel/x-ttkaskwqcoyocgcu)