[00:15:40] *** Joins: tomzawadzki (~tomzawadz@192.55.54.42) [00:58:36] *** Quits: ziyeyang_ (~ziyeyang@192.55.54.42) (Quit: Leaving) [01:22:34] *** Quits: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) (Ping timeout: 260 seconds) [07:56:54] sethhowe, slightly newer output: https://ci.spdk.io/spdk/builds/review/53913c20478dc1eebb7ff780f76288b9cb9a8596.1517409419/fedora-04/build.log [08:05:38] sethhowe, better yet just ping me when you get in, I have a few questions about perm levels when running autotest.sh.... thanks! [08:47:23] *** Quits: sethhowe (sethhowe@nat/intel/x-ekcxacdskkbplcfm) (Remote host closed the connection) [08:47:42] *** Joins: sethhowe (sethhowe@nat/intel/x-qqbflnldfwnpghmq) [08:51:27] looks like test/vhost/integrity/ is trying to use virt-manager, but it's not installed on fedora-08, and it looks like the permissions aren't set up for virt to run our custom qemu build on fedora-05 [08:51:36] can we tweak this to just run qemu directly like the other tests? [08:53:24] also, based on some additional timing markers, it looks like copying the SPDK code into the VM for the virtio-pci test takes almost a full minute: https://ci.spdk.io/spdk/builds/review/6003faf4389ce9173e2d39b9244eda5a3fbd93eb.1517356865/fedora-07/timing.svg [08:53:46] seems like it might be time to go back and optimize that test to just copy in the pre-built binaries we need [08:58:00] is it copying the .git directory? [08:58:53] iscsi_tgt/ext4test uses rsync --exclude=".git" [09:00:11] that would probably help - I think it's copying everything right now [09:00:36] although I think the build pool is using a shallow clone, so hopefully .git isn't too big [09:00:45] (if that's not true, we should look into doing that as well to help build time in general) [09:01:27] I will check to confirm that we are doing a shallow clone in the build pool. [09:07:26] correct vbdev observation(?): they aren't transparent to the SPDK application; if I add a vbdev to my conf file and my app was previously using the name of the underlying bdev I need to change that to the name of the vbdev? Or am I missing some logic somewhere? [09:09:00] your app needs to use the name of the vbdev [09:10:35] yeah, that's what I was asking. thanks [09:11:24] other than hotremove my passthru bdev is working now - i just haven't added the hotremove cb fn yet is all... [09:12:30] darsto_: you around? [09:12:51] bwalker: y [09:13:43] for the bdev programmer's guide - I thought ppelplin was writing that. I jumped in and updated to cover since he's got higher priority stuff with snapshots right now [09:13:59] but you uploaded a new rev yesterday at essentially the same time as me [09:14:28] you're free to have it - just making sure that's your intention [09:15:28] I uploaded my guide just because I saw yours [09:15:42] Your version is fine - and it's finished [09:15:53] kind of finished [09:15:59] it's a bit different approach and could probably use some examples [09:16:11] I'm missing lots of detail on how to write a bdev module [09:16:20] in particular, vbdevs aren't covered [09:16:27] and using the bdev_part stuff [09:17:02] if you were working on it, I don't mind if you keep at it. Take anything you want from mine and just put it into yours [09:17:26] or if you want me to continue with the version I uploaded, I can do that too. Totally up to you [09:18:51] heh, I can't wait to read the part on creating a vbdev module! [09:21:18] bwalker: go ahead with yours [09:23:09] I'd actually say it's fine as it is [09:23:22] the parts that are there are sufficient for writing a basic bdev [09:23:32] there is just so much more to say on the more advanced topics [09:23:57] and now that I've written the guide, there's a whole bunch of things I want to change in the code to make it clearer [09:24:07] ^ that [09:24:11] that's how it usually goes [09:24:36] I'll make a list and peluse can implement all of it [09:24:57] since he's writing a basic bdev example already [09:26:18] * peluse is all over it... [09:40:30] bwalker: thanks for taking this over [09:40:43] it's better for all of us if it's done really good [10:01:34] jimharris/bwalker: I tagged this patch and the one before it as 18.01; please review: https://review.gerrithub.io/#/c/397573/ [10:04:13] I just realized my version of the bdev guide doesn't even talk about "claiming" [10:08:11] reviewed and pushed [10:18:18] bwalker, it's always the little shit :) [10:30:37] pushed new version of two patches from maciek for review: https://review.gerrithub.io/#/c/397562/ [10:31:51] oops, guess we both did that [10:39:18] drv: will this autoclose the github issue "Fixes GitHub issue #232." ? [10:39:28] I believe so [10:39:35] not sure how the magic issue closing logic works [10:39:39] afair there have to be nothing after the number [10:40:01] if not, I can always close it by hand afterwards :) [10:40:21] there a couple of keywords to use: resolves/resolved/fixes/fixed [10:40:46] if you use any of these followed by a #number it will autoclose the issue [10:40:56] so the '.' will cause it to not work? [10:41:07] afair [10:41:10] yep [10:42:04] if you say "Fixes github issue #232" it won't close the issue either [10:42:19] it has e.g. "Fixes #232" [10:42:51] hmm, I see [10:43:02] not sure if I really like that - if we ever switch issue trackers, it would be confusin [10:43:03] g [10:58:40] jimharris: fix for the JSON array decoding GitHub issue is up for review: https://review.gerrithub.io/#/c/397591/ [11:01:43] also looks like the 1 GB -> 4 MB cluster size change breaks the lvol tests for some reason (maybe a lvol create is hardcoding a number of clusters somewhere?0 [11:05:05] *** Quits: tomzawadzki (~tomzawadz@192.55.54.42) (Ping timeout: 240 seconds) [11:16:19] which failure are you looking at drv? i see where a construct_lvol_store fails with an EEXIST error [11:16:35] yes, that seems like it's probably leftover state from a previous test [11:16:52] maybe a previous create was supposed to fail, but now succeeds due to the cluster size change? [11:18:20] hmmm - yeah, could be [11:54:28] drv, wrt the PG reference name, I'll go change to just use the old blob name to avoid having to find/touch other places... [11:57:48] I just need to change the .md filename and the Doxyfile and index references though right (and not all the tags inside blob.md that say blob_pg)? [12:00:45] well, I gotta run for a few so if the internal ones need to be changed also feel free to do so or assuming it can wait I'll do it this afternoon [12:00:54] but I pushed the other changes just now [12:33:35] peluse: the #blob_pg name would have to be changed to #blob if you want the existing references to work [12:33:49] the other ones can probably stay #blob_pg_whatever if you prefer [12:33:59] but might as well be consistent [12:34:37] the actual .md filename doesn't really matter (the first # name determines the output HTML filename) [12:48:32] i think i've fixed the 4MiB cluster issue: https://review.gerrithub.io/#/c/397562/ [12:49:07] you were right - the failure is because of leftover state from the test literally just before it [12:49:13] looks like there's a tab that snuck into the .py file [12:49:17] crap [12:50:10] these test cases could definitely use some better comments [12:50:24] yes [12:50:27] I think most of them have descriptions over in the test plan, but those should ideally be in the test script itself as comments [12:51:51] this one actually doesn't match up with the description in test/lvol/test_plan.md for test case 601 [12:52:26] assuming this passes the test pool, i'll shoot a note out to take a look at it [12:52:33] oh, I see, the cluster size argument in the construct_lvol_store is 0 [12:52:35] i just kept that test doing what it did before [12:52:45] yeah - use the default [12:52:47] so it does match the description, I just read it wrong [12:53:34] it needs to be looked at - that construct_lvol_store is supposed to fail - but with this 4MiB patch, the construct_lvol_store works but the test script still marks the test case as a PASS [12:53:42] yeah, I'm not sure it's testing what it intends to test [12:54:10] since construct_lvol_store cluster_sz parameter is optional, 0 is just interpreted as "use default" rather than an invalid case [12:54:55] as long as it passes the test pool, I'm happy with checking your fix in and investigating it separately [13:05:18] looks like it will pass this time [13:07:09] the next patch in the series needs a small fix [13:07:13] *** Joins: lhodev (~Adium@inet-hqmc05-o.oracle.com) [13:16:19] bah - yeah, just noticed that [13:17:00] we already use %d for cluster_sz in a bunch of other places [13:17:13] I'm not saying it's right - I'm just saying we can fix this later :) [13:18:15] not a bunch of places - 2 other places in this file - I'll add a separate patch that fixes those all up [13:23:12] hmm, ok [13:24:45] never mind - I updated the patch [13:25:54] I think we can (theoretically, at least) configure a 2 GiB cluster_sz, which would print wrong with %d [13:26:15] bwalker: can you look at first two patches starting at https://review.gerrithub.io/#/c/397562/ [13:26:36] *** Parts: lhodev (~Adium@inet-hqmc05-o.oracle.com) () [13:28:23] k [13:39:46] i rebased the original lvol test patch - that should be in good shape now [13:40:24] awesome [13:40:29] what else do we need for the release? [13:40:33] speak now or forever hold your peace :) [13:57:17] on the status page, what do folks think about putting the Pending Approval list above the Build Pool box? [13:57:35] fine with me [13:57:40] sethhowe [13:58:57] sure, that should be doable with just a change to the HTML template [13:59:30] yeah - I figured it would be super easy, just wasn't sure if everyone agreed that the Pending Approval list should be at the top [14:00:15] push the lvol test patch? [14:01:06] if it passed, go for it [14:53:47] jimharris: After I added TRACE_READ_FROM_SOCKET_START for tracking socket start read TS, it looks like TRACE_READ_FROM_SOCKET_START gets called after TRACE_READ_FROM_SOCKET_DONE in the output trace log. TRACE_READ_FROM_SOCKET_START was added in lib/iscsi/iscsi.c file in spdk_iscsi_read_pdu function. [15:02:36] oh - i forgot that READ_FROM_SOCKET_DONE is done on *each* successful read of data from the socket [15:02:47] ok - you'll need to move the SOCKET_START call into spdk_iscsi_conn_read_data [15:03:04] oh wait [15:04:03] hmmm - spdk_trace_record() isn't going to handle that very well - currently spdk_trace_record() has no way to pass a TSC, it just uses the TSC of when you call it [15:06:03] you'll want to change spdk_trace_record() to an internal function (i.e. _spdk_trace_record()) that takes an extra tsc parameter [15:06:28] change spdk_trace_record() to just a single call that passes spdk_get_ticks() as the tsc parameter [15:06:48] and another version called something like spdk_trace_record_tsc() that allows the caller to pass a specific tsc value [15:06:59] Right OK [15:07:16] then in spdk_iscsi_conn_read_data, use this new tsc variant, but only call it when spdk_sock_recv() returned ata [15:07:20] returned data [15:07:36] does that makes sense? [15:08:01] Yep let me try it [15:29:18] wrt status page changes keep in mind that the implementation will change in the move to Jenkins so sethhowe be sure to keep folks working on that activity up to date so we don't move to that system and lose new little tweaks... [15:29:55] (but we'll be running them in parallel for a while so will have plenty of time to catch them, still though) [16:16:47] v18.01 has been tagged: https://github.com/spdk/spdk/releases/tag/v18.01 [16:17:30] *** Joins: Shuhei (caf6fc61@gateway/web/freenode/ip.202.246.252.97) [16:52:52] nice [18:11:06] *** Quits: guerby (~guerby@april/board/guerby) (Remote host closed the connection) [18:11:19] *** Joins: guerby (~guerby@ip165.tetaneutral.net) [18:11:19] *** Quits: guerby (~guerby@ip165.tetaneutral.net) (Changing host) [18:11:19] *** Joins: guerby (~guerby@april/board/guerby)