libceph: delete from need_resend_linger before check_linger_pool_dne()
authorIlya Dryomov <idryomov@gmail.com>
Thu, 15 Jun 2017 14:30:55 +0000 (16:30 +0200)
committerIlya Dryomov <idryomov@gmail.com>
Fri, 7 Jul 2017 15:25:16 +0000 (17:25 +0200)
When processing a map update consisting of multiple incrementals, we
may end up running check_linger_pool_dne() on a lingering request that
was previously added to need_resend_linger list.  If it is concluded
that the target pool doesn't exist, the request is killed off while
still on need_resend_linger list, which leads to a crash on a NULL
lreq->osd in kick_requests():

    libceph: linger_id 18446462598732840961 pool does not exist
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: ceph_osdc_handle_map+0x4ae/0x870

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
net/ceph/osd_client.c

index 518dbac..576101b 100644 (file)
@@ -3243,6 +3243,7 @@ static void scan_requests(struct ceph_osd *osd,
                                list_add_tail(&lreq->scan_item, need_resend_linger);
                        break;
                case CALC_TARGET_POOL_DNE:
+                       list_del_init(&lreq->scan_item);
                        check_linger_pool_dne(lreq);
                        break;
                }