TI AM335x: Thread 呼叫 QMutex.lock() 後就卡住

12月 16, 2016

TI AM335x: Thread 呼叫 QMutex.lock() 後就卡住

程式 scheduling 設成 SCHED_RR。情況是這樣，有一個 thread 它是要接收從 socket 過來的資料，它平常是 block 等待在 accept。當有請求來了之後，會接收，然後使用 QMutex.lock() 嘗試取得一個共享資源。但這樣的程式碼，在 console 下直接執行程式都沒有問題。然而程式如果是由 init.d 中的 script 叫起來，該 thread 就會卡在 QMutex.lock() 之中。

按照經驗，看起來又是這個 thread 被 scheduling 切走之後，就再也回不來了。通常我都會先嘗試呼叫 sleep 0s 讓它 timing 稍微不一樣，來看看是不是可以解決。再試果然就不會卡住了。但為什麼呢？為了更詳細地看到底發生何事，我嘗試用 ftrace 的技巧來看看發生了什麼。

首先讓程式在 init.d 中被執行，發現該 thread 最後一個 syscall 是 sys_enter_futex，接著 scheduling 就把這個 thread 切走，執行其它 thread 了。所以我的猜測沒錯，它呼叫了 futex 之後，就卡住了。

再來一樣在 init.d 中執行程式，但是在取得 QMutex.lock() 之前先 sleep 0s 一下。透過 trace-cmd 看結果，發現 sys_enter_futex 沒有被呼叫到！該 thread 很快樂地做 QMutex.lock() 之後要做的事情。

一開始我很困惑，這與我想像的不同。於是我翻出 qt source code 來看，經過一翻頭昏腦脹之後，總算稍微看到為什麼了。

首先 QMutex 的實作在

qtbase\include\QtCore\qmutex.h

之中。我們看看它的 lock() 實作：

void QMutex::lock() QT_MUTEX_LOCK_NOEXCEPT

    QMutexData *current;

    if (fastTryLock(current))

        return;

    if (QT_PREPEND_NAMESPACE(isRecursive)(current))

        static_cast<QRecursiveMutexPrivate *>(current)->lock(-1);

    else

        lockInternal();

第一個被叫到的是 fastTryLock，它是一個 inline 函式：

inline bool fastTryLock(QMutexData *&current) Q_DECL_NOTHROW {

        return d_ptr.testAndSetAcquire(0, dummyLocked(), current);

這個 d_ptr 是：

QBasicAtomicPointer<QMutexData> d_ptr;

好，它是 qt 一個 atomic pointer。fastTryLock 又再呼叫了 testAndSetAcquire。另外還有 dummyLocked()：

bool testAndSetAcquire(Type expectedValue, Type newValue, Type &currentValue) Q_DECL_NOTHROW

    { return Ops::testAndSetAcquire(_q_value, expectedValue, newValue, &currentValue); }

static inline QMutexData *dummyLocked() {

        return reinterpret_cast<QMutexData *>(quintptr(1));

嗯，開始頭昏中。然後不小心在

qtbase\src\corelib\thread\qmutex_linux.cpp

看到下面的註解：

/*

 * QBasicMutex implementation on Linux with futexes

 * QBasicMutex contains one pointer value, which can contain one of four

 * different values:

 *    0x0       unlocked, non-recursive mutex

 *    0x1       locked non-recursive mutex, no waiters

 *    0x3       locked non-recursive mutex, at least one waiter

 *   > 0x3      recursive mutex, points to a QMutexPrivate object

 * LOCKING (non-recursive):

 * A non-recursive mutex starts in the 0x0 state, indicating that it's

 * unlocked. When the first thread attempts to lock it, it will perform a

 * testAndSetAcquire from 0x0 to 0x1. If that succeeds, the caller concludes

 * that it successfully locked the mutex. That happens in fastTryLock().

 * If that testAndSetAcquire fails, QBasicMutex::lockInternal is called.

 * lockInternal will examine the value of the pointer. Otherwise, it will use

 * futexes to sleep and wait for another thread to unlock. To do that, it needs

 * to set a pointer value of 0x3, which indicates that thread is waiting. It

 * does that by a simple fetchAndStoreAcquire operation.

 * If the pointer value was 0x0, it means we succeeded in acquiring the mutex.

 * For other values, it will then call FUTEX_WAIT and with an expected value of

 * 0x3.

 * If the pointer value changed before futex(2) managed to sleep, it will

 * return -1 / EWOULDBLOCK, in which case we have to start over. And even if we

 * are woken up directly by a FUTEX_WAKE, we need to acquire the mutex, so we

 * start over again.

 * UNLOCKING (non-recursive):

 * To unlock, we need to set a value of 0x0 to indicate it's unlocked. The

 * first attempt is a testAndSetRelease operation from 0x1 to 0x0. If that

 * succeeds, we're done.

 * If it fails, unlockInternal() is called. The only possibility is that the

 * mutex value was 0x3, which indicates some other thread is waiting or was

 * waiting in the past. We then set the mutex to 0x0 and perform a FUTEX_WAKE.

*/

重點我用紅字標出來。它說一個 non-recursive (就像我們用的) 會先利用 testAndSetAcquire 判斷值是不是 0，如果是則將它設為 1，代表 lock 取得了。如果值一開始不是 1，代表已有人取到這個 lock，所以接下來就會呼叫 lockInternal()。

看到這邊有個小概念產生，qt 內部的 lock 事實上是對一個 atomic pointer 做操作。如果它是 0，代表沒人用，於是 qt 就將它設成 1，代表有人用了。由於是 atomic 操作，所以可以保證是 thread-safe 的。那有人用的情況之下，才會再進而呼叫 lockInternal()。那這個函式的實作是：

void QBasicMutex::lockInternal() Q_DECL_NOTHROW

    Q_ASSERT(!isRecursive());

    lockInternal_helper<false>(d_ptr);

template <bool IsTimed> static inline
bool lockInternal_helper(QBasicAtomicPointer<QMutexData> &d_ptr, int timeout = -1, QElapsedTimer *elapsedTimer = 0) Q_DECL_NOTHROW
{
    if (!IsTimed)
        timeout = -1;
    // we're here because fastTryLock() has just failed
    if (timeout == 0)
        return false;
    struct timespec ts, *pts = 0;
    if (IsTimed && timeout > 0) {
        ts.tv_sec = timeout / 1000;
        ts.tv_nsec = (timeout % 1000) * 1000 * 1000;
    }

    // the mutex is locked already, set a bit indicating we're waiting
    while (d_ptr.fetchAndStoreAcquire(dummyFutexValue()) != 0) {
        if (IsTimed && pts == &ts) {
            // recalculate the timeout
            qint64 xtimeout = qint64(timeout) * 1000 * 1000;
            xtimeout -= elapsedTimer->nsecsElapsed();
            if (xtimeout <= 0) {
                // timer expired after we returned
                return false;
            }
            ts.tv_sec = xtimeout / Q_INT64_C(1000) / 1000 / 1000;
            ts.tv_nsec = xtimeout % (Q_INT64_C(1000) * 1000 * 1000);
        }
        if (IsTimed && timeout > 0)
            pts = &ts;
        // successfully set the waiting bit, now sleep
        int r = _q_futex(&d_ptr, FUTEX_WAIT, quintptr(dummyFutexValue()), pts);
        if (IsTimed && r != 0 && errno == ETIMEDOUT)
            return false;
        // we got woken up, so try to acquire the mutex
        // note we must set to dummyFutexValue because there could be other threads
        // also waiting
    }

    Q_ASSERT(d_ptr.load());
    return true;
}

好，這個函式總算讓我們看到了一個重點：_q_futex。它內容是：

static inline int _q_futex(void *addr, int op, int val, const struct timespec *timeout) Q_DECL_NOTHROW

    volatile int *int_addr = reinterpret_cast<volatile int *>(addr);

#if Q_BYTE_ORDER == Q_BIG_ENDIAN && QT_POINTER_SIZE == 8

    int_addr++; //We want a pointer to the 32 least significant bit of QMutex::d

#endif

    int *addr2 = 0;

    int val2 = 0;

    // we use __NR_futex because some libcs (like Android's bionic) don't

    // provide SYS_futex etc.

    return syscall(__NR_futex, int_addr, op | futexFlags(), val, timeout, addr2, val2);

Bingo！lockInternal() 最後是呼叫 sys_enter_futex 的 syscall 無誤。

看到這邊，就可以回過來解釋為什麼同樣是 QMutex.lock()，在 trace-cmd 結果中卻不一定會看見 sys_enter_futex 這個 syscall。因為如果該 QMutex 還未被佔用，則不需要呼叫 sys_enter_futex 啊！

但是，看到這裡仍然無法解釋，為何呼叫了 sys_enter_futex 之後就卡住了。只能說我們在 lock() 之前先 sleep 0s，造成原本佔用此 QMutex 的情形消失了，於是我們就很順利地取得 lock，接著做要做的事。

目前就先這樣頂著用吧！XD

搜尋此網誌

紅塵萬丈

TI AM335x: Thread 呼叫 QMutex.lock() 後就卡住

留言

張貼留言

熱門文章

RTC wake up from S3/S4/S5

Notepad++ 快速在文章每行開頭結尾加入文字