Solveddlib detector too slow on the TK1 board


Recently, we use dlib on our TK1(arm) board, but seems it take too long(about 3s) to detect one face in the picture.

We use 'pip install dlib' to install, and have a test used below code:

detector = dlib.get_frontal_face_detector()
img = io.imread("/home/ubuntu/face.jpg")
for i in range(1000):
dets = detector(img, 1)
print("Number of faces detected: {}".format(len(dets)))

And it take about 3s to detect one picture, do you know where is wrong? how to fix it? thanks~
Is the blas library impact so much?

25 Answers

✔️Accepted Answer

My test code and compiler settings are here.

Updated RPI3 measurements:

Raspberry Pi 3 Model B [rev. a02082] (circa 2016)

armv7/1.2GHz (g++ (Raspbian 4.9.2-10/Raspbian))

Run Flags Duration (ms) Notes
5. -O3 ~2904 Compiled, ran.
6. -O3 -mfpu=neon ~1267 Compiled, ran.
10a. -O3 -mfpu=neon -fprofile-generate ~5600 Compiled, ran.
10b. -O3 -mfpu=neon -fprofile-use ~444 Did 10a, then compiled, ran.

Wow! 🥇

Other Answers:

dets = detector(img, 1)

first try changing this to dets = detector(img, 0)

Next step is to use NEON optimizations. It is discussed here: #276
Some other possibility is to run partial face detection (only frontal faces) - this will make it run about 2x faster with some face missing. You can try reading this for more info

TK1's CPU is quite slow and the whole idea of TK1 is to use GPU for all processing tasks. Dlib does not support FHOG detectors on GPU, but there are some in OpenCV

And the one more problem of TK1 - 32-bit architecture, so max CUDA version is 6.5 for it. And Dlib require at least 7.5 CUDA version

Switching to Jetson TX1/2 is required to run Dlib's DNN algorithms

400x600 is quite small resolution, I think no need to try smaller images
Next is NEON question. This is not something oficially supported and should be double-checked
Check also this doc for Jetson CPU speed tuning

Also possible optimizations are not to use pyramid and use only frontal detector

        typedef dlib::scan_fhog_pyramid<dlib::pyramid_down<6>, dlib::fhog_feature_extractor > image_scanner_type;
        image_scanner_type scanner;
        detector = dlib::object_detector<image_scanner_type>(scanner, detector.get_overlap_tester(), detector.get_w());

This new detector will work about 4x faster, but will miss frontal faces and will detect only a limited face size range (about 80 pixels size)
But this is general optimization and they will work on PC too, while 50x gap is something very different. I assume that TK1 has 2x less CPU frequency, so the gap comes to 25x, then SIMD - they should give about 2x-4x performance improvement, and the rest is possible architecture differences, memory speed and bandwidth
To understand the real situation I recommend you to measure face detection stages separate. First stage is FHOG features extraction:

        dlib::array<dlib::array2d<double>> hog;
        dlib::impl_fhog::impl_extract_fhog_features(img, hog, 8, 1, 1);

The real way how to make face detection work on Tegra TK1 well is to rewrite the code into CUDA - this is the main idea of all Jetsons

Related Issues:

dlib ImportError: cannot open shared object file
pip install mkl it works to me I have installed dlib using pip install dlib but ...
dlib detector too slow on the TK1 board
My test code and compiler settings are here Updated RPI3 measurements: Raspberry Pi 3 Model B [rev ...
electron Requiring electron outside of main.js causes a TypeError
For anyone encountering this problem in the future and reading this thread Electron version: 1.3.5 O...
electron Error while importing electron in react | import { ipcRenderer } from 'electron'
@MarshallOfSound my mistake I found the solution in issue #7300 if it can help anyone Please note th...
electron Failed to load resource: net::ERR_FILE_NOT_FOUND file:///D:/css/app.css
If this occur when having <base href=/> in the index.html just replace it by <base href=./>. ...
electron Error: Electron failed to install correctly, please delete node_modules/electron and try installing again
Try It should output a progress bar for the download 👍 Electron version: latest (That Try) Operatin...
electron How do I open a url from <a> on default OS browser?
I found this code snippet on S.O.: Dropped it in my main index file it seems to be working as far as...
electron 9.0.0 does not display local images
A recap for anyone else who's having this issue: Make sure you've enabled webSecurity from your Brow...
nodegit Can't install via NPM
Latest Xcode just straight up broke things all across the world Try running sudo xcode-select --inst...
electron The SUID sandbox helper binary was found, but is not configured correctly
CONFIG_USER_NS=y enables the user namespaces feature but they're still restricted to privileged user...
cmder Bad owner or permissions on C:\\Users\\USER/.ssh/config
Ok Since yesterday's windows update (1810) ...
nix How to remove nix
To uninstall nix you will need to do the following things: remove the Hi I am using nix on Ubuntu 16...
arrayfire NVCC does not support Apple Clang version 8.x
@joseph-zhong it looks like you're using Xcode 8.3 which CUDA (v8.0.61) does not yet support :( Down...
electron autoUpdater does not work when have authenticated proxy
I finally got everything working so I just wanted to report back here quickly while I'm waiting for ...
electron Can't install electron 1.7.6 in Ubuntu 16.04 environment
try sudo npm install -g electron --unsafe-perm=true --allow-root Electron version:1.7.6 (1.7.8(lates...
opencv opencv3.10 does not support cuda8.0rc?
For those people who find this page looking for a way to fix this in their build but can't deal with...
kaldi Is there any speaker diarization documentation and already trained model?
@iacoshoria the recipe is not bound to this dataset We are talking about making a diarization recipe...
electron No Cut/Copy/Paste context menus on <input> fields
Adding context menus to all inputs/textarea's is simple: Just put that in render process somewhere ...
electron app.getVersion returns electron binary version instead of package.json field
In development the version returned by app.getVersion() depends on how you launched your app ...
opencv Cannot make install opencv from origin/master on ubuntu 17.10
@zapcity @Guilmort you may have to manually execute the generator script ( Assuming you call...
electron Behind proxy, installation failed with 7.0.0, but works with 6.1.2
The workaround of specifying environment variables from castlabs/electron-releases#45 (comment) work...
electron Could not get code signature for running application
Building on @erynkyo I've found that you don't need to create a new key and can ad-hoc code sign by ...
electron mainWindow.webContents.openDevTools() throws error when loading new page
+1 happens after every reload but only if devtools is open I don't specify to open devtools from the...
opencv cv2.rectangle() TypeError: an integer is required (got type tuple)
The error message is completely wrong as always I think the cause of this error may be the case when...
electron Not allowed to load local resource
@isurendrasingh Add this configuration when you instantiate the BrowserWindow. Electron Version: 2.0...
cmder Can no longer use ssh-add: Error connecting to agent
try which ssh-agent Windows 1803 comes with their own OpenSSH version by default now you probably wa...
electron Electron apps can't find
@loveencounterflow I had the same issue on Linux Mint 18 Cinnamon sudo apt-get install curl libcurl3...
electron Getting started ... or not?
@ZelphirKaltstahl do you have node_modules/.bin on your PATH? I think that is one assumption the ins...
cmder Support new windows 10 1903 Terminal
To add cmder as a new terminal add this profile in the json settings Full settings example with cmde...
electron systemPreferences.askForMediaAccess(mediaType) causes app to crash in macOS Mojave
@fgladisch The TCC log indicates that the entitlement is correctly set The app crashes regardless ...
electron How to catch errors occured in the main process?
To to catch errors that occure in the main process you can use: When you set a listener for uncaught...
electron BrowserWindow preload script not executed
This works for me: Hello I have downloaded the atom/electron-quick-start and changed the main.js to ...
electron Distinguishing "development" from "production"
For us it has always been easy to do with minimal code This solution works in both the main and rend...
nix Error: cloning builder process: Operation not permitted installing Nix 2.2.1 in (Arch) Linux
I had the same problem I think it has something to do with archlinux not having kernel user namespac...
electron [Discussion] Requiring Native Modules in the Renderer Process to be NAPI or Context Aware
Is there a way to suppress the deprecation warning? I am upgrading to electron 7.x and plan to updat...
electron electron.d.ts does not work with @types/node v13.1.0
Run npm install --save --save-exact @types/node@^12.12.6 It works afterward (even with Electron v8.2...
electron How do I change the name of the electron window?
The first answer isn't clear enough Here's the secret sauce 👍 👍 At first tried changing just the m...
electron How to inject jquery into Browserwindow
Okay so disregard what I said above It doesn't appear you can inject jQuery like that in a webview p...
electron Disable zoom
If you are looking for a way how to prevent zoom from main process you can use: ...
electron Notification API do not work with Windows 10 16299.19 (fall creators update)
I think I found a fix! Calling app.setAppUserModelId(<my app id>) suddenly fixes my notifications ...
opencv Incorrect error message for non-integer points in rectangle draw function: TypeError: function takes exactly 4 arguments (2 given)
I got the same issue I resolved it by converting all the coordinate of the bounding box into int ins...
opencv GCC 6.0/6.x build problems
This workaround works in some cases (see #6541): cmake -DENABLE_PRECOMPILED_HEADERS=OFF ... ...
cmder W10 - problem on starting \conemu-maximus5\..\init.bat is not recognized
@lamapidu @cosmicmarkup Check that your antivirus is not quarantining init.bat when you extract the ...
electron node Integration is set to false, but I need the renderer process and the main process communication
When you spawn your browser window set the preload option to a script you wish to preload: Then in t...
opencv BUG:TypeError: Expected Ptr<cv::UMat> for argument 'img'
Somehow img = np.array(img) still doesn't work for me but img = img.copy() works. ...
scylla Docker images are broken
I can also confirm it starts to work applying this: After this change no restart it works fine. ...
opencv ‘memcpy’ was not declared in this scope (Ubuntu 16.04)
Simple replace in opencv/cmake/OpencvDetectCuda.cmake to it's work for me! Please state the informat...
cmder Virtualenv in Powershell cannot be activated
I face the same problem Thanks I confirm that removing -Options ReadOnly from Set-Item -Path functio...
cmder Failed to backup ConEmu.xml file to ./config folder! Needs admin privileges?
this happens when cmder is installed in C:\Program Files\Cmder to write to this path the system need...
bitcoin test: Remove sync_blocks global
Thanks @brakmic and everyone else for working on issues and making improvements ...