Using GPU Stream with OpenCV and Nvidia Jetson TK1

I purchased the Nvidia Jetson TK1 a few weeks ago and I am trying to use the CPU and GPU at the same time, hence using the Stream class. With a simple test, I understand that it is not doing what I think it should, I am probably using it incorrectly, or perhaps a compiler variant.

I checked this link for answers before posting this question: How to use gpu :: Stream in OpenCV?

Here is my code:

#include <stdio.h> 
#include <iostream>   
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/gpu/gpu.hpp"
#include <time.h> 

using namespace cv;
using namespace std;
using namespace gpu;    


int main(int argc,char** argv)    
{    
 unsigned long AAtime=0, BBtime=0;  
gpu::setDevice(0);
gpu::FeatureSet(FEATURE_SET_COMPUTE_30);
Mat host_src= imread(argv[1],0);
GpuMat gpu_src, gpu_dst;

Stream stream;

gpu_src.upload(host_src);

AAtime = getTickCount(); 
blur(gpu_src, gpu_dst, Size(5,5), Point(-1,-1), stream);

//Cpu function
int k=0;
for(unsigned long long int j=0;j<10;j++)
for(unsigned long long int i=0;i<10000000;i++)
 k+=rand(); 

stream.waitForCompletion();
Mat host_dst;
BBtime = getTickCount();  
 cout<<(BBtime - AAtime)/getTickFrequency()<<endl;
gpu_dst.download(host_dst);

 return 0;  

}   

      

With the timer function, I saw that the total time is CPU + GPU, not the longest of the two, so they don't run in parallel. I tried to use CudaMem as jet47 showed, but when I look at the image, it is only striped, not my image:

CudaMem host_src_pl(Size(900, 1200), CV_8UC1, CudaMem::ALLOC_PAGE_LOCKED); // My image is 1200 by 900
CudaMem host_dst_pl;
Mat host_src= imread(argv[1],0);
host_src = host_src_pl;
//rest of the code

      

I used this command to compile: "g ++ -Ofast -mfpu = neon -funsafe-math-optimizations -fabi-version = 8 -Wabi -std = C ++ 11 -march = armv7-testStream.cpp -fopenmp -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_calib3d -lopencv_contrib -lopencv_features2d -lopencv_flann -lopencv_gpu -lopencv_legacy -lopencv_ml -lopencv_objdetect -lopencv_photo -lopencv_stitching -lopencv_superres -lopencv_video -lopencv_videostab -o gpuStream "Some may be redundant, I tried without them, and it does the same thing.

What am I missing? Thanks for answers:)

+3


source to share





All Articles