Commit 6e3ea8b4 authored by WuZhiwen's avatar WuZhiwen Committed by Alexander Alekhin

Merge pull request #12703 from wzw-intel:vkcom

* dnn: Add a Vulkan based backend

This commit adds a new backend "DNN_BACKEND_VKCOM" and a
new target "DNN_TARGET_VULKAN". VKCOM means vulkan based
computation library.

This backend uses Vulkan API and SPIR-V shaders to do
the inference computation for layers. The layer types
that implemented in DNN_BACKEND_VKCOM include:
Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling,
AvePooling, Permute

This is just a beginning work for Vulkan in OpenCV DNN,
more layer types will be supported and performance
tuning is on the way.
Signed-off-by: 's avatarWu Zhiwen <zhiwen.wu@intel.com>

* dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK

In order to build dnn with Vulkan support, need installing
Vulkan SDK and setting environment variable "VULKAN_SDK" and
add "-DWITH_VULKAN=ON" to cmake command.

You can download Vulkan SDK from:
https://vulkan.lunarg.com/sdk/home#linux

For how to install, see
https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html
respectively for linux, windows and mac.

To run the vulkan backend, also need installing mesa driver.
On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers'

To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=*VkCom*'
Signed-off-by: 's avatarWu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: dynamically load Vulkan runtime

No compile-time dependency on Vulkan library.
If Vulkan runtime is unavailable, fallback to CPU path.

Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your
own vulkan runtime library.
Signed-off-by: 's avatarWu Zhiwen <zhiwen.wu@intel.com>

* dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders

The SPIR-V shaders are in format of text-based 32-bit hexadecimal
numbers, and inserted into .cpp files as unsigned int32 array.

* dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes

Vulkan header files are copied from
https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan
to 3rdparty/include

Fix the Copyright declaration issue.

Refine OpenCVDetectVulkan.cmake

* dnn/Vulkan: Add vulkan backend tests into existing ones.

Also fixed some test failures.

- Don't use bool variable as uniform for shader
- Fix dispathed group number beyond max issue
- Bypass "group > 1" convolution. This should be support in future.

* dnn/Vulkan: Fix multiple initialization in one thread.
parent 220b2785
//
// File: vk_platform.h
//
/*
** Copyright (c) 2014-2017 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
#ifndef VK_PLATFORM_H_
#define VK_PLATFORM_H_
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
/*
***************************************************************************************************
* Platform-specific directives and type declarations
***************************************************************************************************
*/
/* Platform-specific calling convention macros.
*
* Platforms should define these so that Vulkan clients call Vulkan commands
* with the same calling conventions that the Vulkan implementation expects.
*
* VKAPI_ATTR - Placed before the return type in function declarations.
* Useful for C++11 and GCC/Clang-style function attribute syntax.
* VKAPI_CALL - Placed after the return type in function declarations.
* Useful for MSVC-style calling convention syntax.
* VKAPI_PTR - Placed between the '(' and '*' in function pointer types.
*
* Function declaration: VKAPI_ATTR void VKAPI_CALL vkCommand(void);
* Function pointer type: typedef void (VKAPI_PTR *PFN_vkCommand)(void);
*/
#if defined(_WIN32)
// On Windows, Vulkan commands use the stdcall convention
#define VKAPI_ATTR
#define VKAPI_CALL __stdcall
#define VKAPI_PTR VKAPI_CALL
#elif defined(__ANDROID__) && defined(__ARM_ARCH) && __ARM_ARCH < 7
#error "Vulkan isn't supported for the 'armeabi' NDK ABI"
#elif defined(__ANDROID__) && defined(__ARM_ARCH) && __ARM_ARCH >= 7 && defined(__ARM_32BIT_STATE)
// On Android 32-bit ARM targets, Vulkan functions use the "hardfloat"
// calling convention, i.e. float parameters are passed in registers. This
// is true even if the rest of the application passes floats on the stack,
// as it does by default when compiling for the armeabi-v7a NDK ABI.
#define VKAPI_ATTR __attribute__((pcs("aapcs-vfp")))
#define VKAPI_CALL
#define VKAPI_PTR VKAPI_ATTR
#else
// On other platforms, use the default calling convention
#define VKAPI_ATTR
#define VKAPI_CALL
#define VKAPI_PTR
#endif
#include <stddef.h>
#if !defined(VK_NO_STDINT_H)
#if defined(_MSC_VER) && (_MSC_VER < 1600)
typedef signed __int8 int8_t;
typedef unsigned __int8 uint8_t;
typedef signed __int16 int16_t;
typedef unsigned __int16 uint16_t;
typedef signed __int32 int32_t;
typedef unsigned __int32 uint32_t;
typedef signed __int64 int64_t;
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif
#endif // !defined(VK_NO_STDINT_H)
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif
#ifndef VULKAN_H_
#define VULKAN_H_ 1
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
#include "vk_platform.h"
#include "vulkan_core.h"
#ifdef VK_USE_PLATFORM_ANDROID_KHR
#include "vulkan_android.h"
#endif
#ifdef VK_USE_PLATFORM_FUCHSIA
#include <zircon/types.h>
#include "vulkan_fuchsia.h"
#endif
#ifdef VK_USE_PLATFORM_IOS_MVK
#include "vulkan_ios.h"
#endif
#ifdef VK_USE_PLATFORM_MACOS_MVK
#include "vulkan_macos.h"
#endif
#ifdef VK_USE_PLATFORM_MIR_KHR
#include <mir_toolkit/client_types.h>
#include "vulkan_mir.h"
#endif
#ifdef VK_USE_PLATFORM_VI_NN
#include "vulkan_vi.h"
#endif
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
#include <wayland-client.h>
#include "vulkan_wayland.h"
#endif
#ifdef VK_USE_PLATFORM_WIN32_KHR
#include <windows.h>
#include "vulkan_win32.h"
#endif
#ifdef VK_USE_PLATFORM_XCB_KHR
#include <xcb/xcb.h>
#include "vulkan_xcb.h"
#endif
#ifdef VK_USE_PLATFORM_XLIB_KHR
#include <X11/Xlib.h>
#include "vulkan_xlib.h"
#endif
#ifdef VK_USE_PLATFORM_XLIB_XRANDR_EXT
#include <X11/Xlib.h>
#include <X11/extensions/Xrandr.h>
#include "vulkan_xlib_xrandr.h"
#endif
#endif // VULKAN_H_
#ifndef VULKAN_ANDROID_H_
#define VULKAN_ANDROID_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_android_surface 1
struct ANativeWindow;
#define VK_KHR_ANDROID_SURFACE_SPEC_VERSION 6
#define VK_KHR_ANDROID_SURFACE_EXTENSION_NAME "VK_KHR_android_surface"
typedef VkFlags VkAndroidSurfaceCreateFlagsKHR;
typedef struct VkAndroidSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkAndroidSurfaceCreateFlagsKHR flags;
struct ANativeWindow* window;
} VkAndroidSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateAndroidSurfaceKHR)(VkInstance instance, const VkAndroidSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateAndroidSurfaceKHR(
VkInstance instance,
const VkAndroidSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#define VK_ANDROID_external_memory_android_hardware_buffer 1
struct AHardwareBuffer;
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_SPEC_VERSION 3
#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME "VK_ANDROID_external_memory_android_hardware_buffer"
typedef struct VkAndroidHardwareBufferUsageANDROID {
VkStructureType sType;
void* pNext;
uint64_t androidHardwareBufferUsage;
} VkAndroidHardwareBufferUsageANDROID;
typedef struct VkAndroidHardwareBufferPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkDeviceSize allocationSize;
uint32_t memoryTypeBits;
} VkAndroidHardwareBufferPropertiesANDROID;
typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {
VkStructureType sType;
void* pNext;
VkFormat format;
uint64_t externalFormat;
VkFormatFeatureFlags formatFeatures;
VkComponentMapping samplerYcbcrConversionComponents;
VkSamplerYcbcrModelConversion suggestedYcbcrModel;
VkSamplerYcbcrRange suggestedYcbcrRange;
VkChromaLocation suggestedXChromaOffset;
VkChromaLocation suggestedYChromaOffset;
} VkAndroidHardwareBufferFormatPropertiesANDROID;
typedef struct VkImportAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
struct AHardwareBuffer* buffer;
} VkImportAndroidHardwareBufferInfoANDROID;
typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {
VkStructureType sType;
const void* pNext;
VkDeviceMemory memory;
} VkMemoryGetAndroidHardwareBufferInfoANDROID;
typedef struct VkExternalFormatANDROID {
VkStructureType sType;
void* pNext;
uint64_t externalFormat;
} VkExternalFormatANDROID;
typedef VkResult (VKAPI_PTR *PFN_vkGetAndroidHardwareBufferPropertiesANDROID)(VkDevice device, const struct AHardwareBuffer* buffer, VkAndroidHardwareBufferPropertiesANDROID* pProperties);
typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryAndroidHardwareBufferANDROID)(VkDevice device, const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo, struct AHardwareBuffer** pBuffer);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkGetAndroidHardwareBufferPropertiesANDROID(
VkDevice device,
const struct AHardwareBuffer* buffer,
VkAndroidHardwareBufferPropertiesANDROID* pProperties);
VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryAndroidHardwareBufferANDROID(
VkDevice device,
const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,
struct AHardwareBuffer** pBuffer);
#endif
#ifdef __cplusplus
}
#endif
#endif
This diff is collapsed.
#ifndef VULKAN_FUCHSIA_H_
#define VULKAN_FUCHSIA_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_FUCHSIA_imagepipe_surface 1
#define VK_FUCHSIA_IMAGEPIPE_SURFACE_SPEC_VERSION 1
#define VK_FUCHSIA_IMAGEPIPE_SURFACE_EXTENSION_NAME "VK_FUCHSIA_imagepipe_surface"
typedef VkFlags VkImagePipeSurfaceCreateFlagsFUCHSIA;
typedef struct VkImagePipeSurfaceCreateInfoFUCHSIA {
VkStructureType sType;
const void* pNext;
VkImagePipeSurfaceCreateFlagsFUCHSIA flags;
zx_handle_t imagePipeHandle;
} VkImagePipeSurfaceCreateInfoFUCHSIA;
typedef VkResult (VKAPI_PTR *PFN_vkCreateImagePipeSurfaceFUCHSIA)(VkInstance instance, const VkImagePipeSurfaceCreateInfoFUCHSIA* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateImagePipeSurfaceFUCHSIA(
VkInstance instance,
const VkImagePipeSurfaceCreateInfoFUCHSIA* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_IOS_H_
#define VULKAN_IOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_ios_surface 1
#define VK_MVK_IOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_IOS_SURFACE_EXTENSION_NAME "VK_MVK_ios_surface"
typedef VkFlags VkIOSSurfaceCreateFlagsMVK;
typedef struct VkIOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkIOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkIOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateIOSSurfaceMVK)(VkInstance instance, const VkIOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateIOSSurfaceMVK(
VkInstance instance,
const VkIOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_MACOS_H_
#define VULKAN_MACOS_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_MVK_macos_surface 1
#define VK_MVK_MACOS_SURFACE_SPEC_VERSION 2
#define VK_MVK_MACOS_SURFACE_EXTENSION_NAME "VK_MVK_macos_surface"
typedef VkFlags VkMacOSSurfaceCreateFlagsMVK;
typedef struct VkMacOSSurfaceCreateInfoMVK {
VkStructureType sType;
const void* pNext;
VkMacOSSurfaceCreateFlagsMVK flags;
const void* pView;
} VkMacOSSurfaceCreateInfoMVK;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMacOSSurfaceMVK)(VkInstance instance, const VkMacOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMacOSSurfaceMVK(
VkInstance instance,
const VkMacOSSurfaceCreateInfoMVK* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_MIR_H_
#define VULKAN_MIR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_mir_surface 1
#define VK_KHR_MIR_SURFACE_SPEC_VERSION 4
#define VK_KHR_MIR_SURFACE_EXTENSION_NAME "VK_KHR_mir_surface"
typedef VkFlags VkMirSurfaceCreateFlagsKHR;
typedef struct VkMirSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkMirSurfaceCreateFlagsKHR flags;
MirConnection* connection;
MirSurface* mirSurface;
} VkMirSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateMirSurfaceKHR)(VkInstance instance, const VkMirSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceMirPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, MirConnection* connection);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateMirSurfaceKHR(
VkInstance instance,
const VkMirSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceMirPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
MirConnection* connection);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_VI_H_
#define VULKAN_VI_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_NN_vi_surface 1
#define VK_NN_VI_SURFACE_SPEC_VERSION 1
#define VK_NN_VI_SURFACE_EXTENSION_NAME "VK_NN_vi_surface"
typedef VkFlags VkViSurfaceCreateFlagsNN;
typedef struct VkViSurfaceCreateInfoNN {
VkStructureType sType;
const void* pNext;
VkViSurfaceCreateFlagsNN flags;
void* window;
} VkViSurfaceCreateInfoNN;
typedef VkResult (VKAPI_PTR *PFN_vkCreateViSurfaceNN)(VkInstance instance, const VkViSurfaceCreateInfoNN* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateViSurfaceNN(
VkInstance instance,
const VkViSurfaceCreateInfoNN* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_WAYLAND_H_
#define VULKAN_WAYLAND_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_wayland_surface 1
#define VK_KHR_WAYLAND_SURFACE_SPEC_VERSION 6
#define VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME "VK_KHR_wayland_surface"
typedef VkFlags VkWaylandSurfaceCreateFlagsKHR;
typedef struct VkWaylandSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkWaylandSurfaceCreateFlagsKHR flags;
struct wl_display* display;
struct wl_surface* surface;
} VkWaylandSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateWaylandSurfaceKHR)(VkInstance instance, const VkWaylandSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWaylandPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, struct wl_display* display);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateWaylandSurfaceKHR(
VkInstance instance,
const VkWaylandSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWaylandPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
struct wl_display* display);
#endif
#ifdef __cplusplus
}
#endif
#endif
This diff is collapsed.
#ifndef VULKAN_XCB_H_
#define VULKAN_XCB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xcb_surface 1
#define VK_KHR_XCB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XCB_SURFACE_EXTENSION_NAME "VK_KHR_xcb_surface"
typedef VkFlags VkXcbSurfaceCreateFlagsKHR;
typedef struct VkXcbSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXcbSurfaceCreateFlagsKHR flags;
xcb_connection_t* connection;
xcb_window_t window;
} VkXcbSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXcbSurfaceKHR)(VkInstance instance, const VkXcbSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXcbPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, xcb_connection_t* connection, xcb_visualid_t visual_id);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXcbSurfaceKHR(
VkInstance instance,
const VkXcbSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXcbPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
xcb_connection_t* connection,
xcb_visualid_t visual_id);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_XLIB_H_
#define VULKAN_XLIB_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_KHR_xlib_surface 1
#define VK_KHR_XLIB_SURFACE_SPEC_VERSION 6
#define VK_KHR_XLIB_SURFACE_EXTENSION_NAME "VK_KHR_xlib_surface"
typedef VkFlags VkXlibSurfaceCreateFlagsKHR;
typedef struct VkXlibSurfaceCreateInfoKHR {
VkStructureType sType;
const void* pNext;
VkXlibSurfaceCreateFlagsKHR flags;
Display* dpy;
Window window;
} VkXlibSurfaceCreateInfoKHR;
typedef VkResult (VKAPI_PTR *PFN_vkCreateXlibSurfaceKHR)(VkInstance instance, const VkXlibSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);
typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXlibPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, Display* dpy, VisualID visualID);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateXlibSurfaceKHR(
VkInstance instance,
const VkXlibSurfaceCreateInfoKHR* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkSurfaceKHR* pSurface);
VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXlibPresentationSupportKHR(
VkPhysicalDevice physicalDevice,
uint32_t queueFamilyIndex,
Display* dpy,
VisualID visualID);
#endif
#ifdef __cplusplus
}
#endif
#endif
#ifndef VULKAN_XLIB_XRANDR_H_
#define VULKAN_XLIB_XRANDR_H_ 1
#ifdef __cplusplus
extern "C" {
#endif
/*
** Copyright (c) 2015-2018 The Khronos Group Inc.
**
** Licensed under the Apache License, Version 2.0 (the "License");
** you may not use this file except in compliance with the License.
** You may obtain a copy of the License at
**
** http://www.apache.org/licenses/LICENSE-2.0
**
** Unless required by applicable law or agreed to in writing, software
** distributed under the License is distributed on an "AS IS" BASIS,
** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
** See the License for the specific language governing permissions and
** limitations under the License.
*/
/*
** This header is generated from the Khronos Vulkan XML API Registry.
**
*/
#define VK_EXT_acquire_xlib_display 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1
#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"
typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);
typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
VkDisplayKHR display);
VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(
VkPhysicalDevice physicalDevice,
Display* dpy,
RROutput rrOutput,
VkDisplayKHR* pDisplay);
#endif
#ifdef __cplusplus
}
#endif
#endif
......@@ -237,6 +237,7 @@ OCV_OPTION(WITH_GTK "Include GTK support" ON
OCV_OPTION(WITH_GTK_2_X "Use GTK version 2" OFF IF (UNIX AND NOT APPLE AND NOT ANDROID) )
OCV_OPTION(WITH_IPP "Include Intel IPP support" (NOT MINGW AND NOT CV_DISABLE_OPTIMIZATION) IF (X86_64 OR X86) AND NOT WINRT AND NOT IOS )
OCV_OPTION(WITH_HALIDE "Include Halide support" OFF)
OCV_OPTION(WITH_VULKAN "Include Vulkan support" OFF)
OCV_OPTION(WITH_INF_ENGINE "Include Intel Inference Engine support" OFF)
OCV_OPTION(WITH_JASPER "Include JPEG2K support" ON IF (NOT IOS) )
OCV_OPTION(WITH_JPEG "Include JPEG support" ON)
......@@ -685,6 +686,11 @@ if(WITH_HALIDE)
include(cmake/OpenCVDetectHalide.cmake)
endif()
# --- VkCom ---
if(WITH_VULKAN)
include(cmake/OpenCVDetectVulkan.cmake)
endif()
# --- Inference Engine ---
if(WITH_INF_ENGINE)
include(cmake/OpenCVDetectInferenceEngine.cmake)
......@@ -1456,6 +1462,15 @@ if(WITH_CUDA OR HAVE_CUDA)
endif()
endif()
if(WITH_VULKAN OR HAVE_VULKAN)
status("")
status(" Vulkan:" HAVE_VULKAN THEN "YES" ELSE "NO")
if(HAVE_VULKAN)
status(" Include path:" VULKAN_INCLUDE_DIRS THEN "${VULKAN_INCLUDE_DIRS}" ELSE "NO")
status(" Link libraries:" VULKAN_LIBRARIES THEN "${VULKAN_LIBRARIES}" ELSE "Dynamic load")
endif()
endif()
if(WITH_OPENCL OR HAVE_OPENCL)
ocv_build_features_string(opencl_features
IF HAVE_OPENCL_SVM THEN "SVM"
......
# Find Vulkan
#
# Vulkan_INCLUDE_DIRS
# Vulkan_LIBRARIES
# Vulkan_FOUND
if (WIN32)
find_path(Vulkan_INCLUDE_DIRS NAMES vulkan/vulkan.h HINTS
"$ENV{VULKAN_SDK}/Include"
"$ENV{VK_SDK_PATH}/Include")
if (CMAKE_CL_64)
find_library(Vulkan_LIBRARIES NAMES vulkan-1 HINTS
"$ENV{VULKAN_SDK}/Bin"
"$ENV{VK_SDK_PATH}/Bin")
else()
find_library(Vulkan_LIBRARIES NAMES vulkan-1 HINTS
"$ENV{VULKAN_SDK}/Bin32"
"$ENV{VK_SDK_PATH}/Bin32")
endif()
else()
find_path(Vulkan_INCLUDE_DIRS NAMES vulkan/vulkan.h HINTS
"$ENV{VULKAN_SDK}/include")
find_library(Vulkan_LIBRARIES NAMES vulkan HINTS
"$ENV{VULKAN_SDK}/lib")
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(Vulkan DEFAULT_MSG Vulkan_LIBRARIES Vulkan_INCLUDE_DIRS)
mark_as_advanced(Vulkan_INCLUDE_DIRS Vulkan_LIBRARIES)
set(VULKAN_INCLUDE_DIRS "${OpenCV_SOURCE_DIR}/3rdparty/include" CACHE PATH "Vulkan include directory")
set(VULKAN_LIBRARIES "")
try_compile(VALID_VULKAN
"${OpenCV_BINARY_DIR}"
"${OpenCV_SOURCE_DIR}/cmake/checks/vulkan.cpp"
CMAKE_FLAGS "-DINCLUDE_DIRECTORIES:STRING=${VULKAN_INCLUDE_DIRS}"
OUTPUT_VARIABLE TRY_OUT
)
if(NOT ${VALID_VULKAN})
message(WARNING "Can't use Vulkan")
return()
endif()
set(HAVE_VULKAN 1)
if(HAVE_VULKAN)
add_definitions(-DVK_NO_PROTOTYPES)
include_directories(${VULKAN_INCLUDE_DIRS})
endif()
#include <vulkan/vulkan.h>
int main(int /*argc*/, char** /*argv*/)
{
return 0;
}
......@@ -92,6 +92,9 @@
/* Halide support */
#cmakedefine HAVE_HALIDE
/* Vulkan support */
#cmakedefine HAVE_VULKAN
/* Define to 1 if you have the <inttypes.h> header file. */
#cmakedefine HAVE_INTTYPES_H 1
......
......@@ -69,7 +69,8 @@ CV__DNN_INLINE_NS_BEGIN
DNN_BACKEND_DEFAULT,
DNN_BACKEND_HALIDE,
DNN_BACKEND_INFERENCE_ENGINE,
DNN_BACKEND_OPENCV
DNN_BACKEND_OPENCV,
DNN_BACKEND_VKCOM
};
/**
......@@ -81,7 +82,8 @@ CV__DNN_INLINE_NS_BEGIN
DNN_TARGET_CPU,
DNN_TARGET_OPENCL,
DNN_TARGET_OPENCL_FP16,
DNN_TARGET_MYRIAD
DNN_TARGET_MYRIAD,
DNN_TARGET_VULKAN
};
/** @brief This class provides all data needed to initialize layer.
......@@ -263,6 +265,7 @@ CV__DNN_INLINE_NS_BEGIN
virtual Ptr<BackendNode> initInfEngine(const std::vector<Ptr<BackendWrapper> > &inputs);
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &inputs);
/**
* @brief Automatic Halide scheduling based on layer hyper-parameters.
* @param[in] node Backend node with Halide functions.
......
......@@ -42,6 +42,7 @@
#include "precomp.hpp"
#include "op_halide.hpp"
#include "op_inf_engine.hpp"
#include "op_vkcom.hpp"
#include "halide_scheduler.hpp"
#include <set>
#include <algorithm>
......@@ -892,6 +893,13 @@ static Ptr<BackendWrapper> wrapMat(int backendId, int targetId, cv::Mat& m)
#ifdef HAVE_INF_ENGINE
return Ptr<BackendWrapper>(new InfEngineBackendWrapper(targetId, m));
#endif // HAVE_INF_ENGINE
}
else if (backendId == DNN_BACKEND_VKCOM)
{
CV_Assert(haveVulkan());
#ifdef HAVE_VULKAN
return Ptr<BackendWrapper>(new VkComBackendWrapper(m));
#endif // HAVE_VULKAN
}
else
CV_Error(Error::StsNotImplemented, "Unknown backend identifier");
......@@ -903,8 +911,21 @@ struct Net::Impl
typedef std::map<int, LayerShapes> LayersShapesMap;
typedef std::map<int, LayerData> MapIdToLayerData;
~Impl()
{
#ifdef HAVE_VULKAN
// Vulkan requires explicit releasing the child objects of
// VkDevice object prior to releasing VkDevice object itself.
layers.clear();
backendWrappers.clear();
vkcom::deinitPerThread();
#endif
}
Impl()
{
#ifdef HAVE_VULKAN
vkcom::initPerThread();
#endif
//allocate fake net input layer
netInputLayer = Ptr<DataLayer>(new DataLayer());
LayerData &inpl = layers.insert( make_pair(0, LayerData()) ).first->second;
......@@ -970,6 +991,12 @@ struct Net::Impl
{
return wrapMat(preferableBackend, preferableTarget, host);
}
else if (preferableBackend == DNN_BACKEND_VKCOM)
{
#ifdef HAVE_VULKAN
return Ptr<BackendWrapper>(new VkComBackendWrapper(baseBuffer, host));
#endif
}
else
CV_Error(Error::StsNotImplemented, "Unknown backend identifier");
}
......@@ -1078,6 +1105,8 @@ struct Net::Impl
preferableTarget == DNN_TARGET_OPENCL ||
preferableTarget == DNN_TARGET_OPENCL_FP16 ||
preferableTarget == DNN_TARGET_MYRIAD);
CV_Assert(preferableBackend != DNN_BACKEND_VKCOM ||
preferableTarget == DNN_TARGET_VULKAN);
if (!netWasAllocated || this->blobsToKeep != blobsToKeep_)
{
if (preferableBackend == DNN_BACKEND_OPENCV && IS_DNN_OPENCL_TARGET(preferableTarget))
......@@ -1107,6 +1136,12 @@ struct Net::Impl
}
}
#endif
if (preferableBackend == DNN_BACKEND_VKCOM && !haveVulkan())
{
preferableBackend = DNN_BACKEND_OPENCV;
preferableTarget = DNN_TARGET_CPU;
}
clear();
allocateLayers(blobsToKeep_);
......@@ -1259,6 +1294,8 @@ struct Net::Impl
initHalideBackend();
else if (preferableBackend == DNN_BACKEND_INFERENCE_ENGINE)
initInfEngineBackend();
else if (preferableBackend == DNN_BACKEND_VKCOM)
initVkComBackend();
else
CV_Error(Error::StsNotImplemented, "Unknown backend identifier");
}
......@@ -1356,6 +1393,31 @@ struct Net::Impl
}
#endif // HAVE_INF_ENGINE
void initVkComBackend()
{
CV_TRACE_FUNCTION();
CV_Assert(preferableBackend == DNN_BACKEND_VKCOM);
#ifdef HAVE_VULKAN
if (!haveVulkan())
return;
MapIdToLayerData::iterator it = layers.begin();
for (; it != layers.end(); it++)
{
LayerData &ld = it->second;
Ptr<Layer> layer = ld.layerInstance;
if (!layer->supportBackend(preferableBackend))
{
continue;
}
ld.skip = false;
ld.backendNodes[DNN_BACKEND_VKCOM] =
layer->initVkCom(ld.inputBlobsWrappers);
}
#endif
}
void initInfEngineBackend()
{
CV_TRACE_FUNCTION();
......@@ -2254,6 +2316,10 @@ struct Net::Impl
{
forwardInfEngine(node);
}
else if (preferableBackend == DNN_BACKEND_VKCOM)
{
forwardVkCom(ld.outputBlobsWrappers, node);
}
else
{
CV_Error(Error::StsNotImplemented, "Unknown backend identifier");
......@@ -3110,6 +3176,13 @@ bool Layer::supportBackend(int backendId)
return backendId == DNN_BACKEND_OPENCV;
}
Ptr<BackendNode> Layer::initVkCom(const std::vector<Ptr<BackendWrapper> > &)
{
CV_Error(Error::StsNotImplemented, "VkCom pipeline of " + type +
" layers is not defined.");
return Ptr<BackendNode>();
}
Ptr<BackendNode> Layer::initHalide(const std::vector<Ptr<BackendWrapper> > &)
{
CV_Error(Error::StsNotImplemented, "Halide pipeline of " + type +
......
......@@ -44,6 +44,7 @@
#include "layers_common.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#ifdef HAVE_OPENCL
#include "opencl_kernels_dnn.hpp"
......@@ -105,7 +106,8 @@ public:
{
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_HALIDE && haveHalide() && axis == 1 && !padding || // By channels
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() && !padding;
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() && !padding ||
backendId == DNN_BACKEND_VKCOM && haveVulkan() && !padding;
}
class ChannelConcatInvoker : public ParallelLoopBody
......@@ -274,6 +276,16 @@ public:
}
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &input) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
vkcom::Tensor in = VkComTensor(input[0]);
int cAxis = clamp(axis, in.dimNum());
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpConcat(cAxis));
return Ptr<BackendNode>(new VkComBackendNode(input, op));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initHalide(const std::vector<Ptr<BackendWrapper> > &input) CV_OVERRIDE
{
......
......@@ -44,6 +44,7 @@
#include "layers_common.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include "opencv2/core/hal/hal.hpp"
#include "opencv2/core/hal/intrin.hpp"
#include <iostream>
......@@ -222,7 +223,9 @@ public:
if (backendId == DNN_BACKEND_INFERENCE_ENGINE)
return preferableTarget != DNN_TARGET_MYRIAD || dilation.width == dilation.height;
else
return backendId == DNN_BACKEND_OPENCV || backendId == DNN_BACKEND_HALIDE;
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_HALIDE ||
backendId == DNN_BACKEND_VKCOM && haveVulkan();
}
bool getMemoryShapes(const std::vector<MatShape> &inputs,
......@@ -384,6 +387,73 @@ public:
biasvec[outCn] = biasvec[outCn+1] = biasvec[outCn-1];
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
int out_channel = blobs[0].size[0];
bool has_bias = hasBias() || fusedBias;
int filter_size[2] = {kernel.height, kernel.width};
int pad_size[2] = {pad.height, pad.width};
int stride_size[2] = {stride.height, stride.width};
int dilation_size[2] = {dilation.height, dilation.width};
int activation = 0;
vkcom::Tensor input_tensor = VkComTensor(inputs[0]);
int in_channel = input_tensor.dimSize(1);
int group = in_channel / blobs[0].size[1];
// TODO: support group > 1
if (group != 1)
return Ptr<BackendNode>();
int padding_mode;
if (padMode.empty())
{
padding_mode = vkcom::kPaddingModeCaffe;
}
else if (padMode == "VALID")
{
padding_mode = vkcom::kPaddingModeValid;
}
else if (padMode == "SAME")
{
padding_mode = vkcom::kPaddingModeSame;
}
else
CV_Error(Error::StsError, "Unsupported padding mode " + padMode);
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpConv(out_channel, has_bias,
filter_size, pad_size,
stride_size, dilation_size,
activation, group,
padding_mode));
std::vector<Ptr<BackendWrapper> > blobsWrapper;
if (newWeightAndBias)
{
Mat wm;
weightsMat.copyTo(wm); // to handle the case of isContinuous() == false
wm.reshape(1, blobs[0].dims, blobs[0].size);
blobsWrapper.push_back(Ptr<BackendWrapper>(new VkComBackendWrapper(wm)));
}
else
{
blobsWrapper.push_back(Ptr<BackendWrapper>(new VkComBackendWrapper(blobs[0])));
}
if (has_bias)
{
Mat biasesMat({out_channel}, CV_32F, &biasvec[0]);
blobsWrapper.push_back(Ptr<BackendWrapper>(new VkComBackendWrapper(biasesMat)));
}
return Ptr<BackendNode>(new VkComBackendNode(inputs, op, blobsWrapper));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initHalide(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_HALIDE
......
......@@ -44,6 +44,7 @@
#include "layers_common.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include "opencv2/imgproc.hpp"
#include <opencv2/dnn/shape_utils.hpp>
#include <iostream>
......@@ -161,6 +162,14 @@ public:
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> >& inputs) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
return Ptr<BackendNode>(new VkComBackendNode(inputs, func.initVkCom()));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual bool tryFuse(Ptr<dnn::Layer>& top) CV_OVERRIDE
{
return func.tryFuse(top);
......@@ -252,7 +261,8 @@ struct ReLUFunctor
bool supportBackend(int backendId, int)
{
return backendId == DNN_BACKEND_OPENCV || backendId == DNN_BACKEND_HALIDE ||
backendId == DNN_BACKEND_INFERENCE_ENGINE;
backendId == DNN_BACKEND_INFERENCE_ENGINE ||
backendId == DNN_BACKEND_VKCOM;
}
void apply(const float* srcptr, float* dstptr, int len, size_t planeSize, int cn0, int cn1) const
......@@ -356,6 +366,16 @@ struct ReLUFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpReLU(slope));
return op;
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -465,6 +485,14 @@ struct ReLU6Functor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -539,6 +567,14 @@ struct TanHFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -613,6 +649,14 @@ struct SigmoidFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -688,6 +732,14 @@ struct ELUFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -760,6 +812,14 @@ struct AbsValFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -812,6 +872,14 @@ struct BNLLFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......@@ -935,6 +1003,14 @@ struct PowerFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>& top)
{
if (power != 1.0f && shift != 0.0f)
......@@ -1070,6 +1146,14 @@ struct ChannelsPReLUFunctor
}
#endif // HAVE_INF_ENGINE
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> initVkCom()
{
// TODO: add vkcom implementation
return std::shared_ptr<vkcom::OpBase>();
}
#endif // HAVE_VULKAN
bool tryFuse(Ptr<dnn::Layer>&) { return false; }
void getScaleShift(Mat&, Mat&) const {}
......
......@@ -44,6 +44,7 @@
#include "layers_common.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/dnn/shape_utils.hpp"
#include "opencv2/core/hal/hal.hpp"
......@@ -92,7 +93,8 @@ public:
{
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_HALIDE ||
backendId == DNN_BACKEND_INFERENCE_ENGINE && (preferableTarget != DNN_TARGET_MYRIAD || type == CHANNEL_NRM);
backendId == DNN_BACKEND_INFERENCE_ENGINE && (preferableTarget != DNN_TARGET_MYRIAD || type == CHANNEL_NRM) ||
backendId == DNN_BACKEND_VKCOM && haveVulkan() && (size % 2 == 1) && (type == CHANNEL_NRM);
}
#ifdef HAVE_OPENCL
......@@ -306,6 +308,15 @@ public:
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpLRN(size / 2, bias, alpha, beta, normBySize));
return Ptr<BackendNode>(new VkComBackendNode(inputs, op));
#endif
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initHalide(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_HALIDE
......
......@@ -43,6 +43,7 @@
#include "../precomp.hpp"
#include "layers_common.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include <float.h>
#include <algorithm>
......@@ -105,7 +106,8 @@ public:
virtual bool supportBackend(int backendId) CV_OVERRIDE
{
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine();
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() ||
backendId == DNN_BACKEND_VKCOM && haveVulkan();
}
bool getMemoryShapes(const std::vector<MatShape> &inputs,
......@@ -370,6 +372,16 @@ public:
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &input) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
CV_Assert(!_order.empty());
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpPermute(_order));
return Ptr<BackendNode>(new VkComBackendNode(input, op));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initInfEngine(const std::vector<Ptr<BackendWrapper> >&) CV_OVERRIDE
{
#ifdef HAVE_INF_ENGINE
......
......@@ -45,6 +45,7 @@
#include "opencv2/core/hal/intrin.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include <float.h>
#include <algorithm>
using std::max;
......@@ -155,7 +156,9 @@ public:
else
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_HALIDE && haveHalide() &&
(type == MAX || type == AVE && !pad_t && !pad_l && !pad_b && !pad_r);
(type == MAX || type == AVE && !pad_t && !pad_l && !pad_b && !pad_r) ||
backendId == DNN_BACKEND_VKCOM && haveVulkan() &&
(type == MAX || type == AVE);
}
#ifdef HAVE_OPENCL
......@@ -246,6 +249,41 @@ public:
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
int padding_mode;
vkcom::PoolType pool_type;
int filter_size[2] = {kernel.height, kernel.width};
int pad_size[2] = {pad.height, pad.width};
int stride_size[2] = {stride.height, stride.width};
pool_type = type == MAX ? vkcom::kPoolTypeMax:
(type == AVE ? vkcom::kPoolTypeAvg:
vkcom::kPoolTypeNum);
if (padMode.empty())
{
padding_mode = vkcom::kPaddingModeCaffe;
}
else if (padMode == "VALID")
{
padding_mode = vkcom::kPaddingModeValid;
}
else if (padMode == "SAME")
{
padding_mode = vkcom::kPaddingModeSame;
}
else
CV_Error(Error::StsError, "Unsupported padding mode " + padMode);
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpPool(filter_size, pad_size,
stride_size, padding_mode,
pool_type, avePoolPaddedArea));
return Ptr<BackendNode>(new VkComBackendNode(inputs, op));
#endif
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initHalide(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
if (type == MAX)
......
......@@ -43,6 +43,7 @@
#include "../precomp.hpp"
#include "layers_common.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include <float.h>
#include <algorithm>
#include <cmath>
......@@ -271,7 +272,8 @@ public:
virtual bool supportBackend(int backendId) CV_OVERRIDE
{
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine();
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() ||
backendId == DNN_BACKEND_VKCOM && haveVulkan();
}
bool getMemoryShapes(const std::vector<MatShape> &inputs,
......@@ -480,6 +482,19 @@ public:
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &input) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpPriorBox(_stepX, _stepY,
_clip, _numPriors,
_variance, _offsetsX,
_offsetsY, _boxWidths,
_boxHeights));
return Ptr<BackendNode>(new VkComBackendNode(input, op));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initInfEngine(const std::vector<Ptr<BackendWrapper> >&) CV_OVERRIDE
{
#ifdef HAVE_INF_ENGINE
......
......@@ -44,6 +44,7 @@
#include "layers_common.hpp"
#include "../op_halide.hpp"
#include "../op_inf_engine.hpp"
#include "../op_vkcom.hpp"
#include <algorithm>
#include <stdlib.h>
using std::max;
......@@ -90,7 +91,8 @@ public:
{
return backendId == DNN_BACKEND_OPENCV ||
backendId == DNN_BACKEND_HALIDE && haveHalide() && axisRaw == 1 ||
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() && !logSoftMax;
backendId == DNN_BACKEND_INFERENCE_ENGINE && haveInfEngine() && !logSoftMax ||
backendId == DNN_BACKEND_VKCOM && haveVulkan();
}
#ifdef HAVE_OPENCL
......@@ -284,6 +286,18 @@ public:
}
}
virtual Ptr<BackendNode> initVkCom(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_VULKAN
vkcom::Tensor in = VkComTensor(inputs[0]);
int cAxis = clamp(axisRaw, in.dimNum());
std::shared_ptr<vkcom::OpBase> op(new vkcom::OpSoftmax(cAxis, logSoftMax));
return Ptr<BackendNode>(new VkComBackendNode(inputs, op));
#endif // HAVE_VULKAN
return Ptr<BackendNode>();
}
virtual Ptr<BackendNode> initHalide(const std::vector<Ptr<BackendWrapper> > &inputs) CV_OVERRIDE
{
#ifdef HAVE_HALIDE
......
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#include "precomp.hpp"
#include <opencv2/dnn/shape_utils.hpp>
#include "op_vkcom.hpp"
namespace cv
{
namespace dnn
{
#ifdef HAVE_VULKAN
void copyToTensor(vkcom::Tensor &dst, const Mat &src)
{
CV_Assert(src.isContinuous() && src.type() == CV_32F);
std::vector<int> mat_shape = shape(src);
dst.reshape((const char*)src.data, mat_shape);
}
void copyToMat(Mat &dst, vkcom::Tensor &src)
{
CV_Assert(dst.type() == CV_32F);
std::vector<int> shape = src.getShape();
void *data = src.map();
Mat tmp(shape, CV_32F, data);
tmp.copyTo(dst);
src.unMap();
}
vkcom::Tensor VkComTensor(const Ptr<BackendWrapper>& ptr)
{
CV_Assert(!ptr.empty());
return ptr.dynamicCast<VkComBackendWrapper>()->getTensor();
}
void setDirty(std::vector<Ptr<BackendWrapper> >& ptrs)
{
for (const Ptr<BackendWrapper>& ptr : ptrs)
{
ptr.dynamicCast<VkComBackendWrapper>()->setDeviceDirty();
}
}
std::vector<vkcom::Tensor> VkComTensors(const std::vector<Ptr<BackendWrapper> >& ptrs)
{
std::vector<vkcom::Tensor> vec;
vec.reserve(ptrs.size());
for (const Ptr<BackendWrapper>& ptr : ptrs)
{
vec.push_back(VkComTensor(ptr));
}
return vec;
}
VkComBackendNode::VkComBackendNode(const std::vector<Ptr<BackendWrapper> >& inputsWrapper,
const std::shared_ptr<vkcom::OpBase>& op,
const std::vector<Ptr<BackendWrapper> >& blobsWrapper)
: BackendNode(DNN_BACKEND_VKCOM)
{
operation = op;
inputsWrapper_ = inputsWrapper;
ins = VkComTensors(inputsWrapper_);
if (!blobsWrapper.empty())
{
blobs = VkComTensors(blobsWrapper);
}
}
bool VkComBackendNode::forward(std::vector<vkcom::Tensor>& outs)
{
for (int i = 0, n = inputsWrapper_.size(); i < n; ++i)
{
inputsWrapper_[i].dynamicCast<VkComBackendWrapper>()->copyToDevice();
}
return operation->forward(ins, blobs, outs);
}
VkComBackendWrapper::VkComBackendWrapper(Mat& m) : BackendWrapper(DNN_BACKEND_VKCOM, DNN_TARGET_VULKAN)
{
copyToTensor(tensor, m);
host = &m;
hostDirty = false;
deviceDirty = false;
}
VkComBackendWrapper::VkComBackendWrapper(const Ptr<BackendWrapper>& baseBuffer, Mat& m)
: BackendWrapper(DNN_BACKEND_VKCOM, DNN_TARGET_VULKAN)
{
Ptr<VkComBackendWrapper> base = baseBuffer.dynamicCast<VkComBackendWrapper>();
CV_Assert(!base.empty());
host = &m;
tensor = base->tensor;
CV_Assert(tensor.count() >= m.total());
tensor.reshape(0, shape(m));
hostDirty = false;
deviceDirty = false;
}
void VkComBackendWrapper::copyToHost()
{
if (deviceDirty)
copyToMat(*host, tensor);
}
void VkComBackendWrapper::setHostDirty()
{
hostDirty = true;
};
void VkComBackendWrapper::setDeviceDirty()
{
deviceDirty = true;
};
void VkComBackendWrapper::copyToDevice()
{
if (hostDirty)
{
copyToTensor(tensor, *host);
hostDirty = false;
}
}
vkcom::Tensor VkComBackendWrapper::getTensor()
{
return tensor;
}
#endif
void forwardVkCom(std::vector<Ptr<BackendWrapper> > &outputs,
const Ptr<BackendNode>& node)
{
#ifdef HAVE_VULKAN
CV_Assert(!node.empty());
Ptr<VkComBackendNode> node_ = node.dynamicCast<VkComBackendNode>();
std::vector<vkcom::Tensor> outs = VkComTensors(outputs);
node_->forward(outs);
setDirty(outputs);
#endif
}
bool haveVulkan()
{
#ifdef HAVE_VULKAN
return vkcom::isAvailable();
#else
return false;
#endif // HAVE_VULKAN
}
} // namespace dnn
} // namespace cv
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_OP_VKCOM_HPP
#define OPENCV_DNN_OP_VKCOM_HPP
#include <opencv2/dnn/shape_utils.hpp>
#ifdef HAVE_VULKAN
#include "vkcom/include/vkcom.hpp"
#endif // HAVE_VULKAN
namespace cv
{
namespace dnn
{
#ifdef HAVE_VULKAN
std::vector<vkcom::Tensor> VkComTensors(const std::vector<Ptr<BackendWrapper> >& ptrs);
vkcom::Tensor VkComTensor(const Ptr<BackendWrapper>& ptr);
// Data copied from/to Mat to/from Tensor. Change the shape of dst if
// needed to make it the same shape as src
void copyToTensor(vkcom::Tensor &dst, const Mat &src);
void copyToMat(Mat &dst, const vkcom::Tensor &src);
class VkComBackendNode : public BackendNode
{
public:
VkComBackendNode(const std::vector<Ptr<BackendWrapper> >& inputsWrapper,
const std::shared_ptr<vkcom::OpBase> &op,
const std::vector<Ptr<BackendWrapper> >& blobsWrapper =
std::vector<Ptr<BackendWrapper> >());
bool forward(std::vector<vkcom::Tensor>& outs);
private:
std::vector<vkcom::Tensor> ins;
std::vector<vkcom::Tensor> blobs;
std::vector<Ptr<BackendWrapper> > inputsWrapper_;
std::shared_ptr<vkcom::OpBase> operation;
};
class VkComBackendWrapper : public BackendWrapper
{
public:
VkComBackendWrapper(Mat& m);
VkComBackendWrapper(const Ptr<BackendWrapper>& baseBuffer, Mat& m);
virtual void copyToHost() CV_OVERRIDE;
virtual void setHostDirty() CV_OVERRIDE;
void setDeviceDirty();
void copyToDevice();
vkcom::Tensor getTensor();
private:
vkcom::Tensor tensor;
Mat* host;
bool hostDirty;
bool deviceDirty;
};
#endif // HAVE_VULKAN
void forwardVkCom(std::vector<Ptr<BackendWrapper> > &outputs, const Ptr<BackendNode>& node);
bool haveVulkan();
} // namespace dnn
} // namespace cv
#endif // OPENCV_DNN_OP_VKCOM_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_BUFFER_HPP
#define OPENCV_DNN_VKCOM_BUFFER_HPP
#ifdef HAVE_VULKAN
#include <vulkan/vulkan.h>
#endif // HAVE_VULKAN
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
class Buffer
{
public:
Buffer(VkDevice& device)
: device_(device), buffer_(VK_NULL_HANDLE), memory_(VK_NULL_HANDLE){};
Buffer(VkDevice& device, size_t size_in_bytes, const char* data);
~Buffer();
VkDeviceMemory getVkMemory() { return memory_; }
VkBuffer getVkBuffer() { return buffer_; }
private:
Buffer();
bool init(size_t size_in_bytes, const char* data);
VkDevice device_;
VkBuffer buffer_;
VkDeviceMemory memory_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_BUFFER_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_BASE_HPP
#define OPENCV_DNN_VKCOM_OP_BASE_HPP
#include "../../precomp.hpp"
#include "vkcom.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
// Forward declare
class Context;
class OpBase
{
public:
OpBase();
virtual ~OpBase();
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) = 0;
protected:
void initVulkanThing(int buffer_num);
void createDescriptorSetLayout(int buffer_num);
void createDescriptorSet(int buffer_num);
void createShaderModule(const uint32_t* spv, size_t sz, const std::string& source = std::string());
void createPipeline(size_t push_constants_size = 0);
void createCommandBuffer();
void recordCommandBuffer(void* push_constants = NULL, size_t push_constants_size = 0);
void runCommandBuffer();
const Context* ctx_;
VkPipeline pipeline_;
VkCommandBuffer cmd_buffer_;
VkDescriptorPool descriptor_pool_;
VkDescriptorSet descriptor_set_;
VkDevice device_;
VkDescriptorSetLayout descriptor_set_layout_;
VkPipelineLayout pipeline_layout_;
VkShaderModule module_;
int group_x_;
int group_y_;
int group_z_;
std::string type_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_BASE_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_CONCAT_HPP
#define OPENCV_DNN_VKCOM_OP_CONCAT_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
struct ConcatShaderConfig
{
int local_size_x;
int local_size_y;
int local_size_z;
int block_height;
int block_width;
int block_depth;
};
class OpConcat: public OpBase
{
public:
OpConcat(const int axis);
bool forward(std::vector<Tensor>& ins, Tensor& out);
void reshapeOutTensor(std::vector<Tensor *>& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool init(const int axis);
bool computeGroupCount();
ConcatShaderConfig config_;
int axis_;
int out_concat_axis_;
int accumulated_concat_axis_;
int concat_size_;
int total_concat_size_;
int thread_num_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_CONCAT_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_CONV_HPP
#define OPENCV_DNN_VKCOM_OP_CONV_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
enum ConvShaderType
{
kConvShaderTypeBasic = 0,
kConvShaderTypeIDLF = 1,
kConvShaderTypeNum
};
struct ConvShaderConfig
{
int local_size_x;
int local_size_y;
int local_size_z;
int block_height;
int block_width;
int block_depth;
ConvShaderType shader_type;
};
class OpConv : public OpBase
{
public:
OpConv(const int out_channel, const bool has_bias,
const int* filter_size, const int* pad,
const int* stride, const int* dilation,
const int activation, const int group,
const int padding_mode);
void reshapeOutTensor(Tensor& in, Tensor& out);
bool forward(Tensor& in, Tensor& filter_weights, Tensor& bias, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool init(const int out_channel, const bool has_bias,
const int* filter_size, const int* pad,
const int* stride, const int* dilation,
const int activation, const int group,
const int padding_mode);
bool computeGroupCount();
int batch_;
int in_height_;
int in_width_;
int in_channel_;
int out_height_;
int out_width_;
int out_channel_;
int filter_height_;
int filter_width_;
int stride_height_;
int stride_width_;
int padding_top_;
int padding_left_;
int dilation_height_;
int dilation_width_;
int activation_;
PaddingMode padding_mode_;
int group_;
int has_bias_;
Tensor swizzled_weights;
ConvShaderConfig config_;
bool dwconv_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_CONV_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_LRN_HPP
#define OPENCV_DNN_VKCOM_OP_LRN_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
enum LRNShaderType
{
kLRNShaderTypeBasic = 0,
kLRNShaderTypeNum
};
struct LRNShaderConfig
{
int local_size_x;
int local_size_y;
int local_size_z;
int block_height;
int block_width;
int block_depth;
LRNShaderType shader_type;
};
class OpLRN : public OpBase
{
public:
OpLRN(const int radius, const float bias,
const float alpha, const float beta,
const bool norm_by_size);
void reshapeOutTensor(Tensor& in, Tensor& out);
bool forward(Tensor& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool init(const int radius, const float bias,
const float alpha, const float beta,
const bool norm_by_size);
bool computeGroupCount();
int batch_;
int height_;
int width_;
int channels_;
int radius_;
float bias_;
float alpha_;
float beta_;
int filter_len_;
int thread_num_;
bool norm_by_size_;
LRNShaderConfig config_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_LRN_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_PERMUTE_HPP
#define OPENCV_DNN_VKCOM_OP_PERMUTE_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
class OpPermute: public OpBase
{
public:
OpPermute(std::vector<size_t>& order);
bool forward(std::vector<Tensor>& ins, std::vector<Tensor>& outs);
void reshapeOutTensor(std::vector<Tensor *>& in, std::vector<Tensor>& outs);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
void prepareStrides(const Shape &shape_before, const Shape &shape_after);
bool computeGroupCount();
std::vector<int> order_;
bool need_permute_;
int global_size_;
int nthreads_;
int dims_;
Tensor tensor_order_;
Tensor tensor_old_stride_;
Tensor tensor_new_stride_;
std::vector<int> old_stride_;
std::vector<int> new_stride_;
Shape in_shape_;
Shape out_shape_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_PERMUTE_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_POOL_HPP
#define OPENCV_DNN_VKCOM_OP_POOL_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
enum PoolType { kPoolTypeAvg, kPoolTypeMax, kPoolTypeNum };
struct PoolShaderConfig
{
int local_size_x;
int local_size_y;
int local_size_z;
int block_height;
int block_width;
int block_depth;
};
class OpPool: public OpBase
{
public:
OpPool(const int* filter_size, const int* pad, const int* stride,
const int padding_mode, const PoolType pool_type,
const bool avg_pool_padded_area);
bool forward(Tensor& in, Tensor& out, Tensor& mask);
void reshapeOutTensor(Tensor& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool init(const int* filter_size, const int* pad, const int* stride,
const int padding_mode, const PoolType type, const bool avg_pool_padded_area);
bool computeGroupCount();
int batch_;
int channels_;
int in_height_;
int in_width_;
int out_height_;
int out_width_;
int filter_height_;
int filter_width_;
int stride_height_;
int stride_width_;
int padding_left_;
int padding_top_;
PoolType pool_type_;
int avg_pool_padded_area_;
int need_mask_;
PaddingMode padding_mode_;
int activation_;
PoolShaderConfig config_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_POOL_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_PRIOR_BOX_HPP
#define OPENCV_DNN_VKCOM_OP_PRIOR_BOX_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
class OpPriorBox: public OpBase
{
public:
OpPriorBox(float step_x,
float step_y,
bool clip,
int num_priors,
std::vector<float>& variance,
std::vector<float>& offsets_x,
std::vector<float>& offsets_y,
std::vector<float>& box_widths,
std::vector<float>& box_heights);
bool forward(std::vector<Tensor>& in, Tensor& out);
void reshapeOutTensor(std::vector<Tensor *>& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool computeGroupCount();
int global_size_;
int nthreads_;
float step_x_;
float step_y_;
bool clip_;
int num_priors_;
std::vector<float> variance_;
std::vector<float> offsets_x_;
std::vector<float> offsets_y_;
std::vector<float> box_widths_;
std::vector<float> box_heights_;
int img_h_;
int img_w_;
int in_h_;
int in_w_;
int out_channel_;
int out_channel_size_;
Tensor tensor_offsets_x_;
Tensor tensor_offsets_y_;
Tensor tensor_widths_;
Tensor tensor_heights_;
Tensor tensor_variance_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_PRIOR_BOX_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_RELU_HPP
#define OPENCV_DNN_VKCOM_OP_RELU_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
class OpReLU: public OpBase
{
public:
OpReLU(const float slope = 1.f);
bool forward(Tensor& in, Tensor& out);
void reshapeOutTensor(Tensor& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool computeGroupCount();
int total_;
float slope_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_RELU_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_OP_SOFTMAX_HPP
#define OPENCV_DNN_VKCOM_OP_SOFTMAX_HPP
#include "vkcom.hpp"
#include "op_base.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
struct SoftmaxShaderConfig
{
int local_size_x;
int local_size_y;
int local_size_z;
int block_height;
int block_width;
int block_depth;
};
class OpSoftmax: public OpBase
{
public:
OpSoftmax(const int axis, const bool log_softmax = false);
~OpSoftmax();
void reshapeOutTensor(Tensor& in, Tensor& out);
bool forward(Tensor& in, Tensor& out);
virtual bool forward(std::vector<Tensor>& ins,
std::vector<Tensor>& blobs,
std::vector<Tensor>& outs) CV_OVERRIDE;
private:
bool init(const int axis, const bool log_softmax);
bool computeGroupCount();
int axis_;
int channels_;
int channel_size_;
int outer_size_;
bool log_softmax_;
SoftmaxShaderConfig config_;
Tensor* max_tensor_;
Tensor* sum_tensor_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_OP_SOFTMAX_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_TENSOR_HPP
#define OPENCV_DNN_VKCOM_TENSOR_HPP
#ifdef HAVE_VULKAN
#include <vulkan/vulkan.h>
#endif
#include <memory>
#include "vkcom.hpp"
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
class Buffer;
class Tensor
{
public:
Tensor(Format fmt = kFormatFp32);
Tensor(const char* data, std::vector<int>& shape, Format fmt = kFormatFp32);
void* map();
void unMap();
Shape getShape() const;
int dimSize(const int dim) const;
int dimNum() const;
int count(const int start_axis = 0, const int end_axis = -1) const;
// Change shape and format to as passed in.
// Copy data if data != NULL
// Allocate new internal buffer if new size > old size or alloc flag is true
Tensor reshape(const char* data, const std::vector<int>& shape, bool alloc = false, Format fmt = kFormatInvalid);
void setTo(float val);
int getFormat() const;
size_t size() const { return size_in_byte_; }
bool isEmpty() { return size_in_byte_ == 0 ? true : false; }
void copyTo(Tensor& dst);
std::shared_ptr<Buffer> getBuffer() { return buffer_; }
private:
VkDevice device_;
std::vector<int> shape_;
size_t size_in_byte_;
std::shared_ptr<Buffer> buffer_;
Format format_;
};
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#endif // OPENCV_DNN_VKCOM_TENSOR_HPP
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#ifndef OPENCV_DNN_VKCOM_HPP
#define OPENCV_DNN_VKCOM_HPP
#include <vector>
namespace cv { namespace dnn { namespace vkcom {
#ifdef HAVE_VULKAN
enum Format{
kFormatInvalid = -1,
kFormatFp16,
kFormatFp32,
kFormatFp64,
kFormatInt32,
kFormatNum
};
enum OpType {
kOpTypeConv,
kOpTypePool,
kOpTypeDWConv,
kOpTypeLRN,
kOpTypeConcat,
kOpTypeSoftmax,
kOpTypeReLU,
kOpTypePriorBox,
kOpTypePermute,
kOpTypeNum
};
enum PaddingMode { kPaddingModeSame, kPaddingModeValid, kPaddingModeCaffe, kPaddingModeNum };
enum FusedActivationType { kNone, kRelu, kRelu1, kRelu6, kActivationNum };
typedef std::vector<int> Shape;
/* context APIs */
bool initPerThread();
void deinitPerThread();
bool isAvailable();
#endif // HAVE_VULKAN
}}} // namespace cv::dnn::vkcom
#include "tensor.hpp"
#include "buffer.hpp"
#include "op_base.hpp"
#include "op_concat.hpp"
#include "op_conv.hpp"
#include "op_lrn.hpp"
#include "op_softmax.hpp"
#include "op_relu.hpp"
#include "op_pool.hpp"
#include "op_prior_box.hpp"
#include "op_permute.hpp"
#endif // OPENCV_DNN_VKCOM_HPP
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int channels;
int in_h;
int in_w;
int out_h;
int out_w;
int padding_h;
int padding_w;
int filter_h;
int filter_w;
int stride_h;
int stride_w;
int total;
int padded_area;
} p;
layout(binding = 0) readonly buffer Input0{
float in_buffer[];
};
layout(binding = 1) writeonly buffer Output{
float out_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int global_size = int(gl_WorkGroupSize.x * gl_NumWorkGroups.x);
int gid = int(gl_GlobalInvocationID.x);
for (int index = gid; index < p.total; index += global_size)
{
const int pw = index % p.out_w;
const int ph = (index / p.out_w) % p.out_h;
const int c = (index / p.out_w / p.out_h) % p.channels;
const int n = index / p.out_w / p.out_h / p.channels;
int hstart = ph * p.stride_h - p.padding_h;
int wstart = pw * p.stride_w - p.padding_w;
int hend = min(hstart + p.filter_h, p.in_h + p.padding_h);
int wend = min(wstart + p.filter_w, p.in_w + p.padding_w);
int pool_size;
if (p.padded_area == 1)
{
pool_size = (hend - hstart) * (wend - wstart);
hstart = max(hstart, 0);
wstart = max(wstart, 0);
hend = min(hend, p.in_h);
wend = min(wend, p.in_w);
}
else
{
hstart = max(hstart, 0);
wstart = max(wstart, 0);
hend = min(hend, p.in_h);
wend = min(wend, p.in_w);
pool_size = (hend - hstart) * (wend - wstart);
}
float aveval = 0;
int off = (n * p.channels + c) * p.in_h * p.in_w;
for (int h = hstart; h < hend; ++h) {
for (int w = wstart; w < wend; ++w) {
aveval += in_buffer[off + h * p.in_w + w];
}
}
out_buffer[index] = aveval / pool_size;
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int out_concat_axis;
int accumulated_concat_axis;
int concat_size;
int total_concat_size;
int thread_num;
} p;
layout(binding = 0) readonly buffer Input0{
float data[];
} src;
layout(binding = 1) writeonly buffer Output{
float data[];
} dst;
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int index = int(gl_GlobalInvocationID.x);
if (index < p.thread_num)
{
int concat_num = index / p.total_concat_size;
int concat_index = index % p.total_concat_size;
int out_index = concat_index + (concat_num * p.out_concat_axis + p.accumulated_concat_axis) * p.concat_size;
dst.data[out_index] = src.data[index];
}
}
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#include "../../precomp.hpp"
namespace cv { namespace dnn { namespace vkcom {
extern const unsigned int concat_spv[541] = {
0x07230203,0x00010000,0x00080001,0x0000004b,0x00000000,0x00020011,0x00000001,0x0006000b,
0x00000001,0x4c534c47,0x6474732e,0x3035342e,0x00000000,0x0003000e,0x00000000,0x00000001,
0x0006000f,0x00000005,0x00000004,0x6e69616d,0x00000000,0x0000000c,0x00060010,0x00000004,
0x00000011,0x00000100,0x00000001,0x00000001,0x00030003,0x00000002,0x000001c2,0x00040005,
0x00000004,0x6e69616d,0x00000000,0x00040005,0x00000008,0x65646e69,0x00000078,0x00080005,
0x0000000c,0x475f6c67,0x61626f6c,0x766e496c,0x7461636f,0x496e6f69,0x00000044,0x00050005,
0x00000013,0x68737570,0x636f6c42,0x0000006b,0x00070006,0x00000013,0x00000000,0x5f74756f,
0x636e6f63,0x615f7461,0x00736978,0x00090006,0x00000013,0x00000001,0x75636361,0x616c756d,
0x5f646574,0x636e6f63,0x615f7461,0x00736978,0x00060006,0x00000013,0x00000002,0x636e6f63,
0x735f7461,0x00657a69,0x00080006,0x00000013,0x00000003,0x61746f74,0x6f635f6c,0x7461636e,
0x7a69735f,0x00000065,0x00060006,0x00000013,0x00000004,0x65726874,0x6e5f6461,0x00006d75,
0x00030005,0x00000015,0x00000070,0x00050005,0x0000001e,0x636e6f63,0x6e5f7461,0x00006d75,
0x00060005,0x00000024,0x636e6f63,0x695f7461,0x7865646e,0x00000000,0x00050005,0x00000029,
0x5f74756f,0x65646e69,0x00000078,0x00040005,0x0000003b,0x7074754f,0x00007475,0x00050006,
0x0000003b,0x00000000,0x61746164,0x00000000,0x00030005,0x0000003d,0x00747364,0x00040005,
0x00000040,0x75706e49,0x00003074,0x00050006,0x00000040,0x00000000,0x61746164,0x00000000,
0x00030005,0x00000042,0x00637273,0x00040047,0x0000000c,0x0000000b,0x0000001c,0x00050048,
0x00000013,0x00000000,0x00000023,0x00000000,0x00050048,0x00000013,0x00000001,0x00000023,
0x00000004,0x00050048,0x00000013,0x00000002,0x00000023,0x00000008,0x00050048,0x00000013,
0x00000003,0x00000023,0x0000000c,0x00050048,0x00000013,0x00000004,0x00000023,0x00000010,
0x00030047,0x00000013,0x00000002,0x00040047,0x0000003a,0x00000006,0x00000004,0x00040048,
0x0000003b,0x00000000,0x00000019,0x00050048,0x0000003b,0x00000000,0x00000023,0x00000000,
0x00030047,0x0000003b,0x00000003,0x00040047,0x0000003d,0x00000022,0x00000000,0x00040047,
0x0000003d,0x00000021,0x00000001,0x00040047,0x0000003f,0x00000006,0x00000004,0x00040048,
0x00000040,0x00000000,0x00000018,0x00050048,0x00000040,0x00000000,0x00000023,0x00000000,
0x00030047,0x00000040,0x00000003,0x00040047,0x00000042,0x00000022,0x00000000,0x00040047,
0x00000042,0x00000021,0x00000000,0x00040047,0x0000004a,0x0000000b,0x00000019,0x00020013,
0x00000002,0x00030021,0x00000003,0x00000002,0x00040015,0x00000006,0x00000020,0x00000001,
0x00040020,0x00000007,0x00000007,0x00000006,0x00040015,0x00000009,0x00000020,0x00000000,
0x00040017,0x0000000a,0x00000009,0x00000003,0x00040020,0x0000000b,0x00000001,0x0000000a,
0x0004003b,0x0000000b,0x0000000c,0x00000001,0x0004002b,0x00000009,0x0000000d,0x00000000,
0x00040020,0x0000000e,0x00000001,0x00000009,0x0007001e,0x00000013,0x00000006,0x00000006,
0x00000006,0x00000006,0x00000006,0x00040020,0x00000014,0x00000009,0x00000013,0x0004003b,
0x00000014,0x00000015,0x00000009,0x0004002b,0x00000006,0x00000016,0x00000004,0x00040020,
0x00000017,0x00000009,0x00000006,0x00020014,0x0000001a,0x0004002b,0x00000006,0x00000020,
0x00000003,0x0004002b,0x00000006,0x0000002c,0x00000000,0x0004002b,0x00000006,0x00000030,
0x00000001,0x0004002b,0x00000006,0x00000034,0x00000002,0x00030016,0x00000039,0x00000020,
0x0003001d,0x0000003a,0x00000039,0x0003001e,0x0000003b,0x0000003a,0x00040020,0x0000003c,
0x00000002,0x0000003b,0x0004003b,0x0000003c,0x0000003d,0x00000002,0x0003001d,0x0000003f,
0x00000039,0x0003001e,0x00000040,0x0000003f,0x00040020,0x00000041,0x00000002,0x00000040,
0x0004003b,0x00000041,0x00000042,0x00000002,0x00040020,0x00000044,0x00000002,0x00000039,
0x0004002b,0x00000009,0x00000048,0x00000100,0x0004002b,0x00000009,0x00000049,0x00000001,
0x0006002c,0x0000000a,0x0000004a,0x00000048,0x00000049,0x00000049,0x00050036,0x00000002,
0x00000004,0x00000000,0x00000003,0x000200f8,0x00000005,0x0004003b,0x00000007,0x00000008,
0x00000007,0x0004003b,0x00000007,0x0000001e,0x00000007,0x0004003b,0x00000007,0x00000024,
0x00000007,0x0004003b,0x00000007,0x00000029,0x00000007,0x00050041,0x0000000e,0x0000000f,
0x0000000c,0x0000000d,0x0004003d,0x00000009,0x00000010,0x0000000f,0x0004007c,0x00000006,
0x00000011,0x00000010,0x0003003e,0x00000008,0x00000011,0x0004003d,0x00000006,0x00000012,
0x00000008,0x00050041,0x00000017,0x00000018,0x00000015,0x00000016,0x0004003d,0x00000006,
0x00000019,0x00000018,0x000500b1,0x0000001a,0x0000001b,0x00000012,0x00000019,0x000300f7,
0x0000001d,0x00000000,0x000400fa,0x0000001b,0x0000001c,0x0000001d,0x000200f8,0x0000001c,
0x0004003d,0x00000006,0x0000001f,0x00000008,0x00050041,0x00000017,0x00000021,0x00000015,
0x00000020,0x0004003d,0x00000006,0x00000022,0x00000021,0x00050087,0x00000006,0x00000023,
0x0000001f,0x00000022,0x0003003e,0x0000001e,0x00000023,0x0004003d,0x00000006,0x00000025,
0x00000008,0x00050041,0x00000017,0x00000026,0x00000015,0x00000020,0x0004003d,0x00000006,
0x00000027,0x00000026,0x0005008b,0x00000006,0x00000028,0x00000025,0x00000027,0x0003003e,
0x00000024,0x00000028,0x0004003d,0x00000006,0x0000002a,0x00000024,0x0004003d,0x00000006,
0x0000002b,0x0000001e,0x00050041,0x00000017,0x0000002d,0x00000015,0x0000002c,0x0004003d,
0x00000006,0x0000002e,0x0000002d,0x00050084,0x00000006,0x0000002f,0x0000002b,0x0000002e,
0x00050041,0x00000017,0x00000031,0x00000015,0x00000030,0x0004003d,0x00000006,0x00000032,
0x00000031,0x00050080,0x00000006,0x00000033,0x0000002f,0x00000032,0x00050041,0x00000017,
0x00000035,0x00000015,0x00000034,0x0004003d,0x00000006,0x00000036,0x00000035,0x00050084,
0x00000006,0x00000037,0x00000033,0x00000036,0x00050080,0x00000006,0x00000038,0x0000002a,
0x00000037,0x0003003e,0x00000029,0x00000038,0x0004003d,0x00000006,0x0000003e,0x00000029,
0x0004003d,0x00000006,0x00000043,0x00000008,0x00060041,0x00000044,0x00000045,0x00000042,
0x0000002c,0x00000043,0x0004003d,0x00000039,0x00000046,0x00000045,0x00060041,0x00000044,
0x00000047,0x0000003d,0x0000002c,0x0000003e,0x0003003e,0x00000047,0x00000046,0x000200f9,
0x0000001d,0x000200f8,0x0000001d,0x000100fd,0x00010038
};
}}} // namespace cv::dnn::vkcom
#version 450
#define LOCAL_SZ_X 256
layout(binding = 0) readonly buffer Input0{
float image_data[];
};
layout(binding = 1) readonly buffer Input1 {
float bias_data[];
};
layout(binding = 2) readonly buffer Input3{
float weight_data[];
};
layout(binding = 3) writeonly buffer Output{
float convolved_image_data[];
};
layout(push_constant) uniform pushBlock {
int in_h;
int in_w;
int out_h;
int out_w;
int stride_h;
int stride_w;
int pad_h;
int pad_w;
int filter_h;
int filter_w;
int dilation_h;
int dilation_w;
int channels;
int batch;
int has_bias;
int M;
int K;
int N;
} p;
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int gx = int(gl_GlobalInvocationID.x);
int gy = int(gl_GlobalInvocationID.y);
int gz = int(gl_GlobalInvocationID.z);
if(gx < p.M && gy < p.N && gz < p.batch)
{
float sum = 0.0f;
int output_y = gx / p.out_w;
int output_x = gx % p.out_w;
int org_y = output_y * p.stride_h - p.pad_h;
int org_x = output_x * p.stride_w - p.pad_w;
int weight_off = gy * p.K;
int input_off = gz * p.in_h * p.in_w * p.channels + (org_y * p.in_w + org_x);
for(int c = 0; c < p.channels; c++)
{
for(int y = 0; y < p.filter_h; y++)
{
for(int x = 0; x < p.filter_w; x++)
{
if((org_y + y * p.dilation_h >= 0) && (org_y + y * p.dilation_h < p.in_h) && (org_x + x * p.dilation_w >= 0) && (org_x + x * p.dilation_w < p.in_w))
{
sum += image_data[input_off + x * p.dilation_w] * weight_data[weight_off + x];
}
}
input_off += p.in_w * p.dilation_h;
weight_off += p.filter_w;
}
input_off += p.in_h * p.in_w - p.in_w * p.filter_h * p.dilation_h;
}
int offset = gz * p.M * p.N + gx + gy * p.M;
if (p.has_bias == 1)
sum += bias_data[gy];
convolved_image_data[offset] = sum;
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int in_h;
int in_w;
int out_h;
int out_w;
int stride_h;
int stride_w;
int pad_h;
int pad_w;
int filter_h;
int filter_w;
int dilation_h;
int dilation_w;
int channels;
int batch;
int has_bias;
int M;
int K;
int N;
} p;
layout(binding = 0) readonly buffer Input0{
float in_buffer[];
};
layout(binding = 1) readonly buffer Input1 {
float bias_data[];
};
layout(binding = 2) readonly buffer Input3{
float weight_data[];
};
layout(binding = 3) writeonly buffer Output{
float out_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
/*
Each work item compute batch * multiplier output cell along the output depth dimension and batch
*/
void main()
{
int gx = int(gl_GlobalInvocationID.x);
int gy = int(gl_GlobalInvocationID.y);
int gz = int(gl_GlobalInvocationID.z);
if(gx < p.out_w && gy < p.out_h && gz < p.channels)
{
float sum = 0.0f;
int org_y = gy * p.stride_h - p.pad_h;
int org_x = gx * p.stride_w - p.pad_w;
int weight_off = gz * p.filter_h * p.filter_w;
int input_off = gz * p.in_h * p.in_w + org_y * p.in_w + org_x;
for(int y = 0; y < p.filter_h; y++)
{
for(int x = 0; x < p.filter_w; x++)
{
if(org_y + y * p.dilation_h >= 0 && org_y + y * p.dilation_h < p.in_h && org_x + x * p.dilation_w >= 0 && org_x + x * p.dilation_w < p.in_w)
{
sum += in_buffer[input_off + x * p.dilation_w] * weight_data[weight_off + x];
}
}
weight_off += p.filter_w;
input_off += p.in_w * p.dilation_h;
}
int offset = gz * p.out_h * p.out_w + gy * p.out_w + gx;
if (p.has_bias == 1)
out_buffer[offset] = sum + bias_data[gz];
else
out_buffer[offset] = sum;
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int thread_num;
int channels;
int height;
int width;
int filter_len;
int radius;
float alpha;
float bias;
float negative_beta;
} p;
layout(binding = 0) readonly buffer Input0{
float in_buffer[];
};
layout(binding = 1) writeonly buffer Output{
float dst_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int gid = int(gl_GlobalInvocationID.x);
int gsz = int(gl_NumWorkGroups.x * gl_WorkGroupSize.x);
for (int index = gid; index < p.thread_num; index += gsz)
{
int x = index % p.width;
int y = (index / p.width) % p.height;
int b = index / (p.width * p.height);
int offset = b * p.channels * p.height * p.width + y * p.width + x;
int channel_off = p.height * p.width;
float scale_val;
int head = 0;
float accum_scale = 0.0f;
int min_val = p.radius < p.channels ? p.radius : p.channels;
while (head < min_val) {
accum_scale += in_buffer[offset + head * channel_off] * in_buffer[offset + head * channel_off];
++head;
}
while (head < p.channels) {
accum_scale += in_buffer[offset + head * channel_off] * in_buffer[offset + head * channel_off];
if (head - p.filter_len >= 0) {
accum_scale -= in_buffer[offset + (head - p.filter_len) * channel_off]
* in_buffer[offset + (head - p.filter_len) * channel_off];
}
scale_val = p.bias + accum_scale * p.alpha;
dst_buffer[offset + (head - p.radius) * channel_off] = in_buffer[offset + (head - p.radius) * channel_off] * pow(scale_val, p.negative_beta);
++head;
}
int pos = head - min_val;
while (pos >= 0 && pos < p.channels) {
if (head - p.filter_len >= 0) {
accum_scale -= in_buffer[offset + (head - p.filter_len) * channel_off]
* in_buffer[offset + (head - p.filter_len) * channel_off];
}
scale_val = p.bias + accum_scale * p.alpha;
dst_buffer[offset + pos * channel_off] = in_buffer[offset + pos * channel_off] * pow(scale_val, p.negative_beta);
++head;
++pos;
}
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int channels;
int in_h;
int in_w;
int out_h;
int out_w;
int padding_h;
int padding_w;
int filter_h;
int filter_w;
int stride_h;
int stride_w;
int total;
int need_mask;
} p;
layout(binding = 0) readonly buffer Input0{
float in_buffer[];
};
layout(binding = 1) writeonly buffer Output{
float out_buffer[];
};
layout(binding = 2) writeonly buffer Mask{
float mask_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int global_size = int(gl_WorkGroupSize.x * gl_NumWorkGroups.x);
int gid = int(gl_GlobalInvocationID.x);
for (int index = gid; index < p.total; index += global_size)
{
const int pw = index % p.out_w;
const int ph = (index / p.out_w) % p.out_h;
const int c = (index / p.out_w / p.out_h) % p.channels;
const int n = index / p.out_w / p.out_h / p.channels;
int hstart = ph * p.stride_h - p.padding_h;
int wstart = pw * p.stride_w - p.padding_w;
const int hend = min(hstart + p.filter_h, p.in_h);
const int wend = min(wstart + p.filter_w, p.in_w);
hstart = max(hstart, 0);
wstart = max(wstart, 0);
float maxval = -1./0.;
int maxidx = -1;
int off = (n * p.channels + c) * p.in_h * p.in_w;
for (int h = hstart; h < hend; ++h) {
for (int w = wstart; w < wend; ++w) {
if (in_buffer[off + h * p.in_w + w] > maxval) {
maxidx = h * p.in_w + w;
maxval = in_buffer[off + maxidx];
}
}
}
out_buffer[index] = maxval;
if (p.need_mask == 1)
mask_buffer[index] = maxidx;
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int nthreads;
int num_axes;
int global_size;
} p;
layout(binding = 0) readonly buffer Input0{
float in_buffer[];
};
layout(binding = 1) readonly buffer Input1{
int permute_order[];
};
layout(binding = 2) readonly buffer Input2{
int old_stride[];
};
layout(binding = 3) readonly buffer Input3{
int new_stride[];
};
layout(binding = 4) writeonly buffer Output{
float out_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
for (int i = int(gl_GlobalInvocationID.x); i < p.nthreads; i += p.global_size)
{
int old_pos = 0;
int new_pos = i;
for (int j = 0; j < p.num_axes; ++j)
{
int order = permute_order[j];
old_pos += (new_pos / new_stride[j]) * old_stride[order];
new_pos %= new_stride[j];
}
out_buffer[i] = in_buffer[old_pos];
}
}
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#include "../../precomp.hpp"
namespace cv { namespace dnn { namespace vkcom {
extern const unsigned int permute_spv[765] = {
0x07230203,0x00010000,0x00080001,0x00000069,0x00000000,0x00020011,0x00000001,0x0006000b,
0x00000001,0x4c534c47,0x6474732e,0x3035342e,0x00000000,0x0003000e,0x00000000,0x00000001,
0x0006000f,0x00000005,0x00000004,0x6e69616d,0x00000000,0x0000000c,0x00060010,0x00000004,
0x00000011,0x00000100,0x00000001,0x00000001,0x00030003,0x00000002,0x000001c2,0x00040005,
0x00000004,0x6e69616d,0x00000000,0x00030005,0x00000008,0x00000069,0x00080005,0x0000000c,
0x475f6c67,0x61626f6c,0x766e496c,0x7461636f,0x496e6f69,0x00000044,0x00050005,0x00000018,
0x68737570,0x636f6c42,0x0000006b,0x00060006,0x00000018,0x00000000,0x7268746e,0x73646165,
0x00000000,0x00060006,0x00000018,0x00000001,0x5f6d756e,0x73657861,0x00000000,0x00060006,
0x00000018,0x00000002,0x626f6c67,0x735f6c61,0x00657a69,0x00030005,0x0000001a,0x00000070,
0x00040005,0x00000021,0x5f646c6f,0x00736f70,0x00040005,0x00000022,0x5f77656e,0x00736f70,
0x00030005,0x00000024,0x0000006a,0x00040005,0x0000002f,0x6564726f,0x00000072,0x00040005,
0x00000031,0x75706e49,0x00003174,0x00070006,0x00000031,0x00000000,0x6d726570,0x5f657475,
0x6564726f,0x00000072,0x00030005,0x00000033,0x00000000,0x00040005,0x0000003a,0x75706e49,
0x00003374,0x00060006,0x0000003a,0x00000000,0x5f77656e,0x69727473,0x00006564,0x00030005,
0x0000003c,0x00000000,0x00040005,0x00000042,0x75706e49,0x00003274,0x00060006,0x00000042,
0x00000000,0x5f646c6f,0x69727473,0x00006564,0x00030005,0x00000044,0x00000000,0x00040005,
0x00000054,0x7074754f,0x00007475,0x00060006,0x00000054,0x00000000,0x5f74756f,0x66667562,
0x00007265,0x00030005,0x00000056,0x00000000,0x00040005,0x00000059,0x75706e49,0x00003074,
0x00060006,0x00000059,0x00000000,0x625f6e69,0x65666675,0x00000072,0x00030005,0x0000005b,
0x00000000,0x00040047,0x0000000c,0x0000000b,0x0000001c,0x00050048,0x00000018,0x00000000,
0x00000023,0x00000000,0x00050048,0x00000018,0x00000001,0x00000023,0x00000004,0x00050048,
0x00000018,0x00000002,0x00000023,0x00000008,0x00030047,0x00000018,0x00000002,0x00040047,
0x00000030,0x00000006,0x00000004,0x00040048,0x00000031,0x00000000,0x00000018,0x00050048,
0x00000031,0x00000000,0x00000023,0x00000000,0x00030047,0x00000031,0x00000003,0x00040047,
0x00000033,0x00000022,0x00000000,0x00040047,0x00000033,0x00000021,0x00000001,0x00040047,
0x00000039,0x00000006,0x00000004,0x00040048,0x0000003a,0x00000000,0x00000018,0x00050048,
0x0000003a,0x00000000,0x00000023,0x00000000,0x00030047,0x0000003a,0x00000003,0x00040047,
0x0000003c,0x00000022,0x00000000,0x00040047,0x0000003c,0x00000021,0x00000003,0x00040047,
0x00000041,0x00000006,0x00000004,0x00040048,0x00000042,0x00000000,0x00000018,0x00050048,
0x00000042,0x00000000,0x00000023,0x00000000,0x00030047,0x00000042,0x00000003,0x00040047,
0x00000044,0x00000022,0x00000000,0x00040047,0x00000044,0x00000021,0x00000002,0x00040047,
0x00000053,0x00000006,0x00000004,0x00040048,0x00000054,0x00000000,0x00000019,0x00050048,
0x00000054,0x00000000,0x00000023,0x00000000,0x00030047,0x00000054,0x00000003,0x00040047,
0x00000056,0x00000022,0x00000000,0x00040047,0x00000056,0x00000021,0x00000004,0x00040047,
0x00000058,0x00000006,0x00000004,0x00040048,0x00000059,0x00000000,0x00000018,0x00050048,
0x00000059,0x00000000,0x00000023,0x00000000,0x00030047,0x00000059,0x00000003,0x00040047,
0x0000005b,0x00000022,0x00000000,0x00040047,0x0000005b,0x00000021,0x00000000,0x00040047,
0x00000068,0x0000000b,0x00000019,0x00020013,0x00000002,0x00030021,0x00000003,0x00000002,
0x00040015,0x00000006,0x00000020,0x00000001,0x00040020,0x00000007,0x00000007,0x00000006,
0x00040015,0x00000009,0x00000020,0x00000000,0x00040017,0x0000000a,0x00000009,0x00000003,
0x00040020,0x0000000b,0x00000001,0x0000000a,0x0004003b,0x0000000b,0x0000000c,0x00000001,
0x0004002b,0x00000009,0x0000000d,0x00000000,0x00040020,0x0000000e,0x00000001,0x00000009,
0x0005001e,0x00000018,0x00000006,0x00000006,0x00000006,0x00040020,0x00000019,0x00000009,
0x00000018,0x0004003b,0x00000019,0x0000001a,0x00000009,0x0004002b,0x00000006,0x0000001b,
0x00000000,0x00040020,0x0000001c,0x00000009,0x00000006,0x00020014,0x0000001f,0x0004002b,
0x00000006,0x0000002b,0x00000001,0x0003001d,0x00000030,0x00000006,0x0003001e,0x00000031,
0x00000030,0x00040020,0x00000032,0x00000002,0x00000031,0x0004003b,0x00000032,0x00000033,
0x00000002,0x00040020,0x00000035,0x00000002,0x00000006,0x0003001d,0x00000039,0x00000006,
0x0003001e,0x0000003a,0x00000039,0x00040020,0x0000003b,0x00000002,0x0000003a,0x0004003b,
0x0000003b,0x0000003c,0x00000002,0x0003001d,0x00000041,0x00000006,0x0003001e,0x00000042,
0x00000041,0x00040020,0x00000043,0x00000002,0x00000042,0x0004003b,0x00000043,0x00000044,
0x00000002,0x00030016,0x00000052,0x00000020,0x0003001d,0x00000053,0x00000052,0x0003001e,
0x00000054,0x00000053,0x00040020,0x00000055,0x00000002,0x00000054,0x0004003b,0x00000055,
0x00000056,0x00000002,0x0003001d,0x00000058,0x00000052,0x0003001e,0x00000059,0x00000058,
0x00040020,0x0000005a,0x00000002,0x00000059,0x0004003b,0x0000005a,0x0000005b,0x00000002,
0x00040020,0x0000005d,0x00000002,0x00000052,0x0004002b,0x00000006,0x00000061,0x00000002,
0x0004002b,0x00000009,0x00000066,0x00000100,0x0004002b,0x00000009,0x00000067,0x00000001,
0x0006002c,0x0000000a,0x00000068,0x00000066,0x00000067,0x00000067,0x00050036,0x00000002,
0x00000004,0x00000000,0x00000003,0x000200f8,0x00000005,0x0004003b,0x00000007,0x00000008,
0x00000007,0x0004003b,0x00000007,0x00000021,0x00000007,0x0004003b,0x00000007,0x00000022,
0x00000007,0x0004003b,0x00000007,0x00000024,0x00000007,0x0004003b,0x00000007,0x0000002f,
0x00000007,0x00050041,0x0000000e,0x0000000f,0x0000000c,0x0000000d,0x0004003d,0x00000009,
0x00000010,0x0000000f,0x0004007c,0x00000006,0x00000011,0x00000010,0x0003003e,0x00000008,
0x00000011,0x000200f9,0x00000012,0x000200f8,0x00000012,0x000400f6,0x00000014,0x00000015,
0x00000000,0x000200f9,0x00000016,0x000200f8,0x00000016,0x0004003d,0x00000006,0x00000017,
0x00000008,0x00050041,0x0000001c,0x0000001d,0x0000001a,0x0000001b,0x0004003d,0x00000006,
0x0000001e,0x0000001d,0x000500b1,0x0000001f,0x00000020,0x00000017,0x0000001e,0x000400fa,
0x00000020,0x00000013,0x00000014,0x000200f8,0x00000013,0x0003003e,0x00000021,0x0000001b,
0x0004003d,0x00000006,0x00000023,0x00000008,0x0003003e,0x00000022,0x00000023,0x0003003e,
0x00000024,0x0000001b,0x000200f9,0x00000025,0x000200f8,0x00000025,0x000400f6,0x00000027,
0x00000028,0x00000000,0x000200f9,0x00000029,0x000200f8,0x00000029,0x0004003d,0x00000006,
0x0000002a,0x00000024,0x00050041,0x0000001c,0x0000002c,0x0000001a,0x0000002b,0x0004003d,
0x00000006,0x0000002d,0x0000002c,0x000500b1,0x0000001f,0x0000002e,0x0000002a,0x0000002d,
0x000400fa,0x0000002e,0x00000026,0x00000027,0x000200f8,0x00000026,0x0004003d,0x00000006,
0x00000034,0x00000024,0x00060041,0x00000035,0x00000036,0x00000033,0x0000001b,0x00000034,
0x0004003d,0x00000006,0x00000037,0x00000036,0x0003003e,0x0000002f,0x00000037,0x0004003d,
0x00000006,0x00000038,0x00000022,0x0004003d,0x00000006,0x0000003d,0x00000024,0x00060041,
0x00000035,0x0000003e,0x0000003c,0x0000001b,0x0000003d,0x0004003d,0x00000006,0x0000003f,
0x0000003e,0x00050087,0x00000006,0x00000040,0x00000038,0x0000003f,0x0004003d,0x00000006,
0x00000045,0x0000002f,0x00060041,0x00000035,0x00000046,0x00000044,0x0000001b,0x00000045,
0x0004003d,0x00000006,0x00000047,0x00000046,0x00050084,0x00000006,0x00000048,0x00000040,
0x00000047,0x0004003d,0x00000006,0x00000049,0x00000021,0x00050080,0x00000006,0x0000004a,
0x00000049,0x00000048,0x0003003e,0x00000021,0x0000004a,0x0004003d,0x00000006,0x0000004b,
0x00000024,0x00060041,0x00000035,0x0000004c,0x0000003c,0x0000001b,0x0000004b,0x0004003d,
0x00000006,0x0000004d,0x0000004c,0x0004003d,0x00000006,0x0000004e,0x00000022,0x0005008b,
0x00000006,0x0000004f,0x0000004e,0x0000004d,0x0003003e,0x00000022,0x0000004f,0x000200f9,
0x00000028,0x000200f8,0x00000028,0x0004003d,0x00000006,0x00000050,0x00000024,0x00050080,
0x00000006,0x00000051,0x00000050,0x0000002b,0x0003003e,0x00000024,0x00000051,0x000200f9,
0x00000025,0x000200f8,0x00000027,0x0004003d,0x00000006,0x00000057,0x00000008,0x0004003d,
0x00000006,0x0000005c,0x00000021,0x00060041,0x0000005d,0x0000005e,0x0000005b,0x0000001b,
0x0000005c,0x0004003d,0x00000052,0x0000005f,0x0000005e,0x00060041,0x0000005d,0x00000060,
0x00000056,0x0000001b,0x00000057,0x0003003e,0x00000060,0x0000005f,0x000200f9,0x00000015,
0x000200f8,0x00000015,0x00050041,0x0000001c,0x00000062,0x0000001a,0x00000061,0x0004003d,
0x00000006,0x00000063,0x00000062,0x0004003d,0x00000006,0x00000064,0x00000008,0x00050080,
0x00000006,0x00000065,0x00000064,0x00000063,0x0003003e,0x00000008,0x00000065,0x000200f9,
0x00000012,0x000200f8,0x00000014,0x000100fd,0x00010038
};
}}} // namespace cv::dnn::vkcom
#version 450
#define LOCAL_SZ_X 256
layout(push_constant) uniform pushBlock {
int global_size;
int nthreads;
float step_x;
float step_y;
int offset_x_size;
int width_size;
int layer_w;
int image_h;
int image_w;
int clip;
int variance_off;
} p;
layout(binding = 0) readonly buffer Input0{
float offset_x[];
};
layout(binding = 1) readonly buffer Input1{
float offset_y[];
};
layout(binding = 2) readonly buffer Input2{
float widths[];
};
layout(binding = 3) readonly buffer Input3{
float heights[];
};
layout(binding = 4) readonly buffer Input4{
vec4 variance[];
};
layout(binding = 5) writeonly buffer Output{
vec4 out_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
for (int index = int(gl_GlobalInvocationID.x); index < p.nthreads; index += p.global_size)
{
int w = index % p.layer_w;
int h = index / p.layer_w;
int output_offset = index * p.offset_x_size * p.width_size;
float box_w, box_h;
vec4 outer;
for (int i = 0; i < p.width_size; ++i)
{
box_w = widths[i];
box_h = heights[i];
for (int j = 0; j < p.offset_x_size; ++j)
{
float center_x = (w + offset_x[j]) * p.step_x;
float center_y = (h + offset_y[j]) * p.step_y;
outer.x = (center_x - box_w * 0.5f) / p.image_w; // xmin
outer.y = (center_y - box_h * 0.5f) / p.image_h; // ymin
outer.z = (center_x + box_w * 0.5f) / p.image_w; // xmax
outer.w = (center_y + box_h * 0.5f) / p.image_h; // ymax
// clip
if (p.clip == 1)
{
vec4 start = vec4(0.f, 0.f, 0.f, 0.f);
vec4 end = vec4(1.f, 1.f, 1.f, 1.f);
outer = min(max(outer, start), end);
}
//set variance
out_buffer[p.variance_off + output_offset] = variance[0];
out_buffer[output_offset] = outer;
output_offset++;
}
}
}
}
This diff is collapsed.
#version 450
#define LOCAL_SZ_X 32
layout(push_constant) uniform pushBlock {
int total;
float slope;
} p;
layout(binding = 0) readonly buffer inbuf{
float in_buffer[];
};
layout(binding = 1) writeonly buffer outbuf{
float out_buffer[];
};
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
for (int i = int(gl_GlobalInvocationID.x); i < p.total; i += int(gl_NumWorkGroups.x * gl_WorkGroupSize.x))
{
float in_val = in_buffer[i];
out_buffer[i] = in_val >= 0.f ? in_val : p.slope * in_val;
}
}
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
//
// Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
#include "../../precomp.hpp"
namespace cv { namespace dnn { namespace vkcom {
extern const unsigned int relu_spv[502] = {
0x07230203,0x00010000,0x00080001,0x0000004b,0x00000000,0x00020011,0x00000001,0x0006000b,
0x00000001,0x4c534c47,0x6474732e,0x3035342e,0x00000000,0x0003000e,0x00000000,0x00000001,
0x0007000f,0x00000005,0x00000004,0x6e69616d,0x00000000,0x0000000c,0x00000041,0x00060010,
0x00000004,0x00000011,0x00000020,0x00000001,0x00000001,0x00030003,0x00000002,0x000001c2,
0x00040005,0x00000004,0x6e69616d,0x00000000,0x00030005,0x00000008,0x00000069,0x00080005,
0x0000000c,0x475f6c67,0x61626f6c,0x766e496c,0x7461636f,0x496e6f69,0x00000044,0x00050005,
0x00000019,0x68737570,0x636f6c42,0x0000006b,0x00050006,0x00000019,0x00000000,0x61746f74,
0x0000006c,0x00050006,0x00000019,0x00000001,0x706f6c73,0x00000065,0x00030005,0x0000001b,
0x00000070,0x00040005,0x00000023,0x765f6e69,0x00006c61,0x00040005,0x00000025,0x75626e69,
0x00000066,0x00060006,0x00000025,0x00000000,0x625f6e69,0x65666675,0x00000072,0x00030005,
0x00000027,0x00000000,0x00040005,0x0000002d,0x6274756f,0x00006675,0x00060006,0x0000002d,
0x00000000,0x5f74756f,0x66667562,0x00007265,0x00030005,0x0000002f,0x00000000,0x00070005,
0x00000041,0x4e5f6c67,0x6f576d75,0x72476b72,0x7370756f,0x00000000,0x00040047,0x0000000c,
0x0000000b,0x0000001c,0x00050048,0x00000019,0x00000000,0x00000023,0x00000000,0x00050048,
0x00000019,0x00000001,0x00000023,0x00000004,0x00030047,0x00000019,0x00000002,0x00040047,
0x00000024,0x00000006,0x00000004,0x00040048,0x00000025,0x00000000,0x00000018,0x00050048,
0x00000025,0x00000000,0x00000023,0x00000000,0x00030047,0x00000025,0x00000003,0x00040047,
0x00000027,0x00000022,0x00000000,0x00040047,0x00000027,0x00000021,0x00000000,0x00040047,
0x0000002c,0x00000006,0x00000004,0x00040048,0x0000002d,0x00000000,0x00000019,0x00050048,
0x0000002d,0x00000000,0x00000023,0x00000000,0x00030047,0x0000002d,0x00000003,0x00040047,
0x0000002f,0x00000022,0x00000000,0x00040047,0x0000002f,0x00000021,0x00000001,0x00040047,
0x00000041,0x0000000b,0x00000018,0x00040047,0x0000004a,0x0000000b,0x00000019,0x00020013,
0x00000002,0x00030021,0x00000003,0x00000002,0x00040015,0x00000006,0x00000020,0x00000001,
0x00040020,0x00000007,0x00000007,0x00000006,0x00040015,0x00000009,0x00000020,0x00000000,
0x00040017,0x0000000a,0x00000009,0x00000003,0x00040020,0x0000000b,0x00000001,0x0000000a,
0x0004003b,0x0000000b,0x0000000c,0x00000001,0x0004002b,0x00000009,0x0000000d,0x00000000,
0x00040020,0x0000000e,0x00000001,0x00000009,0x00030016,0x00000018,0x00000020,0x0004001e,
0x00000019,0x00000006,0x00000018,0x00040020,0x0000001a,0x00000009,0x00000019,0x0004003b,
0x0000001a,0x0000001b,0x00000009,0x0004002b,0x00000006,0x0000001c,0x00000000,0x00040020,
0x0000001d,0x00000009,0x00000006,0x00020014,0x00000020,0x00040020,0x00000022,0x00000007,
0x00000018,0x0003001d,0x00000024,0x00000018,0x0003001e,0x00000025,0x00000024,0x00040020,
0x00000026,0x00000002,0x00000025,0x0004003b,0x00000026,0x00000027,0x00000002,0x00040020,
0x00000029,0x00000002,0x00000018,0x0003001d,0x0000002c,0x00000018,0x0003001e,0x0000002d,
0x0000002c,0x00040020,0x0000002e,0x00000002,0x0000002d,0x0004003b,0x0000002e,0x0000002f,
0x00000002,0x0004002b,0x00000018,0x00000033,0x00000000,0x0004002b,0x00000006,0x00000039,
0x00000001,0x00040020,0x0000003a,0x00000009,0x00000018,0x0004003b,0x0000000b,0x00000041,
0x00000001,0x0004002b,0x00000009,0x00000044,0x00000020,0x0004002b,0x00000009,0x00000049,
0x00000001,0x0006002c,0x0000000a,0x0000004a,0x00000044,0x00000049,0x00000049,0x00050036,
0x00000002,0x00000004,0x00000000,0x00000003,0x000200f8,0x00000005,0x0004003b,0x00000007,
0x00000008,0x00000007,0x0004003b,0x00000022,0x00000023,0x00000007,0x0004003b,0x00000022,
0x00000031,0x00000007,0x00050041,0x0000000e,0x0000000f,0x0000000c,0x0000000d,0x0004003d,
0x00000009,0x00000010,0x0000000f,0x0004007c,0x00000006,0x00000011,0x00000010,0x0003003e,
0x00000008,0x00000011,0x000200f9,0x00000012,0x000200f8,0x00000012,0x000400f6,0x00000014,
0x00000015,0x00000000,0x000200f9,0x00000016,0x000200f8,0x00000016,0x0004003d,0x00000006,
0x00000017,0x00000008,0x00050041,0x0000001d,0x0000001e,0x0000001b,0x0000001c,0x0004003d,
0x00000006,0x0000001f,0x0000001e,0x000500b1,0x00000020,0x00000021,0x00000017,0x0000001f,
0x000400fa,0x00000021,0x00000013,0x00000014,0x000200f8,0x00000013,0x0004003d,0x00000006,
0x00000028,0x00000008,0x00060041,0x00000029,0x0000002a,0x00000027,0x0000001c,0x00000028,
0x0004003d,0x00000018,0x0000002b,0x0000002a,0x0003003e,0x00000023,0x0000002b,0x0004003d,
0x00000006,0x00000030,0x00000008,0x0004003d,0x00000018,0x00000032,0x00000023,0x000500be,
0x00000020,0x00000034,0x00000032,0x00000033,0x000300f7,0x00000036,0x00000000,0x000400fa,
0x00000034,0x00000035,0x00000038,0x000200f8,0x00000035,0x0004003d,0x00000018,0x00000037,
0x00000023,0x0003003e,0x00000031,0x00000037,0x000200f9,0x00000036,0x000200f8,0x00000038,
0x00050041,0x0000003a,0x0000003b,0x0000001b,0x00000039,0x0004003d,0x00000018,0x0000003c,
0x0000003b,0x0004003d,0x00000018,0x0000003d,0x00000023,0x00050085,0x00000018,0x0000003e,
0x0000003c,0x0000003d,0x0003003e,0x00000031,0x0000003e,0x000200f9,0x00000036,0x000200f8,
0x00000036,0x0004003d,0x00000018,0x0000003f,0x00000031,0x00060041,0x00000029,0x00000040,
0x0000002f,0x0000001c,0x00000030,0x0003003e,0x00000040,0x0000003f,0x000200f9,0x00000015,
0x000200f8,0x00000015,0x00050041,0x0000000e,0x00000042,0x00000041,0x0000000d,0x0004003d,
0x00000009,0x00000043,0x00000042,0x00050084,0x00000009,0x00000045,0x00000043,0x00000044,
0x0004007c,0x00000006,0x00000046,0x00000045,0x0004003d,0x00000006,0x00000047,0x00000008,
0x00050080,0x00000006,0x00000048,0x00000047,0x00000046,0x0003003e,0x00000008,0x00000048,
0x000200f9,0x00000012,0x000200f8,0x00000014,0x000100fd,0x00010038
};
}}} // namespace cv::dnn::vkcom
#version 450
#define LOCAL_SZ_X 256
layout(binding = 0) readonly buffer buf0{
float input_buffer[]; // outer_size * channels * channel_size
};
layout(binding = 1) buffer buf1{
float max_buffer[]; // outer_size * channel_size
};
layout(binding = 2) buffer buf2{
float sum_buffer[]; // outer_size * channel_size
};
layout(binding = 3) buffer buf3{
float output_buffer[]; // outer_size * channels * channel_size
};
layout(push_constant) uniform pushBlock {
int channel_size;
int outer_size;
int channels;
} p;
layout(local_size_x = LOCAL_SZ_X, local_size_y = 1, local_size_z = 1) in;
void main()
{
int gid = int(gl_GlobalInvocationID.x);
if (gid >= p.outer_size) return;
int global_off = gid * p.channels * p.channel_size;
int reduced_buffer_off = gid * p.channel_size;
// find the max along channel
int index = global_off;
for (int i = 0; i < p.channel_size; ++i)
{
max_buffer[reduced_buffer_off + i] = input_buffer[index];
index++;
}
for (int c = 1; c < p.channels; ++c)
{
for (int i = 0; i < p.channel_size; ++i)
{
max_buffer[reduced_buffer_off + i] = max(max_buffer[reduced_buffer_off + i], input_buffer[index]);
index++;
}
}
// substract, exp and accumulate along channel
for (int i = 0; i < p.channel_size; ++i)
sum_buffer[reduced_buffer_off + i] = 0.f;
index = global_off;
for (int c = 0; c < p.channels; ++c)
{
for (int i = 0; i < p.channel_size; ++i)
{
float exp_val = exp(input_buffer[index] - max_buffer[reduced_buffer_off + i]);
output_buffer[index] = exp_val;
sum_buffer[reduced_buffer_off + i] += exp_val;
index++;
}
}
// divide by computed sum
index = global_off;
for (int c = 0; c < p.channels; ++c)
{
for (int i = 0; i < p.channel_size; ++i)
{
float v = output_buffer[index] / sum_buffer[reduced_buffer_off + i];
#ifdef LOG_SOFTMAX
v = log(v);
#endif
output_buffer[index] = v;
index++;
}
}
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -285,6 +285,6 @@ TEST_P(DNNTestNetwork, FastNeuralStyle_eccv16)
processNet("dnn/fast_neural_style_eccv16_starry_night.t7", "", inp, "", "", l1, lInf);
}
INSTANTIATE_TEST_CASE_P(/*nothing*/, DNNTestNetwork, dnnBackendsAndTargets(true, true, false));
INSTANTIATE_TEST_CASE_P(/*nothing*/, DNNTestNetwork, dnnBackendsAndTargets(true, true, false, true));
}} // namespace
This diff is collapsed.
......@@ -161,6 +161,8 @@ TEST_P(setInput, normalization)
throw SkipTestException("Myriad is not available/disabled in OpenCV");
if (backend == DNN_BACKEND_OPENCV && target == DNN_TARGET_OPENCL_FP16 && dtype != CV_32F)
throw SkipTestException("");
if (backend == DNN_BACKEND_VKCOM && dtype != CV_32F)
throw SkipTestException("");
Mat inp(5, 5, CV_8UC3);
randu(inp, 0, 255);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment