InfiniTensor

Commit Graph

Author	SHA1	Message	Date
xiaonans	a98573990b	Accelerate llama (#219 ) * [feature] add cudagraph support * modify code to pass the cuda_all_reduce test * modify rope op * support rmsnorm * add fp16 support to silu cuda op * fix bugs in rmsnorm op * uncomment simplify in onnx.py --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2024-04-01 08:46:05 +08:00
Chenjie Duan	54a35772fb	feature: add parameter to config matmul compute type (#218 ) * feature: add parameter to config matmul compute type * fix format	2024-03-26 09:00:45 +08:00
zhangyue	00e6cc2587	XCCL support (#171 ) * add reduce_mean and gather * fix format * add kunlun allreduce and cmakefile * add kunlun allreduce and cmakefile * deltete cmake opt * fix format * fix makefile * add DIST option in Makefile * add xpu allgather * delete xpu_wait() * add xpu allgather * delete specific compiler * fix format * fix gather * add broadcast * fix format * fix * fix xpu, add where operation, fix element-wise operation * fix softmax * fix softmax * log internal input and output * fix kunlun gather bugs * update CMakeList.txt and Makefile * fix some kunlun kernels * fix Makefile * fix Makefile * set cmake version 3.12 * format * fix where, gather and support gpt2 * "fix format" * fix format * copy onnx.py from master * use KUNLUN_HOME instead of absolute path * fix torchvision models * support torchvison model-zoo * fix format * format fix, CMakeList fix * fix review * fix vecToString return value * fix format * delete empty file --------- Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2024-02-29 11:48:35 +08:00
xiaonans	ae9f61de5a	add comment for rope operator	2024-02-04 10:57:01 +08:00
xiaonans	e8d111ef5d	add rope and silu support	2024-01-26 10:01:27 +08:00
xiaonans	afed5d3c3d	use workspace to optimize kvcache attention	2024-01-25 10:33:01 +08:00
xiaonans	6a1bfd6c45	[feature] support kvcache with static graph	2024-01-17 11:38:44 +08:00
zhangyunze	58993d4339	解除前端对onnx infershape功能的依赖 (#206 ) * feat: SqueezeOp lift the dependency of onnx infershape. * feat: UnsqueezeOp lift the dependency of onnx infershape. * feat: lift the dependency of onnx infershape * fix: fix Makefile off nccl	2024-01-12 14:54:27 +08:00
Chenjie Duan	83f1de93d0	add frontend resize kernel (#194 ) * - add frontend resize kernel * - fix resize test * - fix bug - add onnx test for resize * fix: modify codes as reviewer suggested --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-12-29 13:32:56 +08:00
Hardy	5ac0ab442f	Fix bang (#198 ) * fix bang batchnorm * fix pooling test bang * add test batchnorm * HIGH PRECISION ACTIVATION * fix pooling * fix matmul * fix test * add layernorm * fix softmax * fix * better code * fix * fix worlflow * fix workflow * fix * fix * fxi matmul * add LRN * fix lrn * fix lrn --------- Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: Baoming Li <1508269885@qq.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-12-28 13:44:10 +08:00
xgqdut2016	a3929c25f8	Add send and recv operators based on NCCL (#182 ) * baseline sendrecv, bug * success sendrecv * get rank from comm * set output shape * successful:set output shape equal to input shape * shape as attribute * success:shape as attribute * success send recv, output 0 * add onnx test * split send and recv * success split send and recv * test-onnx bug * success test-onnx * modified onnx.py * solve review	2023-12-14 16:38:03 +08:00
xgqdut2016	a7293c12ba	Add layer normalization (#181 ) * - add layernorm kernel * success:add layernorm kernel and test * fix: remove unusalble comments * fix: modify code as reviewer suggested * debug,modified .cu and test * optional bias support * overloading function * fix bug after merging; remove time constrain in conv test --------- Co-authored-by: kilinchange <kilinchange@163.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-24 15:15:14 +08:00
PanZezhong1725	6ece3f4a77	Add ReduceSum op and kernel (#160 ) * Add reduceSum op and kernel * fix merge and format * Reduce: reuse cat macro, add doc string --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-24 09:29:58 +08:00
zhangyunze	331f7ab2b8	support Dynamic tensor infer shape and fix memory pool (#176 ) * feat: support dynamic tensor part1 * feat: support dynamic-tensor part2 * feat: support dynamic tensor part 3 * fix: fix some .. * - add kvcache example * feat: support concat to identity kernel * add a simple mempory pool for allocator * fix: rebase to master * fix bug after merging * - remove outdated script * fix: fix as review --------- Co-authored-by: kilinchange <kilinchange@163.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-23 13:11:50 +08:00
xiaonans	965df4e294	[feature] add fused attention_kvcache operator support (#179 ) * [feature] add fused attention_kvcache operator support * add test to attention_kvcache op * Add space line at EOF --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-14 23:44:22 +08:00
Hardy	50862df765	[Kunlun & CUDA & BANG] add depth2space operator (#178 ) * add depth2space operator * fix format * add depth2space on cambricon bang * add depth2space on gpu --------- Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-10 17:58:26 +08:00
Haojie Wang	8e4d88fb9f	add transpose, concat and split for native cpu (#158 )	2023-10-12 10:14:28 +08:00
PanZezhong1725	36ae7b7fb6	Add GatherElements op and cuda kernel (#149 ) * Add GatherElements op and cuda kernel * fix format * remove print * remove unused var * fix spacing * fix format --------- Co-authored-by: panzezhong@qiyuanlab.com <panzezhong@zezhongpan> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-10-12 09:18:12 +08:00
PanZezhong1725	ed3034f878	Add HardSigmoid and HardSwish (#156 ) * Add HardSigmoid and HardSwish * fix format	2023-10-10 22:41:06 +08:00
ChengXiang Qi	7f16fa353e	【Hackathon No.108】Add Gelu operator, ffi, kernel for cpu and gpu. (#148 ) feat: Add Gelu kernel, operator, ffi.	2023-10-10 15:21:13 +08:00
Haojie Wang	7a9fcd93b2	Pooling ceil mode (#155 ) * add ceil mode for pooling * do not print debug info for allocator by default * fix test bugs after introducing pooling ceil mode * fix onnx import bug	2023-10-09 20:51:39 +08:00
constroy Li	f60767a770	impl distributed launch with NCCL (#106 ) * add cmake bits about NCCL * move example to examples/NNmodel * impl NCCL communicator * add comm related function to Runtime * export runtime interface * add launch.py * use unique name to distingush the the NCCL ID file * add timeout to communicator init * expose communicator obj from runtime obj, add unit test for nccl communicator * reformat files * Add allReduce operator and cuda nccl allReduce kernel * impl model parallel for resnet * add allGather nccl kernel and operator * Add allreduce allgather operator tests, change allgather kernel to output list of tensor, fix shape infer, handle nullptr output * fix format of onnx.py * use concat following AllGather * get tensor parallel for resnet * fix format of graph_handler.cc * change BUILD_DIST default to OFF * polish code of communicator * update .gitignore * Add broadcast operator and cuda kernel * Add comments for operators * remove const of class member * move communicator to CudaRuntimeObj * Add an empty line at EOF. --------- Co-authored-by: panzezhong <panzezhong@qiyuanlab.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-09-05 09:47:35 +08:00
zhangyunze	3e6ef305f1	框架支持bert/gpt2模型构图 (#94 ) * feat: support to sqrt op * feat: support to erf op * feat: support to expand op * feat: support to where op * fix: gather op index can be int64_t(hard coding) * fix: some wrong use * style: fix the format style * test: add test for change op * fix: rebase to master * fix: fix matmul b compute wrong * add expand and where kernel * Add int64 support for cuda gather kernel * add test_where.cc * add "expand.(cu/cc,test,cuda),modified where.cu" * Separate initialization of datatypes to avoid compile error * modify where.(cu/cc/h,test), expand and clip * Format fix * Format fix --------- Co-authored-by: xgqdut2016 <kenan_gewei@163.com> Co-authored-by: panzezhong <panzezhong@qiyuanlab.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-08-29 16:06:52 +08:00
zhangyunze	ef672894d0	support mixed dtype (#102 ) * feat: support mixed dtype * feat: support cast op * test: add test for cast op * feat: support datatype BFloat16 * feat: support data convert fp32 <-> bfp16 * fix: fix all op's infershape func * fix as review comment	2023-08-16 21:49:43 +08:00
Derui Yang	57ac94d893	refactor(core): 添加新的 `OpType` 定义 (#99 ) * feat: 添加新的 OpType 定义 Signed-off-by: YdrMaster <ydrml@hotmail.com> * refactor: 使用新的 OpType 替换原来的，修改整个项目 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: onnx 导入 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正 cuda 和 bang kernel 的问题 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 过滤 bang test Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 过滤 bang test Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix bang code. * fix code on bang * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 删除指定文件 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 删两个没用的文件，去掉一个不知道为什么的注释 Signed-off-by: YdrMaster <ydrml@hotmail.com> --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>	2023-08-07 11:17:05 +08:00
zhangyunze	9b10a74788	支持fp16 dtype (#96 ) * add conv_half kernel * Conv Kernel FP16 * dcj: replace "DataType::Float32" with "op->getDType()" to support more DataType * feat: support Float16 dtype * fix: set default clang-format to 14 version * fix: 按照review意见修改 * fix: add data convert to convfp16 kernel test * test: add conv_fp16 kernel test --------- Co-authored-by: zhangyue207 <zhangyue@qiyuanlab.com> Co-authored-by: kilinchange <kilinchange@163.com>	2023-08-02 16:38:16 +08:00
YdrMaster	26f0d13c26	Dev for 202303ddl (#66 ) * add activation operatiopn relu, tanh, sigmoid on mlu * commit for format * add activation backward operation * add test for activation_backward * add test * add convbpfilter * fix * add transpsoe code and test * add trigon function operation on mlu: sin,cos,tan,asin,sinh,asinh * add copy operation on mlu * add ceil operation and floor operation * add operation clip * add operation cnnl div, test and test for divdemo bangc kernel * add divnonan operation and test * add erf operation * add exp operation * add operation fill * add log operation * add log1p operation * add l2loss operation * add maximum and minimum operation * add mseloss operation * add negTensor operation * add power operation * add reciprocal operation * add sqrt and rsqrt operation * add transform operation * add addn operation * add muln operation * cherrry pick some operation * add floordiv operation and floordivtrunc operation * add floormod operation * add cumsum operation * add det operation * add pad operation * format * add concat operation * format * add split operation * fix concat and split operation * add round operation * add pooling operation * add square operation * add squaredDifference operation * code format fix * add flip operation * code format fix * add hardtanh operation * add logic operation * add addcdiv and addcmul operation * add arange operation * add bitcompute operation * add net test * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: rename Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 用 NativeCpuRuntime 替换 CpuRuntime Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code * fix code * fix code by review suggestion * remove operation which is not the onnx operation * fix format * clang format * refactor: tensor 的 print 加一层模板的 dataToString Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: onnx 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 增加计算图优化接口 Signed-off-by: YdrMaster <ydrml@hotmail.com> * add clip operation * feat: 支持导入 clip Signed-off-by: YdrMaster <ydrml@hotmail.com> * test: 导入导出测试加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix batch norm * feat: 增加 Shape 算子 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入 unsqueeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正 clip 接口 feat: 支持导入 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add broadcast operation * fix elementwise-broadcast * fix elementwise broadcast * add broadcast for gpu elementsie * feat: pad 支持 axes 负数 feat: 不支持的 padding 导出为独立的 pad 算子 feat: 支持导入 onnxsim 过的 inception Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正池化的测试 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 pads，支持 inception 导入导出，已加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 densenet 导入导出，并加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 squeeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix softmax * feat: 导出 clip 和 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 Conv 的 bias Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv group Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: matmul 的 bias 没有放在输入里，修正 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix exmaple * fix: 改正 reduce_mean 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * refactor: 修改 slice 实现与 onnx 一致 Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: 不导出两个 runtime 函数 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 中文使用指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 补全指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修复导入数据的问题 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 添加 Dropout 基本结构，但不支持两个输出是不同的类型 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 重新导出优化接口 feat: dropout 导入 Signed-off-by: YdrMaster <ydrml@hotmail.com> * build: BANG 选项加入 Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * fxi code, change of test/kernels/bang/test* is use NativeCpuRuntime. chaneg of include/bang/bang_runtime is for the cntoolkit upgrade. * feat: 导出 bang runtime Signed-off-by: YdrMaster <ydrml@hotmail.com> * add USE_BANG=1 * fix matmul * fix reshape * fix * fix activation * fix transpose * format * format * update Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add prelu on mlu * fix: ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 PRelu Signed-off-by: YdrMaster <ydrml@hotmail.com> * add convtrans on mlu * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * docs: 更新 README_CN.md Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code by review suggestions * style Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: Softmax 的 axis 可以用默认值？感觉是 onnx 不标准 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix cuda & intelcpu bugs after merging --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: whjthu <haojie0429@gmail.com>	2023-04-18 15:10:33 +08:00
zhengly123	a1974aabcd	NNET supports TVM backend and kernels (#78 ) * Add: mutator InfoGAN minimum test * Add: cache and padding (bugs!!) * Add: expression reader as a cmake target * Fix: [Intermediate] NMutator::expressionToGraph To be fix: matmul with implicit broadcast * Add: matmul broadcast * Fix: GraphObj ctor should use cloneTensor * Fix: cuBLAS failure when codegen is enabled * Add: Exception for checkCuError * Fix: graph OpList ctor * Add: expr simplication for TVM * Add: TVM headers and CMake include paths * Add: CMake config * Add: PackedFunc (broken) * Fix: remove cuCtxCreate which makes TVM fails * Fix: membound_tvm * Fix: test_memboundOp * Add: PRelu Expr and AsTVMVisitor * Add: Random generator * Add: support TVM packed function * Fix: specify runtime * Add: CMake support of TVM * Add: detailed output of Matmul * Add: comments for Matmul * Chore: format and comments * Chore: GraphObj::selfCheck without assert control * Fix: CMAKE_CXX_FLAGS in CMakeLists * fix merge bug * update api for mkl batchnorm test * fix lotus env * fig header bug --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn> Co-authored-by: whjthu <haojie0429@gmail.com>	2023-04-18 00:26:36 +08:00
wendy12022	43d4798323	ADD: sub graph replacement. (#56 ) reconfig: connections among op and tensor now is managered by GraphObj . add some comments merge from master merge from master ADD: sub graph replacement reconfig inputs of op resize, due to the check of operator inputs. ResizeObj::clone clang format fix some and add test for multi-output. replacement support multi-inputs and multi-outputs. add clone for all operators add replaceSubGraph addSubGraph remove extra code add more test remove extra print Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-04-17 13:09:07 +08:00
wendy12022	c8b2c8ed32	Cpu backend2 (#77 ) fix review change Device::MKL to Device::INTELCPU fix mkl linkage fix errors according to merge from master now can call mkl backend fix softmax/flatten with axis from onnx. modify README.md fix memory refree add env_lotus_intelcpu.sh fix compile merge from branch cpu_backend fix something add gather fix something FIX: directory rename from "mkl" to "intelcpu" ADD: use oneMKL dpcpp interface to implement matmul kernel. ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling. add dpcpp kernel for pow. ADD: mkl kernel for pad. ADD: slice mkl kernel. ADD: reshape/flatten/identity mkl kernel. ADD: split mkl kernel. fix compile error FIX: fix flattenObj with axis. ADD reduce_mean mkl kernel. Add concat mkl kernel. bathNorm for mkl kernel. sigmoid mkl kernel. ADD：add mkl kernel for pooling add more tests for softmax Now softmax cuda kernel supports any axises. mkl kernel for softmax softmax add axis to softmax operator add mkl kernel for abs tanh ADD: relu kernel for mkl fix binary mkl primitives. add mkl kernel for binary operators fix compiler error move stream to runtime clang format add MemoryFormat for tensorObj. use post_ops for fused conv/deconv Distinguish mkl op_timer from cuda op timer. add act optype to conv and deconv add operator timer add mkl kernel for convTransposed minor fix for group conv do not use cblas_sgemm_batch CpuRuntimeObj->NativeCpuRuntimeObj add matmul op for mkl	2023-04-17 12:15:23 +08:00
whjthu	d9886e9de3	fix: remove inline keyword in class; rename getter and setter for inputOf and outputOf	2023-03-25 12:04:24 +08:00
YdrMaster	e294e46436	feat: 导出 pool 到 onnx Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-03-15 17:23:32 +08:00
YdrMaster	a5e692baea	feat: 导出 batchnorm 到 onnx Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-03-15 17:23:32 +08:00
YdrMaster	71a87c27d1	feat: 导出 ReduceMean 到 onnx Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-03-15 15:09:12 +08:00
YdrMaster	2a23669394	feat: 导出 Reshape 到 onnx Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-03-15 15:09:12 +08:00
Haojie Wang	0f52d04882	Merge branch 'master' into dev-onnx	2023-03-15 14:52:03 +08:00
deathwings602	40d1b1c91b	Add ConvTransposedNHWC (#67 ) * Add: IT_ASSERT_TODO * [WIP] Add: ConvTranspose2d mutation test * add ConvTransposedNHWC * fix test_cuda_transposed_2d --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>	2023-03-01 14:15:02 +08:00
YdrMaster	315763a83a	feat: 前端支持 pad 及单元测试 Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-15 11:41:06 +08:00
YdrMaster	f9d0076a86	opt: 优化 SliceObj 构造器实现 Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-14 16:44:08 +08:00
YdrMaster	62ceb78ae3	feat: 前端支持 reduceMean 及单元测试 Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-14 15:35:01 +08:00
YdrMaster	fb9d84dbb7	opt: 优化 ReduceMeanObj 构造器实现 Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-14 15:14:28 +08:00
YdrMaster	d11fb0ad5f	feat: 前端支持 gather 及单元测试 Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-14 14:16:01 +08:00
YdrMaster	7626efbfa8	feat: 前端支持 reshape - 无法测试，因为后端不支持 shape 的 INT64 类型 opt: ReshapeObj 构造改为全部传值并在内部 move Signed-off-by: YdrMaster <ydrml@hotmail.com>	2023-02-14 09:51:11 +08:00
whjthu	26be533faa	Add documentation for operators.	2023-02-13 22:51:15 +08:00
zhengly123	c7ec9ee6e7	Add search engine (#64 ) * Add: tensor fuid * [Intermediate state] Add: Graph ctor for OpVec * Add: clone for operators * tmp: search_engine * search: init search Engine. * Add: dummy mutator for the test of search engine * search: add print graph. * search: add partition. * search: update comments. * Fix: remain FUID in Tensor::clone * Chore: rename GUidBaseType to UidBaseType * Fix: connect NMutator to SearchEngine * Chore: output * Fix test_memboundOp: nmutator uses input runtime * Chore: clang-format * Chore: clang-format * Fix: comments in the review --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: mazx <dyxdy@live.com>	2023-02-12 18:27:52 +08:00
wendy12022	d780f687fc	ADD: reconfig ResizeObj, support "tf_crop_and_resize " and cubic coeff kernel. (#59 ) add cubic coef add tf_crop_and_resize	2022-12-24 04:02:21 +08:00
wendy12022	c5966f8d81	Add: resize operator and cuda kernel,support nearest/linear coef. (#51 ) ADD: resize operator and cuda kernel,support nearest/linear coef. fix some fix tests add more tests for linear mode. add linear coef mode. add scales add tests fix tests. add notLarger notSmaller fix add test ADD:resize operator and cuda kernel	2022-11-14 09:30:22 +08:00
wendy12022	d1c913010f	ADD:reduce_mean operator and cuda kernel. (#47 ) add new line at file ending.	2022-10-15 16:53:58 +08:00
wendy12022	a4d6426589	ADD: batch norm operator and cuda kernel. (#44 ) fix numInputs of batchNorm, add new line in file ending. ADD: batch norm operator and cuda kernel. add training remove comments. fix compile error. add batch norm operator and cuda kernel.	2022-10-15 16:29:28 +08:00
wendy12022	26cee55e81	ADD:extend operator and cuda kernel. (#40 ) Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2022-09-29 14:52:50 +08:00

1 2

67 Commits