InfiniTensor

Commit Graph

Author	SHA1	Message	Date
wendy12022	13b7a2604b	ADD add/mul/sub/div/pow operators and CPU/CUDA kernels (#26 ) Fix some remove useless code. add div/pow kernel Add add/mul/sub operators. fix cpu kernel. add element wise kenerl for cuda. ADD element wise operator.	2022-09-09 13:43:59 +08:00
Anmuliar	0409eafb5f	Operators g2bmm&gbmm transplantation (#24 ) * Function tune and corresponding testcase. Add: Tune function in /src/kernel/cuda/conv.cc and corresponding testcase in test_conv. Fix: A little bug of perfRecord using in /src/core/runtime.cc. * Tune part debug Add: recover the code, fixed the commit error. Add: some anotations in tune function * clang formmat test * Fix: mem leak in CUDA Runtime and Conv * Fix: sync in conv and default sync in timeit * Change the way to tune operator conv. Timeit function cudNNUnfused -> Timeit function cudnnConvolutionForward. * Change: merge the common part of cudnnunfused&tune into cudnndescriptoraccess * clang test * clang-format * clang-format bash. * Added operator G2BMM and corresponding testcase. Added files related to operator G2BMM creating&calling. Added custom_ops.cuh&custom_op.h. * Add operator GBMML * new version * Fix: G2BMM and GBMM kernel bugs * Added testcase of operator GBMML * clang format * Added cmake option REQUIRE_GCC9 * Delete redundent file * Renamed class GBMML into GBMM * clang format * Reviewed. * Added cudahostcompier option. * Add: explicit CMAKE_CUDA_HOST_COMPILER * Rename gbmm kernel * Fix: nvcc warning in GBMM and G2BMM Co-authored-by: wcz112 <wcz19@mails.tsinghua.edu.cn> Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-09-08 21:31:35 +08:00
Hardy	e1d43202d7	Verify wanghailu 0902 (#22 ) * commit for verify, add some difference function * add code for verify * add code for verify Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>	2022-09-05 15:45:52 +08:00
wendy12022	c3bc278c12	Op matmul (#20 ) ADD:add cuda kernel for matmul. matmul tune Add test_matmul.cc	2022-09-01 21:06:55 +08:00
Hardy	32a01efbbe	add code for backtrace (#21 ) * add code for backtrace * Add: infini::Exception ``` Test project /home/zly/InfiniTensor_aux/build Start 1: test_graph 1/4 Test #1: test_graph ....................... Passed 0.05 sec Start 2: test_hash 2/4 Test #2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/4 Test #3: test_conv ........................ Passed 4.40 sec Start 4: test_pooling 4/4 Test #4: test_pooling ..................... Passed 2.47 sec 100% tests passed, 0 tests failed out of 4 Total Test time (real) = 6.94 sec ``` * Fix: USE_BACKTRACE in cmake Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-09-01 20:30:12 +08:00
wendy12022	48293576c0	Add maxpool and avgpool operators (#17 ) * ADD:maxpool&&avgpool operators. add OperatorObj::getDType() clang format FIX:timeit API has changed. * Fix: Tensor::getInputs is const method * Chore Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-31 14:44:53 +08:00
Anmuliar	bd63f738dc	cuDNN conv tuning (#16 ) * Function tune and corresponding testcase. Add: Tune function in /src/kernel/cuda/conv.cc and corresponding testcase in test_conv. Fix: A little bug of perfRecord using in /src/core/runtime.cc. * Tune part debug Add: recover the code, fixed the commit error. Add: some anotations in tune function * clang formmat test * Fix: mem leak in CUDA Runtime and Conv * Fix: sync in conv and default sync in timeit * Change the way to tune operator conv. Timeit function cudNNUnfused -> Timeit function cudnnConvolutionForward. * Change: merge the common part of cudnnunfused&tune into cudnndescriptoraccess * clang test * clang-format * clang-format bash. * Chore: remove print and blank lines Co-authored-by: wcz112 <wcz19@mails.tsinghua.edu.cn> Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-29 21:37:07 +08:00
Anmuliar	e076991f2f	Revert "Operator serialization (#14 )" (#15 ) This reverts commit `25f0c441d2`.	2022-08-29 16:02:48 +08:00
Anmuliar	25f0c441d2	Operator serialization (#14 ) Class "Cuda Runtime" fulfills function "tune" and adds corresponding testcase. Add: convCudnn::tune, convCudnn::cuDNNdescriptorAccess. Add: testcase tune. *Fix: a brief bug in CPU Runtime.	2022-08-29 15:59:03 +08:00
zhengly123	93f86d3f4d	Simplify tensor transfer between CPU and CUDA (#10 ) * Add: OP infers data type & Graph clones tensor * Fix: vecToString format * Add: static assert for Tensor methods * Rename: getDataRawPtr -> getRawDataPtr Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-25 11:29:16 +08:00
zhengly123	af08df32d2	Extended DataType class and Runtime interaction (#9 ) * Add: DataType class * Add: data-type-oblivious tensor interface * Rename: copyBlobToCPU Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-23 16:55:59 +08:00
zhengly123	bd5934279b	Fix: rename kerels -> kernels (#8 ) Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-22 16:42:55 +08:00
zhengly123	04ea5eed38	Add CUDA runtime (#6 ) * Fix: add warm-up and repetition in timing * Add: CUDA runtime and float support * Refactor: Cuda and Cpu runtimes inherit Runtime * Add: environment script for Lotus * Add: Lotus build instructions * Update README.md Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-22 15:01:03 +08:00
zhengly123	9303ddda8e	Add Conv operator and naive CPU implemenation (#5 ) * Add: Conv definition * Add: tensor copy data from vector * Add: CPU conv kernel * Fix: replace Int32 with UInt32 in DataType Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-17 14:16:01 +08:00
zhengly123	a26890abce	Tensor hash and inferShape (#4 ) * Refactor: operator hash and inferShape * Add: hash without shape * Add: inferShape interface for given input tensors * Add: construct outputs in op ctor * Add: comments for matmul * Add: opType in AttrVector and WorkloadVector * Chore: _graph -> graph in Op ctor * Chore: change the "Node" suffix to "Obj" Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-15 15:08:56 +08:00
Haojie Wang	eda41b06a7	Merge pull request #1 from InfiniTensor/init Initialization	2022-08-09 20:21:20 +08:00
Liyan Zheng	a4fb9fa413	Chore: format dbg	2022-08-09 20:16:39 +08:00
Liyan Zheng	8219b0f7ff	Add: comments for Kernel	2022-08-09 20:05:01 +08:00
Liyan Zheng	ce5d49c79b	Add: clang format script	2022-08-09 19:50:23 +08:00
Liyan Zheng	cc78a756e1	Add: clang format check github action	2022-08-09 17:58:12 +08:00
Liyan Zheng	2054b0eda4	Chore: rename getOpAttrs to getOpPerfKey	2022-08-09 15:34:28 +08:00
Liyan Zheng	8b685ae4a6	Update: OpAttrs -> OpPerfKey	2022-08-09 14:58:45 +08:00
Liyan Zheng	b7e2096a26	Add: nnet code	2022-08-08 16:02:07 +08:00
Liyan Zheng	1205240218	Add: mutator abstract class	2022-08-08 15:54:17 +08:00
Liyan Zheng	efa966a3e2	Add: perf engine	2022-08-07 21:12:17 +08:00
Liyan Zheng	6c356d5b42	Add: kernel registry and naive Matmul kernel	2022-08-06 15:58:40 +08:00
Liyan Zheng	559be5866d	Add: Matmul operator	2022-08-05 12:50:34 +08:00
Liyan Zheng	e6101b0336	Add: graph, tensor, and operator	2022-07-31 21:44:03 +08:00
Haojie Wang	b89495a782	Initial commit	2022-07-27 22:40:23 +08:00

1 2

79 Commits All Branches Search

79 Commits

All Branches