update run info

2022-12-01 17:21:16 +08:00 · 2022-12-01 17:21:16 +08:00 · d4e918c12a
parent fbd55e782a
commit d4e918c12a
3 changed files with 98 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -4,7 +4,7 @@
 ## Introduction
 Open source software (OSS) licenses regulate the conditions under which OSS can be legally reused, distributed, and modified. However, a common issue arises when incorporating third-party OSS accompanied with licenses, i.e., license incompatibility, which occurs when multiple licenses exist in one project and there are conflicts between them. Despite being problematic, fixing license incompatibility issues requires substantial efforts
 due to the lack of license understanding and complex package dependency.
-In this paper, we propose LiResolver, a fine-grained, scalable, and flexible tool to resolve license incompatibility issues {for open source software}. Specifically, it first understands the semantics of licenses through fine-grained entity extraction and relation extraction. Then, it detects and resolves license incompatibility issues by recommending official licenses in priority. When no official licenses can satisfy the constraints, it generates a custom license as an alternative solution. Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09\% FP rate and 0.02\% FN rate for incompatibility issue localization, and 62.61\% of 230 real-world incompatible projects resolved by LiResolver. Furthermore, we also evaluate the impacts of license hierarchy and copyright holder detection on the effectiveness of incompatibility resolution. We discuss lessons learned and made all the datasets and the replication package of LiResolver publicly available to facilitate follow-up research.
+In this paper, we propose ***LiResolver***, a fine-grained, scalable, and flexible tool to resolve license incompatibility issues for open source software. Specifically, it first understands the semantics of licenses through fine-grained entity extraction and relation extraction. Then, it detects and resolves license incompatibility issues by recommending official licenses in priority. When no official licenses can satisfy the constraints, it generates a custom license as an alternative solution. Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09\% FP rate and 0.02\% FN rate for incompatibility issue localization, and 62.61\% of 230 real-world incompatible projects resolved by LiResolver. Furthermore, we also evaluate the impacts of license hierarchy and copyright holder detection on the effectiveness of incompatibility resolution. We discuss lessons learned and made all the datasets and the replication package of LiResolver publicly available to facilitate follow-up research.

 ![image](img/overview_00.png)

@ -21,6 +21,42 @@ In this paper, we propose LiResolver, a fine-grained, scalable, and flexible too
 ## Installation


+#### download dependency files
+
+
+
+* download the `roberta-base` pretrained model from website`https://huggingface.co` to the file folder `LiResolver/RE/roberta-base` of LiResolver
+
+* download the `glove` pretrained model from website`https://nlp.stanford.edu/projects/glove/` to the file folder `LiResolver/EE5/LocateTerms/data/glove.6B` of LiResolver
+
+* download the `bert-base-uncased` pretrained model from website`https://huggingface.co` to the file folder `LiResolver/AC/bert-base-uncased` of LiResolver
+
+* download the `corenlp` tool from website`https://stanfordnlp.github.io/CoreNLP/` to the file folder `LiResolver/model/stanford-corenlp-4.2.0` of LiResolver
+
+
+
+
+
+
+#### create dependency environment
+
+`conda create ...`
+
+
+#### Run LiResolver
+
+**Input:** 
+
+place the OSS project you want to analyze into the folder `LiResolver/repos/`, and run
+
+`cd LiResolver`
+
+`python3 main.py`
+
+
+**Output:**
+
+The incompatibility resolution results will write into the folder `LiResolver/REPAIRED/` and other processing information will output on the console. 



--- a/main.py
+++ b/main.py
@ -0,0 +1,59 @@
+#coding=utf-8
+import os
+import json
+import utils
+import LicenseRepair
+from LicenseDataset import Licensedataset
+from treelib import Tree, Node
+
+rootDir = os.path.dirname(os.path.abspath(__file__))
+unDir = os.path.join(rootDir, 'repos')
+
+
+
+
+# （先加载好corenlp）
+from stanfordcorenlp import StanfordCoreNLP
+nlp = StanfordCoreNLP(os.path.join(rootDir, 'model', 'stanford-corenlp-4.2.0'))
+# （加载ee5模型）
+from EE5.LocateTerms.nermodel.ner_model import NERModel
+from EE5.LocateTerms.nermodel.config import Config
+ner_config_ee5 = Config()
+ner_model_ee5 = NERModel(ner_config_ee5)
+ner_model_ee5.build()
+ner_model_ee5.restore_session(ner_config_ee5.dir_model)
+print('【ee5 loaded. 】')
+# （加载re模型）
+from RE import re_predict
+re_args, re_model = re_predict.load_re_model()
+print('【re loaded. 】')
+# ner_model_ee5 = None
+# re_args = None
+# re_model = None
+# （加载ac模型）
+from tgrocery import Grocery
+ac_model = Grocery(os.path.dirname(os.path.abspath(__file__))+'/'+'AC/ossl2_ac')
+ac_model.load()
+print('【ac loaded. 】')
+## 加载ld
+ld = Licensedataset()
+ld.load_licenses_from_csv(nlp, ld, ner_model_ee5, re_args, re_model, ac_model)
+print('【ld loaded. 】')
+
+
+
+
+
+repo = "XXXX"
+lr, fg_hasPL, num_fixable, num_incom, num_repair, methods_repair \
+        = LicenseRepair.runLicenseRepair(repo, nlp, ld, ner_model_ee5, re_args, re_model, ac_model)
+
+
+
+# （关闭corenlp）
+nlp.close()
+
+
+
+
+
--- a/projectLicenseTree.py
+++ b/projectLicenseTree.py
@ -11,7 +11,8 @@ import os
 import utils

 rootDir = os.path.dirname(os.path.abspath(__file__))
-unDir = os.path.join(os.path.dirname(rootDir), 'repos') #####
+#unDir = os.path.join(os.path.dirname(rootDir), 'repos') #####
+unDir = os.path.join(rootDir, 'repos')


 outputDir000 = rootDir + '/output/'