update run info

This commit is contained in:
anonymous123rainy 2022-12-01 17:21:16 +08:00
parent fbd55e782a
commit d4e918c12a
3 changed files with 98 additions and 2 deletions

View File

@ -4,7 +4,7 @@
## Introduction
Open source software (OSS) licenses regulate the conditions under which OSS can be legally reused, distributed, and modified. However, a common issue arises when incorporating third-party OSS accompanied with licenses, i.e., license incompatibility, which occurs when multiple licenses exist in one project and there are conflicts between them. Despite being problematic, fixing license incompatibility issues requires substantial efforts
due to the lack of license understanding and complex package dependency.
In this paper, we propose LiResolver, a fine-grained, scalable, and flexible tool to resolve license incompatibility issues {for open source software}. Specifically, it first understands the semantics of licenses through fine-grained entity extraction and relation extraction. Then, it detects and resolves license incompatibility issues by recommending official licenses in priority. When no official licenses can satisfy the constraints, it generates a custom license as an alternative solution. Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09\% FP rate and 0.02\% FN rate for incompatibility issue localization, and 62.61\% of 230 real-world incompatible projects resolved by LiResolver. Furthermore, we also evaluate the impacts of license hierarchy and copyright holder detection on the effectiveness of incompatibility resolution. We discuss lessons learned and made all the datasets and the replication package of LiResolver publicly available to facilitate follow-up research.
In this paper, we propose ***LiResolver***, a fine-grained, scalable, and flexible tool to resolve license incompatibility issues for open source software. Specifically, it first understands the semantics of licenses through fine-grained entity extraction and relation extraction. Then, it detects and resolves license incompatibility issues by recommending official licenses in priority. When no official licenses can satisfy the constraints, it generates a custom license as an alternative solution. Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09\% FP rate and 0.02\% FN rate for incompatibility issue localization, and 62.61\% of 230 real-world incompatible projects resolved by LiResolver. Furthermore, we also evaluate the impacts of license hierarchy and copyright holder detection on the effectiveness of incompatibility resolution. We discuss lessons learned and made all the datasets and the replication package of LiResolver publicly available to facilitate follow-up research.
![image](img/overview_00.png)
@ -21,6 +21,42 @@ In this paper, we propose LiResolver, a fine-grained, scalable, and flexible too
## Installation
#### download dependency files
* download the `roberta-base` pretrained model from website`https://huggingface.co` to the file folder `LiResolver/RE/roberta-base` of LiResolver
* download the `glove` pretrained model from website`https://nlp.stanford.edu/projects/glove/` to the file folder `LiResolver/EE5/LocateTerms/data/glove.6B` of LiResolver
* download the `bert-base-uncased` pretrained model from website`https://huggingface.co` to the file folder `LiResolver/AC/bert-base-uncased` of LiResolver
* download the `corenlp` tool from website`https://stanfordnlp.github.io/CoreNLP/` to the file folder `LiResolver/model/stanford-corenlp-4.2.0` of LiResolver
#### create dependency environment
`conda create ...`
#### Run LiResolver
**Input:**
place the OSS project you want to analyze into the folder `LiResolver/repos/`, and run
`cd LiResolver`
`python3 main.py`
**Output:**
The incompatibility resolution results will write into the folder `LiResolver/REPAIRED/` and other processing information will output on the console.

59
main.py Normal file
View File

@ -0,0 +1,59 @@
#coding=utf-8
import os
import json
import utils
import LicenseRepair
from LicenseDataset import Licensedataset
from treelib import Tree, Node
rootDir = os.path.dirname(os.path.abspath(__file__))
unDir = os.path.join(rootDir, 'repos')
# 先加载好corenlp
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(os.path.join(rootDir, 'model', 'stanford-corenlp-4.2.0'))
# 加载ee5模型
from EE5.LocateTerms.nermodel.ner_model import NERModel
from EE5.LocateTerms.nermodel.config import Config
ner_config_ee5 = Config()
ner_model_ee5 = NERModel(ner_config_ee5)
ner_model_ee5.build()
ner_model_ee5.restore_session(ner_config_ee5.dir_model)
print('【ee5 loaded. 】')
# 加载re模型
from RE import re_predict
re_args, re_model = re_predict.load_re_model()
print('【re loaded. 】')
# ner_model_ee5 = None
# re_args = None
# re_model = None
# 加载ac模型
from tgrocery import Grocery
ac_model = Grocery(os.path.dirname(os.path.abspath(__file__))+'/'+'AC/ossl2_ac')
ac_model.load()
print('【ac loaded. 】')
## 加载ld
ld = Licensedataset()
ld.load_licenses_from_csv(nlp, ld, ner_model_ee5, re_args, re_model, ac_model)
print('【ld loaded. 】')
repo = "XXXX"
lr, fg_hasPL, num_fixable, num_incom, num_repair, methods_repair \
= LicenseRepair.runLicenseRepair(repo, nlp, ld, ner_model_ee5, re_args, re_model, ac_model)
# 关闭corenlp
nlp.close()

View File

@ -11,7 +11,8 @@ import os
import utils
rootDir = os.path.dirname(os.path.abspath(__file__))
unDir = os.path.join(os.path.dirname(rootDir), 'repos') #####
#unDir = os.path.join(os.path.dirname(rootDir), 'repos') #####
unDir = os.path.join(rootDir, 'repos')
outputDir000 = rootDir + '/output/'