diff --git a/README.md b/README.md index 628afbe..8008fa8 100644 --- a/README.md +++ b/README.md @@ -2,38 +2,28 @@ http://docklet.unias.org -## intro +## Intro -Docklet is an operating system for mini-datacener. Its goal is to help -multi-user share cluster resources effectively. Unlike the "application -framework oriented" cluster manager such as Mesos and Yarn, Docklet is -**user oriented**. In Docklet, every user has their own private -**virtual cluster (vcluster)**, which consists of a number of virtual -Linux container nodes distributed over the physical cluster. Every -vcluster is separated from others and can be operated like a real -physical cluster. Therefore, most applications, especially those -requiring a cluster environment, can run in vcluster seamlessly. +Docklet is a cloud operating system for mini-datacener. Its goal is to +help multi-user share cluster resources effectively. In Docklet, every +user has their own private **virtual cluster (vcluster)**, which +consists of a number of virtual Linux container nodes distributed over +the physical cluster. Each vcluster is separated from others and can be +operated like a real physical cluster. Therefore, most applications, +especially those requiring a cluster environment, can run in vcluster +seamlessly. -Docklet provides a base image for creating virtual nodes. This image has -pre-installed a lot of mainstream development tools and frameworks, -including gcc/g++, openjdk, python3, R, MPI, scala, ruby, php, node.js, -texlive, mpich2, spark, -scipy/numpy/matplotlib/pandas/sympy/scikit-learn, jupyter notebook, etc. -Users can get a ready vcluster with just one click within 1 second. +Users manage and use their vcluster all through web. The only client +tool needed is a modern web browser supporting HTML5, like Safari, +Firefox, or Chrome. The integrated *jupyter notebook* provides a web +**Workspace**. In the Workspace, users can code, debug, test, +and runn their programs, even visualize the outputs online. +Therefore, it is ideal for data analysis and processing. -The users are free to install their specific software in their vcluster. -Docklet supports operating through **web terminal**. Users can do their -work as an administrator working on a console. The base image system is -ubuntu. The recommended way of installing new software is by -**apt-get**. - -The users manage and use their vcluster all through web. The only client -tool needed is a modern web browser, like safari, firefox, chrome. The -integrated *jupyter notebook* provides a web workspace. By visiting the -workspace, users can do coding, debugging and testing of their programs -online. The **python scipy** series of tools can even display graphical -pictures in the browser. Therefore, it is ideal for data analysis and -processing. +Docklet creates virtual nodes from a base image. Admins can +pre-install development tools and frameworks according to their +interests. The users are also free to install their specific software +in their vcluster. Docklet only need **one** public IP address. The vclusters are configured to use private IP address range, e.g., 172.16.0.0/16, @@ -47,54 +37,34 @@ The Docklet system runtime consists of four components: - docklet master - docklet worker -## install +## Install -Currently the docklet runtime is recommend to run in Unbuntu 15.10+. +Currently the Docklet system is recommend to run in Unbuntu 15.10+. Ensure that python3.5 is the default python3 version. -Unpack the docklet tarball to a directory ( /root/docklet as an -example), will get +Clone Docklet from github ``` -readme.md -prepare.sh -conf/ - container.conf - docklet.conf.template - lxc-script/ -bin/ - docklet-master - docklet-worker -src/ - httprest.py - worker.py - ... -web/ - web.py -doc/ -tools/ - update-basefs.sh - start_jupyter.sh +git clone https://github.com/unias/docklet.git ``` -If it is the first time install, users should run **prepare.sh** to -install necessary packages automatically. Note it may need to run this -script several times to successfully install all the needed packages. +Run **prepare.sh** from console to install depended packages and +generate necessary configurations. -A *root* users will be created for managing the system. The password is -recorded in `FS_PREFIX/local/generated_password.txt` . +A *root* users will be created for managing the Docklet system. The +password is recorded in `FS_PREFIX/local/generated_password.txt` . -## config ## +## Config ## The main configuration file of docklet is conf/docklet.conf. Most default setting works for a single host environment. First copy docklet.conf.template to get docklet.conf. -The following settings should be taken care of: +Pay attention to the following settings: -- NETWORK_DEVICE : the network device to use. +- NETWORK_DEVICE : the network interface to use. - ETCD : the etcd server address. For distributed muli hosts environment, it should be one of the ETCD public server address. For single host environment, the default value should be OK. @@ -111,7 +81,7 @@ The following settings should be taken care of: by visiting this address. If the system is behind a firewall, then a reverse proxy should be setup. -## start ## +## Start ## ### distributed file system ### @@ -123,8 +93,7 @@ Lets presume the file system server export filesystem as nfs In each physical host to run docklet, mount **fileserver:/pub** to **FS_PEFIX/global** . -For single host environment, it need not to configure distributed -file system. +For single host environment, nothing to do. ### etcd ### @@ -133,7 +102,7 @@ Ubuntu releases have included **etcd** in the repository, just `apt-get install etcd`, and it need not to start etcd manually. For others, you should install etcd manually. -For multi hosts distributed environment, start +For multi hosts distributed environment, **must** start **dep/etcd-multi-nodes.sh** in each etcd server hosts. This scripts requires users providing the etcd server address as parameters. @@ -146,27 +115,19 @@ address, e.g., 172.16.0.1. This server will be the master. If it is the first time you start docklet, run `bin/docklet-master init` to init and start docklet master. Otherwise, run `bin/docklet-master start`, which will start master in recovery mode in background using -conf/docklet.conf. It means docklet will recover workspaces existed. - -This script in fact will start three daemons: the docklet master of -httprest.py, the configurable-http-proxy and the docklet web of web.py. +conf/docklet.conf. You can check the daemon status by running `bin/docklet-master status` -If the master failed to start, you could try `bin/docklet-master init` -to initialize the whole system. - -More usages can be found by typing `bin/docklet-master` - The master logs are in **FS_PREFIX/local/log/docklet-master.log** and **docklet-web.log**. ### worker ### -Worker needs a basefs image to boot container. +Worker needs a basefs image to create containers. You can create such an image with `lxc-create -n test -t download`, -and then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs` +then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs` to `basefs`. Note the `jupyerhub` package must be installed for this image. And the @@ -179,21 +140,17 @@ Run `bin/docklet-worker start`, will start worker in background. You can check the daemon status by running `bin/docklet-worker status` -More usages can be found by typing `bin/docklet-worker` - The log is in **FS_PREFIX/local/log/docklet-worker.log** Currently, the worker must be run after the master has been started. -## usage ## +## Usage ## Open a browser, visiting the address specified by PORTAL_URL , e.g., ` http://docklet.info/ ` -If the system is just deployed in single host for testing purpose, -then the PORTAL_URL defaults to `http://MASTER_IP:PROXY_PORT`, -e.g., `http://localhost:8000`. - That is it. -## system admin ## +# Contribute # + +Contributions are welcome. Please check [devguide](doc/devguide/devguide.md) diff --git a/conf/docklet.conf.template b/conf/docklet.conf.template index 1c2d161..0537c51 100644 --- a/conf/docklet.conf.template +++ b/conf/docklet.conf.template @@ -47,16 +47,19 @@ # CLUSTER_NET: cluster network ip address range, default is 172.16.0.1/16 # CLUSTER_NET=172.16.0.1/16 +# Deprecated since v0.2.7. read from quota group set in web admin page # CONTAINER_CPU: CPU quota of container, default is 100000 # A single CPU core has total=100000 (100ms), so the default 100000 # mean a single container can occupy a whole core. # For a CPU with two cores, this can be set to 200000 # CONTAINER_CPU=100000 +# Deprecated since v0.2.7. read from quota group set in web admin page # CONTAINER_DISK: disk quota of container image upper layer, count in MB, # default is 1000 # CONTAINER_DISK=1000 +# Deprecated since v0.2.7. read from quota group set in web admin page # CONTAINER_MEMORY: memory quota of container, count in MB, default is 1000 # CONTAINER_MEMORY=1000 diff --git a/src/httprest.py b/src/httprest.py index 56d6898..3becb32 100755 --- a/src/httprest.py +++ b/src/httprest.py @@ -161,6 +161,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler): logger.info ("handle request : create cluster %s with image %s " % (clustername, image['name'])) [status, result] = G_vclustermgr.create_cluster(clustername, user, image, user_info) if status: + user_info = G_usermgr.selfQuery(cur_user = cur_user) + quota = {} + quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0 + quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024 + etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername)) + for con in result['containers']: + etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota) self.response(200, {'success':'true', 'action':'create cluster', 'message':result}) else: self.response(200, {'success':'false', 'action':'create cluster', 'message':result}) @@ -177,6 +184,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler): user_info = json.dumps(user_info) [status, result] = G_vclustermgr.scale_out_cluster(clustername, user, image, user_info) if status: + user_info = G_usermgr.selfQuery(cur_user = cur_user) + quota = {} + quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0 + quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024 + etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername)) + for con in result['containers']: + etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota) self.response(200, {'success':'true', 'action':'scale out', 'message':result}) else: self.response(200, {'success':'false', 'action':'scale out', 'message':result}) diff --git a/src/imagemgr.py b/src/imagemgr.py index be85fff..0893fef 100755 --- a/src/imagemgr.py +++ b/src/imagemgr.py @@ -68,7 +68,7 @@ class ImageMgr(): return [False,"target image is exists"] try: sys_run("mkdir -p %s" % imgpath+image,True) - sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True) + sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True) sys_run("rm -f %s" % (imgpath+"."+image+"_docklet_share"),True) except Exception as e: logger.error(e) @@ -87,9 +87,8 @@ class ImageMgr(): imgpath = self.imgpath + "private/" + user + "/" else: imgpath = self.imgpath + "public/" + imageowner + "/" - try: - sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True) + sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True) except Exception as e: logger.error(e) diff --git a/src/monitor.py b/src/monitor.py index 928eae4..c95b789 100755 --- a/src/monitor.py +++ b/src/monitor.py @@ -7,17 +7,18 @@ from log import logger class Container_Collector(threading.Thread): - def __init__(self,etcdaddr,cluster_name,host,cpu_quota,mem_quota,test=False): + def __init__(self,etcdaddr,cluster_name,host,test=False): threading.Thread.__init__(self) self.thread_stop = False self.host = host self.etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (cluster_name)) - self.etcdser.setkey('/vnodes/cpu_quota', cpu_quota) - self.etcdser.setkey('/vnodes/mem_quota', mem_quota) - self.cpu_quota = float(cpu_quota)/100000.0 - self.mem_quota = float(mem_quota)*1000000/1024 + #self.cpu_quota = float(cpu_quota)/100000.0 + #self.mem_quota = float(mem_quota)*1000000/1024 self.interval = 2 self.test = test + self.cpu_last = {} + self.cpu_quota = {} + self.mem_quota = {} return def list_container(self): @@ -46,22 +47,32 @@ class Container_Collector(threading.Thread): basic_info['PID'] = info['PID'] basic_info['IP'] = info['IP'] self.etcdser.setkey('/vnodes/%s/basic_info'%(container_name), basic_info) + cpu_parts = re.split(' +',info['CPU use']) cpu_val = cpu_parts[0].strip() cpu_unit = cpu_parts[1].strip() - res = self.etcdser.getkey('/vnodes/%s/cpu_use'%(container_name)) - cpu_last = 0 - if res[0] == True: - last_use = dict(eval(res[1])) - cpu_last = float(last_use['val']) + if not container_name in self.cpu_last.keys(): + [ret, ans] = self.etcdser.getkey('/vnodes/%s/quota'%(container_name)) + if ret == True : + res = dict(eval(ans)) + self.cpu_quota[container_name] = res['cpu'] + self.mem_quota[container_name] = res['memory'] + self.cpu_last[container_name] = 0 + else: + logger.warning(ans) + self.cpu_quota[container_name] = 1 + self.mem_quota[container_name] = 2000*1000000/1024 + self.cpu_last[container_name] = 0 cpu_use = {} cpu_use['val'] = cpu_val cpu_use['unit'] = cpu_unit - cpu_usedp = (float(cpu_val)-float(cpu_last))/(self.cpu_quota*self.interval*1.3) - if(cpu_usedp > 1): + cpu_usedp = (float(cpu_val)-float(self.cpu_last[container_name]))/(self.cpu_quota[container_name]*self.interval*1.3) + if(cpu_usedp > 1 or cpu_usedp < 0): cpu_usedp = 1 cpu_use['usedp'] = cpu_usedp + self.cpu_last[container_name] = cpu_val; self.etcdser.setkey('vnodes/%s/cpu_use'%(container_name), cpu_use) + mem_parts = re.split(' +',info['Memory use']) mem_val = mem_parts[0].strip() mem_unit = mem_parts[1].strip() @@ -70,7 +81,9 @@ class Container_Collector(threading.Thread): mem_use['unit'] = mem_unit if(mem_unit == "MiB"): mem_val = float(mem_val) * 1024 - mem_usedp = float(mem_val) / self.mem_quota + elif (mem_unit == "GiB"): + mem_val = float(mem_val) * 1024 * 1024 + mem_usedp = float(mem_val) / self.mem_quota[container_name] mem_use['usedp'] = mem_usedp self.etcdser.setkey('/vnodes/%s/mem_use'%(container_name), mem_use) #print(output) @@ -220,7 +233,6 @@ class Container_Fetcher: [ret, ans] = self.etcdser.getkey('/%s/cpu_use'%(container_name)) if ret == True : res = dict(eval(ans)) - res['quota'] = self.etcdser.getkey('/cpu_quota')[1] return res else: logger.warning(ans) @@ -231,7 +243,6 @@ class Container_Fetcher: [ret, ans] = self.etcdser.getkey('/%s/mem_use'%(container_name)) if ret == True : res = dict(eval(ans)) - res['quota'] = self.etcdser.getkey('/mem_quota')[1] return res else: logger.warning(ans) diff --git a/src/worker.py b/src/worker.py index f3e9608..87ebb33 100755 --- a/src/worker.py +++ b/src/worker.py @@ -192,7 +192,7 @@ if __name__ == '__main__': logger.info ("using WORKER_PORT %s" % worker_port ) con_collector = monitor.Container_Collector(etcdaddr, clustername, - ipaddr, cpu_quota, mem_quota) + ipaddr) con_collector.start() logger.info("CPU and Memory usage monitor started")