Merge remote-tracking branch 'upstream/master'
This commit is contained in:
commit
c3ecd0d2a0
123
README.md
123
README.md
|
@ -2,38 +2,28 @@
|
|||
|
||||
http://docklet.unias.org
|
||||
|
||||
## intro
|
||||
## Intro
|
||||
|
||||
Docklet is an operating system for mini-datacener. Its goal is to help
|
||||
multi-user share cluster resources effectively. Unlike the "application
|
||||
framework oriented" cluster manager such as Mesos and Yarn, Docklet is
|
||||
**user oriented**. In Docklet, every user has their own private
|
||||
**virtual cluster (vcluster)**, which consists of a number of virtual
|
||||
Linux container nodes distributed over the physical cluster. Every
|
||||
vcluster is separated from others and can be operated like a real
|
||||
physical cluster. Therefore, most applications, especially those
|
||||
requiring a cluster environment, can run in vcluster seamlessly.
|
||||
Docklet is a cloud operating system for mini-datacener. Its goal is to
|
||||
help multi-user share cluster resources effectively. In Docklet, every
|
||||
user has their own private **virtual cluster (vcluster)**, which
|
||||
consists of a number of virtual Linux container nodes distributed over
|
||||
the physical cluster. Each vcluster is separated from others and can be
|
||||
operated like a real physical cluster. Therefore, most applications,
|
||||
especially those requiring a cluster environment, can run in vcluster
|
||||
seamlessly.
|
||||
|
||||
Docklet provides a base image for creating virtual nodes. This image has
|
||||
pre-installed a lot of mainstream development tools and frameworks,
|
||||
including gcc/g++, openjdk, python3, R, MPI, scala, ruby, php, node.js,
|
||||
texlive, mpich2, spark,
|
||||
scipy/numpy/matplotlib/pandas/sympy/scikit-learn, jupyter notebook, etc.
|
||||
Users can get a ready vcluster with just one click within 1 second.
|
||||
Users manage and use their vcluster all through web. The only client
|
||||
tool needed is a modern web browser supporting HTML5, like Safari,
|
||||
Firefox, or Chrome. The integrated *jupyter notebook* provides a web
|
||||
**Workspace**. In the Workspace, users can code, debug, test,
|
||||
and runn their programs, even visualize the outputs online.
|
||||
Therefore, it is ideal for data analysis and processing.
|
||||
|
||||
The users are free to install their specific software in their vcluster.
|
||||
Docklet supports operating through **web terminal**. Users can do their
|
||||
work as an administrator working on a console. The base image system is
|
||||
ubuntu. The recommended way of installing new software is by
|
||||
**apt-get**.
|
||||
|
||||
The users manage and use their vcluster all through web. The only client
|
||||
tool needed is a modern web browser, like safari, firefox, chrome. The
|
||||
integrated *jupyter notebook* provides a web workspace. By visiting the
|
||||
workspace, users can do coding, debugging and testing of their programs
|
||||
online. The **python scipy** series of tools can even display graphical
|
||||
pictures in the browser. Therefore, it is ideal for data analysis and
|
||||
processing.
|
||||
Docklet creates virtual nodes from a base image. Admins can
|
||||
pre-install development tools and frameworks according to their
|
||||
interests. The users are also free to install their specific software
|
||||
in their vcluster.
|
||||
|
||||
Docklet only need **one** public IP address. The vclusters are
|
||||
configured to use private IP address range, e.g., 172.16.0.0/16,
|
||||
|
@ -47,54 +37,34 @@ The Docklet system runtime consists of four components:
|
|||
- docklet master
|
||||
- docklet worker
|
||||
|
||||
## install
|
||||
## Install
|
||||
|
||||
Currently the docklet runtime is recommend to run in Unbuntu 15.10+.
|
||||
Currently the Docklet system is recommend to run in Unbuntu 15.10+.
|
||||
|
||||
Ensure that python3.5 is the default python3 version.
|
||||
|
||||
Unpack the docklet tarball to a directory ( /root/docklet as an
|
||||
example), will get
|
||||
Clone Docklet from github
|
||||
|
||||
```
|
||||
readme.md
|
||||
prepare.sh
|
||||
conf/
|
||||
container.conf
|
||||
docklet.conf.template
|
||||
lxc-script/
|
||||
bin/
|
||||
docklet-master
|
||||
docklet-worker
|
||||
src/
|
||||
httprest.py
|
||||
worker.py
|
||||
...
|
||||
web/
|
||||
web.py
|
||||
doc/
|
||||
tools/
|
||||
update-basefs.sh
|
||||
start_jupyter.sh
|
||||
git clone https://github.com/unias/docklet.git
|
||||
```
|
||||
|
||||
If it is the first time install, users should run **prepare.sh** to
|
||||
install necessary packages automatically. Note it may need to run this
|
||||
script several times to successfully install all the needed packages.
|
||||
Run **prepare.sh** from console to install depended packages and
|
||||
generate necessary configurations.
|
||||
|
||||
A *root* users will be created for managing the system. The password is
|
||||
recorded in `FS_PREFIX/local/generated_password.txt` .
|
||||
A *root* users will be created for managing the Docklet system. The
|
||||
password is recorded in `FS_PREFIX/local/generated_password.txt` .
|
||||
|
||||
## config ##
|
||||
## Config ##
|
||||
|
||||
The main configuration file of docklet is conf/docklet.conf. Most
|
||||
default setting works for a single host environment.
|
||||
|
||||
First copy docklet.conf.template to get docklet.conf.
|
||||
|
||||
The following settings should be taken care of:
|
||||
Pay attention to the following settings:
|
||||
|
||||
- NETWORK_DEVICE : the network device to use.
|
||||
- NETWORK_DEVICE : the network interface to use.
|
||||
- ETCD : the etcd server address. For distributed muli hosts
|
||||
environment, it should be one of the ETCD public server address.
|
||||
For single host environment, the default value should be OK.
|
||||
|
@ -111,7 +81,7 @@ The following settings should be taken care of:
|
|||
by visiting this address. If the system is behind a firewall, then
|
||||
a reverse proxy should be setup.
|
||||
|
||||
## start ##
|
||||
## Start ##
|
||||
|
||||
### distributed file system ###
|
||||
|
||||
|
@ -123,8 +93,7 @@ Lets presume the file system server export filesystem as nfs
|
|||
In each physical host to run docklet, mount **fileserver:/pub** to
|
||||
**FS_PEFIX/global** .
|
||||
|
||||
For single host environment, it need not to configure distributed
|
||||
file system.
|
||||
For single host environment, nothing to do.
|
||||
|
||||
### etcd ###
|
||||
|
||||
|
@ -133,7 +102,7 @@ Ubuntu releases have included **etcd** in the repository, just `apt-get
|
|||
install etcd`, and it need not to start etcd manually. For others, you
|
||||
should install etcd manually.
|
||||
|
||||
For multi hosts distributed environment, start
|
||||
For multi hosts distributed environment, **must** start
|
||||
**dep/etcd-multi-nodes.sh** in each etcd server hosts. This scripts
|
||||
requires users providing the etcd server address as parameters.
|
||||
|
||||
|
@ -146,27 +115,19 @@ address, e.g., 172.16.0.1. This server will be the master.
|
|||
If it is the first time you start docklet, run `bin/docklet-master init`
|
||||
to init and start docklet master. Otherwise, run `bin/docklet-master start`,
|
||||
which will start master in recovery mode in background using
|
||||
conf/docklet.conf. It means docklet will recover workspaces existed.
|
||||
|
||||
This script in fact will start three daemons: the docklet master of
|
||||
httprest.py, the configurable-http-proxy and the docklet web of web.py.
|
||||
conf/docklet.conf.
|
||||
|
||||
You can check the daemon status by running `bin/docklet-master status`
|
||||
|
||||
If the master failed to start, you could try `bin/docklet-master init`
|
||||
to initialize the whole system.
|
||||
|
||||
More usages can be found by typing `bin/docklet-master`
|
||||
|
||||
The master logs are in **FS_PREFIX/local/log/docklet-master.log** and
|
||||
**docklet-web.log**.
|
||||
|
||||
### worker ###
|
||||
|
||||
Worker needs a basefs image to boot container.
|
||||
Worker needs a basefs image to create containers.
|
||||
|
||||
You can create such an image with `lxc-create -n test -t download`,
|
||||
and then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs`
|
||||
then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs`
|
||||
to `basefs`.
|
||||
|
||||
Note the `jupyerhub` package must be installed for this image. And the
|
||||
|
@ -179,21 +140,17 @@ Run `bin/docklet-worker start`, will start worker in background.
|
|||
|
||||
You can check the daemon status by running `bin/docklet-worker status`
|
||||
|
||||
More usages can be found by typing `bin/docklet-worker`
|
||||
|
||||
The log is in **FS_PREFIX/local/log/docklet-worker.log**
|
||||
|
||||
Currently, the worker must be run after the master has been started.
|
||||
|
||||
## usage ##
|
||||
## Usage ##
|
||||
|
||||
Open a browser, visiting the address specified by PORTAL_URL ,
|
||||
e.g., ` http://docklet.info/ `
|
||||
|
||||
If the system is just deployed in single host for testing purpose,
|
||||
then the PORTAL_URL defaults to `http://MASTER_IP:PROXY_PORT`,
|
||||
e.g., `http://localhost:8000`.
|
||||
|
||||
That is it.
|
||||
|
||||
## system admin ##
|
||||
# Contribute #
|
||||
|
||||
Contributions are welcome. Please check [devguide](doc/devguide/devguide.md)
|
||||
|
|
|
@ -47,16 +47,19 @@
|
|||
# CLUSTER_NET: cluster network ip address range, default is 172.16.0.1/16
|
||||
# CLUSTER_NET=172.16.0.1/16
|
||||
|
||||
# Deprecated since v0.2.7. read from quota group set in web admin page
|
||||
# CONTAINER_CPU: CPU quota of container, default is 100000
|
||||
# A single CPU core has total=100000 (100ms), so the default 100000
|
||||
# mean a single container can occupy a whole core.
|
||||
# For a CPU with two cores, this can be set to 200000
|
||||
# CONTAINER_CPU=100000
|
||||
|
||||
# Deprecated since v0.2.7. read from quota group set in web admin page
|
||||
# CONTAINER_DISK: disk quota of container image upper layer, count in MB,
|
||||
# default is 1000
|
||||
# CONTAINER_DISK=1000
|
||||
|
||||
# Deprecated since v0.2.7. read from quota group set in web admin page
|
||||
# CONTAINER_MEMORY: memory quota of container, count in MB, default is 1000
|
||||
# CONTAINER_MEMORY=1000
|
||||
|
||||
|
|
|
@ -161,6 +161,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler):
|
|||
logger.info ("handle request : create cluster %s with image %s " % (clustername, image['name']))
|
||||
[status, result] = G_vclustermgr.create_cluster(clustername, user, image, user_info)
|
||||
if status:
|
||||
user_info = G_usermgr.selfQuery(cur_user = cur_user)
|
||||
quota = {}
|
||||
quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0
|
||||
quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024
|
||||
etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername))
|
||||
for con in result['containers']:
|
||||
etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota)
|
||||
self.response(200, {'success':'true', 'action':'create cluster', 'message':result})
|
||||
else:
|
||||
self.response(200, {'success':'false', 'action':'create cluster', 'message':result})
|
||||
|
@ -177,6 +184,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler):
|
|||
user_info = json.dumps(user_info)
|
||||
[status, result] = G_vclustermgr.scale_out_cluster(clustername, user, image, user_info)
|
||||
if status:
|
||||
user_info = G_usermgr.selfQuery(cur_user = cur_user)
|
||||
quota = {}
|
||||
quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0
|
||||
quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024
|
||||
etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername))
|
||||
for con in result['containers']:
|
||||
etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota)
|
||||
self.response(200, {'success':'true', 'action':'scale out', 'message':result})
|
||||
else:
|
||||
self.response(200, {'success':'false', 'action':'scale out', 'message':result})
|
||||
|
|
|
@ -68,7 +68,7 @@ class ImageMgr():
|
|||
return [False,"target image is exists"]
|
||||
try:
|
||||
sys_run("mkdir -p %s" % imgpath+image,True)
|
||||
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True)
|
||||
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True)
|
||||
sys_run("rm -f %s" % (imgpath+"."+image+"_docklet_share"),True)
|
||||
except Exception as e:
|
||||
logger.error(e)
|
||||
|
@ -87,9 +87,8 @@ class ImageMgr():
|
|||
imgpath = self.imgpath + "private/" + user + "/"
|
||||
else:
|
||||
imgpath = self.imgpath + "public/" + imageowner + "/"
|
||||
|
||||
try:
|
||||
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True)
|
||||
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True)
|
||||
except Exception as e:
|
||||
logger.error(e)
|
||||
|
||||
|
|
|
@ -7,17 +7,18 @@ from log import logger
|
|||
|
||||
class Container_Collector(threading.Thread):
|
||||
|
||||
def __init__(self,etcdaddr,cluster_name,host,cpu_quota,mem_quota,test=False):
|
||||
def __init__(self,etcdaddr,cluster_name,host,test=False):
|
||||
threading.Thread.__init__(self)
|
||||
self.thread_stop = False
|
||||
self.host = host
|
||||
self.etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (cluster_name))
|
||||
self.etcdser.setkey('/vnodes/cpu_quota', cpu_quota)
|
||||
self.etcdser.setkey('/vnodes/mem_quota', mem_quota)
|
||||
self.cpu_quota = float(cpu_quota)/100000.0
|
||||
self.mem_quota = float(mem_quota)*1000000/1024
|
||||
#self.cpu_quota = float(cpu_quota)/100000.0
|
||||
#self.mem_quota = float(mem_quota)*1000000/1024
|
||||
self.interval = 2
|
||||
self.test = test
|
||||
self.cpu_last = {}
|
||||
self.cpu_quota = {}
|
||||
self.mem_quota = {}
|
||||
return
|
||||
|
||||
def list_container(self):
|
||||
|
@ -46,22 +47,32 @@ class Container_Collector(threading.Thread):
|
|||
basic_info['PID'] = info['PID']
|
||||
basic_info['IP'] = info['IP']
|
||||
self.etcdser.setkey('/vnodes/%s/basic_info'%(container_name), basic_info)
|
||||
|
||||
cpu_parts = re.split(' +',info['CPU use'])
|
||||
cpu_val = cpu_parts[0].strip()
|
||||
cpu_unit = cpu_parts[1].strip()
|
||||
res = self.etcdser.getkey('/vnodes/%s/cpu_use'%(container_name))
|
||||
cpu_last = 0
|
||||
if res[0] == True:
|
||||
last_use = dict(eval(res[1]))
|
||||
cpu_last = float(last_use['val'])
|
||||
if not container_name in self.cpu_last.keys():
|
||||
[ret, ans] = self.etcdser.getkey('/vnodes/%s/quota'%(container_name))
|
||||
if ret == True :
|
||||
res = dict(eval(ans))
|
||||
self.cpu_quota[container_name] = res['cpu']
|
||||
self.mem_quota[container_name] = res['memory']
|
||||
self.cpu_last[container_name] = 0
|
||||
else:
|
||||
logger.warning(ans)
|
||||
self.cpu_quota[container_name] = 1
|
||||
self.mem_quota[container_name] = 2000*1000000/1024
|
||||
self.cpu_last[container_name] = 0
|
||||
cpu_use = {}
|
||||
cpu_use['val'] = cpu_val
|
||||
cpu_use['unit'] = cpu_unit
|
||||
cpu_usedp = (float(cpu_val)-float(cpu_last))/(self.cpu_quota*self.interval*1.3)
|
||||
if(cpu_usedp > 1):
|
||||
cpu_usedp = (float(cpu_val)-float(self.cpu_last[container_name]))/(self.cpu_quota[container_name]*self.interval*1.3)
|
||||
if(cpu_usedp > 1 or cpu_usedp < 0):
|
||||
cpu_usedp = 1
|
||||
cpu_use['usedp'] = cpu_usedp
|
||||
self.cpu_last[container_name] = cpu_val;
|
||||
self.etcdser.setkey('vnodes/%s/cpu_use'%(container_name), cpu_use)
|
||||
|
||||
mem_parts = re.split(' +',info['Memory use'])
|
||||
mem_val = mem_parts[0].strip()
|
||||
mem_unit = mem_parts[1].strip()
|
||||
|
@ -70,7 +81,9 @@ class Container_Collector(threading.Thread):
|
|||
mem_use['unit'] = mem_unit
|
||||
if(mem_unit == "MiB"):
|
||||
mem_val = float(mem_val) * 1024
|
||||
mem_usedp = float(mem_val) / self.mem_quota
|
||||
elif (mem_unit == "GiB"):
|
||||
mem_val = float(mem_val) * 1024 * 1024
|
||||
mem_usedp = float(mem_val) / self.mem_quota[container_name]
|
||||
mem_use['usedp'] = mem_usedp
|
||||
self.etcdser.setkey('/vnodes/%s/mem_use'%(container_name), mem_use)
|
||||
#print(output)
|
||||
|
@ -220,7 +233,6 @@ class Container_Fetcher:
|
|||
[ret, ans] = self.etcdser.getkey('/%s/cpu_use'%(container_name))
|
||||
if ret == True :
|
||||
res = dict(eval(ans))
|
||||
res['quota'] = self.etcdser.getkey('/cpu_quota')[1]
|
||||
return res
|
||||
else:
|
||||
logger.warning(ans)
|
||||
|
@ -231,7 +243,6 @@ class Container_Fetcher:
|
|||
[ret, ans] = self.etcdser.getkey('/%s/mem_use'%(container_name))
|
||||
if ret == True :
|
||||
res = dict(eval(ans))
|
||||
res['quota'] = self.etcdser.getkey('/mem_quota')[1]
|
||||
return res
|
||||
else:
|
||||
logger.warning(ans)
|
||||
|
|
|
@ -192,7 +192,7 @@ if __name__ == '__main__':
|
|||
logger.info ("using WORKER_PORT %s" % worker_port )
|
||||
|
||||
con_collector = monitor.Container_Collector(etcdaddr, clustername,
|
||||
ipaddr, cpu_quota, mem_quota)
|
||||
ipaddr)
|
||||
con_collector.start()
|
||||
logger.info("CPU and Memory usage monitor started")
|
||||
|
||||
|
|
Loading…
Reference in New Issue