Merge remote-tracking branch 'upstream/master'

This commit is contained in:
zhongyehong 2016-04-11 14:37:42 +08:00
commit c3ecd0d2a0
6 changed files with 86 additions and 102 deletions

123
README.md
View File

@ -2,38 +2,28 @@
http://docklet.unias.org
## intro
## Intro
Docklet is an operating system for mini-datacener. Its goal is to help
multi-user share cluster resources effectively. Unlike the "application
framework oriented" cluster manager such as Mesos and Yarn, Docklet is
**user oriented**. In Docklet, every user has their own private
**virtual cluster (vcluster)**, which consists of a number of virtual
Linux container nodes distributed over the physical cluster. Every
vcluster is separated from others and can be operated like a real
physical cluster. Therefore, most applications, especially those
requiring a cluster environment, can run in vcluster seamlessly.
Docklet is a cloud operating system for mini-datacener. Its goal is to
help multi-user share cluster resources effectively. In Docklet, every
user has their own private **virtual cluster (vcluster)**, which
consists of a number of virtual Linux container nodes distributed over
the physical cluster. Each vcluster is separated from others and can be
operated like a real physical cluster. Therefore, most applications,
especially those requiring a cluster environment, can run in vcluster
seamlessly.
Docklet provides a base image for creating virtual nodes. This image has
pre-installed a lot of mainstream development tools and frameworks,
including gcc/g++, openjdk, python3, R, MPI, scala, ruby, php, node.js,
texlive, mpich2, spark,
scipy/numpy/matplotlib/pandas/sympy/scikit-learn, jupyter notebook, etc.
Users can get a ready vcluster with just one click within 1 second.
Users manage and use their vcluster all through web. The only client
tool needed is a modern web browser supporting HTML5, like Safari,
Firefox, or Chrome. The integrated *jupyter notebook* provides a web
**Workspace**. In the Workspace, users can code, debug, test,
and runn their programs, even visualize the outputs online.
Therefore, it is ideal for data analysis and processing.
The users are free to install their specific software in their vcluster.
Docklet supports operating through **web terminal**. Users can do their
work as an administrator working on a console. The base image system is
ubuntu. The recommended way of installing new software is by
**apt-get**.
The users manage and use their vcluster all through web. The only client
tool needed is a modern web browser, like safari, firefox, chrome. The
integrated *jupyter notebook* provides a web workspace. By visiting the
workspace, users can do coding, debugging and testing of their programs
online. The **python scipy** series of tools can even display graphical
pictures in the browser. Therefore, it is ideal for data analysis and
processing.
Docklet creates virtual nodes from a base image. Admins can
pre-install development tools and frameworks according to their
interests. The users are also free to install their specific software
in their vcluster.
Docklet only need **one** public IP address. The vclusters are
configured to use private IP address range, e.g., 172.16.0.0/16,
@ -47,54 +37,34 @@ The Docklet system runtime consists of four components:
- docklet master
- docklet worker
## install
## Install
Currently the docklet runtime is recommend to run in Unbuntu 15.10+.
Currently the Docklet system is recommend to run in Unbuntu 15.10+.
Ensure that python3.5 is the default python3 version.
Unpack the docklet tarball to a directory ( /root/docklet as an
example), will get
Clone Docklet from github
```
readme.md
prepare.sh
conf/
container.conf
docklet.conf.template
lxc-script/
bin/
docklet-master
docklet-worker
src/
httprest.py
worker.py
...
web/
web.py
doc/
tools/
update-basefs.sh
start_jupyter.sh
git clone https://github.com/unias/docklet.git
```
If it is the first time install, users should run **prepare.sh** to
install necessary packages automatically. Note it may need to run this
script several times to successfully install all the needed packages.
Run **prepare.sh** from console to install depended packages and
generate necessary configurations.
A *root* users will be created for managing the system. The password is
recorded in `FS_PREFIX/local/generated_password.txt` .
A *root* users will be created for managing the Docklet system. The
password is recorded in `FS_PREFIX/local/generated_password.txt` .
## config ##
## Config ##
The main configuration file of docklet is conf/docklet.conf. Most
default setting works for a single host environment.
First copy docklet.conf.template to get docklet.conf.
The following settings should be taken care of:
Pay attention to the following settings:
- NETWORK_DEVICE : the network device to use.
- NETWORK_DEVICE : the network interface to use.
- ETCD : the etcd server address. For distributed muli hosts
environment, it should be one of the ETCD public server address.
For single host environment, the default value should be OK.
@ -111,7 +81,7 @@ The following settings should be taken care of:
by visiting this address. If the system is behind a firewall, then
a reverse proxy should be setup.
## start ##
## Start ##
### distributed file system ###
@ -123,8 +93,7 @@ Lets presume the file system server export filesystem as nfs
In each physical host to run docklet, mount **fileserver:/pub** to
**FS_PEFIX/global** .
For single host environment, it need not to configure distributed
file system.
For single host environment, nothing to do.
### etcd ###
@ -133,7 +102,7 @@ Ubuntu releases have included **etcd** in the repository, just `apt-get
install etcd`, and it need not to start etcd manually. For others, you
should install etcd manually.
For multi hosts distributed environment, start
For multi hosts distributed environment, **must** start
**dep/etcd-multi-nodes.sh** in each etcd server hosts. This scripts
requires users providing the etcd server address as parameters.
@ -146,27 +115,19 @@ address, e.g., 172.16.0.1. This server will be the master.
If it is the first time you start docklet, run `bin/docklet-master init`
to init and start docklet master. Otherwise, run `bin/docklet-master start`,
which will start master in recovery mode in background using
conf/docklet.conf. It means docklet will recover workspaces existed.
This script in fact will start three daemons: the docklet master of
httprest.py, the configurable-http-proxy and the docklet web of web.py.
conf/docklet.conf.
You can check the daemon status by running `bin/docklet-master status`
If the master failed to start, you could try `bin/docklet-master init`
to initialize the whole system.
More usages can be found by typing `bin/docklet-master`
The master logs are in **FS_PREFIX/local/log/docklet-master.log** and
**docklet-web.log**.
### worker ###
Worker needs a basefs image to boot container.
Worker needs a basefs image to create containers.
You can create such an image with `lxc-create -n test -t download`,
and then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs`
then copy the rootfs to **FS_PREFIX/local**, and renamed `rootfs`
to `basefs`.
Note the `jupyerhub` package must be installed for this image. And the
@ -179,21 +140,17 @@ Run `bin/docklet-worker start`, will start worker in background.
You can check the daemon status by running `bin/docklet-worker status`
More usages can be found by typing `bin/docklet-worker`
The log is in **FS_PREFIX/local/log/docklet-worker.log**
Currently, the worker must be run after the master has been started.
## usage ##
## Usage ##
Open a browser, visiting the address specified by PORTAL_URL ,
e.g., ` http://docklet.info/ `
If the system is just deployed in single host for testing purpose,
then the PORTAL_URL defaults to `http://MASTER_IP:PROXY_PORT`,
e.g., `http://localhost:8000`.
That is it.
## system admin ##
# Contribute #
Contributions are welcome. Please check [devguide](doc/devguide/devguide.md)

View File

@ -47,16 +47,19 @@
# CLUSTER_NET: cluster network ip address range, default is 172.16.0.1/16
# CLUSTER_NET=172.16.0.1/16
# Deprecated since v0.2.7. read from quota group set in web admin page
# CONTAINER_CPU: CPU quota of container, default is 100000
# A single CPU core has total=100000 (100ms), so the default 100000
# mean a single container can occupy a whole core.
# For a CPU with two cores, this can be set to 200000
# CONTAINER_CPU=100000
# Deprecated since v0.2.7. read from quota group set in web admin page
# CONTAINER_DISK: disk quota of container image upper layer, count in MB,
# default is 1000
# CONTAINER_DISK=1000
# Deprecated since v0.2.7. read from quota group set in web admin page
# CONTAINER_MEMORY: memory quota of container, count in MB, default is 1000
# CONTAINER_MEMORY=1000

View File

@ -161,6 +161,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler):
logger.info ("handle request : create cluster %s with image %s " % (clustername, image['name']))
[status, result] = G_vclustermgr.create_cluster(clustername, user, image, user_info)
if status:
user_info = G_usermgr.selfQuery(cur_user = cur_user)
quota = {}
quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0
quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024
etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername))
for con in result['containers']:
etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota)
self.response(200, {'success':'true', 'action':'create cluster', 'message':result})
else:
self.response(200, {'success':'false', 'action':'create cluster', 'message':result})
@ -177,6 +184,13 @@ class DockletHttpHandler(http.server.BaseHTTPRequestHandler):
user_info = json.dumps(user_info)
[status, result] = G_vclustermgr.scale_out_cluster(clustername, user, image, user_info)
if status:
user_info = G_usermgr.selfQuery(cur_user = cur_user)
quota = {}
quota['cpu'] = float(user_info['data']['groupinfo']['cpu'])/100000.0
quota['memory'] = float(user_info['data']['groupinfo']['memory'])*1000000/1024
etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (G_clustername))
for con in result['containers']:
etcdser.setkey('/vnodes/%s/quota'%(con['containername']), quota)
self.response(200, {'success':'true', 'action':'scale out', 'message':result})
else:
self.response(200, {'success':'false', 'action':'scale out', 'message':result})

View File

@ -68,7 +68,7 @@ class ImageMgr():
return [False,"target image is exists"]
try:
sys_run("mkdir -p %s" % imgpath+image,True)
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True)
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (self.dealpath(fspath),imgpath+image),True)
sys_run("rm -f %s" % (imgpath+"."+image+"_docklet_share"),True)
except Exception as e:
logger.error(e)
@ -87,9 +87,8 @@ class ImageMgr():
imgpath = self.imgpath + "private/" + user + "/"
else:
imgpath = self.imgpath + "public/" + imageowner + "/"
try:
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True)
sys_run("rsync -a --delete --exclude=lost+found/ --exclude=root/nfs/ --exclude=dev/ --exclude=mnt/ --exclude=tmp/ --exclude=media/ --exclude=proc/ --exclude=sys/ %s/ %s/" % (imgpath+imagename,self.dealpath(fspath)),True)
except Exception as e:
logger.error(e)

View File

@ -7,17 +7,18 @@ from log import logger
class Container_Collector(threading.Thread):
def __init__(self,etcdaddr,cluster_name,host,cpu_quota,mem_quota,test=False):
def __init__(self,etcdaddr,cluster_name,host,test=False):
threading.Thread.__init__(self)
self.thread_stop = False
self.host = host
self.etcdser = etcdlib.Client(etcdaddr,"/%s/monitor" % (cluster_name))
self.etcdser.setkey('/vnodes/cpu_quota', cpu_quota)
self.etcdser.setkey('/vnodes/mem_quota', mem_quota)
self.cpu_quota = float(cpu_quota)/100000.0
self.mem_quota = float(mem_quota)*1000000/1024
#self.cpu_quota = float(cpu_quota)/100000.0
#self.mem_quota = float(mem_quota)*1000000/1024
self.interval = 2
self.test = test
self.cpu_last = {}
self.cpu_quota = {}
self.mem_quota = {}
return
def list_container(self):
@ -46,22 +47,32 @@ class Container_Collector(threading.Thread):
basic_info['PID'] = info['PID']
basic_info['IP'] = info['IP']
self.etcdser.setkey('/vnodes/%s/basic_info'%(container_name), basic_info)
cpu_parts = re.split(' +',info['CPU use'])
cpu_val = cpu_parts[0].strip()
cpu_unit = cpu_parts[1].strip()
res = self.etcdser.getkey('/vnodes/%s/cpu_use'%(container_name))
cpu_last = 0
if res[0] == True:
last_use = dict(eval(res[1]))
cpu_last = float(last_use['val'])
if not container_name in self.cpu_last.keys():
[ret, ans] = self.etcdser.getkey('/vnodes/%s/quota'%(container_name))
if ret == True :
res = dict(eval(ans))
self.cpu_quota[container_name] = res['cpu']
self.mem_quota[container_name] = res['memory']
self.cpu_last[container_name] = 0
else:
logger.warning(ans)
self.cpu_quota[container_name] = 1
self.mem_quota[container_name] = 2000*1000000/1024
self.cpu_last[container_name] = 0
cpu_use = {}
cpu_use['val'] = cpu_val
cpu_use['unit'] = cpu_unit
cpu_usedp = (float(cpu_val)-float(cpu_last))/(self.cpu_quota*self.interval*1.3)
if(cpu_usedp > 1):
cpu_usedp = (float(cpu_val)-float(self.cpu_last[container_name]))/(self.cpu_quota[container_name]*self.interval*1.3)
if(cpu_usedp > 1 or cpu_usedp < 0):
cpu_usedp = 1
cpu_use['usedp'] = cpu_usedp
self.cpu_last[container_name] = cpu_val;
self.etcdser.setkey('vnodes/%s/cpu_use'%(container_name), cpu_use)
mem_parts = re.split(' +',info['Memory use'])
mem_val = mem_parts[0].strip()
mem_unit = mem_parts[1].strip()
@ -70,7 +81,9 @@ class Container_Collector(threading.Thread):
mem_use['unit'] = mem_unit
if(mem_unit == "MiB"):
mem_val = float(mem_val) * 1024
mem_usedp = float(mem_val) / self.mem_quota
elif (mem_unit == "GiB"):
mem_val = float(mem_val) * 1024 * 1024
mem_usedp = float(mem_val) / self.mem_quota[container_name]
mem_use['usedp'] = mem_usedp
self.etcdser.setkey('/vnodes/%s/mem_use'%(container_name), mem_use)
#print(output)
@ -220,7 +233,6 @@ class Container_Fetcher:
[ret, ans] = self.etcdser.getkey('/%s/cpu_use'%(container_name))
if ret == True :
res = dict(eval(ans))
res['quota'] = self.etcdser.getkey('/cpu_quota')[1]
return res
else:
logger.warning(ans)
@ -231,7 +243,6 @@ class Container_Fetcher:
[ret, ans] = self.etcdser.getkey('/%s/mem_use'%(container_name))
if ret == True :
res = dict(eval(ans))
res['quota'] = self.etcdser.getkey('/mem_quota')[1]
return res
else:
logger.warning(ans)

View File

@ -192,7 +192,7 @@ if __name__ == '__main__':
logger.info ("using WORKER_PORT %s" % worker_port )
con_collector = monitor.Container_Collector(etcdaddr, clustername,
ipaddr, cpu_quota, mem_quota)
ipaddr)
con_collector.start()
logger.info("CPU and Memory usage monitor started")