Merge pull request #251 from FirmlyReality/publicIP

Public ip
This commit is contained in:
Yujian Zhu 2017-05-29 23:35:39 +08:00 committed by GitHub
commit d71ea855cf
9 changed files with 148 additions and 68 deletions

View File

@ -1,4 +1,4 @@
# Docklet
# Docklet
http://docklet.unias.org
@ -11,24 +11,24 @@ consists of a number of virtual Linux container nodes distributed over
the physical cluster. Each vcluster is separated from others and can be
operated like a real physical cluster. Therefore, most applications,
especially those requiring a cluster environment, can run in vcluster
seamlessly.
seamlessly.
Users manage and use their vcluster all through web. The only client
tool needed is a modern web browser supporting HTML5, like Safari,
Firefox, or Chrome. The integrated *jupyter notebook* provides a web
**Workspace**. In the Workspace, users can code, debug, test,
and runn their programs, even visualize the outputs online.
**Workspace**. In the Workspace, users can code, debug, test,
and runn their programs, even visualize the outputs online.
Therefore, it is ideal for data analysis and processing.
Docklet creates virtual nodes from a base image. Admins can
Docklet creates virtual nodes from a base image. Admins can
pre-install development tools and frameworks according to their
interests. The users are also free to install their specific software
interests. The users are also free to install their specific software
in their vcluster.
Docklet only need **one** public IP address. The vclusters are
configured to use private IP address range, e.g., 172.16.0.0/16,
192.168.0.0/16, 10.0.0.0/8. A proxy is setup to help
users visit their vclusters behind the firewall/gateway.
users visit their vclusters behind the firewall/gateway.
The Docklet system runtime consists of four components:
@ -50,7 +50,7 @@ git clone https://github.com/unias/docklet.git
```
Run **prepare.sh** from console to install depended packages and
generate necessary configurations.
generate necessary configurations.
A *root* users will be created for managing the Docklet system. The
password is recorded in `FS_PREFIX/local/generated_password.txt` .
@ -58,13 +58,13 @@ password is recorded in `FS_PREFIX/local/generated_password.txt` .
## Config ##
The main configuration file of docklet is conf/docklet.conf. Most
default setting works for a single host environment.
default setting works for a single host environment.
First copy docklet.conf.template to get docklet.conf.
Pay attention to the following settings:
- NETWORK_DEVICE : the network interface to use.
- NETWORK_DEVICE : the network interface to use.
- ETCD : the etcd server address. For distributed multi hosts
environment, it should be one of the ETCD public server address.
For single host environment, the default value should be OK.
@ -73,20 +73,25 @@ Pay attention to the following settings:
- FS_PREFIX: the working dir of docklet runtime. default is
/opt/docklet.
- CLUSTER_NET: the vcluster network ip address range, default is
172.16.0.1/16. This network range should all be allocated to and
managed by docklet.
- PROXY_PORT : the public port of docklet. Users use
this port to visit the docklet system.
172.16.0.1/16. This network range should all be allocated to and
managed by docklet.
- PROXY_PORT : the listening port of configurable-http-proxy. It proxy
connections from exteral public network to internal private
container networks.
- PORTAL_URL : the portal of the system. Users access the system
by visiting this address. If the system is behind a firewall, then
a reverse proxy should be setup.
- NGINX_PORT : the access port of the public portal. User use this
port to visit docklet system.
- DISTRIBUTED_GATEWAY : whether the users' gateways are distributed
or not. Both master and worker must be set by same value.
## Start ##
### distributed file system ###
For multi hosts distributed environment, a distributed file system is
needed to store global data. Currently, glusterfs has been tested.
needed to store global data. Currently, glusterfs has been tested.
Lets presume the file system server export filesystem as nfs
**fileserver:/pub** :
@ -99,7 +104,7 @@ For single host environment, nothing to do.
For single host environment, start **tools/etcd-one-node.sh** . Some recent
Ubuntu releases have included **etcd** in the repository, just `apt-get
install etcd`, and it need not to start etcd manually. For others, you
install etcd`, and it need not to start etcd manually. For others, you
should install etcd manually.
For multi hosts distributed environment, **must** start
@ -113,9 +118,10 @@ public IP address/url, e.g., docklet.info; the other having a private IP
address, e.g., 172.16.0.1. This server will be the master.
If it is the first time you start docklet, run `bin/docklet-master init`
to init and start docklet master. Otherwise, run `bin/docklet-master start`,
which will start master in recovery mode in background using
conf/docklet.conf.
to init and start docklet master. Otherwise, run `bin/docklet-master start`,
which will start master in recovery mode in background using
conf/docklet.conf. (Note: if docklet will run in the distributed gateway mode
and recovery mode, please start the workers first.)
You can check the daemon status by running `bin/docklet-master status`
@ -126,11 +132,11 @@ The master logs are in **FS_PREFIX/local/log/docklet-master.log** and
Worker needs a basefs image to create containers.
You can create such an image with `lxc-create -n test -t download`,
then copy the rootfs to **FS_PREFIX/local**, and rename `rootfs`
You can create such an image with `lxc-create -n test -t download`,
then copy the rootfs to **FS_PREFIX/local**, and rename `rootfs`
to `basefs`.
Note the `jupyerhub` package must be installed for this image. And the
Note the `jupyerhub` package must be installed for this image. And the
start script `tools/start_jupyter.sh` should be placed at
`basefs/home/jupyter`.
@ -146,7 +152,7 @@ Currently, the worker must be run after the master has been started.
## Usage ##
Open a browser, visiting the address specified by PORTAL_URL ,
Open a browser, visiting the address specified by PORTAL_URL ,
e.g., ` http://docklet.info/ `
That is it.

View File

@ -141,9 +141,15 @@
# DATA_QUOTA_CMD="gluster volume quota docklet-volume limit-usage %s %s"
# DISTRIBUTED_GATEWAY : whether the users' gateways are distributed or not
# Must be set by same value on master and workers.
# True or False, default: False
# DISTRIBUTED_GATEWAY=False
# PUBLIC_IP : publick ip of this machine. If DISTRIBUTED_GATEWAY is True,
# users' gateways can be setup on this machine. Users can visit this machine
# by the public ip. default: IP of NETWORK_DEVICE.
# PUBLIC_IP=0.0.0.0
# NGINX_CONF: the config path of nginx, default: /etc/nginx
# NGINX_CONF="/etc/nginx"

View File

@ -233,7 +233,7 @@ IP=%s
config = open(jconfigpath, 'r')
context = config.read()
config.close()
context = context.replace(old_ip, new_ip)
context = context.replace(old_ip+"/go", new_ip+"/go")
config = open(jconfigpath, 'w')
config.write(context)
config.close()

View File

@ -1,4 +1,4 @@
import os
import os,netifaces
def getenv(key):
if key == "CLUSTER_NAME":
@ -56,6 +56,13 @@ def getenv(key):
return os.environ.get("DATA_QUOTA_CMD", "gluster volume quota docklet-volume limit-usage %s %s")
elif key == 'DISTRIBUTED_GATEWAY':
return os.environ.get("DISTRIBUTED_GATEWAY", "False")
elif key == "PUBLIC_IP":
device = os.environ.get("NETWORK_DEVICE","eth0")
addr = netifaces.ifaddresses(device)
if 2 in addr:
return os.environ.get("PUBLIC_IP",addr[2][0]['addr'])
else:
return os.environ.get("PUBLIC_IP","0.0.0.0")
elif key == "NGINX_CONF":
return os.environ.get("NGINX_CONF","/etc/nginx")
elif key =="USER_IP":

View File

@ -696,6 +696,10 @@ if __name__ == '__main__':
if len(sys.argv) > 1 and sys.argv[1] == "new":
mode = 'new'
# get public IP and set public Ip in etcd
public_IP = env.getenv("PUBLIC_IP")
etcdclient.setkey("machines/publicIP/"+ipaddr, public_IP)
# do some initialization for mode: new/recovery
if mode == 'new':
# clean and initialize the etcd table

View File

@ -8,6 +8,7 @@ from log import logger
import env
import proxytool
import requests
import traceback
userpoint = "http://" + env.getenv('USER_IP') + ":" + str(env.getenv('USER_PORT'))
def post_to_user(url = '/', data={}):
@ -124,6 +125,7 @@ class VclusterMgr(object):
hostpath = self.fspath+"/global/users/"+username+"/hosts/"+str(clusterid)+".hosts"
hosts = "127.0.0.1\tlocalhost\n"
proxy_server_ip = ""
proxy_public_ip = ""
containers = []
for i in range(0, clustersize):
workerip = workers[random.randint(0, len(workers)-1)]
@ -135,10 +137,14 @@ class VclusterMgr(object):
if i == 0:
self.networkmgr.load_usrgw(username)
proxy_server_ip = self.networkmgr.usrgws[username]
[status, proxy_public_ip] = self.etcd.getkey("machines/publicIP/"+proxy_server_ip)
if not status:
logger.error("Fail to get proxy_public_ip %s."%(proxy_server_ip))
return [False, "Fail to get proxy server public IP."]
lxc_name = username + "-" + str(clusterid) + "-" + str(i)
hostname = "host-"+str(i)
logger.info ("create container with : name-%s, username-%s, clustername-%s, clusterid-%s, hostname-%s, ip-%s, gateway-%s, image-%s" % (lxc_name, username, clustername, str(clusterid), hostname, ips[i], gateway, image_json))
[success,message] = oneworker.create_container(lxc_name, proxy_server_ip, username, uid, json.dumps(setting) , clustername, str(clusterid), str(i), hostname, ips[i], gateway, image_json)
[success,message] = oneworker.create_container(lxc_name, proxy_public_ip, username, uid, json.dumps(setting) , clustername, str(clusterid), str(i), hostname, ips[i], gateway, image_json)
if success is False:
logger.info("container create failed, so vcluster create failed")
return [False, message]
@ -149,8 +155,11 @@ class VclusterMgr(object):
hostfile.write(hosts)
hostfile.close()
clusterfile = open(clusterpath, 'w')
proxy_url = env.getenv("PORTAL_URL") +"/"+ proxy_server_ip +"/_web/" + username + "/" + clustername
info = {'clusterid':clusterid, 'status':'stopped', 'size':clustersize, 'containers':containers, 'nextcid': clustersize, 'create_time':datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), 'start_time':"------" , 'proxy_url':proxy_url, 'proxy_server_ip':proxy_server_ip}
proxy_url = env.getenv("PORTAL_URL") +"/"+ proxy_public_ip +"/_web/" + username + "/" + clustername
info = {'clusterid':clusterid, 'status':'stopped', 'size':clustersize, 'containers':containers, 'nextcid': clustersize, 'create_time':datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), 'start_time':"------"}
info['proxy_url'] = proxy_url
info['proxy_server_ip'] = proxy_server_ip
info['proxy_public_ip'] = proxy_public_ip
clusterfile.write(json.dumps(info))
clusterfile.close()
return [True, info]
@ -180,8 +189,9 @@ class VclusterMgr(object):
lxc_name = username + "-" + str(clusterid) + "-" + str(cid)
hostname = "host-" + str(cid)
proxy_server_ip = clusterinfo['proxy_server_ip']
proxy_public_ip = clusterinfo['proxy_public_ip']
uid = json.loads(user_info)["data"]["id"]
[success, message] = oneworker.create_container(lxc_name, proxy_server_ip, username, uid, json.dumps(setting), clustername, clusterid, str(cid), hostname, ip, gateway, image_json)
[success, message] = oneworker.create_container(lxc_name, proxy_public_ip, username, uid, json.dumps(setting), clustername, clusterid, str(cid), hostname, ip, gateway, image_json)
if success is False:
logger.info("create container failed, so scale out failed")
return [False, message]
@ -212,9 +222,9 @@ class VclusterMgr(object):
clusterinfo['proxy_ip'] = ip + ":" + port
if self.distributedgw == 'True':
worker = self.nodemgr.ip_to_rpc(clusterinfo['proxy_server_ip'])
worker.set_route("/"+ clusterinfo['proxy_server_ip'] + "/_web/" + username + "/" + clustername, target)
worker.set_route("/"+ clusterinfo['proxy_public_ip'] + "/_web/" + username + "/" + clustername, target)
else:
proxytool.set_route("/" + clusterinfo['proxy_server_ip'] + "/_web/" + username + "/" + clustername, target)
proxytool.set_route("/" + clusterinfo['proxy_public_ip'] + "/_web/" + username + "/" + clustername, target)
clusterfile = open(self.fspath + "/global/users/" + username + "/clusters/" + clustername, 'w')
clusterfile.write(json.dumps(clusterinfo))
clusterfile.close()
@ -227,9 +237,9 @@ class VclusterMgr(object):
clusterinfo.pop('proxy_ip')
if self.distributedgw == 'True':
worker = self.nodemgr.ip_to_rpc(clusterinfo['proxy_server_ip'])
worker.delete_route("/" + clusterinfo['proxy_server_ip'] + "/_web/" + username + "/" + clustername)
worker.delete_route("/" + clusterinfo['proxy_public_ip'] + "/_web/" + username + "/" + clustername)
else:
proxytool.delete_route("/" + clusterinfo['proxy_server_ip'] + "/_web/" + username + "/" + clustername)
proxytool.delete_route("/" + clusterinfo['proxy_public_ip'] + "/_web/" + username + "/" + clustername)
clusterfile = open(self.fspath + "/global/users/" + username + "/clusters/" + clustername, 'w')
clusterfile.write(json.dumps(clusterinfo))
clusterfile.close()
@ -399,6 +409,29 @@ class VclusterMgr(object):
disk += int(container['setting']['disk'])
return [True, {'cpu':cpu, 'memory':memory, 'disk':disk}]
def update_cluster_baseurl(self, clustername, username, oldip, newip):
[status, info] = self.get_clusterinfo(clustername, username)
if not status:
return [False, "cluster not found"]
logger.info("%s %s:base_url need to be modified(%s %s)."%(username,clustername,oldip,newip))
for container in info['containers']:
worker = xmlrpc.client.ServerProxy("http://%s:%s" % (container['host'], env.getenv("WORKER_PORT")))
if worker is None:
return [False, "The worker can't be found or has been stopped."]
worker.update_baseurl(container['containername'],oldip,newip)
worker.stop_container(container['containername'])
def check_public_ip(self, clustername, username):
[status, info] = self.get_clusterinfo(clustername, username)
[status, proxy_public_ip] = self.etcd.getkey("machines/publicIP/"+info['proxy_server_ip'])
if not info['proxy_public_ip'] == proxy_public_ip:
logger.info("%s %s proxy_public_ip has been changed, base_url need to be modified."%(username,clustername))
oldpublicIP= info['proxy_public_ip']
self.update_proxy_ipAndurl(clustername,username,info['proxy_server_ip'])
self.update_cluster_baseurl(clustername,username,oldpublicIP,proxy_public_ip)
return False
else:
return True
def start_cluster(self, clustername, username, uid):
[status, info] = self.get_clusterinfo(clustername, username)
@ -406,34 +439,36 @@ class VclusterMgr(object):
return [False, "cluster not found"]
if info['status'] == 'running':
return [False, "cluster is already running"]
# check gateway for user
# after reboot, user gateway goes down and lose its configuration
# so, check is necessary
self.networkmgr.check_usergw(username, uid, self.nodemgr,self.distributedgw=='True')
# set proxy
if not "proxy_server_ip" in info.keys():
info['proxy_server_ip'] = self.addr
self.write_clusterinfo(info,clustername,username)
try:
target = 'http://'+info['containers'][0]['ip'].split('/')[0]+":10000"
if self.distributedgw == 'True':
worker = self.nodemgr.ip_to_rpc(info['proxy_server_ip'])
worker.set_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername, target)
# check public ip
if not self.check_public_ip(clustername,username):
[status, info] = self.get_clusterinfo(clustername, username)
worker.set_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername, target)
else:
if not info['proxy_server_ip'] == self.addr:
logger.info("%s %s proxy_server_ip has been changed, base_url need to be modified."%(username,clustername))
for container in info['containers']:
worker = xmlrpc.client.ServerProxy("http://%s:%s" % (container['host'], env.getenv("WORKER_PORT")))
if worker is None:
return [False, "The worker can't be found or has been stopped."]
worker.update_baseurl(container['containername'],info['proxy_server_ip'],self.addr)
info['proxy_server_ip'] = self.addr
proxy_url = env.getenv("PORTAL_URL") +"/"+ self.addr +"/_web/" + username + "/" + clustername
info['proxy_url'] = proxy_url
self.write_clusterinfo(info,clustername,username)
proxytool.set_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername, target)
oldpublicIP= info['proxy_public_ip']
self.update_proxy_ipAndurl(clustername,username,self.addr)
[status, info] = self.get_clusterinfo(clustername, username)
self.update_cluster_baseurl(clustername,username,oldpublicIP,info['proxy_public_ip'])
# check public ip
if not self.check_public_ip(clustername,username):
[status, info] = self.get_clusterinfo(clustername, username)
proxytool.set_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername, target)
except:
logger.info(traceback.format_exc())
return [False, "start cluster failed with setting proxy failed"]
# check gateway for user
# after reboot, user gateway goes down and lose its configuration
# so, check is necessary
self.networkmgr.check_usergw(username, uid, self.nodemgr,self.distributedgw=='True')
# start containers
for container in info['containers']:
# set up gre from user's gateway host to container's host.
self.networkmgr.check_usergre(username, uid, container['host'], self.nodemgr, self.distributedgw=='True')
@ -465,35 +500,40 @@ class VclusterMgr(object):
[status, info] = self.get_clusterinfo(clustername, username)
if not status:
return [False, "cluster not found"]
# need to check and recover gateway of this user
self.networkmgr.check_usergw(username, uid, self.nodemgr,self.distributedgw=='True')
# recover proxy of cluster
if not "proxy_server_ip" in info.keys():
info['proxy_server_ip'] = self.addr
self.write_clusterinfo(info,clustername,username)
[status, info] = self.get_clusterinfo(clustername, username)
if not "proxy_public_ip" in info.keys():
self.update_proxy_ipAndurl(clustername,username,info['proxy_server_ip'])
[status, info] = self.get_clusterinfo(clustername, username)
self.update_cluster_baseurl(clustername,username,info['proxy_server_ip'],info['proxy_public_ip'])
if info['status'] == 'stopped':
return [True, "cluster no need to start"]
# recover proxy of cluster
try:
target = 'http://'+info['containers'][0]['ip'].split('/')[0]+":10000"
if self.distributedgw == 'True':
worker = self.nodemgr.ip_to_rpc(info['proxy_server_ip'])
worker.set_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername, target)
# check public ip
if not self.check_public_ip(clustername,username):
[status, info] = self.get_clusterinfo(clustername, username)
worker.set_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername, target)
else:
if not info['proxy_server_ip'] == self.addr:
logger.info("%s %s proxy_server_ip has been changed, base_url need to be modified."%(username,clustername))
for container in info['containers']:
worker = xmlrpc.client.ServerProxy("http://%s:%s" % (container['host'], env.getenv("WORKER_PORT")))
if worker is None:
return [False, "The worker can't be found or has been stopped."]
worker.update_baseurl(container['containername'],info['proxy_server_ip'],self.addr)
worker.stop_container(container['containername'])
info['proxy_server_ip'] = self.addr
proxy_url = env.getenv("PORTAL_URL") +"/"+ self.addr +"/_web/" + username + "/" + clustername
info['proxy_url'] = proxy_url
self.write_clusterinfo(info,clustername,username)
proxytool.set_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername, target)
oldpublicIP= info['proxy_public_ip']
self.update_proxy_ipAndurl(clustername,username,self.addr)
[status, info] = self.get_clusterinfo(clustername, username)
self.update_cluster_baseurl(clustername,username,oldpublicIP,info['proxy_public_ip'])
# check public ip
if not self.check_public_ip(clustername,username):
[status, info] = self.get_clusterinfo(clustername, username)
proxytool.set_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername, target)
except:
return [False, "start cluster failed with setting proxy failed"]
# need to check and recover gateway of this user
self.networkmgr.check_usergw(username, uid, self.nodemgr,self.distributedgw=='True')
# recover containers of this cluster
for container in info['containers']:
# set up gre from user's gateway host to container's host.
@ -516,9 +556,9 @@ class VclusterMgr(object):
return [False, 'cluster is already stopped']
if self.distributedgw == 'True':
worker = self.nodemgr.ip_to_rpc(info['proxy_server_ip'])
worker.delete_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername)
worker.delete_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername)
else:
proxytool.delete_route("/" + info['proxy_server_ip'] + '/go/'+username+'/'+clustername)
proxytool.delete_route("/" + info['proxy_public_ip'] + '/go/'+username+'/'+clustername)
for container in info['containers']:
worker = xmlrpc.client.ServerProxy("http://%s:%s" % (container['host'], env.getenv("WORKER_PORT")))
if worker is None:
@ -577,6 +617,21 @@ class VclusterMgr(object):
logger.error ("internal error: cluster:%s info file has no clusterid " % clustername)
return -1
def update_proxy_ipAndurl(self, clustername, username, proxy_server_ip):
[status, info] = self.get_clusterinfo(clustername, username)
if not status:
return [False, "cluster not found"]
info['proxy_server_ip'] = proxy_server_ip
[status, proxy_public_ip] = self.etcd.getkey("machines/publicIP/"+proxy_server_ip)
if not status:
logger.error("Fail to get proxy_public_ip %s."%(proxy_server_ip))
proxy_public_ip = proxy_server_ip
info['proxy_public_ip'] = proxy_public_ip
proxy_url = env.getenv("PORTAL_URL") +"/"+ proxy_public_ip +"/_web/" + username + "/" + clustername
info['proxy_url'] = proxy_url
self.write_clusterinfo(info,clustername,username)
return proxy_public_ip
def get_clusterinfo(self, clustername, username):
clusterpath = self.fspath + "/global/users/" + username + "/clusters/" + clustername
if not os.path.isfile(clusterpath):

View File

@ -180,6 +180,8 @@ class Worker(object):
logger.info("Monitor Collector has been started.")
# worker change it state itself. Independedntly from master.
self.etcd.setkey("machines/runnodes/"+self.addr, "work")
publicIP = env.getenv("PUBLIC_IP")
self.etcd.setkey("machines/publicIP/"+self.addr,publicIP)
self.thread_sendheartbeat = threading.Thread(target=self.sendheartbeat)
self.thread_sendheartbeat.start()
# start serving for rpc

View File

@ -55,7 +55,7 @@
<button type="button" class="btn btn-xs btn-default"> Delete </button>
</td>
<td>
<a href="/{{ cluster['proxy_server_ip'] }}/go/{{ mysession['username'] }}/{{ cluster['name'] }}" target="_blank"><button type="button" class="btn btn-xs btn-success">&nbsp;&nbsp;&nbsp;Go&nbsp;&nbsp;&nbsp;</button></a>
<a href="/{{ cluster['proxy_public_ip'] }}/go/{{ mysession['username'] }}/{{ cluster['name'] }}" target="_blank"><button type="button" class="btn btn-xs btn-success">&nbsp;&nbsp;&nbsp;Go&nbsp;&nbsp;&nbsp;</button></a>
</td>
{% else %}
<td><a href="/vclusters/"><div class="text-warning"><i class="fa fa-stop "></i> Stopped</div></a></td>

View File

@ -24,7 +24,7 @@ class dashboardView(normalView):
message = message.get("message")
single_cluster['status'] = message['status']
single_cluster['id'] = message['clusterid']
single_cluster['proxy_server_ip'] = message['proxy_server_ip']
single_cluster['proxy_public_ip'] = message['proxy_public_ip']
full_clusters.append(single_cluster)
else:
self.error()