Calyptos: Running midonet-api on non-CLC hosts should fail validation #70

dmccue · 2015-09-09T18:19:19Z

Successful install logs: https://eucalyptus.atlassian.net/secure/attachment/25710/calyptos-1441818819.tgz

[root@odc-f-09 ~]# euca-describe-instances i-b6698673
RESERVATION r-689b39b6  000251786737    default
INSTANCE    i-b6698673  emi-990a5431    10.116.156.1    172.31.1.94 pending admin   0       m1.medium   2015-09-09T17:51:29.152Z    az-01               monitoring-enabled  10.116.156.1    172.31.1.94 vpc-7e0cb490    subnet-7aea6bd4 instance-store                  hvm         sg-91fc5339             x86_64
NETWORKINTERFACE    eni-33419845    subnet-7aea6bd4 vpc-7e0cb490    000251786737    in-use  172.31.1.94 euca-172-31-1-94.eucalyptus.internal    true
ATTACHMENT      0   attached    2015-09-09T17:51:29.157Z    true
ASSOCIATION 10.116.156.1        172.31.1.94
GROUP   sg-91fc5339 default
PRIVATEIPADDRESS    172.31.1.94 euca-172-31-1-94.eucalyptus.internal    primary
TAG instance    i-b6698673  Name    test1
TAG instance    i-b6698673  euca:node   10.105.1.209

Useful reference: http://jeevanullas.in/blog/aws-vpc-eucalyptus-midonet-2/

This is more than likely VPC related, debugging will be required to see if the configuration is set, possible missing routes as this is a non-BGP setup

viglesiasce · 2015-09-09T18:42:16Z

@dmccue this is caused by the midonet-api not being colocated with the CLC/eucanetd. In 4.2 that is a requirement. We need to add a validator for this for sure.

dmccue · 2015-09-10T13:04:02Z

This is what happens when the midonet-api is set to the CLC IP:
https://eucalyptus.atlassian.net/secure/attachment/25720/calyptos-1441889948.tgz

midokura.midonet-api-url changed from http://10.105.10.70:8080/midonet-api to http://10.105.10.51:8080/midonet-api

[10.105.10.70] out:   * execute[Create TunnelZone] action run[2015-09-10T05:55:41-07:00] INFO: Processing execute[Create TunnelZone] action run (midokura::create-first-resources line 8)
[10.105.10.70] out: [2015-09-10T05:55:41-07:00] INFO: Retrying execution of execute[Create TunnelZone], 19 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:55:52-07:00] INFO: Retrying execution of execute[Create TunnelZone], 18 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:02-07:00] INFO: Retrying execution of execute[Create TunnelZone], 17 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:12-07:00] INFO: Retrying execution of execute[Create TunnelZone], 16 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:23-07:00] INFO: Retrying execution of execute[Create TunnelZone], 15 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:33-07:00] INFO: Retrying execution of execute[Create TunnelZone], 14 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:43-07:00] INFO: Retrying execution of execute[Create TunnelZone], 13 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:56:54-07:00] INFO: Retrying execution of execute[Create TunnelZone], 12 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:04-07:00] INFO: Retrying execution of execute[Create TunnelZone], 11 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:14-07:00] INFO: Retrying execution of execute[Create TunnelZone], 10 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:25-07:00] INFO: Retrying execution of execute[Create TunnelZone], 9 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:35-07:00] INFO: Retrying execution of execute[Create TunnelZone], 8 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:45-07:00] INFO: Retrying execution of execute[Create TunnelZone], 7 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:57:56-07:00] INFO: Retrying execution of execute[Create TunnelZone], 6 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:06-07:00] INFO: Retrying execution of execute[Create TunnelZone], 5 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:16-07:00] INFO: Retrying execution of execute[Create TunnelZone], 4 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:27-07:00] INFO: Retrying execution of execute[Create TunnelZone], 3 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:37-07:00] INFO: Retrying execution of execute[Create TunnelZone], 2 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:47-07:00] INFO: Retrying execution of execute[Create TunnelZone], 1 attempt(s) left
[10.105.10.70] out: [2015-09-10T05:58:58-07:00] INFO: Retrying execution of execute[Create TunnelZone], 0 attempt(s) left

dmccue · 2015-09-11T14:01:10Z

Have unset midokura.midonet-api-url to default to http://localhost:8080/midonet-api which seems to have worked...

However there's now a fatal issue with cassandra:

ERROR 06:57:52,308 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
    at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1296)
    at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:457)
    at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:671)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:623)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:515)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:424)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:554)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:643)
java.lang.RuntimeException: Unable to gossip with any seeds
    at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1296)
    at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:457)
    at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:671)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:623)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:515)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:424)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:554)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:643)
Exception encountered during startup: Unable to gossip with any seeds

Which leads to:
https://stackoverflow.com/questions/20690987/apache-cassandra-unable-to-gossip-with-any-seeds

dmccue · 2015-09-11T14:05:19Z

[root@odc-f-28 ~]# grep 'listen_address\|broadcast_address' /etc/cassandra/conf/cassandra.yaml
listen_address: odc-f-28.prc.eucalyptus-systems.com
# Leaving this blank will set it to the same value as listen_address
# broadcast_address: 1.2.3.4
#    Uses public IPs as broadcast_address to allow cross-region

Reason why this doesn't work is because odc-f-28.prc.eucalyptus-systems.com is resolving to the public interface and not the private interface. What is the best way to address this, change the /etc/hosts file or modify cassandra.yaml listen_address to use the private interface IP?

This would need to be altered from node['fqdn'] to something that allows overriding to private ip address
https://github.com/eucalyptus/midokura-cookbook/blob/master/recipes/cassandra.rb#L13

execute "CASSANDRA: set listening address" do
 command "sed -i -e 's/localhost/#{node['fqdn']}/g' /etc/cassandra/conf/cassandra.yaml"
end

viglesiasce · 2015-09-11T18:27:09Z

@dmccue glad to hear the localhost change fixed up the mido side.

Im going to move the cassandra issue to a different issue so we dont cross wires for this one.

viglesiasce · 2015-09-11T18:31:18Z

I opened #72 to continue the cassandra work

viglesiasce · 2015-09-11T18:45:39Z

Looking more at that cloud, it looks like instances are now going to running but not able to get their addresses via DHCP. Need to investigate that further.

viglesiasce · 2015-09-11T21:00:52Z

Eucanetd is not running on the CLC which another requirement and needs a validator. After that was cleared out we had issues because eucanetd was not able to figure out which mido hosts were running the instances. This is caused from the lack of a reverse mapping of the nodes hostnames to their registered IP addresses (both in mido and euca). To work around this I added the following to the CLC/eucanetd /etc/hosts file and instances then began to get their IPs properly:

10.105.10.51 odc-f-09.prc.eucalyptus-systems.com
10.105.10.73 odc-f-31.prc.eucalyptus-systems.com
10.105.10.78 odc-f-36.prc.eucalyptus-systems.com
10.105.1.209 odc-d-30.prc.eucalyptus-systems.com

Instances are now booting and getting their IP addresses/metadata as expected.

The diff for the env file is as follows:

[root@odc-f-09 calyptos-deploy]# diff environment.yml environment-vic.yml
46,47c46
<     # Mappings for only NCs and CCs
<       odc-f-28.prc.eucalyptus-systems.com: 10.105.10.70
---
>     # Mappings for only NCs and CLC
54a54,55
>   - &EUCANETD_HOST
>     odc-f-09.prc.eucalyptus-systems.com
83c84
<           EucanetdHost: *MIDO_GATEWAY_HOST
---
>           EucanetdHost: *EUCANETD_HOST
[root@odc-f-09 calyptos-deploy]#

dmccue · 2015-09-14T14:45:27Z

Made those changes: https://eucalyptus.atlassian.net/secure/attachment/25801/calyptos-1442240272.tgz
Not able to connect to the midonet-api, will investigate

dmccue · 2015-09-14T16:35:21Z

https://eucalyptus.atlassian.net/secure/attachment/25802/calyptos-1442248064.tgz

(on clc)
[root@odc-f-09 ~]# netstat -antp | grep 8080
tcp 0 0 :::8080 :::* LISTEN 26516/java
[root@odc-f-09 ~]# midonet-cli --midonet-url=http://localhost:8080/midonet-api -A -e add tunnel-zone name mido-tz type gre
The API server failed to respond normally. The network DB is possibly down. Bye.
[root@odc-f-09 ~]# tail -1 /var/log/eucalyptus/eucanetd.log
2015-09-14 09:24:53 FATAL 000022010 mido_check_state | midonet-api is not reachable after 120 retries: eucanetd shutting down

Obviously the midonet-api is installed on the CLC (10.105.10.51), however the REST api is showing 404 for all calls. Likely to be a tomcat configuration issue or backend issue whereby tomcat can't communicate with zookeeper

viglesiasce · 2015-09-14T19:06:40Z

@dmccue looks like the midonet-api is pointing at 10.105.10.70 but zookeeper is running on 10.104.10.5. Can you rerun with 10.105.10.70 as your zookeeper host. The cookbook is currently only installing zookeeper on the midonet-api host.

viglesiasce · 2015-09-14T19:11:23Z

Sorry @dmccue i meant rerunning with 10.105.10.51

dmccue · 2015-09-14T20:45:01Z

@viglesiasce That has now built with exit code 0, there remains ingress connectivity issues over private and public IPs

Validators required:

midokura.zookeepers contains a minimum of one array item pointing to eucalyptus.topology.clc-1
midokura.midonet-api-url contains ip address of eucalyptus.topology.clc-1
eucalyptus.network.config-json.Mido.EucanetdHost contains hostname of eucalyptus.topology.clc-1

viglesiasce · 2015-09-14T20:48:25Z

Thanks @dmccue! You saved me the work of going back through this journey to figure out the right validators 👍

dmccue · 2015-09-21T17:31:11Z

Have switched over to using midonet on clc and specifying localhost as midonet api endpoint

viglesiasce changed the title ~~Calyptos: After successful install, instances hang at pending - VPCMIDO~~ Calyptos: Running midonet-api on non-CLC hosts should fail validation Sep 9, 2015

viglesiasce added the validation label Sep 9, 2015

dmccue mentioned this issue Sep 14, 2015

Calyptos: Populate CLC /etc/hosts with midokura.midolman-host-mapping attributes #73

Open

dmccue closed this as completed Sep 21, 2015

dmccue reopened this Sep 21, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calyptos: Running midonet-api on non-CLC hosts should fail validation #70

Calyptos: Running midonet-api on non-CLC hosts should fail validation #70

dmccue commented Sep 9, 2015

viglesiasce commented Sep 9, 2015

dmccue commented Sep 10, 2015

dmccue commented Sep 11, 2015

dmccue commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

dmccue commented Sep 14, 2015

dmccue commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

dmccue commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

dmccue commented Sep 21, 2015

Calyptos: Running midonet-api on non-CLC hosts should fail validation #70

Calyptos: Running midonet-api on non-CLC hosts should fail validation #70

Comments

dmccue commented Sep 9, 2015

viglesiasce commented Sep 9, 2015

dmccue commented Sep 10, 2015

dmccue commented Sep 11, 2015

dmccue commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

viglesiasce commented Sep 11, 2015

dmccue commented Sep 14, 2015

dmccue commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

dmccue commented Sep 14, 2015

viglesiasce commented Sep 14, 2015

dmccue commented Sep 21, 2015