abusedbits.com: Network

Showing posts with label Network. Show all posts

Monday, April 17, 2017

Networking Penalty Box

In networking, we can't always have both price and performance.

In many cases it is the equivalent of 'having the cake and eating it too.'

This is governed by the mechanism or method of network delivery, software or hardware.

In software, the cost of networking is relatively low and favors extremely rapid change.

It is important to remember that it is constrained by the software architecture as well as the queues and threads that must be processed in concert with the operating system, hypervisor and application, etc.

All of these are contending for time with the CPU and executing a software instruction takes a certain relative amount of time based on the CPU architecture.

In hardware, the cost of networking is high and favors rapid packet exchange over the ability to modify the networking function.

I'm being very generous in this statement, the sole purpose of hardware is to move the packet from one spot to another, as rapidly as possible.

Because the majority of the work is done in silicon, the only means to modify the network is to subroutine into software (which undermines the purpose and value of the hardware) OR to replace the silicon (which can take months to years and costs, a lot).

Figure 1. The price vs performance curve

Utilizing x86 generically programmable delivery mechanisms, it is possible to do many things required of the network that are tolerable, but not fast or optimized.

The examples of host bridges and OVS are eminently capable at relative bandwidth requirements and latency supporting an application within the confines of a hypervisor. It can be remarkably efficient at least with respect to the application requirements. The moment the traffic exits a hypervisor or OS, it becomes considerably more complex, particularly under high virtualization ratios.

Figure 2. The Network Penalty Box

Network chipset vendors, chipset developers and network infrastructure vendors have maintained the continuing escalation in performance by designing capability into silicon.

All the while, arguably, continuing to put a downward pressure on the cost per bit transferred.

Virtualization vendors, on the other hand, have rapidly introduced network functions to support their use cases.

At issue is the performance penalty for networking in x86 and where that performance penalty affects the network execution.

In general, there is a performance penalty for executing layer 3 routing using generic x86 instructions vs silicon in the neighborhood of 20-25x.

For L2 and L3 (plus encapsulation) networking in x86 instruction vs silicon, the impact imposed is higher, in the neighborhood of 60-100x.

This adds latency to a system we'd prefer not to have, especially with workload bandwidth shifting heavily in the East-West direction.

Worse, it consumes a portion of the CPU and memory of the host that could be used to support more workloads. The consumption is so unwieldy, bursty and application dependent that it becomes difficult to calculate the impact except in extremely narrow timeslices.

Enter virtio/SR-IOV/DPDK

The theory is, take network instructions that can be optimized and send them to the 'thing' that optimizes them.

Examples include libvert/virtio that evolve the para-virtualization of the network interface through driver optimization that can occur at the rate of change of software.

SR-IOV increases performance by taking a more direct route from the OS or hypervisor to the bus that supports the network interface via an abstraction layer. This provides a means for the direct offload of instructions to the network interface to provide more optimized execution.

DPDK creates a direct to hardware abstraction layer that may be called from the OS or hypervisor. Similarly offloading instructions for optimized execution in hardware.

What makes these particularly useful, from a networking perspective, is that elements or functions normally executed in the OS, hypervisor, switch, router, firewall, encryptor, encoder, decoder, etc., may now be moved into a physical interface for silicon based execution.

The cost of moving functions to the physical interface can be relatively small compared to putting these functions into a switch or router. The volumes and rate of change of a CPU, chipset or network interface card has been historically higher, making introduction faster.

Further, vendors of these cards and chipsets have practical reasons to support hardware offloads that favor their product over other vendors (or at the very least to remain competitive).

This means that network functions are moving closer to the hypervisor.

As the traditional network device vendors of switches, routers, load balancers, VPNs, etc., move to create Virtual Network Functions (VNFs) of their traditional business (in the form of virtual machines and containers) the abstractions to faster hardware execution will become ever more important.

This all, to avoid the Networking Penalty Box.

Monday, April 3, 2017

The what all, of Enterprise Cloud adoption, and all

Was asked my thoughts on Enterprise Public Cloud adoption rate.

Sensors leading to Public Cloud adoption:

From an enterprise perspective, the volume of servers sold in the previous year is soft. http://www.gartner.com/newsroom/id/3530117

The partnership that can underpin legacy enterprise app deployments in hybrid cloud has been announced. https://blogs.vmware.com/vsphere/2016/10/vmware-aws-announce-strategic-partnership.html

The drivers for cost containment are going to become increasingly important as customers look for lower cost of service. This will probably start looking like “Use (AWS or Azure) for disaster recovery” and likely to evolve into “Application Transformation” discussions as optimizations within the cloud. The best way to see this is with the directionality of utility mapping like the value chains between enterprise and public cloud: http://www.abusedbits.com/2016/11/why-public-cloud-wins-workload-love.html

Microsoft has announced an Azure Stack appliance to extend the reach of capabilities into the private cloud. http://www.zdnet.com/article/microsoft-to-release-azure-stack-as-an-appliance-in-mid-2017/

The cost / unit consumption of private Data Center estate is anywhere between ~40 and 100% more than public cloud. This is further being eroded by co-location vendors continuing to drive down the cost/unit for data center vs new build. It is becoming very costly to build or retrofit older data centers rather than simply consuming a service at cost that can be easily liquidated per month. The large scale DC vendors are also creating ecosystem connections for networking directly with the public cloud vendors and where those don’t exist, companies like AT&T, etc are enabling this type of service connection via their MPLS capabilities.

Then, there’s the eventual large scale adoption of containers that present some additional optimizations, particularly as they relate to DevOps that further increase density over hypervisor based virtualization and increase dramatically the speed of change. Further extending this capability, the network vendors, the historic last line to move in 3rd platform are starting to adopt these concepts.
http://www.abusedbits.com/2017/03/creation-and-destruction-without-remorse.html
http://www.investors.com/news/technology/can-cisco-take-aristas-best-customers-with-software-bait/
http://www.crn.com/slide-shows/channel-programs/300084323/arista-networks-exec-on-new-container-software-offensive-and-its-biggest-fundamental-advantage-over-cisco.htm?_lrsc=9ce3424f-25d3-4521-95e4-eeae8e96b525

This culminates in public cloud providers positioning themselves for legacy applications, cost containment, cost based on their operating models, positions in DR if they can’t get production workloads, integration into private cloud where they can’t get principle workloads and certainly new workloads in cost/volume based on scale.

This leads me to believe that the position on Public Cloud, from an enterprise perspective, is just starting…..

Friday, March 31, 2017

Enterprise WAN is evolving!

Figure 1. Enterprise WAN Reference Architecture

Figure 1 represents the high level Enterprise WAN Reference Architecture that current network capabilities seem to be indicating for the support of enterprise services.

The MPLS network will be extended and enhanced utilizing gateway functions like VPN (which we currently do), CSP access that enables direct connectivity via the MPLS network and SD-WAN that will allow the extension of the MPLS via the Internet to small and medium size locations (maybe even large locations).

SD-WAN will extend the capability of MPLS network to locations not natively available with individual carriers. It avoids the need to NNI carriers unless it is absolutely necessary. The carriage mechanism is tunneling over the internet and can support vendor/protocol specific optimizations for some quality of service (an abstraction of the underlying IP connectivity).

Where SD-WAN cannot be on an MPLS gateway, the internet direct to DC will be able to support this functionality.

This model also represents the dissection and reduction of networks that must be "carried twice", ingressing and egressing the Data Center perimeter security controls. These controls will eventually be migrated to the Carrier Cloud WAN Services. They will be provisioned for specificity in the enterprise application usage model or virtualized per application within the workload execution model.

Traffic destined for CSPs and SaaS can use a more direct path via the Internet if allowed by the Enterprise.

The CSPs, connected to the Internet, a CSP gateway to MPLS and Ecosystem networks connected directly to Data Centers will extend the Enterprise Network to support enhanced consumption of those types of services like SAS, IoT as well as the various Cloud Service Providers.

Individuals will come in over a variety of connectivity mechanisms including broadband and telco wireless.

Providing the cost structure is competitive, backup paths for many of these networks are likely to shift toward future implementations of Telco 5G.

Wednesday, March 8, 2017

Creation and destruction without remorse

Recent announcements by network vendors (Arista, Juniper, Cisco) specifically taking aim at containers as part of their strategy indicate that the evolution of the platform continues.

When I say The Platform, I mean specifically the Value Chain of workload execution that exists as separate entities within the Enterprise and the Public Cloud.

Adoption of methods that provide 'like' services in both the Enterprise and Public Cloud pose an interesting point of view in the direction the network industry will take. In this view of the world, there is a history of expectation that supporting capabilities within the Enterprise will resemble the capabilities in the Public Cloud and vice-versa.

The two service areas are advancing, often with differing goals in mind, for use of their virtualization service. Enhancement of the capabilities are happening on different trajectories, where the example of AWS moving their Platform-as-a-Service toward Serverless functions and Enterprise services as Virtual Infrastructure Managers that are starting to tie into container capabilities.

The "least common denominator" capabilities that allow meaningful ubiquity in an all encompassing service model of today may very well disappear over time. This to be replaced by the service chaining of decoupled application functions.

Based on some of these recent announcements, the network vendors mean to enter this brave new world in their areas of strength.

Arista is pursuing the support of complex workflows execution, where its network code can be spun up or down as workloads change, with the same code on hardware, in virtualization and within containers.

Juniper's Contrail similarly supports virtual services like their vRouter that interacts directly with the virtual infrastructure management for automation (service chaining).

Cisco's Project Contiv may be the most ambitious, applying policies (and networking) that characterize the 'intent' of an application's deployment.

Some of the key things about DevOps that will play a role in how this all works out, and that the network vendors should take to heart:

Application development is not an isolated activity. When one finds a useful capability, they will share it.

Because containers can be easily shared, applications are unlikely to be created from scratch. Making sure developers can share useful capability is vital.

Network methods used to support applications must be easily created and destroyed.

Or, as posed to me by Rick Wilhelm, "Containers allow creation and destruction of application environments without drama or remorse." -- so must it be for the network.

Friday, February 24, 2017

CSPs, SaaS and Network Ecosystems

Networking between Virtual Private Cloud (VPC) deployments is possible, but if you're looking to avoid some of the pitfalls of either multi-region or even multi-vendor deployments

it may be necessary to build a substantial part of the network yourself
you may have to trombone the traffic through your CSP to Data Center connection

Neither of these are great options for an infrastructure that is nearly all automated and programmatic.

So, considering the alternatives, there may be some interesting possibilities as Network Ecosystem vendors enhance their services with additional automation and integration.

Consider AT&T's NetBond for instance. In a situation where you are already using NetBond to create interconnection points for your enterprise integration and consumption of CSP services, imagine the possibility of using the NetBond headends to instrument a connection between extra-regional VPCs in a CSP, like Amazon Web Services.

The major advantage, NetBond is a programmable interface to the Direct Route AND they can pass traffic on the AT&T AVPN without having to transverse the Enterprise WAN.

Here's a high level of what that would look like:

VPC Networking AWS NetBond AT&T Amazon Availability Zone

Figure 1. AWS VPC to VPC

At first glance, this looks remarkably similar to VPC routing, but notice that this configuration is completely EXTRA-REGIONAL, it could be used to connect a VPC in US West to a VPC in Singapore.

This could provide some really interesting availability and DR models for application designers.

A second possibility is to enhance a Hybrid Cloud service with execution in more than one vendor CSP.

Consider the following figure:

Amazon, Azure, ExpressRoute, DirectRoute, Route, AWS

Figure 2. Amazon VPC connecting to Azure Cloud

In this model, creating a truly vendor independent cloud deployment becomes possible. Not only will this instrument application delivery across multiple CSPs, but it makes some of the container application deployment possibilities a lot less "sticky." And yes, it's entirely programmatic.

There's always a question around moving data to the right place. Considering that quite a number of enterprises use a variety of SaaS services today, it may be nice to move specific volumes of data from one place to another to act on them with some Big Data analytics (and maybe even some #AI in the future).

Consider the next figure:

Figure 3. CSP to SaaS

As an example: with this method it would be possible in the future to send SFDC data (or even a stream) to an interactive visualization of the data in Microsoft Azure via Power BI. Again, all done programmatically AND secure.

Ultimately, once network connecting points are made available, interesting things can start to happen with Network Ecosystems.

Update:

.@abusedbits Love it-- A realtime market opportunity feed for the @CSC + @MicrosoftR IML: http://bit.ly/2ibqZpk #CSCTechTalk

https://twitter.com/JerryAOverton/status/835493717389754368

A compelling use of real time data feed, programmatically applied to a network integration and delivery of interactive visualization with MicrosoftR.

Wednesday, January 25, 2017

Network Abstraction Virtualization SDN VNF

Recent question asked: What is this network virtualization stuff I keep hearing about?

Figure 1. Network packets and trains

Network virtualization can apply to multiple areas of networking. At a high level....

Network Virtualization technically started with VLAN, which stands for virtual LAN, where the broadcast domain was abstracted away from ALL of the physical endpoints in the network. This made it possible to group computers on a network with some level of logic, it's done in software rather than by changing wires and can be considered an abstraction of the wiring.

There are a couple of different types of Software Defined Networking (SDN), the leading one right now is an "overlay" in a tunnel over an "underlay" or "provider" network. It exists as an abstraction of one network on top of another, where the underlay is responsible for fast packet performance (traditional networking) and the overlay is responsible for specific awareness or intelligence of the communicating endpoints.
The simple example: If you consider a train the "underlay" network (it moves packets efficiently) then a person riding on the train with their own bag is the "overlay." The train doesn't have to know where the person is going, just that a portion of their travel is between these two endpoints. This abstracts the path of the data packets from the logic of how they are connected by placing the traffic in a network tunnel. Common tunnel types are VxLAN, GRE and NVGRE. This type is associated with technology like VMware NSX and Microsoft Hyper-V networking.

There is another SDN type that acts on the flow of packets between their source and destination. This also abstracts the path of the data packets from the logic of how they are connected, but in difference to the concept above, this type of SDN acts on the forwarding plane of network hardware primarily. This type is associated with technology like OpenFlow.

And there is also another type of network virtualization happening right now, where the "function" or software coding of a network device is built within a software package, like a virtual host or container, that can be run on a standard server. This is called a Virtual Network Function (VNF) and is closely associated with the advocacy of moving from hardware to software delivery of services, often called Network Function Virtualization (NFV).
The simple example: A router has been historically a device with interfaces that moves packets from one physical or logical interface to another according to a configured pattern. A VNF router is software (not a device) that runs on a server that moves packets from one software or logical interface to another. This abstracts away the hardware in favor of software delivery of the capability. There's a bit of this in the enterprise and a lot starting in the Telecommunications Carrier space.

Again, this is at a high level and hope that it helps. There are other network abstractions currently in use, but these are the primary ones getting all of the media attention today.

Wednesday, October 26, 2016

NFV - Service Consumption

As the telecommunications industry starts delivering on Network Function Virtualization (NFV) via delivery of Virtual Network Functions (VNF) there should be a consideration for the service consumption mechanism that is driving this industry.

Consider that startup players in the market are approaching this market segment from enabling developers to directly integrate with their systems. As the network ecosystems have evolved to include demand based services, these new players are providing the means to directly consume services that have been historically managed services.

There is a direct parallel of this model with the likes of Amazon Web Services and Microsoft Azure. They have build a platform and enhanced it over time to directly address the service consumption model. As a demand based service, compute and storage have largely been commoditized, or in the vernacular of the Value Chain, they are utility services. You pay for what you use.

Telecommunications carriers need to be aware of the conditions this placed on the entirety of the IT market. It shifted major capabilities to Hybrid Cloud and may further shift the entirety of workload execution to this demand based service area before the next major scale out.

During this evolution, traditional managed services may not survive in their current state. Further, the directional of OSS and BSS have almost always been northbound. As the digital shift continues, these functions need to be both North and Southbound.

Finally, there cannot be enough emphasis on this, this is technology segregated by logic. Policy Enforcement that is well understood and tied together from MANO Service Chaining to VIM and finally to the consumer needs to be a foundational part of the plan of service delivery, enabled and enacted upon by API and made available to the masses that will be in a position to consume it.

The evolution of this space is ripe for a lambda function like execution in its maturity.

Friday, September 23, 2016

Network Function Virtualization - Value Chain 2016

Value Chain for NFV in 2016

I'm using this to describe the relative position of functions in the value chain for Network Function Virtualization (NFV).

Where the customer is consuming Virtual Network Functions (VNFs) there are a large number of supporting functions underlying the delivery of the VNFs.

This is intended to show the relationship between those elements that would perform part of a MANO Functional Block in the VIM (Virtualized Infrastructure Manager). #mapping

Monday, July 25, 2016

Docker Network Demo - Part 5

A couple of useful links:

https://github.com/wsargent/docker-cheat-sheet

https://blog.docker.com/2013/10/docker-0-6-5-links-container-naming-advanced-port-redirects-host-integration/

Also figured out where the interesting docker names come from:

https://github.com/docker/docker/blob/master/pkg/namesgenerator/names-generator.go

BTW, there is a lot of REM in the file with some Easter Egg kind of info in it.

https://docs.docker.com/engine/reference/commandline/attach/

You can create your own names using --name foo as in "docker run --name test -it alpine /bin/sh".

Resuming from Part 4….

First thing, I just simply didn't have it in me to continue to use a complete /16. So:

docker network create -d bridge --subnet 172.16.2.0/24 docker2

nelson@lab1:~$ docker network ls

NETWORK ID          NAME                DRIVER

5ef6f5f7f40f        bridge              bridge

11f4ac20d39d        docker1             bridge

5d150019b8a9        docker2             bridge

d1a03332c0c1        host                host

91b70cf2593b        none                null

I feel so much better…..

Also, I updated the Ubuntu system and rebooted it, so I'm going to need to recreate the containers I'm playing with.

Now that I know how to name the docker containers, I can re-create the lab setup rapidly with the following commands:

docker run --name=test1 --net=docker1 -it alpine /bin/sh

docker run --name=test2 --net=docker1 -it alpine /bin/sh

docker run --name=test3 --net=docker2 -it alpine /bin/sh

nelson@lab1:~$
docker ps

CONTAINER
ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

9f9a5604108b        alpine              "/bin/sh"           2 minutes ago       Up 2 minutes                            test3

61acf893dac5        alpine              "/bin/sh"           2 minutes ago       Up 2 minutes                            test2

b501988db295        alpine              "/bin/sh"           3 minutes ago       Up 2 minutes                            test1

Docker revised containers and networks

Let's look at the connectivity again. The vSwitch isn't allowing the traffic to pass from one bridge to the other.

From test1 to test3

/ # ping 172.16.2.2

PING
172.16.2.2 (172.16.2.2): 56 data bytes

^C

--- 172.16.2.2
ping statistics ---

8 packets
transmitted, 0 packets received, 100% packet loss

From test3 to test1

/ # ping 172.16.1.2

PING
172.16.1.2 (172.16.1.2): 56 data bytes

^C

--- 172.16.1.2
ping statistics ---

5 packets
transmitted, 0 packets received, 100% packet loss

What does it take to get the containers to be able to talk to each other.

https://docs.docker.com/v1.8/articles/networking/ -> Search "Communication between containers"

There's a nice section on the rules here, but basically it can be turned off if --iptables=false is evoked at docker start.

Be aware: This is not considered a secure way of allowing containers to communicate. Look up --icc=true and https://docs.docker.com/v1.8/userguide/dockerlinks/

Before:

nelson@lab1:/etc/default$ sudo iptables -L -n

[sudo]
password for nelson:

Chain INPUT
(policy ACCEPT)

target     prot opt source               destination

Chain FORWARD
(policy ACCEPT)

target     prot opt source               destination

DOCKER-ISOLATION  all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

Chain OUTPUT
(policy ACCEPT)

target     prot opt source               destination

Chain DOCKER
(3 references)

target     prot opt source               destination

Chain
DOCKER-ISOLATION (1 references)

target     prot opt source               destination

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

RETURN     all 
--  0.0.0.0/0            0.0.0.0/0

Insert the following rule in /etc/default/docker using your favorite editor

#nelson - remove iptables remove masquerade

DOCKER_OPTS="--iptables=false --ip-masq=false"

Rebooting - in too much of a hurry to figure out iptables right now
update: sudo iptables -F -t nat -- flushes the nat table
sudo iptables -F -t filter -- flushes the filter table

Then re-start and re-attach the containers in each putty window

/ #
nelson@lab1:~$ docker start test1

test3

nelson@lab1:~$
docker attach test1

/ #

/ # ifconfig
-a

eth0      Link encap:Ethernet  HWaddr 02:42:AC:10:01:02

          inet addr:172.16.1.2 
Bcast:0.0.0.0  Mask:255.255.255.0

          inet6 addr:
fe80::42:acff:fe10:102%32734/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1500 
Metric:1

          RX packets:24 errors:0 dropped:0
overruns:0 frame:0

          TX packets:8 errors:0 dropped:0
overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:5361 (5.2 KiB)  TX bytes:648 (648.0 B)

lo        Link encap:Local Loopback

          inet addr:127.0.0.1  Mask:255.0.0.0

          inet6 addr: ::1%32734/128 Scope:Host

          UP LOOPBACK RUNNING  MTU:65536 
Metric:1

          RX packets:0 errors:0 dropped:0
overruns:0 frame:0

          TX packets:0 errors:0 dropped:0
overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

After Docker default change.

nelson@lab1:~$
sudo iptables -L -n

Chain INPUT
(policy ACCEPT)

target     prot opt source               destination

Chain FORWARD
(policy ACCEPT)

target     prot opt source               destination

Chain OUTPUT
(policy ACCEPT)

target     prot opt source               destination

Ping test1 to test3

/ # ping 172.16.2.2

PING
172.16.2.2 (172.16.2.2): 56 data bytes

64 bytes from
172.16.2.2: seq=0 ttl=63 time=0.163 ms

64 bytes from
172.16.2.2: seq=1 ttl=63 time=0.138 ms

64 bytes from
172.16.2.2: seq=2 ttl=63 time=0.133 ms

^C

--- 172.16.2.2
ping statistics ---

3 packets
transmitted, 3 packets received, 0% packet loss

round-trip
min/avg/max = 0.133/0.144/0.163 ms

Ping test3 to test1

/ # ping 172.16.1.2

PING
172.16.1.2 (172.16.1.2): 56 data bytes

64 bytes from
172.16.1.2: seq=0 ttl=63 time=0.280 ms

64 bytes from
172.16.1.2: seq=1 ttl=63 time=0.126 ms

64 bytes from
172.16.1.2: seq=2 ttl=63 time=0.136 ms

64 bytes from
172.16.1.2: seq=3 ttl=63 time=0.129 ms

64 bytes from
172.16.1.2: seq=4 ttl=63 time=0.139 ms

^C

--- 172.16.1.2
ping statistics ---

5 packets
transmitted, 5 packets received, 0% packet loss

round-trip
min/avg/max = 0.126/0.162/0.280 ms

What you should probably be thinking now, OMG what have I done!

Update: from here, all isolation rules must be made specifically in iptables
make sure the FORWARD-DROP rules provide all of the required isolation
think direction AND address range

this method may be very useful if the network area is behind a sufficient perimeter

host routes for specific networks could be applied for connectivity

a routing function on the host would be used for communicating with the
outside world. Look at:
http://www.admin-magazine.com/Articles/Routing-with-Quagga

#-REM out the statement in the default docker file and rebooted

Once again all is right with the world.

nelson@lab1:~$
sudo iptables -L -n

[sudo]
password for nelson:

Chain INPUT
(policy ACCEPT)

target     prot opt source               destination

Chain FORWARD
(policy ACCEPT)

target     prot opt source               destination

DOCKER-ISOLATION  all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

DOCKER     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

ACCEPT     all 
--  0.0.0.0/0            0.0.0.0/0

Chain OUTPUT
(policy ACCEPT)

target     prot opt source               destination

Chain DOCKER
(3 references)

target     prot opt source               destination

Chain
DOCKER-ISOLATION (1 references)

target     prot opt source               destination

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

DROP       all 
--  0.0.0.0/0            0.0.0.0/0

RETURN     all 
--  0.0.0.0/0            0.0.0.0/0

nelson@lab1:~$

http://www.abusedbits.com/2016/09/docker-host-networking-modes.html

Friday, July 22, 2016

Docker Network Demo - Part 4

So, there's always the oops moment when you know that you did something wrong, often before you did it.

I closed one of the putty windows. Wasn't sure how to get back to my new container.

Update: https://github.com/docker/docker/issues/2838 Control-P and Control-Q on the console allow you to move into and out of the psuedo-shell

As it turns out, the container is given a name (assumption that a name could be applied to it also).

docker ps - to see the running containers

nelson@lab1:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

4a567ec8d878 alpine "/bin/sh" 4 hours ago Up 4 hours serene_jennings

60f137369165 alpine "/bin/sh" 18 hours ago Up 18 hours nauseous_meninsky

I'm after nauseous_meninsky (have to look up where they get these names later).

nelson@lab1:~$ docker attach nauseous_meninsky

/ #

Whew! Disaster averted. Back in my container!

……

Getting back to the networking, the default docker network is an RFC1918 class B. It seemed like a waste of address space to me, so let's create another network in docker.

docker network create -d bridge --subnet 172.16.1.0/24 docker1

-d is the driver, we want a bridge

--subnet defines the network range, looks like the default gateway is always the first in the range

docker1 is the defined name, like docker0 in the ifconfig -a from the host

nelson@lab1:~$ docker network create -d bridge --subnet 172.16.1.0/24 docker1

11f4ac20d39dd523c48fe3ac6462dd8bcb4a7247dba5162bec37d46208315bc2

docker network ls - to see if it added to the networks

nelson@lab1:~$ docker network ls

NETWORK ID NAME DRIVER

1c9307d1163e bridge bridge

11f4ac20d39d docker1 bridge

72a37254aedb host host

ae03349bbf0e none null

Let's create a container and associate it to the new network.

docker run --net=docker1 alpine -it alpine /bin/sh

nelson@lab1:~$ docker run --net=docker1 -it alpine /bin/sh

/ # ifconfig -a

eth0 Link encap:Ethernet HWaddr 02:42:AC:10:01:02

inet addr:172.16.1.2 Bcast:0.0.0.0 Mask:255.255.255.0

inet6 addr: fe80::42:acff:fe10:102%32720/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:54 errors:0 dropped:0 overruns:0 frame:0

TX packets:8 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:11822 (11.5 KiB) TX bytes:648 (648.0 B)

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0

inet6 addr: ::1%32720/128 Scope:Host

UP LOOPBACK RUNNING MTU:65536 Metric:1

RX packets:0 errors:0 dropped:0 overruns:0 frame:0

TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

/ #

nelson@lab1:~$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

2b9cacaef4f6 alpine "/bin/sh" 36 seconds ago Up 35 seconds sick_mclean

4a567ec8d878 alpine "/bin/sh" 4 hours ago Up 4 hours serene_jennings

60f137369165 alpine "/bin/sh" 19 hours ago Up 19 hours nauseous_meninsky

Now, lets see what it can talk to from the new shell.

Internet - Success

/ # ping 8.8.8.8

PING 8.8.8.8 (8.8.8.8): 56 data bytes

64 bytes from 8.8.8.8: seq=0 ttl=57 time=24.809 ms

64 bytes from 8.8.8.8: seq=1 ttl=57 time=25.089 ms

64 bytes from 8.8.8.8: seq=2 ttl=57 time=29.708 ms

--- 8.8.8.8 ping statistics ---

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 24.809/26.535/29.708 ms

Gateway - Success

/ # ping 172.16.1.1

PING 172.16.1.1 (172.16.1.1): 56 data bytes

64 bytes from 172.16.1.1: seq=0 ttl=64 time=0.130 ms

64 bytes from 172.16.1.1: seq=1 ttl=64 time=0.117 ms

64 bytes from 172.16.1.1: seq=2 ttl=64 time=0.111 ms

--- 172.16.1.1 ping statistics ---

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.111/0.119/0.130 ms

Container 1 - Failure

/ # ping 172.17.0.2

PING 172.17.0.2 (172.17.0.2): 56 data bytes

--- 172.17.0.2 ping statistics ---

9 packets transmitted, 0 packets received, 100% packet loss

Container 2 - Failure

/ # ping 172.17.0.3

PING 172.17.0.3 (172.17.0.3): 56 data bytes

--- 172.17.0.3 ping statistics ---

4 packets transmitted, 0 packets received, 100% packet loss

So, there's no path between 172.16.1.0/24 and 172.17.0.0/16

The routes from the host

nelson@lab1:~$ ip route

default via 192.168.123.254 dev wlan0 proto static

172.16.1.0/24 dev br-11f4ac20d39d proto kernel scope link src 172.16.1.1

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

192.168.123.0/24 dev wlan0 proto kernel scope link src 192.168.123.24 metric 9

Modified for 2 bridges attached to docker

So, maybe it looks a little more like this.

http://www.abusedbits.com/2016/07/docker-network-demo-part-5.html

Docker Network Demo - Part 3

Let's have a look at what is happening between the host and the container.

docker network ls - from the physical host shows the networks attached to docker

There is a bridge (softswitch), a host network on the bridge and a (none) null network (don't know what this is yet)

nelson@lab1:~$ docker network ls

NETWORK ID NAME DRIVER

1c9307d1163e bridge bridge

72a37254aedb host host

ae03349bbf0e none null

ifconfig -a to show the host connected network interfaces

docker0 is the bridge for the containers, eth0,eth1 currently unused, lo the host loopback and

wlan0, the currently connected host network (also where host default route resides)

There are also two networks with 'veth' prefixes. These are the virtual interfaces to docker0 for each container.

nelson@lab1:~$ ifconfig -a

docker0 Link encap:Ethernet HWaddr 02:42:5e:2d:df:17

inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0

inet6 addr: fe80::42:5eff:fe2d:df17/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:235 errors:0 dropped:0 overruns:0 frame:0

TX packets:251 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:16644 (16.6 KB) TX bytes:27519 (27.5 KB)

eth0 Link encap:Ethernet HWaddr fc:aa:14:98:ca:29

UP BROADCAST MULTICAST MTU:1500 Metric:1

RX packets:0 errors:0 dropped:0 overruns:0 frame:0

TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eth1 Link encap:Ethernet HWaddr fc:aa:14:98:ca:2b

UP BROADCAST MULTICAST MTU:1500 Metric:1

RX packets:0 errors:0 dropped:0 overruns:0 frame:0

TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Interrupt:20 Memory:f7e00000-f7e20000

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0

inet6 addr: ::1/128 Scope:Host

UP LOOPBACK RUNNING MTU:65536 Metric:1

RX packets:1747 errors:0 dropped:0 overruns:0 frame:0

TX packets:1747 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:180141 (180.1 KB) TX bytes:180141 (180.1 KB)

vethc07b410 Link encap:Ethernet HWaddr b6:c1:69:71:74:31

inet6 addr: fe80::b4c1:69ff:fe71:7431/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:94 errors:0 dropped:0 overruns:0 frame:0

TX packets:172 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:7805 (7.8 KB) TX bytes:19445 (19.4 KB)

vethd678055 Link encap:Ethernet HWaddr 9a:e2:9a:71:7f:3a

inet6 addr: fe80::98e2:9aff:fe71:7f3a/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:48 errors:0 dropped:0 overruns:0 frame:0

TX packets:81 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:4176 (4.1 KB) TX bytes:10628 (10.6 KB)

wlan0 Link encap:Ethernet HWaddr d8:fc:93:47:01:fd

inet addr:192.168.1.24 Bcast:192.168.1.255 Mask:255.255.255.0

inet6 addr: fe80::dafc:93ff:fe47:1fd/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:683977 errors:0 dropped:7 overruns:0 frame:0

TX packets:2165426 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:110733511 (110.7 MB) TX bytes:2883791106 (2.8 GB)

Just for my edification, wanted to see if the host can reach the container

First Container

nelson@lab1:~$ ping 172.17.0.2

PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.

64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.106 ms

64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.066 ms

64 bytes from 172.17.0.2: icmp_seq=3 ttl=64 time=0.073 ms

64 bytes from 172.17.0.2: icmp_seq=4 ttl=64 time=0.079 ms

--- 172.17.0.2 ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 2997ms

rtt min/avg/max/mdev = 0.066/0.081/0.106/0.015 ms

Second Container

nelson@lab1:~$ ping 172.17.0.3

PING 172.17.0.3 (172.17.0.3) 56(84) bytes of data.

64 bytes from 172.17.0.3: icmp_seq=1 ttl=64 time=0.048 ms

64 bytes from 172.17.0.3: icmp_seq=2 ttl=64 time=0.047 ms

--- 172.17.0.3 ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 999ms

rtt min/avg/max/mdev = 0.047/0.047/0.048/0.006 ms

docker network inspect bridge - show what the bridge (by name from docker network ls) is and how it is configured in a JSON object http://www.json.org/

Notice the containers identified in the container section

nelson@lab1:~$ docker network inspect bridge

[

{

"Name": "bridge",

"Id": "1c9307d1163e9d46a0a34a6430e4031ba7c41e1c33cd55304965e389905667bf",

"Scope": "local",

"Driver": "bridge",

"EnableIPv6": false,

"IPAM": {

"Driver": "default",

"Options": null,

"Config": [

{

"Subnet": "172.17.0.0/16",

"Gateway": "172.17.0.1"

}

]

"Internal": false,

"Containers": {

"4a567ec8d878c73614a72db1d465e811cbb345384a2a02507596f3d161f8e77b": {

"Name": "serene_jennings",

"EndpointID": "58d1e794d6abe6ac142008080c78f2a072f76ad3514485238b2ee36aff69442d",

"MacAddress": "02:42:ac:11:00:03",

"IPv4Address": "172.17.0.3/16",

"IPv6Address": ""

"60f1373691651b1b9694cc20e8ee4940611e7744a7526c7d513581f3a0c71e30": {

"Name": "nauseous_meninsky",

"EndpointID": "8c8ff1ccb10110f4befec2c83fb9af32247af5f8584be21ca7dc681c2a4b679e",

"MacAddress": "02:42:ac:11:00:02",

"IPv4Address": "172.17.0.2/16",

"IPv6Address": ""

}

"Options": {

"com.docker.network.bridge.default_bridge": "true",

"com.docker.network.bridge.enable_icc": "true",

"com.docker.network.bridge.enable_ip_masquerade": "true",

"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",

"com.docker.network.bridge.name": "docker0",

"com.docker.network.driver.mtu": "1500"

"Labels": {}

}

]

Feel free to repeat this command for host and none.

Wondering where the traffic is going…

ip route - from the host for specific traffic directions

nelson@lab1:~$ ip route

default via 192.168.1.254 dev wlan0 proto static

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

192.168.1.0/24 dev wlan0 proto kernel scope link src 192.168.1.24 metric 9

Also from one of the containers

/ # ip route

default via 172.17.0.1 dev eth0

172.17.0.0/16 dev eth0 src 172.17.0.2

http://www.abusedbits.com/2016/07/docker-network-demo-part-4.html