In addition to a whole host of cool stuff announced last month as part of Azures product updates, VNet gateways now come in multiple sizes. I’ve posted previously on speed limitations through conventional VNet to VNet ipsec connections (ie less than 100mb/s) however these new gateway sizes look to address this.

As far as I can tell these cannot yet be created through the web portal (like a lot of things) however you can use the following syntax in order to provision a high performance vnet gateway in order to terminate your onprem vpn, express route, or vnet to VNet connections on.

New-AzureVNetGateway -VNetName “ExistingVnetName” -GatewaySKU HighPerformance -GatewayType DynamicRouting

It’s exciting to see that multiple interfaces are now supported on azure VM’s. This is bound to open lots of opportunities to the networking vendors that we know and love but aren’t represented in Azure (It’s still pretty much only Barracuda networks there at the moment). You’re not yet able to manipulate routing tables so this is still limited in that you couldn’t create an iptables running linux VM which had interfaces in each subnet (ie a DMZ firewall) and route other VM’s through it. It’s bound to be supported soon enough, which will open these do-it-yourself type approaches.

I’ve not yet had a chance to run through the same tests I did in this post, however I will do shortly. Hopefully we see throughput of 1Gbps and upwards.

I’ve finally got around to provisioning two Ubuntu machines in two separate VNETs to test azure VNet to VNet throughput. They are connected with the native azure VPN. The machines are type “Standard_A1” which is 1 core and 1.75 GB memory although I don’t think should have too much of a bearing on the throughput. They are running ubuntu Linux abd iperf on both ends. The following represents transfers between the two geographic regions (Dublin and Amsterdam).

First up, latency:

Vnet to Vnet Latency

Latency between two VNET’s – Dublin and Amsterdam

 

Next up, Throughput on a single thread.

Test 1 :

————————————————————
Client connecting to 10.3.125.6, TCP port 5001
TCP window size: 22.7 KByte (default)
————————————————————
[ 3] local 10.3.160.4 port 50855 connected with 10.3.125.6 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.1 sec 12.6 MBytes 10.5 Mbits/sec

Test 2:

————————————————————
Client connecting to 10.3.125.6, TCP port 5001
TCP window size: 22.7 KByte (default)
————————————————————
[ 3] local 10.3.160.4 port 50852 connected with 10.3.125.6 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 24.4 MBytes 20.4 Mbits/sec

Test 3:

————————————————————
Client connecting to 10.3.125.6, TCP port 5001
TCP window size: 22.7 KByte (default)
————————————————————
[ 3] local 10.3.160.4 port 50860 connected with 10.3.125.6 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.1 sec 14.6 MBytes 12.2 Mbits/sec

Test 4:

————————————————————
Client connecting to 10.3.125.6, TCP port 5001
TCP window size: 22.7 KByte (default)
————————————————————
[ 3] local 10.3.160.4 port 50861 connected with 10.3.125.6 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.2 sec 34.8 MBytes 28.7 Mbits/sec

Next up, Throughput on 10 threads:

Test 1:

[ ID] Interval Transfer Bandwidth
[ 12] 0.0-10.3 sec 5.25 MBytes 4.29 Mbits/sec
[ 9] 0.0-10.3 sec 7.25 MBytes 5.92 Mbits/sec
[ 3] 0.0-10.3 sec 7.62 MBytes 6.21 Mbits/sec
[ 5] 0.0-10.4 sec 4.50 MBytes 3.64 Mbits/sec
[ 4] 0.0-10.4 sec 6.00 MBytes 4.84 Mbits/sec
[ 8] 0.0-10.4 sec 5.00 MBytes 4.03 Mbits/sec
[ 10] 0.0-10.4 sec 5.50 MBytes 4.44 Mbits/sec
[ 6] 0.0-10.5 sec 4.75 MBytes 3.80 Mbits/sec
[ 7] 0.0-10.5 sec 4.38 MBytes 3.50 Mbits/sec
[ 11] 0.0-10.5 sec 5.12 MBytes 4.10 Mbits/sec
[SUM] 0.0-10.5 sec 55.4 MBytes 44.3 Mbits/sec

Test 2:

[ ID] Interval Transfer Bandwidth
[ 7] 0.0-10.2 sec 6.62 MBytes 5.43 Mbits/sec
[ 12] 0.0-10.3 sec 5.62 MBytes 4.58 Mbits/sec
[ 10] 0.0-10.3 sec 6.88 MBytes 5.61 Mbits/sec
[ 3] 0.0-10.3 sec 7.38 MBytes 5.99 Mbits/sec
[ 11] 0.0-10.4 sec 6.88 MBytes 5.54 Mbits/sec
[ 5] 0.0-10.5 sec 7.00 MBytes 5.62 Mbits/sec
[ 4] 0.0-10.5 sec 7.50 MBytes 6.01 Mbits/sec
[ 6] 0.0-10.5 sec 5.75 MBytes 4.59 Mbits/sec
[ 9] 0.0-10.6 sec 6.62 MBytes 5.25 Mbits/sec
[ 8] 0.0-11.0 sec 6.50 MBytes 4.94 Mbits/sec
[SUM] 0.0-11.0 sec 66.8 MBytes 50.7 Mbits/sec

Test 3:

[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.2 sec 8.25 MBytes 6.77 Mbits/sec
[ 11] 0.0-10.2 sec 7.25 MBytes 5.94 Mbits/sec
[ 10] 0.0-10.3 sec 6.50 MBytes 5.30 Mbits/sec
[ 7] 0.0-10.4 sec 7.62 MBytes 6.17 Mbits/sec
[ 12] 0.0-10.4 sec 7.88 MBytes 6.37 Mbits/sec
[ 4] 0.0-10.4 sec 7.12 MBytes 5.74 Mbits/sec
[ 6] 0.0-10.5 sec 7.88 MBytes 6.28 Mbits/sec
[ 3] 0.0-10.6 sec 6.38 MBytes 5.04 Mbits/sec
[ 9] 0.0-10.7 sec 7.75 MBytes 6.09 Mbits/sec
[ 8] 0.0-10.8 sec 5.75 MBytes 4.46 Mbits/sec
[SUM] 0.0-10.8 sec 72.4 MBytes 56.1 Mbits/sec

Test 4:

[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.1 sec 10.8 MBytes 8.90 Mbits/sec
[ 11] 0.0-10.1 sec 5.75 MBytes 4.76 Mbits/sec
[ 6] 0.0-10.2 sec 11.4 MBytes 9.40 Mbits/sec
[ 8] 0.0-10.2 sec 8.50 MBytes 7.02 Mbits/sec
[ 4] 0.0-10.2 sec 7.25 MBytes 5.97 Mbits/sec
[ 9] 0.0-10.2 sec 7.00 MBytes 5.75 Mbits/sec
[ 7] 0.0-10.2 sec 6.75 MBytes 5.54 Mbits/sec
[ 12] 0.0-10.2 sec 7.38 MBytes 6.04 Mbits/sec
[ 3] 0.0-10.3 sec 6.00 MBytes 4.90 Mbits/sec
[ 10] 0.0-10.3 sec 7.12 MBytes 5.81 Mbits/sec
[SUM] 0.0-10.3 sec 77.9 MBytes 63.6 Mbits/sec

Throughput on 20 threads, starting to look like diminishing returns.

Test 1:

[ 12] 0.0-10.3 sec 3.25 MBytes 2.65 Mbits/sec
[ 14] 0.0-10.4 sec 4.12 MBytes 3.31 Mbits/sec
[ 10] 0.0-10.5 sec 3.00 MBytes 2.39 Mbits/sec
[ 16] 0.0-10.5 sec 3.00 MBytes 2.39 Mbits/sec
[ 9] 0.0-10.5 sec 3.75 MBytes 2.98 Mbits/sec
[ 13] 0.0-10.6 sec 3.25 MBytes 2.58 Mbits/sec
[ 15] 0.0-10.5 sec 3.00 MBytes 2.39 Mbits/sec
[ 19] 0.0-10.5 sec 3.75 MBytes 2.99 Mbits/sec
[ 18] 0.0-10.6 sec 4.62 MBytes 3.67 Mbits/sec
[ 17] 0.0-10.6 sec 3.62 MBytes 2.87 Mbits/sec
[ 3] 0.0-10.6 sec 3.25 MBytes 2.57 Mbits/sec
[ 21] 0.0-10.6 sec 3.25 MBytes 2.56 Mbits/sec
[ 11] 0.0-10.7 sec 3.25 MBytes 2.54 Mbits/sec
[ 7] 0.0-10.8 sec 2.88 MBytes 2.24 Mbits/sec
[ 6] 0.0-10.8 sec 3.88 MBytes 3.02 Mbits/sec
[ 8] 0.0-10.8 sec 3.62 MBytes 2.81 Mbits/sec
[ 5] 0.0-10.9 sec 4.50 MBytes 3.48 Mbits/sec
[ 22] 0.0-10.9 sec 2.88 MBytes 2.22 Mbits/sec
[ 20] 0.0-10.9 sec 2.75 MBytes 2.11 Mbits/sec
[ 4] 0.0-11.1 sec 3.00 MBytes 2.27 Mbits/sec
[SUM] 0.0-11.1 sec 68.6 MBytes 51.8 Mbits/sec

Test 2:

[ 10] 0.0-10.5 sec 3.38 MBytes 2.69 Mbits/sec
[ 20] 0.0-10.5 sec 3.75 MBytes 2.98 Mbits/sec
[ 12] 0.0-10.6 sec 3.00 MBytes 2.38 Mbits/sec
[ 16] 0.0-10.6 sec 3.50 MBytes 2.76 Mbits/sec
[ 5] 0.0-10.7 sec 3.62 MBytes 2.85 Mbits/sec
[ 11] 0.0-10.7 sec 3.25 MBytes 2.55 Mbits/sec
[ 4] 0.0-10.7 sec 3.75 MBytes 2.93 Mbits/sec
[ 9] 0.0-10.7 sec 3.12 MBytes 2.44 Mbits/sec
[ 3] 0.0-10.8 sec 3.00 MBytes 2.33 Mbits/sec
[ 8] 0.0-10.8 sec 3.50 MBytes 2.72 Mbits/sec
[ 22] 0.0-10.9 sec 3.88 MBytes 2.99 Mbits/sec
[ 14] 0.0-10.9 sec 3.12 MBytes 2.40 Mbits/sec
[ 19] 0.0-11.0 sec 3.62 MBytes 2.76 Mbits/sec
[ 15] 0.0-11.1 sec 2.62 MBytes 1.98 Mbits/sec
[ 6] 0.0-11.2 sec 3.25 MBytes 2.44 Mbits/sec
[ 21] 0.0-11.3 sec 3.50 MBytes 2.60 Mbits/sec
[ 18] 0.0-11.3 sec 3.25 MBytes 2.41 Mbits/sec
[ 17] 0.0-11.4 sec 4.25 MBytes 3.14 Mbits/sec
[ 13] 0.0-11.4 sec 3.25 MBytes 2.39 Mbits/sec
[ 7] 0.0-11.5 sec 3.62 MBytes 2.64 Mbits/sec
[SUM] 0.0-11.5 sec 68.2 MBytes 49.7 Mbits/sec

Test 3:

[ 9] 0.0-10.4 sec 4.75 MBytes 3.85 Mbits/sec
[ 12] 0.0-10.4 sec 5.00 MBytes 4.03 Mbits/sec
[ 13] 0.0-10.4 sec 3.62 MBytes 2.93 Mbits/sec
[ 10] 0.0-10.4 sec 4.75 MBytes 3.83 Mbits/sec
[ 14] 0.0-10.4 sec 5.38 MBytes 4.32 Mbits/sec
[ 18] 0.0-10.4 sec 3.25 MBytes 2.61 Mbits/sec
[ 19] 0.0-10.5 sec 5.38 MBytes 4.31 Mbits/sec
[ 20] 0.0-10.5 sec 4.38 MBytes 3.50 Mbits/sec
[ 8] 0.0-10.5 sec 4.12 MBytes 3.28 Mbits/sec
[ 16] 0.0-10.6 sec 3.50 MBytes 2.78 Mbits/sec
[ 11] 0.0-10.6 sec 4.50 MBytes 3.57 Mbits/sec
[ 6] 0.0-10.7 sec 5.62 MBytes 4.43 Mbits/sec
[ 3] 0.0-10.7 sec 3.50 MBytes 2.75 Mbits/sec
[ 17] 0.0-10.7 sec 5.88 MBytes 4.62 Mbits/sec
[ 21] 0.0-10.8 sec 4.00 MBytes 3.11 Mbits/sec
[ 5] 0.0-10.8 sec 3.88 MBytes 3.01 Mbits/sec
[ 15] 0.0-10.8 sec 5.25 MBytes 4.07 Mbits/sec
[ 7] 0.0-10.8 sec 3.00 MBytes 2.32 Mbits/sec
[ 4] 0.0-10.9 sec 3.50 MBytes 2.70 Mbits/sec
[ 22] 0.0-11.2 sec 6.00 MBytes 4.49 Mbits/sec
[SUM] 0.0-11.2 sec 89.2 MBytes 66.8 Mbits/sec

Test 4:

[ ID] Interval Transfer Bandwidth
[ 8] 0.0-10.2 sec 3.62 MBytes 2.99 Mbits/sec
[ 18] 0.0-10.3 sec 3.75 MBytes 3.07 Mbits/sec
[ 20] 0.0-10.3 sec 4.50 MBytes 3.68 Mbits/sec
[ 6] 0.0-10.3 sec 4.75 MBytes 3.87 Mbits/sec
[ 5] 0.0-10.3 sec 4.75 MBytes 3.87 Mbits/sec
[ 11] 0.0-10.3 sec 4.50 MBytes 3.65 Mbits/sec
[ 4] 0.0-10.4 sec 3.12 MBytes 2.53 Mbits/sec
[ 22] 0.0-10.4 sec 5.00 MBytes 4.03 Mbits/sec
[ 13] 0.0-10.4 sec 5.62 MBytes 4.53 Mbits/sec
[ 19] 0.0-10.4 sec 3.50 MBytes 2.82 Mbits/sec
[ 14] 0.0-10.4 sec 3.75 MBytes 3.02 Mbits/sec
[ 21] 0.0-10.4 sec 3.25 MBytes 2.61 Mbits/sec
[ 16] 0.0-10.5 sec 3.75 MBytes 3.01 Mbits/sec
[ 15] 0.0-10.5 sec 4.25 MBytes 3.41 Mbits/sec
[ 9] 0.0-10.5 sec 3.62 MBytes 2.90 Mbits/sec
[ 17] 0.0-10.5 sec 4.12 MBytes 3.30 Mbits/sec
[ 12] 0.0-10.5 sec 3.00 MBytes 2.39 Mbits/sec
[ 10] 0.0-10.6 sec 4.25 MBytes 3.36 Mbits/sec
[ 3] 0.0-10.8 sec 4.12 MBytes 3.20 Mbits/sec
[ 7] 0.0-10.9 sec 4.25 MBytes 3.27 Mbits/sec
[SUM] 0.0-10.9 sec 81.5 MBytes 62.8 Mbits/sec

To conclude, the highest throughput I saw on a single thread was 28mb/s, it’s pretty slow when you might be used to 1gb or evern 10gb links between your on premises data center. Replication of data between geographic regions could be very troublesome on systems where a lot of data is being written. The best throughput I saw was 66mb/s on 20 threads, although there were comparable speeds on 10 threads. There’s no huge advantage to using more than about 10-15 threads it seems.

In another update I will test intra DC vnet to vnet (ie in the same data center) and also to other regions.

The client I’m currently working for is starting a fairly large migration of it’s applications to the azure cloud platform. One aspect of this that I’ve been interested in exploring is the performance of the network stack, both throughput and VNet latency. A logical way of separating environments is by creating VNET’s and joining them with VNET to VNET VPN’s which are built in natively.

I’m a little surprised about the VNet’s latency, even in the same datacenter. I have deployed two machines in two separate VNET’s in the same geographic region (North Europe – Dublin) and the standard latency is 2-3ms but it frequently jumps to 6-9ms. While this is just an observation using a standard windows ping, I plan to look into this more closely, a little more scientifically. iperf and throughput stats to come!

Are there any applications that are so latency dependent that this alone would prevent you from moving said applications into cloud infrastructure? Low latency on an on premises network is almost so this might be important to you when considering a move away from on-prem. ‘Noisy Neighbours’ are more easily dealt with when you look after the network end to end of course, troubleshooting this latency variation could be difficult.