Last Week In AWS Podcast

Corey Quinn

The latest in AWS news, sprinkled with snark. Posts about AWS come out over sixty times a day. We filter through it all to find the hidden gems, the community contributions--the stuff worth hearing about! Then we summarize it with snark and share it with you--minus the nonsense.

Episodes

Mentioned books

Feb 21, 2020 • 14min

Whiteboard Confessional: Route 53 DB

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptCorey: Welcome to AWS Morning Brief: Whiteboard Confessional. I’m Cloud Economist Corey Quinn. This weekly show exposes the semipolite lie that is whiteboard architecture diagrams. You see, a child can draw a whiteboard architecture, but the real world is a mess. We discuss the hilariously bad decisions that make it into shipping products, the unfortunate hacks the real world forces us to build, and that the best to call your staging environment is “theory”. Because invariably whatever you’ve built works in the theory, but not in production. Let’s get to it.But first… On this show. I talk an awful lot about architectural patterns that are horrifying. Let's instead talk for a moment about something that isn't horrifying: CHAOSSEARCH. Architecturally, they do things right. They provide a log analytics solution that separates out your storage from your compute. The data lives inside of your S3 buckets and you can access it using API's you've come to know and tolerate through a series of containers that live next to that S3 storage. Rather than replicating massive clusters that you have to care and feed for yourself, instead, you now get to focus on just storing data, treating it like you normally would other S3 data and not replicating it, storing it on expensive discs in triplicate and fundamentally not having to deal with the pains of running other log analytics infrastructure. Check them out today at chaossearch.io.I frequently joke on Twitter about my favorite database being Route 53, which is AWS’s managed database service. It’s a fun joke, to the point where I’ve become Route 53’s de facto technical evangelist. But where did this whole joke come from? It turns out that this started life as an unfortunate architecture that was taken in a terrible direction. Let's go back in time, at this point almost 15 years from the time of this recording, in the year of our Lord 2020. We had a data center that was running a whole bunch of instances—in fact, we had a few data centers, or datas center, depending upon how you chose to pluralize, that’s not the point of this ridiculous story. Instead what we’re going to talk about is what was inside these data centers. In this case, servers. I know, server-less fans, clutch your pearls, because that was a thing that people had many, many, many years ago. Also known as roughly 2007. And on those servers there was this new technology that was running and was really changing our perspective of how we dealt with systems. I am, of course, referring to the amazing transformative revelation known as virtualization. This solved the problem of computers being bored and not being able to process things in a parallelized fashion—because you didn’t want all of your applications running on all of your systems—by building artificial boundaries between different application containers, for a lack of a better term. Now in these days, these weren’t applications. These were full-on virtualized operating systems, so you had servers running inside of servers, and this was very early days. Cloud wasn’t really a thing. It was something that was on the horizon, if you’ll pardon the pun. So, this led to an interesting question of, “All right. I wound up connecting to one of my virtual machines, and there’s no good way for me to tell which physical server that virtual machine was connecting to.” How could we solve for this? Now, back in those days, with the Hypervisor technology we used, which was Xen, that’s X-E-N—it’s incidentally the same virtualization technology that AWS started out with for many years before releasing their Nitro Hypervisor, which is KVM derived, a couple of years ago. Again, not the point of this particular story. And one of the interesting pieces about how this works was that Xen doesn’t really expose anything, at least in those days, that you could use to query the physical host it was running on. So, how would we wind up doing this? Now, at very small scale where you have two or three servers sitting somewhere, it’s pretty easy. You log in and you can check. At significant scale, that starts to get a little bit more concerning. How do you figure out which physical host a virtual instance is running on? Well, there’s a bunch of schools of thought you can approach this from. But what you’re trying to build is known, technically, as a configuration management database, or CMDB. This is, of course, radically different from configuration management, such as Puppet, Chef, Ansible, Salt, and other similar tooling. But, again, this is technology, and naming things has never been one of our collective strong suits. So, what do we wind up doing? You can have a database, or an Excel spreadsheet, or something like that that has all of these things listed, but what happens when you then wind up turning an old instance off, and spinning up a new instance on a different physical server? These things become rapidly out-of-date. So, what we did was sort of the worst possible option. It didn’t solve for all of these problems, but at least was able to address what we wound up doing. At least, let us address what the perceived problem was, in a way that is, of course, architecturally terrible, or it wouldn’t have been on this show.DNS has a whole bunch of interesting capabilities. You can view it, more or less, as the phone number for the internet. It translates names to numbers. Fully qualified domain names, in most cases, to IP addresses. But it does more than that. You can query IP address and wind up getting the PTR, or reverse record, that tells you what the name of a given IP address is, assuming that they match. You can set those to different things, but that’s a different pile of madness that I’m certain we will touch upon a different day. So, what we did is we took advantage of a little-known record type known as TXT, or text, record. You can put arbitrary strings inside of TXT records and then consume them programmatically, or use a whole bunch of different things. One of the ways that we can use that, that isn’t patiently ridiculous is, domains generally have TXT records that contain their SPF record, which shows which systems are authorized to send mail on their behalf as an anti-spam measure. So, if you have something else that starts claiming to send email from your domain that isn’t authorized, that gets flagged as spam by many receiving servers. We misused TXT records, because there is no limit, really, to how many TXT records you can have, and wound up using that as our configuration management database. So, you could query a given instance, we’ll call it webserver003.production.losangeles.company.com, which was our naming scheme for these things, and it would return a record that was itself a fully qualified domain name, but it was the name of the physical host on top of which it was running. So, yeah, we could then propagate that, as we could with any other DNS records, to other places in the ...

Feb 17, 2020 • 13min

EBS Gets Overly Multi-Attached

AWS Morning Brief for the week of February 17, 2020.

Feb 10, 2020 • 12min

Polly Brand Voice Want a Platypus?

AWS Morning Brief for the week of February 10, 2020.

Feb 6, 2020 • 22min

Networking in the Cloud Fundamentals: BGP Revisited with Ivan Pepelnjak

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptCorey: Hello and welcome to our Networking In The Cloud mini series, sponsored by ThousandEyes. That's right. There may be just one of you, but there are a thousand eyes. On a more serious note, ThousandEyes has sponsored their cloud performance benchmarking report for 2019, at the end of last year, talking about what it looks like when you race various cloud providers. They looked at all the big cloud providers and determined what does performance look like from an end user perspective? What does the user experience look like among and between different cloud providers? To get your copy of this report, you can visit snark.cloud/realclouds. Why real clouds? Well, because they raced AWS, Azure, GCP, IBM Cloud, and Alibaba, all of which are real clouds. They did not include Oracle cloud because, once again, they are real clouds. Check out your copy of the report at snark.cloud/realclouds.Welcome to week 12 of the Networking In The Cloud mini series of the AWS Morning Brief, sponsored by ThousandEyes. So one of the early episodes of the Networking In The Cloud mini series had me opining and relatively uninformed broad brush strokes about the nature of BGP. Today I am joined by Ivan Pepelnjak, who is a former CCIE who wrote a fascinating blog post that I will link to in the show notes, saying, "This is great, but this is what happens when someone who's good at one thing steps completely out of their comfort zone into things they don't fully understand and start opining confidently, if not authoritatively." Ivan, thank you for taking the time to speak with me.Ivan: Thanks for having me on. And no, I was way more polite than your summary.Corey: Absolutely. I believe that there's a way to tell a story of the hero's journey that everyone talks about when they're building a narrative arc. Instead, I go for the moron's journey and I always like to be the moron because, generally, I tend to be, and as I walk through the world and get things sometimes right, occasionally wrong, I love being corrected when I stumble blindly into an area I don't know. First because it gives me an opportunity to learn something new, which is great, but it also gives me that opportunity to be the dumbest person in the room again, which is awesome. So...Ivan: That's exactly why I blog to get your opinions.Corey: Exactly. You have data, I have opinions and mine are louder seems to be the way that discourse works in the modern era. So from a high level, what did I get wrong about BGP?Ivan: Well, you got everything right about the mess that we are in and the fragility of the generic internet infrastructure. The only thing you got wrong was that you blamed the tool, but not people using the tool.Corey: It always feels like it's safer, on some level, to blame technology because if the takeaway is, "Well, the user experience around tool X isn't great, and that adds a contributing factor to why things break." That seems to be a message that carries slightly better than, "And thus the answer is for everyone to be smarter and stop screwing up." And that may very well be the answer. It's just a bitter pill to swallow sometimes. So I find blaming a tool is easy.Ivan: Yeah, but it's like blaming the knives for people to get cut or blaming the chainsaw for people to cut off their arm because they were not properly trained.Corey: One of my assertions was that BGP is more or less a hot mess because it was designed for an era when people on the internet fundamentally could trust one another and that doesn't seem to be the case today. The analogy in my mind, that I don't think I mentioned, was SMTP, the the email protocol, for lack of a better term. When that was built, the internet was more or less comprised of researchers and who in the world would ever abuse a protocol like email? It's not like there was any money involved in the internet. Fast forward today and your spam folder is inherently a garbage fire.Ivan: Yeah, but BGP has a slightly different history. It was redesigned a few times. There were several attempts to get the global routing protocol right. And BGP, the last attempt, already included the tools that allow entities that don't trust each other, like commercial internet service providers, to exchange information and apply policies on inbound and outbound updates. So for example, I don't want to hear about your customers because I hate you and I don't want to peer with you or I don't want to tell you about my customer because that customer has a special deal and their traffic can only go through some other transit providers so I will not tell you about that customer. Those things were already a major requirement when BGP was designed and it always included the tools to implement the policies that individual commercial entities wanted to have, which by the way, never happens to SMTP. We have BGP version 4 now and we are still on SMTP version zero.one plus enhancements.Corey: I guess the best analogy I can come up with through my exposure with BGP, because I tend to handle inter networking between various groups about as well as I write code, things that I have some vague awareness that there are things you should be doing here that I will almost certainly not get right, so I back away slowly and leave it to professionals. As a result, every time I really see how BGP works in any hands-on sense or a point where it's forced upon my awareness, it's similar to how I become aware of plumbing. I don't think about it. I don't question it. I just expect when I turn the faucet on or flush the toilet that water will do what it's going to do. I don't expect the toilet to explode. So the only time I think about BGP is when there is a peering dispute or when there's a flap or, on one notable occasion, when I was at a security conference and, as a demo, some folks hijacked the entire AS of the SN for the conference and rerouted it halfway around the world and back, which explained why everything was super latent and crappy.Ivan: Yeah. You're absolutely right, but all the incidents you mentioned are not the fault of the tool. They are the fault of the tool not being properly used. And also, let's be honest, it took them hundreds of years to get the plan being to the point when you can just turn on the faucet and the clean and drinkable water comes out of it. It's not like that would have happened in the last year or two, and very probably it wouldn't have happened without public pressure to bring us dri...

Feb 3, 2020 • 11min

Lies, Damned Lies, and Sponsored Benchmarks

AWS Morning Brief for the week of February 3, 2020.

Jan 30, 2020 • 15min

Networking in the Cloud Fundamentals: Cloud and the Last Mile

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptCorey: Hello and welcome to our Networking in the Cloud, mini series sponsored by ThousandEyes. That's right. There may be just one of you, but there are a thousand eyes on a more serious note. ThousandEyes has sponsored their cloud performance benchmarking report for 2019 at the end of last year. Talking about what it looks like when you race various cloud providers. They looked at all the big cloud providers and determined what does performance look like from an end user perspective? What does the user experience look like among and between different cloud providers? To get your copy of this report, you can visit snark.cloud/realclouds. Why real clouds? Well, because they raced AWS, Azure, GCP, IBM Cloud and Alibaba, all of which are real clouds.They did not include Oracle Cloud because once again they are real clouds. Check out your copy of the report at snark.cloud/realclouds. It's interesting that that report focuses on the end user experience because as this mini series begins to wind down, we're talking today about the last mile and its impact on what perceived cloud performance looks like. And I will admit that even having given this entire mini series and having a bit of a network engineering background, once upon a time, I still wind up in a fun world of always defaulting to blaming my local crappy ISP.Now today, my local ISP is amazing. I use Sonic in San Francisco. I get Symmetric Gigabit. It's the exact opposite of Comcast who was my last provider until Sonic came to my neighborhood and it was fun that day because I looked up and down the block and saw no fewer than six Sonic trucks ripping Comcast out by the short and curlies. Which let's not kid ourselves, is something we all wish we could do and I was the happiest boy in town the day I got to do it. Now, the hard part is figuring out that yes, it is in fact a local ISP problem because it isn't always. This is also fortuitous because I spent the last month or so fixing my own local internet situation and today I'd like to tell you a little bit more about that as well as how and why.Originally when I first moved into my roughly, we'll call it 2,800 square foot house, it's spread across three stories, I wound up getting EEROs, that's E-E-R-O. They're a mesh network set up that was acquired by Amazon after I'd purchased them. These are generation one. The wireless environment in San Francisco is challenging and in certain parts of my house, the reception as a result, wound up being a steaming bowl of horse crap. The big challenge was figuring out that, that's what the problem was. With weird dropouts and handoff issues, it was interesting. This one area that caused immediate improvement was not having these things talk to each other wirelessly as most full mesh systems will do, but instead making sure that they were cabled up appropriately to a switch, the central patch panel and then hooked them in. Now you have to be careful with switches because a lot of stuff won't do anything approaching full throughput because that can get expensive and a lot of consumer gear is crap.This was a managed HP pro curved device back in the days that HP made networking equipment. That was great. And it's still crap, but it is crap that works at full line rate. So there's that. Next I wound up figuring that ... all right, it's time to take this seriously. So I did some research and talked to people I know who are actually good at things, instead of sounding on the internet like they're good at things. And I figured the next step was to buy some Ubiquiti Networks style stuff. Great. We go ahead and trot some of that out. It's an enterprise gear. It's full mesh. I of course now have a guest wifi that you have to pay for to use the hotspot. It's called Toss a coin to your wifi for an SS ID because of course it is. I have problems. And it's fun and I can play these stupid games, but suddenly every weird internet problem I had in my house started getting better as a result.And it's astonishing how that changed my perception of various third party services. None of whom, by the way, had anything to do with my actual problem. But there were still some perceptual differences. And this impacts the cloud in a number of subtle ways and that's what I want to talk about today. So one of the biggest impacts is DNS. And I don't mean that in the sense of big cloud provider DNS, we've already talked about how DNS works in a previous episode. But rather what resolver you wind up using yourself. One of the things that I did as a part of this upgrade, is I rolled out a distribution of Linux called Pi-hole, which sounds incredibly insulting as applied to people as in, you know what, you should shut? Your Pi-hole. However, it's designed to run on top of Raspberry Pi and provide a DNS server that creatively blocks ads.And that's super neat. I liked the idea of just blocking ad servers, but you have to trust whatever you're using for a DNS resolver because of a few specific use cases that I stumbled over as I went down this process. One, it turns out that having access to every website you'd care to visit as far as a list of things you've been doing, is not really the most privacy conscious thing in the universe. Now, for some reason, the internet collectively decided, you know who we trust with all the things that we look at on the internet and have no worries about giving that information to? That's right. Freaking Google. So eight dot eight dot eight dot eight, was a famously to remember open resolver and it works super well. It's quick. It returns everything. The problem is, is that Google's primary business model is very clearly surveillance and I don't do anything particularly interesting.If you look at my DNS history, you're going to find a lot of things that you'd think you could use to blackmail me, but it turns out you actually can't because I talk about them on podcasts. That's right. I use Route 53 as a database. What of it? And it's all very strange just as far as even without anything to hide, I still feel this sense of pervasive creepiness at the idea that a giant company can look at this. Can look at my previous browsing history. So blocking things like that are of interest to me. So okay, instead, if I run Pi-hole that acts as my own resolver but then it winds up passing queries on to an upstream provider. I mean I could run my own, but that has other latency concerns and DNS latency when you're making requests is super indicative because the entire internet has gone collectively dumb. And decided to display a simple static webpage, You need to make 30 distinct DNS request in series and wait for them all to come back and other ridiculous nonsense that is the modern web today.What makes this extra special is I figured out, okay, I'm not going to go with Google or CloudFlar...

Jan 27, 2020 • 12min

Dedicated T3 Instances Burst My Understanding

AWS Morning Brief for the week of January 27th, 2020.

Jan 23, 2020 • 15min

Networking in the Cloud Fundamentals: Connectivity Issues in EC2

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptCorey: Welcome to the AWS Morning Briefs miniseries, Networking In the Cloud, sponsored by ThousandEyes. ThousandEyes has released their cloud performance benchmark report for 2020. They effectively race the top five cloud providers. That's AWS, Google Cloud Platform, Microsoft Azure, IBM Cloud, and Alibaba Cloud, notably not including Oracle Cloud, because it is restricted to real clouds, not law firms. It winds up being derived from an unbiased third party and metric-based perspective on cloud performance as it relates to end user experience. So this comes down to what real users see, not arbitrary benchmarks that can't be gamed. It talks about architectural and conductivity differences between those five cloud providers and how that impacts performance. It talks about AWS Global Accelerator in exhausting detail. It talks about the Great Firewall of China and what effect that has on cloud performance in that region, and it talks about why regions like Asia and Latin America experience increased network latency on certain providers. To get your copy of this fascinating and detailed report, visit snark.cloud/realclouds, because again, Oracle's not invited. That's snark.cloud/realclouds, and my thanks to ThousandEyes for their continuing sponsorship of this ridiculous podcast segment.Now, let's say you go ahead and spin up a pair of EC2 instances, and as would never happen until suddenly it does, you find that those two EC2 instances can't talk to one another. This episode of the AWS Morning Brief's Networking in the Cloud Podcast focuses on diagnosing connectivity issues in EC2. It is something that people don't have to care about until suddenly they really, really do. Let's start with our baseline premise, that we've spun up an EC2 instance, and a second EC2 instance can't talk to it. How do we go about troubleshooting our way through that process?The first thing to check, above all else, and this goes back to my grumpy Unix systems administrator days is: are both EC2 instances actually up?Yes, the console says they're up. It is certainly billing you for both of those instances, I mean, this is the cloud we're talking about, and it even says that the monitoring checks, there are two by default for each instance, are passing. That doesn't necessarily mean as much as you might hope. If you go into the EC2 console, you can validate through the system logs that they booted successfully. You can pull a screenshot out of them. If everything else was working, you could use AWS Systems Manager Session Manager, and if you'll forgive the ridiculous name, that's not a half bad way to go about getting access to an instance. It spins up a shell instance in a browser that you can poke around inside that instance within, but that may or may not get you where it needs to go. I'm assuming you're trying to connect to one of those instances or both of those instances and failing, so validate that you can get into both of those instances independently.Something else to check. Consider protocols. Very often, you may not have permitted SSH access to these things. Okay, or maybe you can't ping these and you're assuming they're down. Well, an awful lot of networks block certain types of ICMP traffic, echo requests, for example. Type eight. Otherwise, you may very well find that whatever protocol you're attempting to use isn't permitted all the way through. Note incidentally, just as an aside, that blocking all ICMP traffic is going to cause problems for your network. When things are fragmented and they need to have a different window size set for things that are being sent across the internet, ICMP traffic is how things are made aware of that. You'll see increased latency if you block all ICMP traffic, and it's very difficult to diagnose, so please, for the love of God, don't do that.Something else to consider as you go down the process of tearing apart what could possibly be going on with these EC2 instances not able to speak to each other. Try and connect to them via IP addresses rather than DNS names. Just because there's ... I'm not saying the problem is always DNS, but it usually is DNS, and this removes a whole host of different problems that could be manifesting if you just go by IP address. Suddenly resolution, timeouts, bad DNS, et cetera, fall by the wayside. When you have a system that you're trying to talk to another system and you're only using IP, suddenly there's a whole host of problems you don't have to think about. It goes well.Something else to consider in the wonderful world of AWS is network ACLs. The best practice around network ACLs is, of course, don't use them. Have an ACL that permits all traffic, and then do everything else further down the stack. The reason is that no one thinks about network ACLs when diagnosing these problems. So if this is the issue, you're going to spend a lot of time spinning around and trying to figure out what it is that's going on.The next more likely approach, and something to consider whenever you're trying to set up different ways of dividing traffic across various regimes of segmentation, is security groups. Security groups are fascinating, and the way that they interact with one another is not hugely well understood. Some people treat security groups like they did old school IP address restrictions, where anything in the following network, and you can express that in CIDR notation the way one would expect, or C-I-D-R depending on how you enjoy pronouncing or mispronouncing things, can wind up being used, sure, but you can also say members of a particular security group are themselves allowed to speak to this other thing. That, in turn, is extraordinarily useful, but it also means extremely complex things, especially when you have multiple security groups layering upon one another.Assuming that you have multiple security group rules in place, the one that allows traffic is likelier to have precedents. Note as well that there's a security group rule that is in place by default that allows all outbound traffic. If that's gotten removed, that could be a terrific reason why an instance is not able to speak to the larger internet.One thing to consider when talking about the larger internet is what ThousandEyes does other than releasing cloud benchmark performance reports. That's right. They are a monitoring company that gives a global observer perspective on the current state of the internet. If certain providers are having problems, they're well positioned to be able to figure out who that provider is, where that provider is having the issue, and how that manifests, and then present that in real time to its customers. So if you have widely dispersed users and want to keep a bit ahead of what t...

Jan 20, 2020 • 10min

AWS Back-All-The-Way-Up

AWS Morning Brief for the week of January 20th, 2020.

Jan 16, 2020 • 17min

Networking in the Cloud Fundamentals: Data Transfer Pricing

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptCorey: Welcome to the AWS Morning Brief, specifically our 12-part mini series, Networking In The Cloud, sponsored by ThousandEyes. ThousandEyes recently released their state of the cloud benchmark performance report. They raced five clouds together and gave a comparative view of the networking strengths, weaknesses, and approaches of those various providers. Take a look at what it means for you. There's actionable advice hidden within, as well as incredibly useful comparative data, so you can start comparing apples to oranges instead of apples to baseballs. Check them out and get your copy today at snark.cloud/realclouds. That's snark.cloud/realclouds because Oracle cloud was not invited to participate.Now, one thing that they did not bother to talk about in that report, is how much all of that data transfer across different providers costs. Today I'd like to talk about that, which is a bit of a lie because I'm not here to talk about it at all, I'm here to rant like a freaking lunatic for which I make no apologies whatsoever.This episode is about data transfer pricing in AWS. Because honestly I need to rant about something and this topic is entirely too near and dear to my heart, given that I spend most of my time fixing AWS bills for interesting and various sophisticated clients.Let's begin with a simple question. The answer to which is guaranteed to piss you off like almost nothing else. What does it cost to move a gigabyte of data in AWS? Think about that for a second. The correct answer, of course, is that nobody freaking knows. There is no way to get a deterministic answer to that question without asking a giant boatload of other questions.Let me give you some examples, and before I do, I would like to call out that every number I'm about to mention applies only to us-east-1, because of course different regions in different places have varying costs, that every single one of these numbers is different in other places sometimes, but not always. Why? Because things are awful. I told you I was going to rant. I'm not apologizing for it at this point.Let's begin simply and talk about what it takes to just shove a gigabyte of data into AWS. Now in most cases that's free. Inbound bandwidth is always free to AWS usually, until it passes through with load balancer or does something else but we'll get there. What does it cost to move data between two AWS regions? Great. The answer to that is, two cents per gigabyte in the primary regions, except there's one use case which gets slightly less. And that is moving between us-east-1 and us-east-2. One is in Virginia, two is in Ohio. That is half price at one cent per gigabyte. My working theory behind that is that it's because even data wants to get the hell out of Ohio.Let's take it a step further. Let's say you were in an individual region. What does it cost to move data from 1-AZ to another? The documentation was exquisitely unclear, and I had to do some experiments with spinning up a few instances in otherwise empty AWS accounts, and using DD and Netcat to hurl data across various links to find out the answer and then wait till it showed up on my bill. The answer is it also costs 2 cents per gigabyte, the same cost as region to region. It's one cent per gigabyte out of an AZ and one cent per gigabyte in to an AZ. And that's right, it means you get charged twice. If you move 10 gigabytes, you are charged for 20 gigabytes on that particular metric.This also has the fun ancillary side effect of meaning that moving data between Virginia and Ohio is cheaper to do that cross region transfer than it is to move that same data within an existing region. Oh wait, it gets dumber than that. What do load balancer data transfer fees look like? The correct answer is who the hell knows? On the old classic load balancers, it was 0.8 cents per gigabyte in or out to the internet and there was also an instance fee, but that's not what we're talking about today. Traffic from any existing load balancer today to something inside of an AZ is free unless it crosses an availability zone and then we're back into cross AZ data transfer territory and anything going from an availability zone to a load balancer costs one cent per gigabyte.Now the newer load balancer generations, the ALDs and the NLDS, what does that cost? Nobody freaking knows because data throughput is just one of several dimensions that go into a load balancer capacity unit, which mean that what your data transfer price is going to look like is going to vary wildly because in this particular case, it's not data transfer itself. There's still that as it leaves, but you also have to pay for this as an additional through the load balancer fee, but it's blended into an LCU, so it's not at all obvious at times that that is in fact what you're being billed for.In another episode of this mini series, we talked about global accelerator. Now there's a site to site VPN option, which they had for a while, but at re:Invent last year they announced a accelerated VPN option that leverages a lot of global accelerator technology to let that site to site VPN take advantage significantly of the global accelerator. Now what does that cost? I could not freaking tell you. There are, I am not exaggerating, five distinct billing line items, if you run an accelerated site to site VPN and of course, all of them cost you money. I am not exaggerating. That is the actual state of the world. It is incredibly annoying. It is so annoying that I'm going to have to take a break before I blow a blood vessel to tell you more about ThousandEyes instead.So other than the cloud report, what is ThousandEyes? They effectively act as the global observer that watches the entire internet from a whole bunch of different listening posts around that internet and keeps track in near real time of what's going on, what's being slow, what providers are having issues and giving information directly to your folks on your side to be able to understand, adapt and mitigate those outages and slow downs. It helps immediately get to the point of is this a networking problem globally or is it our last crappy code deploy that broke things? If this sounds like something that might be useful for you or your team, I encourage you to check them out at thousandeyes.com. They're a fantastic company with a fantastic product and best of all their billing makes sense.We're back to ranting again. That's right. My problem with the AWS data transfer pricing is not that it's shitty and complex, but also that it's expensive. Pricing largely has not changed since AWS...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner