Hack Your API First

 Introduction

    The Age of the API



Hi. This is Amir Shahzad, and welcome to Hack your API First. This course is going to be all about how to identify vulnerabilities in the APIs behind many of the rich client apps we build today. Now, this course is a great continuation of one of my previous courses, Hack Yourself First: How to go on the Cyber-Offense. Now, that course was all about finding common vulnerabilities in web applications. It could be any technology stack, and they're vulnerabilities that you can easily find mostly in the web browser and a few common development tools. This time we're going to move on to mobile apps and apply the same approach, that is risks and vulnerabilities that you can easily find yourself in any app of any technology stack. You don't have to have seen the first course for this one to make sense, but if you're doing any web development I do highly recommend it. That said, let's jump into the course and start talking about connected devices. In this course, these are the sorts of devices that we're going to be most commonly referring to, things like smartphones or modern Windows apps, and tablets. Doesn't matter what make or model they are. If they talk over the web via HTTP or HTTPS, then we're going to cover them in this course. Now of course these devices talk to backend services built in all sorts of different web technologies on different frameworks, and the great thing about this course is that it's equally relevant regardless of what's running on the server. And I've made sure it makes sense to everyone by ensuring that we're really just looking at externally observable patterns, so things we'll see in tools that aren't specifically tied to say. NET or PHP or Node or any specific server framework. That keeps it broadly relevant to anyone building APIs and rich clients that can trim them. One of the things I'd like to touch on before we start delving into the APIs themselves is just how important it is these days to focus on the security of the services behind rich apps, and I'd like to show you some statistics that should hopefully reinforce the messages that we'll be covering in this course. Here's a chart of the number of apps in Apple's iOS App Store from about the middle of 2008 all the way through to the present time of recording, which is about mid-2014. It is astronomical growth, and at the present time, we're approaching 1. 2 million apps in Apple's App Store. It's a very similar story to Google's Play store. They kicked in a little bit later, but by many reports, they're even bigger now. Regardless, they're both north of a million apps in the store, and a huge proportion of those talk over the web using the sorts of services that we're going to be looking at. Now, the number of apps is one thing, but the number of devices that these apps have made their way onto is really staggering, and in fact, when we look at the figures we can see that we're approaching 80 billion downloads of apps to date. Now, apparently the average mobile device today has somewhere between 40 and 50 different apps installed on it, and again a significant portion of those are taking over the web. Put that in the context of this graph heading up towards the 80 billion mark. Again, that's just for Apple. That's not Google Play, that's not Windows Store, and it's not the other providers out there that distribute apps through other channels, yet they still talk over the web to communicate with backend services. All of these are within the scope of this course. But of course, if you are building mobile apps, the scale, and the signs are probably nothing new to you, but it is important to set that context about just how ubiquitous these apps and services are becoming in today's technology world. But there's something else that's happening in our technology space today that's also very significant.

Next Step

Let's take a quick look at that before we move on. We are now well and truly entering this era that has become known as the internet of things. Now, things are a pretty generic term, and in reality, it covers pretty much anything which is a connected device, so anything that has a communication channel, and very frequently that communication channel with external services is again over HTTP over the web. So, let me give you a few examples because all of these fall into the scope of what we're going to cover in this course. So, first up Withings scales. I have a set of these myself. They're great. You stand on them obviously. They take your weight, they also take your fat mass, and then they talk over an API via the web to a backend service. How Withings secures its service is critical when it comes to protecting its customer data. And we're now starting to talk about health data as well, so that's pretty sensitive information. Another good example that can have a serious security impact is physical security devices. This is Lockitron. It sits over your deadlock on an existing door, so it's a retrofit, but it is an IP-enabled retrofit that you can control from your smartphone and that operators can control from a central location. They control it once again via HTTP services over the web. It is another device dependent on the sort of APIs that you're probably building if you're watching this course. One more before we move on, and this is a pretty significant one, cars. In this case, Tesla has provided an API framework where developers can communicate over HTTP with Tesla's services and subsequently with vehicles. This is serious stuff when developers start having the ability to connect to vehicles. It's very early days too, so we're going to see a lot more, not just of Tesla's and door locks and connected scales, but of all sorts of devices or things that are connected to the internet using HTTP-based services. So, we're actually becoming extremely reliant on these services, but the other thing is that they're sort of out of sight. They sit behind the veneer of either interface on our devices like our smartphones or tablets, or they run in the background when I stand on the scales each morning, so they're not quite as immediately obvious as something like a web page that you're loading in your browser, and in many ways that create its own risks. This course is going to teach you how to find those risks. And not only that, but it's also going to teach you what the secure patterns look like because obviously, the ultimate goal here is to help you go away and write more secure apps of which a very significant component is frequently the APIs. Now, I just touched on the APIs behind these sorts of devices and apps tend to be a little bit out of sight. Let's move on, and I'm going to give you an example of what I mean by that.

The Hidden Nature of API Security



I want to talk a little bit about what I'd call the hidden nature of API security. And to explain what I mean by that, let's first look at a traditional website. And in fact what you see on the screen here is the dedicated insecure website I created for the Hack Yourself First course, the predecessor to this one if you like. Now, what we're looking at is a Log in form. It takes an email address and a password. We can also see that this Log in form is served over HTTP and not HTTPS. It's not a secure connection, there's no assurance of who we're connected to, yet it's asking for sensitive data. Now, in the course I explain why this is a risk even if it posts to HTTPS, and the point I want to make here is that it's very clear, it's overtly obvious that we do not have a fundamental security tenant, which is SSL. No HTTPS. No green padlock. No certificate. Now, let's compare this to a secure site, Pluralsight. When you log onto Pluralsight and you provide your user name and password, we can very clearly see just by the address bar that it is indeed an HTTPS connection, it's gone green, we have a padlock, and if we'd like we can click on the padlock, inspect the certificate, see who it was issued to, see who the certificate authority was, and of course also see when the certificate is valid until. This is very clear and present security, and indeed we try and ingrain into people that they should expect to see these very, very basics on any website where security is important. But of course it's not just SSL, which is overtly obvious either by its presence or by its omission. It's also other security aspects, things like cross-site scripting, SQL injection. It only takes one or two simple requests in the browser, the sort of things I talk about in my previous course, to see that a site might actually be seriously vulnerable. Now, that's one or two requests that the developer can easily make when they're building a site. Same as the QA people. Same as even the security team. It's very easy to find security risks in web applications. You can also easily do things like look at cookies. Are they flagged as HTTP only or secure? Do they have long expirations? All of these things are easily observable with one or two clicks inside the browser. Let's compare that to the paradigm of apps running on today's modern devices. Here's a good example. I've got three apps here. One of these does a very poor job of the log in security. Which one is it? Well, it's kind of hard to tell. We don't get a URL that's visible within the app before we entrust our personal sensitive data with the app. No HTTPS. No green padlocks. No certificates we inspect. We trust or at least we hope that the developers of the app have done a good job of it, but unfortunately we really have no assurance, nothing beyond reputation, and unfortunately reputation is a very poor security control. Now, this not only makes things harder for the consumer, but it actually makes things harder for the developers and the testers and as I said earlier even the security teams to find immediately obvious risks like you can in the browser. It's certainly easy enough if you know how to go about it, but it's not the same as just looking at the URL or the developer tools in Chrome. That is a different story altogether, so we're going to touch on all that in this course. But before we go on with that, let's just recap on exactly what I mean when I talk about APIs.

Leaky APIs and Hidden APIs

Introduction

Welcome to Hack Your API First. In this module we're going to take a look at leaky APIs, so APIs that disclose information that they shouldn't, and hidden APIs, or in other words APIs that the developer didn't think people would find yet are discoverable via a number of ways. Now, I'm going to walk you through some examples of this in both the sample vulnerable app and also some real-world apps that are out there today. But before I do that, I'd like to take you through a story that perfectly illustrates the risks we're going to be talking about in this module. Back in 2011 I wrote a blog post about a risk in a mobile app that had the potential to impact the privacy of shoppers at our local Westfield Shopping Center here in Sydney, Australia. This blog post got a lot of media attention, and everyone was frankly a little bit outraged that an organization would put out an app that took such a casual approach to people's privacy. Let me walk you through what the app did. The premise of the app was that when you go shopping in the Westfield Shopping Center in Bondi Sydney there's a good chance that after a busy day of hard shopping you may forget where you left your car parked. So, what you could do is open up the app, and this is the app in front of you here, search for your number plate, and it would show you four pictures of vehicles with number plates close to the one you were searching for. I say close because this is obviously an OCR process, an optical character recognition process where the number plates of the vehicles are photographed via the service and then identified by the software and converted to plain text with some degree of confidence. So, your search for the vehicle, and you get four results like you see here. They're pretty small pictures and thumbnails. You can't quite make out the number plate. By design, it's not disclosing the actual plate of the vehicle in the car park, and of course,, it's limiting it to only four results, so inevitably the thinking here was that the risk to privacy was minimized. Can't really read the plate; only four results. Now, when you found the vehicle you were looking for you'd tap it, and it would show a map of where the car was located. Very handy, and now you can just wander over to your vehicle without trolling through the car park the rest of the afternoon trying to remember where you parked. Now, on the surface of it,, this seems fine, and it was only when I actually proxied the data through Fiddler doing exactly what we saw in the last module that I found something rather interesting. Let's have a look at the JSON response when a vehicle was searched for. This is what it looks like, and you can see that there's actually quite a lot of information here. All of this is about just one vehicle.

Next Step

Clearly, the information here includes things like a map, so what image should be shown once someone selects the vehicle, information about the sensor that's actually taking the picture of the vehicle, a whole bunch of timestamp data, and the bit that I've highlighted, the visit information. And what we can see on the visitor information is the plain text of the license plate, in this case, AWC11A. Now, immediately this poses a risk because it actually gives us the plain text version of the plate. You could start enumerating through potential plates and actually pulling this data back out. You could automate that, but the service did only return four results, so not that bad right? You'd have to enumerate through quite a large number of results. The problem was the limit of four results was specified by a query string, so you could change it to 40 or 400 or 4000 in which case you'd get the result for every single parking bay in the shopping center because there were only about 2500 of them. Not only could you get those 2500 results, but you could replay the query over and over and over again, perhaps say every minute, and have a very good profile of the vehicles that come and go from the shopping center. Clearly, this is a serious privacy risk because now you can track the movements of vehicles. And ultimately because a number plate really is personally identifiable information, you can pretty reliably tie it back to just one or two individuals. This posed a really serious privacy risk. Now, we're going to have a look at a similar example of this in the sample app throughout the rest of this module. But there's one other thing that this Westfield implementation did that I'm also going to talk about in this module, and that is that it exposed the administration facilities to control the parking sensors. So, think about things like the signs that show the number of bays that are currently available in many modern-day car parks, and the messages shown on boards. All of these were controlled by a web interface that the exposure of this API effectively made publically available. And again, I'm going to show you how easy it is to locate those sorts of risks throughout the remainder of this module. So, I hope that gives you some real-world context to the sort of things we're about to look at. These are not hypothetical. These happen, and this certainly isn't the only example. Let's go and take a look at that vulnerable mobile app now.

Post a Comment

1 Comments