Learning from the July 2019 Capital One Breach

// Tue 30 July 2019

News broke today of breach involving Capital One, involving the theft of data stored in non-public S3 buckets through a multi-step targeted attack.

The first thing I want to make clear is that I sympathize with the Capital One security and operations teams at this difficult time. Capital One is a well-known innovator in cloud security, has very competent people dedicated to this and has even developed high quality open source solutions such as Cloud Custodian that benefit the entire community. No security is perfect, and if you are a big enough target get used to the idea that being breached is a question of "when" not "if".

I sincerely hope that this incident is not blown out of proportion and does not cause a slow down of their strategy of being a cloud-first bank by next year, as they announced on stage at AWS re:Inforce. In any case, I think that an organization as heavily invested into cloud security as Capital One being breached is a good reminder that security is hard. Also, that even though the cloud offers many advantages in applying security over on-premises environments, it is still relatively new and evolving technology and there's a non-trivial learning curve.

You can see the Resources section below for detailed write-ups, and the affidavit describing what is known so far about this case. Reading through this and discussing this with others, I'd like to document a few practical guidelines and mitigations that might help prevent similar attacks in the future.

Least Privilege

The attack involved the attacker obtaining temporary credentials associated with an IAM role referenced as *****-WAF-Role in the affidavit. A big part of the problem seems to be that this role had excessive privileges:

For service roles, remove s3:ListAllMyBuckets and similar List/Describe privs. Code normally knows what buckets it needs to work with, and does not need to list them.
— Scott Piper (@0xdabbad00) July 30, 2019

Being able to use the unnecessary list buckets privilege granted to the role allowed the attacker to find out and explore multiple buckets until they found interesting data. Had that privilege not been granted to the role, and assuming they didn't have insider knowledge of the infrastructure, the attacker might have been slowed down or even prevented from finding buckets with sensitive data entirely.

The same IAM role credentials were then used to list and download data from the buckets apparently using the AWS CLI s3 sync command, which actually uses multiple bucket list and object read operations to do its work.

The name indicates that this role was being used by a Web Application Firewall solution. So presumably the S3 privileges were either meant to allow it to access websites published via S3 or to write logs or configuration backups to S3. It wouldn't surprise me if someone simply granted access to all buckets or even to all S3 operations during deployment either manually or using an overly permissive AWS Managed Policy in the same role.

So two lessons here:

Ensure your policies allow as few Actions and Resources as possible! Avoid using * like the plague, and always provide an explicit list of Actions and Resources in your policies.
Don't trust AWS Managed Policies blindly. Review them and, when necessary, create more specific and more secure versions for your own use. Particularly when they violate the first lesson.

If the WAF role in this incident only had access to writing new objects into a single bucket containing WAF logs, the impact from this intrusion might have been minimized.

Control Permission Use Context with Policy Conditions

The other thing that caught my attention on this breach is that the attacker was able to get temporary credentials from an IAM role assigned to a resource on Capital One's VPC and then use them from an external computer under their control to download data from the S3 buckets.

On an on-premises environment, a "private" FTP or NFS server would be on an internal network not accessible from the Internet, so the attacker would not only need credentials but also network access in order to use them. That's not how S3 works, however. Even non-public S3 buckets can be accessed by default from anywhere in the world by someone holding the right credentials.

Turns out, though, that you can implement something similar in the case of S3 buckets. A very infrequently used security measure for S3 buckets is to limit the access to API requests originated on a specific VPC, VPC endpoint of IP address ranges by using policy conditions.

So at a minimum, if you have an IAM role granting access to a particular S3 bucket, and that role is only meant to be used by on-VPC resources such as EC2 instances, you could make it so that the access is only granted if the VPC or VPC endpoint is the one you expect.

Even better, if a particular bucket will only be used by resources on your VPC, you could add a statement to the bucket policy denying any access attempt that does not originate from the specific VPC or VPC endpoints you control to its bucket policy. This would ensure that even if excessive privileges are granted to users or roles, the Deny statement will take precedence and no access to that bucket from outside of your VPC will ever happen. This would be the closest equivalent to the "FTP or NFS server in the internal network" analogy on an on-premises environment.

Either of those measures might have prevented the attacker from using the role's temporary credentials to access those non-public buckets from a computer outside of the company VPC to download the data. The attacker would have then needed to take additional steps, such as compromising or creating a new resource on the Capital One VPC with sufficient outbound network permissions, in order to exfiltrate the data. This might have created the conditions that allowed the intrusion to be detected before it was successful.

This technique can be generalized to other services beyond S3, of course, but make sure you test the necessary scenarios thoroughly to avoid blocking legitimate requests. Not all condition keys are available in all API access scenarios and service combinations, because why would things be this easy?

Resources

Here are a few other resources if you want to learn more about what the incident:

Capital One gets Capital Done: Hacker swipes personal info on 106 million US, Canadian credit card applicants by The Register using a rather distasteful title but with good information.
Affidavit which is the original source of what is known so far about the breach, including the alledged attacker's many OPSEC failings.
What We Know about the Capital One Data Breach by Rich Mogull
Netflix Information Security: Preventing Credential Compromise in AWS by the Netflix team and brought to my attention by Scott Piper

Go Top