AWS Cloud Practicioner Notes
Key points before diving in:
- AWS is based on the client/server model
- You only pay for what you use!!!
- An instance is a server (a vm) on AWS
- AWS cloud computing is the delivery of IT resources over the net with pay as you go pricing (you can create servers, or get more storage, as you need them)
The three cloud computing deployment models are:
- cloud-based
- on-premises
- hybrid - for when you want to integrate cloud with legacy apps
Benefits of cloud computing
- Trade upfront expense for variable expense - Upfront expense refers to data centers, physical servers, and other resources that you would need to invest in before using them. Variable expense means you only pay for computing resources you consume instead of investing heavily in data centers and servers before you know how you’re going to use them.
- Stop spending money to run and maintain data centers Computing in data centers often requires you to spend more money and time managing infrastructure and servers. A benefit of cloud computing is the ability to focus less on these tasks and more on your applications and customers.
- Stop guessing capacity (pay as you go)
- Benefit from massive economies of scale. By using cloud computing, you can achieve a lower variable cost than you can get on your own. Because usage from hundreds of thousands of customers can aggregate in the cloud, providers, such as AWS, can achieve higher economies of scale. The economy of scale translates into lower pay-as-you-go prices.
- Increase speed and agility
- Go global in minutes. The global footprint of the AWS Cloud enables you to deploy applications to customers around the world quickly, while providing them with low latency.
EC2 - the server that you use to get access to servers (they are highly flexible, cost effective, and quick to set up)
If you turn off an EC2 instance, you stop paying for it.
Multi-tenancy - sharing hardware among various virtual machines the vm's are secure from one another
You can vertically scale an EC2 instance and control the network access to it
To set up an EC2 instance
- LAUNCH - provision it by choose OS, applications, instance type, hardware configuration
- CONNECT to it (in various ways, commonly from your own desktop)
- USE - install software, add storage, etc.
Types of EC2 Instance types (aka instance families)
- General Purpose - general, such as normal web server
- Compute (CPU) Optimized - Gaming server
- Memory Optimized - High performance DB
- Accelerated Computing - graphics intense, streaming
- Storage Opimized
Pricing Plans
- On Demand - good for getting started to figure out your usage
- Savings Plan - low prices if you commit to certain level of usage for 1 to 3 yrs
- Reserved Usage - for if you know your usage for 1 to 3 years (up to 70% vs on-demand)
- Spot Instances - if your service can withstand interupptions (batch jobs, AWS may shut them down temporarily to use resources elswhare
- Dedicated - physical servers for your use only (the most expensive kind)
Scaling EC2 You can provision your resources usage to change throughout the day to match your business processes
scaling up vs scaling out
Dynamic scaling - responds to changing demand in real time Predictive scaling - automatically schedules the right number of instances based on predicted demand
When you create an Auto Scaling group, you can set the minimum number of Amazon EC2 instances. The minimum capacity is the number of Amazon EC2 instances that launch immediately after you have created the Auto Scaling group. This is horizontal scaling
When you create an Auto Scaling group, you set the:
- minimum capacity
- desired capacity
- maximum capacity And AWS automatically switches between them based on demand
Load Balancing across EC2 instances - ELB (Elastic Load Balancing) All instances use a single URL and ELB handles the load balancing
Messaging and Queuing Avoid tightly-coupled architectures where if one component fails, the entire system fails If App A and App B work together, it B fails then A will also fail
Use a message queue to deal with this. App A sends request to message queue. If B fails, the message remain in the queue
There are two types of message services in AWS: Amazon Simple Notification Service (SNS)
- send store receive messages between services at any volume
- uses pub/sub
Amazon Simple Queue Service (SQS)
- messages exist unti they are processed
You are responsible for updating your EC2 instances ,setting up the scaling plan, architecting them unless you are using the serverless approach (in which you cannot see or access the underlying architecture). This is AWS Lambda. You can run code without having to provision an EC2 instance (you only pay for the compute time used - the time the function is running). An example is a lambda function that resizes uploaded images.
Upload your code into a lambda function, then configure a trigger. Then the code runs in a managed environment that it automatically scalable and highly availble. It's not suited for long running processes (like deep learning), but operations that run in 15 minutes or less.
Container - package for your code (Docker) that runs in isolation from other containers. Containers that work together are called clusters
ECS - Elastic Container Service
EKS - Elastic Kubernetes Service
You need to monitor, start, stop, and restart them. This is called orchestration. ECS and EKS hels you orchestrate. You could run your containers on an EC2 instance (or multiple), but if you don't want to manage the instance (and you don't need access to the OS) you can use AWS Fargate (a serverless compute engine for containers).
AWS Fargate is good for when your containers meet these criteria:
- You host short running functions
- you need service oriented apps (?)
- you need event driven apps
- you don't need to provision servers
AWS Global Infrastructure High Availability and fault tolerance AWS has regions - datacenters in large groups Your data stays in the region you choose (it does not get spread out to other regions for security reasons) Some governments insist that certain data be stored in their own region
Choosing a Region
- Compliance (is your data required to stay in a region)
- Proximity to your customer base
- Featurer availability
- Pricing (some regions are more expensive than others)
Latency - the time it takes data to travel from one place to another
Amazon creates availability zones for each region (which are physically separated for fault tolerance)
You should always run an app in at least 2 availability zones. ELB (elasctic load balancing) - automatically runs across multiple availability zones.
Edge Locations - sites that use Amazon Cloud Front to store cached copies of your data/content to reduce latency. It's a CDN!
Edge locations are separate from regions (I'm assuming there are many more of them).
Amazon Cloud Front - A global delivery service (a CDN that caches content in locations that are closest to your customers)
Amazon Route 53 Here's the Wikipedia definition: Amazon Route 53 (Route 53) is a scalable and highly available Domain Name System (DNS) service. "53" is a reference to the TCP/UDP port 53, where DNS server requests are addressed. In addition to being able to route users to various AWS services, including EC2 instances, Route 53 also enables AWS customers to route users to non-AWS infrastructure and to monitor the health of their application and its endpoints. Route 53's servers are distributed throughout the world. Amazon Route 53 supports full, end-to-end DNS resolution over IPv6. Recursive DNS resolvers on IPv6 networks can use either IPv4 or IPv6 transport to send DNS queries to Amazon Route 53
AWS Outposts - You can copy a region into your own data center(?)
You set up your AWS environment through API calls! To do this you could use:
- AWS Management Console
- AWS CLI
- AWS SDKs
- Various other tools
Managed tools for provisioning:
AWS elastic beanstalk - give your code to beanstalk and it can build out your environment for you and save environment configurations (you focus on the app, not the infrastructure)
AWS Cloud formation - build an environment by writing lines of code Define the settings in a .json or .yaml file
ELB (Elastic load balancing), SQS (Simple Que Server), and SNS (Simple Notification Server) automatically spread out over different availability zones.
Networking
Amazon Virtual Private Cloud AVPC you can create a virtual network of your EC2 instances The instances can be sub-netted into public and private groupings
You assign ip addresses to various subnets and can make them private/public
Public facing resources - you must attach an internet gateway to your VPC to make it publicly available.
For private networks, you can specify the range of IP adresses who can access it with a virtual private gateway AWS direct connect - you can set up a direct connection from your office to your AWS VPC (you have to work with a AWS DC partner in your area)
Security
Network hardening - every packet that passes into a subnet can get checked against a network access control list (ACL) based on who it was sent from and how it was trying to communicate. By default, ACLs are stateless and allow all inbound and outbound traffic (so you need to set up the rules)
packet a unit of data sent over a network
Security groups - you can designate a group of instances to allow certain types of traffic (HTTP). They are virtual firewalls that control inbound and outbound traffic to an instance. By default they disallow all inbound traffic, but they let all packets out.
If you have multiple instances in the same VPC you can use the same security group settings for them.
ACLs are stateless which means that any time traffic (packets) leave and/or return to a subnet they must pass ACL checks.
Security groups are stateful (?) - they remember the incoming packets and allow the outbound packets for the request to return.
Global Networking
Route 53 - AWS DNS Server (translates domain names into IP addresses) Route 53 can route requests based on various policies:
- latency based routing
- Geolocation
- Geoproximity
- Weighted round robin You can register domain names via Route 53
Route 53 can direct a request to the appropriate Amazon Cloud Front server (a CDN) to help with latency
EBS (Elastic Block Store)
Block level storage - A series of bytes (hard drives)
Instance store volumes - the harddrive for an instance. If you stop an instance all the instance store volumes will be deleted. The instance and the volume must be in the same availability zone
EBS - virtual harddrives that you can attach to your instances, the data will persist between stops and starts of instances.
You can take snapshots of your EBS volumes to back them up. A snapshot is an incremental back up, it stores the changes since the last snapshot.
S3 (Simple Storage Service)
Store and retreive unlimited amounts of data.
Files are store as objects in buckets (think of a bucket as a folder). Each object is composed of a key, the data, and metadata When you upload a file, you can set permissions and control visibility to it.
You can also use S3 verionsing to track changes.
Max object size is 5TB
There are different tiers of S3 storage based on how often you need to retreive the data, and how available the data needs to be.
- S3 Standard - Data is stored in at least 3 physical locations
- Static website - Your static files will be stored as a website
- S3 Intelligient-Tiering - for when changes to data and it's access is unknown.
- S3IA (standard infrequent access) - for archiving data that is not frequently accessed but requires high availability. Great for backups.
- S3 Glacier/Instant Retrieval - For archived data that requires immediate access.
- S3 Glacier/Flexible Retrieval - data that you don't need frequently that allows lots of r/w policies
WHAT is Amazon S3 Glacier Deep Archive (it was mentioned in a review question but I don't have it in my notes)
Life cycle policies - you can move data between the different storage by setting policies (ex - data that is 90 days old should be moved to Glacier)
EBS vs S3 EBS - Sizes up to 16TB - For large files that require complex read/write operations S3 - sizes up to 5TB for high availabilty, files are web enabled, regionally distributed (data has durability). Serverless (no instance necessary)
EFS (Elastic File System)
- managed file system (volume)
- multiple instances can access the files and scale automatically
- unlike EBS volumes, which are limited to a single availability zone, EFS is region wide
- multiple instances can attach to the volume
RDMS (Relational Database Management Services)
AWS supports
- MySQL
- PostgreSQL
- Oracle
- MS SQL Server
Lift and Shift - migrate your in-house db to AWS
Amazon RDS (Relational Data Service)
- Automated patching
- Backups
- Redundancy
- Failover
- Disaster recovery
Amazon Aurora - An enterprise-class relational database
- compatible with MySQL and PostgreSQL
- one/tenth the cost of commercial databases
- great for high-availability needs
- automatic data replication (6 replicas/copies across 3 availability zones)
- up to 15 read replicas
- continuous backups to S3
- point in time recoveries
DynamoDB (A No-SQL database, key/value database)
- A serverless database - you don't need to provision/patch/maintain a server
- Create tables of items, each item has attributes
- You don't need to worry about scaling (it's automatic)
- You don't need to worry about redundancy (it's automatic)
- It's schemaless (non-relational)
- Every item has a key (it's a key/value database)
- Great when item attributes may vary
- Millisecond response time
SQL vs No-SQL DBs
- Business analytics with complex joins - use RDS
Amazon Redshift
- A data warehousing service that you can use for big data analytics.
- It offers the ability to collect data from many sources and helps you to understand relationships and trends across your data
- Used Business Intelligience
- For when the data does not change once it's been captured
AWS Database Migration Service (DMS)
- secure, easy way to migrate data from your other servers (relational or non-relational) to AWS
- your 'source' to aws 'target'
If the schema is NOT the same for both places:
- Use the AWS schema conversion tool
use cases
- Dev and text db migrations - so your developers and testers can work with copies of the live db
- Database consolidation
- Continuous replication - sending ongoing copies of your data to targets instead of a one-time migration
Additional Database services
Amazon DocumentDB (with MongoDB compatibility)
- great for content management systems, catalogs
Amazon Neptune - a graph database
Amazon Quantum Ledger Database (QLDB)
A ledger service like block chain that keeps a history of all changes
Amazon Managed Blockchain
A distributed ledger
Amazon ElastiCache
Adds layers on top of your database(s) to help improve the read times of common requests (supports Redis and Memcached)
Amazon DynamoDB Accelerator (DAX)
An in-memory cache for DynamoDB
AWS Security Model
The AWS shared responsibility model - AWS provides security mechanisms, the customer must use them.
AWS is responsible for security of the cloud, you are responsible for security in the cloud (your VPC).
AWS Responsibilities
- physical infrastructure
- hardware
- network (VPC)
- hypervisor for instances
- data replication (for some services)
Customer Responsibilities
- Instance operating system settings (ex: apply patches)
- data security
- IAM
- Client-side encryption
AWS is responsible for the security of the cloud, you are responsible for the security in the cloud.
User Permissions and Access
AWS root user account - the primary admin of the account. You should use MFA for this account.
When you create an IAM user, it has NO permissions by default. You must explicitly allow it (least privilege principle).
You can attach an IAM Policy to a user (or multiple policies to a user)
Example Policy:
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "s3:ListObject",
"Resource": "arn:aws:s3:::AWS-SOME-BUCKET"
}
}
The Effect must be allow or deny.
The Action can be any AWS API call (???) (this api call would be to list the files/objects in a bucket)
The Resource - The AWS resource the API call is for (???)
You can manage users, and their permissions by organizing them into IAM groups. You apply policies to the group and then add users to the group.
A Role is a type of identity is similar to a user, but has no username or password. Roles are for setting permissions for temporary amounts of time.
When an identity assumes a role, the role permissions supercede any an all other permissions associated with the identity (the role becomes the single source of permisson).
If you want users to be able to log onto AWS resources with their Active Directory account, you can assign AD users to a role. (Not sure how this works).
Direct quote from source: "Before an IAM user, application, or service can assume an IAM role, they must be granted permissions to switch to the role."
AWS Organization
A centralized place to manage all AWS accounts associated with your organization
You can consolidate billing of all the accounts into one
You can organize accounts into hiararchical groups
Use Service control policies (SCPs) to control access to AWS services, resources, and APIs
You can group accounts organizational units and apply policies to them. OUs are for accounts that might have overlapping permissions/policies, you could apply a policy to the OU.
Compliance
You need to make sure that you are using AWS in compliance with the standards that apply to your specific business.
AWS meets many compliance standards for many industries. You can request documentation from AWS that states that you (and aws) are following data security standards.
AWS Artifact Agreements A service that provides on demand access to AWS security and compliance reports. You can also select online agreements and use them with your customers (examples are agreements that follow HIPA or Ferpa guidelines)
AWS Artifact Reports - Reports by 3-party audits that very that AWS is compliant with various global, regional, and industry-specific standars and regulations.
Distributed Denial-of-service attacks (DDoS)
Types of DDos attacks:
UDP flood Make a request to a service that returns lots of data for a request(like the national weather service) and have the data sent to a different server
HTTP level attacks
Slow Loris attack When an attacker opens and maintains many simultaneous HTTP connections by using 'partial' HTTP requests
AWS already works to prevent DDoS attacks
- Security groups can prevent these attacks (they operate at the network level so the EC2 don't get hit)
- Elastic load balancer can mitigate slow loris attacks
AWS Sheild A service that protects applications against DDoS attacks. AWS Shield provides two levels of protection: Standard and Advanced.
- AWS Shield Standard - automatically protects you at no cost
- AWS Shield Advanced - a paid service that integrates with other AWS services to give you detailed diagnostics to detect and prevent DDoS attacks
AWS WAF (Web application firewall) A firewall the allows you to monitor network requests to your web servers.
It uses a web access control list (acl) to allow/block requests (like a network acl).
Example: you can block requests from an IP address.
AWS WAF uses machine learning to recognize malicious requests.
WAF can be used in conjunction with CloudFront and Application Load Balancer.
AWS Key Management Service (KMS)
For creatining an managing the cryptographic keys that are use to encrypt/decrypt your data.
Encryption can be applied for
- Data at rest
- Data in transit
DynamoDB automatically encrypts it's data (at rest)
Amazon Inspector
Performs automated security tests and assessments
Amazon GuardDuty
Provides intelligent threat detection by monitoring network actiivty and account behavior.
You must enable it for your account (is there a fee?)