General background
- AWS introduction
- AWS has regions and each region has multiple availability zones
- How to choose a region
- Compliance with laws and rules
- Proximity to customers: reduce latency
- Service availability: some services are not available in all regions
- Pricing should be considered
- AZs are isolated from each other and connected with high bandwidth, low latency network
- Some service are global, some are region-scoped
- AWS management console is a web-based user interface
IAM and AWS CLI
- IAM introduction
- IAM = identity and access management, global service
- Root account is created by default, should not be used or shared
- Users should be grouped, groups can contain users but not other groups, users can be in multiple groups and don't have to belong to a group
- IAM permissions
- Users and groups can be assigned JSON documents called policies
- Policies define permissions for users
- The least privilege principle: don't give more permissions than a user needs
- Policy structure: version, id, statement(sid, effect, principal, action, resource, condition))
- IAM security
- IAM password policy: length, character types, expiration, reuse prevention, etc.
- IAM MFA: multi-factor authentication
- Virtual MFA device
- Universal 2nd Factor(U2F) Security Key
- Hardware Key Fob MFA device
- Hardware Fob MFA device for AWS GovCloud
- How to access AWS
- AWS Management Console (also you can use AWS CloudShell)
- AWS CLI: use
aws --version
to check the version - AWS SDKs(for code)
- IAM Roles
- IAM roles are permissions assigned to ASW services
- Example: EC2 instance roles, Lambda function roles, Roles for CloudFormation
- IAM security Tools(audit purpose)
- IAM credentials report(account-level)
- IAM access advisor(user-level)
EC2 Fundamentals
You can setup budgets for your AWS account
EC2 Basics
- EC2 = Elastic Compute Cloud = Infrastructure as a Service(IaaS)
- EC2(rent virtual machines), EBS(store data on virtual drives), ELB(distribute load across machines), ASG(scaling the services using an auto-scaling group)
- You can use
EC2 User data
to bootstrap EC2 instances(only run once at first start, run onroot
user)
AWS services
- EC2 Fundamentals
- Introduction
- AWS EC2 is one of the most popular services
- EC2(Elastic computer cloud) is an infrastructure as a service (IaaS)
- It contains renting virtual machines(EC2), storing data on virtual drives(EBS), distribute load across machines(ELB), scaling the services using an auto-scaling group(ASG)
- You need to specify OS, CPU, RAM, storage space(network-attached: EBS&EFS, or hardware: EC2 instance store), network card(speed of the card, public IP address), firewall rules(security group), bootstrap script(EC2 user data)
- It is possible to bootstrap EC2 instances using an EC2 user data script at the instance first start, it only runs once
- Create an EC2 instance
- Tips: create key pair for login, allow HTTP traffic from the internet, add setup user data, the public IP may change after restart, but the private IP remains the same
- EC2 instance types
- AWS instance naming convention: [instance class] + [generation].[size within instance class]
- Security groups
- Security groups are the fundamental of network security in AWS
- Security groups control how traffic is allowed into or out of EC2 instances (access to ports, authorized IP ranges, inbound/outbound traffic, etc.)
- Security groups can only contain allow rules, rules can reference by IP or by security group
- Security group can attach to multiple instances, a instance can have multiple security groups
- It's good to maintain one separate security group for SSH access
- Error type: timeout error, the problem is related to security groups, connection refused error, then it's an application error or it's not launched
- By default, all inbound traffic is blocked, all outbound traffic is allowed
- Classic ports to know
- port 22: SSH(secure shell), login into a Linux instance
- port 21: FTP(file transfer protocol), upload files into a file share
- port 22: SFTP(SSH file transfer protocol) upload files through SSH
- port 80: HTTP(hypertext transfer protocol), access unsecured websites
- port 443: HTTPS(hypertext transfer protocol secure), access secured websites
- port 3389: RDP(remote desktop protocol), login into a Windows instance
- Connect to EC2 instances
- The default user name is
ec2-user
- Use local SSH to connect with public key
- Use EC2 instance connect(a browser based SSH terminal) provided by AWS
- Do not set
aws configure
on EC2 instances, otherwise all iam users can access your credentials. Attach IAM roles to EC2 instances instead: instance -> right click -> security -> modify IAM roles
- The default user name is
- EC2 instance purchasing options:
- On-demand instances: pay full price after launch
- Saving plans: commit to use for a time period, get discount
- Reserved instances: commit to use a consistent instance configuration(like instance type, region), get discount
- Spot instances: use unused instances, large discount
- Spot fleet: set of spot instances + optional on-demand instances
- Strategies to allocate spot instances within a spot fleet
- lowestPrice: from the pool with the lowest price, cost optimization, short workload
- diversified: distributed across all pools, great for availability, long workload
- capacityOptimized: pool with the optimal capacity for the number of instances
- Dedicated hosts: a physical server with EC2 instance capacity fully dedicated to your use
- Dedicated instances: an instance on virtual private cloud that's dedicated to a single customer
- Capacity reservations: allow customers to reserve instances in a specific zone for any duration
- Introduction
- EC2 Solutions Architect Associate Level
- Public IP & private IP
- Networking has two sorts of IPs: IPv4 and IPv6
- At present, IPv4 is more common than IPv6
- Ipv4 format: [0-255].[0-255].[0-255].[0-255]
- Public IP allows access from everywhere on the Internet, private IP allows access from local networks
- Elastic IP
- When you stop and then start an EC2 instance, it can change its public IP
- If you need to have a fixed public IP for your instance, you need an Elastic IP
- An elastic IP is a public IPv4 IP you own as long as you don't delete it, you can attach it to one instance at a time
- Try avoiding elastic IP, instead use a random public IP and register a DNS name to it
- Networking has two sorts of IPs: IPv4 and IPv6
- EC2 Placement groups
- Cluster
- Place cluster instances into a low-latency group in a single available zone
- Great network, if the rack fails then all instances fail at the same time
- Suitable for big that needs low latency and high network throughput
- Spread
- Spreads instances across underlying hardware
- Can span across available zones, reduce risk in simultaneous failure, but limited to 7 instances per available zone per placement group
- Suitable for application that maximize high availability, and those that must be isolated from each other
- Partition
- Spreads instances across many different partitions within a available zone
- Can span across multiple available zones, instances in different available zone do not share racks, a partition failure can affect many instances but won't affect other partitions
- Suitable for distributed applications like HDFS, HBase, Kafka, Cassandra
- Cluster
- Elastic Network Interfaces(ENI)
- ENI is a logical component in a VPC that represents a virtual network card
- Attributes of ENI
- One primary private Ipv4, one or more secondary IPv4
- One elastic IPv4 per private Ipv4
- One public Ipv4
- One or more security groups
- A MAC address
- You can create ENI independently and attach them on EC2 instances
- Bound to a specific available zone
- EC2 Hibernate
- Stop/terminate an instance
- Stop: the data on disk(EBS) is kept intact in the next start
- Terminate: any EBS volumes also set-upt to be destroyed is lost
- On start
- For the first start: the OS boots & the EC2 User Data script is run
- Following starts: the OS boots up
- Then your application starts, caches get warmed up, and that can take time
- EC2 Hibernate
- The RAM state is preserved, then instance boot is much faster
- Actually, the RAM state is written to a file in the root EBS volume, the root EBS volume must be encrypted
- EC2 hibernate supports a lot instance families, but there are still some requirements for instance hardwares; EC2 hibernate is available for on-demand, reserved and spot instances
- An instance cannot be hibernated for more than 60 days
- Stop/terminate an instance
- EC2 Nitro
- This is the underlying platform for the next generation of EC2 instances
- Use new virtualization technology, which allows for better performance, better underlying security
- EC2 vCPU
- Multiple threads can run on one CPU in EC2
- Each thread is represented as a virtual CPU
- EC2 instances come with a combination of RAM and CPU, you can optimize CPU options(set number of CPU cores, and threads per core) during instance launch
- EC2 Capacity Reservations
- Capacity reservation ensure you have EC2 capacity when needed
- There are manual or planed end-date for the reservation
- No need for a long time
- Capacity access is immediate, you get billed as soon as it starts
- Public IP & private IP
- EC2 instance storage
- EBS
- EBS intro
- An EBS(Elastic block store) volume is a network drive(not a physical drive, might be a bit of latency) you can attach to your instances while they run
- It allows your instances to persist data, even after their termination
- EBS can be mounted to one instance at a time, and are bound to a specific available zone
- EBS can be detached from an EC2 instance and attached to another quickly
- EBS delete on termination attribute
- When you terminate an EC2 instance, the root EBS volume is deleted by default
- You can change the delete on termination attribute to false, so that the root EBS volume is not deleted when the instance is terminated
- You can also change the delete on termination attribute to false for any additional EBS volumes you attach to your instance
- EBS snapshots
- Make a backup of your EBS volume at a point in time
- When detaching volume, snapshot is not required but strongly recommended
- Can copy snapshots across available zone or region
- Snapshots can be archived to reduce cost
- You can set recycle bin for snapshots
- EBS volume
- Types
- gp2/gp3(general purpose SSD): general purpose ssd volume that
balances price and performance
- IOPS are throughput are linked in gp2, in gp3 you can independently set the IOPS and throughput
- io1/io2(provisioned IOPS SSD): highest performance ssd for
low-latency or high-throughput workloads
- Great for IOPS like databases
- io1/io2 can increase PIOPS independently from storage size
- io2 have more durability and more IOPS per GiB
- st1(throughput optimized HDD): low cost volume for frequently
accessed, throughput-intensive workloads
- st1 is throughput optimized HDD
- Suitable for big data, data warehouses, log processing
- sc1(cold HDD): lowest cost volume designed for less frequently
accessed workloads
- sc1 is cold HDD
- Suitable for scenarios where low cost is important
- gp2/gp3(general purpose SSD): general purpose ssd volume that
balances price and performance
- Only gp2/gp3/io1/io2 can be used as root volumes
- EBS multi-attach
- Attach the same EBS volume to multiple EC2 instances
- Only apply to io1/io2 family
- EBS encryption
- If you create an encrypted EBS volume, data is encrypted, snapshots are encrypted, all volumes created from the snapshots are encrypted
- Encryption has minimal impact on performance
- EMS encryption leverages keys from KMS
- Procedures for EBS encryption
- Create an EBS snapshot of the current volume, and this snapshot is unencrypted
- Copy the unencrypted snapshot to create a new encrypted snapshot
- Create a new volume from the encrypted snapshot, which is also encrypted
- Attach the encrypted volume to the original instance
- Types
- EBS intro
- AMI
- AMI(amazon machine image) is a customization of an EC2 instance
- You can add own software, configuration, os, monitoring
- AMI is built for a specific region
- You can launch EC2 instances from
- A public AMI: aws provided
- You own AMI: you make and maintain them yourself
- An aws marketplace AMI: third party AMI
- right click on instance ->
image and template
->create image
- EC2 instance store
- EBS volumes are network drives with good but limited performance
- If you need a high performance hardware drive, you can use EC2 instance store
- EC2 instance store has better I/O performance, suitable for buffer/cache/scratch data
- EC2 instance store lose storage if they're stopped, you should make backups and replications to avoid risk of data loss
- Amazon EFS
- EFS(Elastic file system): a managed NFS(network file system) that can be mounted on many EC2
- EFS works with EC2 instances in multiple available zones
- EFS is highly available, scalable(the size of EFS can be scaled automatically), and expensive, it's compatible with Linux but not Windows
- Use security group to control access to EFS
- Performance mode(set on creation time of EFS) of EFS: general purpose mode(default), max I/O mode, throughput mode
- EFS storage classes: standard tier for frequently accessed files, infrequent access(EFS-IA) tier for low cost
- Availability and durability: standard(multiple-available zone), one zone (one available zone, compatible with EFS-IA to formulate EFS One Zone-IA, huge cost reduction)
- Comparison of EBS and EFS
- EBS volumes
- Can be attached to only one instance at a time. In order to migrate an EBS across available zone, you need to take a snapshot and restore the snapshot to another available zone
- Are locked at one available zone level
- gp2: IO increases if disk size increases
- io1: IO is independent of disk size
- Root EBS volumes is terminated when the EC2 instance is terminated
- EFS
- Can be mounted to multiple instances across multiple available zones
- EFS share website files
- EFS only works for Linux, but not Windows
- EFS has a higher price point than EBS, can switch EFS_IA for cost savings
- EBS volumes
- EBS
- High availability and scalability