This is my set of notes for the AWS Developer Associate Exams. It based on having done the SA Associate

Weekly Guide

Times shown as at 18-Jan-2019

Revisit IAM / EC2 / S3 (5:26:18 - but content all covered in SA)
Look at Serverless (2:26:45)
Build something in Serverless (2 week project)
DynamoDB (1:36:02)
KMS and Other AWS Services (1:37:54)
Developer Theory (2:50:50)
Advanced IAM, Monitoring (1:15:42)
Practice Papers (2 weeks - 5 mock tests allowing about 2 hours for each test and review)

Total Time: 10 weeks

Reviewed

Free Test - June 2018
Free Test CDA
Test 1
Test 2
Test 3
Test 4
Test 5

IAM

Use CloudTrail to monitor STS
To use a role run STS:AssumeRole to get access as the role
If get error can use aws sts decode-authorization-message to get detail

Policy Types

Managed Policies (owned by AWS, cannot be edited => copy to customer manage)
Customer Manager Policy (only with own account)
Inline policy (embedded into the User, Group or Role ==> Customer Manager generally recommended)
aws iam simulate-custom-policy to test permissions. Need to get context keys to supply to CLI

Web Identity Federation

Authenticate with other internet providers (e.g. Google, Facebook, Amazon, SAML (Active Directory) and Open ID)
Cognito acts a identity brokers between AWS and internet providers
- OAuth 2.0 flows
- Can customise look and feel (logo labels etc)
- Can hold user pool as well as other providers (built in signup, signin and guest users)
  - Can have password rules and MFA requirements
  - Verification code flow
  - Has groups as well as users with mapping to IAM
- User logs into provider, get JWT token from broker (Facebook), cognito exchanges for temporary limited IAM token
- Push Syncronization with SNS to send user updates to devices
- SAML
  - Security Assertion Markup Langauge 2.0
  - Endpoint https://signin.aws.amazon.com/saml
  - https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_enable-console-saml.html
  - SAML Federation for organisation SSO to AD
- Streams
  - Allows access to data stored in Cognito
- Can detect compromised credentials on User Pool
  - Use Block use in advanced security
  - Checks on Sign In, Sign Up, Password Change

EC2

ECS

Fully managed orchestration service
Run a docker container
IAM Role controls access to resources (like an EC2) set at Task level
Part of the VPC so usual VPC access
Can use Securty Group on instances to isolate
Often stick an Application LB in front of ECS

S3

Public url: https://[bucket].s3-website-[region].amazonaws.com
MultiPart upload - 3 steps: Initiate, Upload part, Complete
Rough performance is 3,500 writes / second and 5,500 reads / second
- If encrypted will be bottlenecked by KMS as well (5,500 / second)
Still use prefix if absolutely needed...
To block unencrypted uploda need to us Bucket Policy Denying if no x-amz-server-side-encryption
503 errors can be when have millions of versions of a file - check inventory
CloudFront TTL use minimum TTL and origin can add Cache-Control or Expires headers

Serverless Computing

Can use a dead letter queue for failed function invocations
- Either SQS and SNS
- Function retried twice if invoked asyncronously
- Need to grant permission to lambda to SendMessage to SQS or Publish to SNS
Lambda defaults to 3s for timeout (max 15 minutes)
Step function to orchestrate performing discrete functions or tasks and co-ordinate
Lambda@Edge allows for code in front of CloudFront

Lambda ALIAS

By default points at single version (pointer to specific version)
routing-config allows you to point at two versions
Controls percentage at each version

API Gateway

Client ==> Method Request ==> Integration Request ==> Backend
Backend ==> Integration Response ==> Method Response ==> Client
Method are API interface and frontend
Integration are where API interacts with bacekn

DynamoDB

Fast, flexible NoSQL db
Single digit ms latency at any scale
Fully managed
Both document and key value data models
Backed by SSD storages
Encryption must be chosen at creation
Always across 3 DCs (avoids single point of failure)
Global tables
- multi region
- multi master
Consistency - Eventual (default - usual within 1s) or Strongly
Model:
- Tables, Items (like a row), Attributes (like a column)
- Key: Name of Data, Value: data
- Documents in JSON, HTML or XML
Stored by Primary Key (also called HashKey)
- 2 Types:
- Partition Key
  - Used as part of hash to determine physical location data is stored
  - No 2 items can have same Primary Key
- Composite Key
  - Partition Key (HASH) and Sort Key (RANGE)
  - 2 items may have same partition key but not primary key
  - Same Partition Key stored together, then sorted by sort key
Access managed by IAM
- Control access and creation
- Create role allowing you to get temporary access keys to access table
- IAM Condition allowing only own record access
  - Use UserID as Partiion Key and then make access only if match
  - dynamodb:LeadingKeys value name
Two classes of Table
- Standard
- Global
Supports indexes
- Performance boost as per SQL
- 2 Types: Local or Global Secondary Index
- Local Secondary Index
  - Created at table creation (and fixed)
  - Same Partition Key, different sort key
- Global Secondary Index
  - Add whenever
  - Different partition key and sort key
Transactions (Reinvent 2018)
- ACID (Atomic, Consistent, Isolated and Durable)
  - Ideal properties of transaction
- All or nothing action across multiple tables
- Durable across system failure
- e.g. buying an item in a game
TTL
- Time to live
- Expiry time for data
- Stored as attribute in Unix timestamp format
- Reduce storage cost
- Automatically deleted at some point after (within 48 hours)
- Can filter in scans/queries
- Manage TTL lets you set attribute and lets you preview state

Query or Scan

By default All attributes
- ProjectionExpression to get subset
Query:
- finds by Primary Key (not necessarily unique)
- Option Sort Key name and value to refine
- Always sorted by Sort Key
  - Can be reversed using ScanIndexForward parameter
  - Only on Queries, despite name
- Eventually Consistent by Default, need to specify for Strongly Consistent
- Can use secondary index as well
Scan:
- Examines all entries in a table
- Can filter but applied after dumping entire data
- Slower than Query
- Lower page size to stop Scan from blocking access to tables
- Can run scan in parallel scans
  - Logically divide table into segment and scan in parallel
  - Impact on performance
Additionally Get and BatchGet
- Retrieve by Primary Key(s)

Provisioned Throughput

Specify in Read and Write Capacity Units (per second)
- 1 WCU = 1x 1KB Write
- 1 RCU = 1x Strongly Consistent Read of 4KB or 2x Eventually Consistent Read of 4KB
Support for autoscaling
- On-Demand Capacity pricing (Reinvent 2018)
- Pay per request
- Use if Unknown or unpredicatble workload
Can change price model once a day
Write costs more than read
Partial items round up
ProvisionedThroughputExceededExceptions
- Request rate too high for capacity
- AWS SDK will automatically retry using exponential back off (feature of every AWS SDK)
- If hand rolled then use exponential back off approach
ReturnConsumedCapacity in query to get used capacity: NONE (off) / TOTAL / INDEXES

DyanamoDB Accellerator (DAX) or Elasticache

In Memory cache for DynamoDB
- Write through cache - i.e. written at same time as Table update
- If present gets from Cache on read (Cache Hit)
- Otherwise does Eventually Consistent read (Cache Miss)
Point API calls at DAX cluster rather than DynamoDB
Read only boost
10x moves to microsecond performance
Idea for Read-Heavy or burst performance
Can use Elasticache as well if prefered
- Memcached or Redis
- Memcached is not Multi-AZ
- Redis has option of multi az redundancy
- Allows both Write Through or Lazy Load
  - Have to watch for data becoming stale on Lazy Load
  - Handle this by having a Time To Live in Elasticache (doesn't avoid, just lowers change)
  - Lazy Load has less data being held so more resource efficient
- Use for RDS

Creating DynamoDB

Initially permission EC2 or Lambda via IAM (DynamoDB Full DB)
Create a table

aws dynamodb create-table
  --table-name students
  --key-schema
  --attribute-definitions

Dynamo DB Streams

Time order sequence of modification events (CUD)
Stored for 24 hours
Encrypted at rest
Dedicated enpoint
Trigger lambdas (lambda polls the stream)
Near real time
Creation of materialised views

KMS

Key Management Service
Main difference from CloudHSM is that KMS is shared hardware rather than dedicated
Manage encryptions keys
2 roles - use (encrypt/decrypt) or manage
- Can add external users as well as IAM Users and Groups
Included in EBS, S3, RDS, Redshift, Workmail, Elastic Transcoder and others
Part of IAM
Not global - keys are regional
When creating a key - Key material can be from KMS or external
When deleting a key
- First disable the key
- Then schedule deletion (within 7 - 30 days)
Customer Master Key
- Metadata: Alias, creation date, description and key state
- Key material
- Can not be exported (need to use Cloud HSM if you want to export)
Envelope Encryption
- Envelope Key (key used to encrypt data)
- Envelope Key is encrypted by the customer master key
- Data Key is decrypted Envelope Key
- Command GenerateDataKey or GenerateDataKeyWithoutPlaintext
KMS API Calls
- aws kms encrypt --key-id --plaintext <File/Text> --output text --query CipherTextBlob
- aws kms decrypt --key-id --ciphertext-blob <File/Text> --output text --query PlainText
- aws kms re-encrypt --ciphertext-blob <File/Text> --destination-key-id
  - Take encrypted one, decrypt and then re-encrypt it
- aws kms enable-key-rotation
  - Key will automatically be rotated once a year
  - only is key material from KMS (not imported)

SQS

Decouple components
Oldest AWS service
Webservice access to a queue
256KB of text (any format e.g. JSON/XML)
Retention is 1 min to 14 days (default 4 days)
Poll based system not a push based systems (use SNS)
Can work as a buffer layer (producer faster than consumer)
- Could use length of queue to autoscale processors
Messages marked as invisible while being processed
- If processing fails re-appears after visibility time out
- Default time out is 30 s
- Maximum time out is 12 hours
Two Types of Queue
- Standard
  - Delivered at least once
  - Best effort ordering
  - Nearly unlimited rate
- FIFO
  - Exactly once
  - Firt in first out
  - Max of 300 tx / s
  - Support message groups
Long Polling
- Doesnt return from request until either timeout or a message sent
- Reduces costs
Compatible with JMS (v1.1) - only standard queues
Delay Queues - can lag a message up to 15 minutes (default is 0). Can set on indiviual messages.
Can have message attributes as well as payload

SNS

Web service to send notifications
Push to mobile devices, SMS, email, SQS or HTTP endpoint
- Also can trigger a lambda
- Simple APIs
Grouped by topics
- Allows recipients to subscribe
- Support different target types and more than one subscriber
Redundant storage across multiple AZs
Push based (pub-sub)
- User can add filter policy when they subscribe to get subset
Pricing
- $0.5 / 1m Requests
- $0.06 / 100k HTTP deliveries
- $0.75 / 100 SMS
- $2 / 100k emails
SES
- Simple Email Service
- Automated emails (e.g. Marketing email, shipping email)
- Can receive email as well (to S3 or trigger SNS / Lambda)
- Not subscription based just need email of target

Elastic Beanstalk

https://www.youtube.com/watch?v=SrwxAScdyT0
Deploying and scaling Web Apps (or Docker environments)
- Deployment
- Provisioning
- Load Balancing / AutoScaling
Written in Java, .Net, PHP, NodeJS, Python, Ruby, Go, Docker
- Packer to create custom environments using AMI and Platform.yaml
Widely used platfroms Tomcat, NGINX, IIS, etc.
Can be within a VPC
Can include additional resources such as RDS
GUI driven
- Control of EC2 type
- EBS can fully manage or can take over full EC2 management
- Managed platform updates
  - OS, PHP etc
  - Control times
Pay for and control of deployed resources
EBS Deployment Policy
- All at Once
  - All simultaneous
  - All go out of service (not for prod systems, for test & dev)
  - If fails would need to roll back
  - Works on single instance
- Rolling
  - Deploys in batches
  - Performance impact as cluster shrinks by batch size
  - If fails would need to roll back
- Rolling with Additional Batch
  - Adds another batch
  - Performance not impacted
  - If fails would need to roll back
- Immutable
  - Completely new fresh instances in new auto scaling group
  - Once healthy moved to existing group and old terminated
  - Preferred option for mission critical systems
  - Roll back easy as just involves killing new ASG
  - Works on single instance
- Blue/Green
  - As per immutable but swap URL in DNS at end
  - Create new environment, deploy to it, swap URLs
Code and configuration in an S3 bucket
- Config written in JSON or YAML
- Called .config in .ebextensions folder
- .ebextensions in top level of application source code bundle
- Use to change the instance size (S3 file)
- Precedence: Default, .ebextensions, Saved Config, Settings directly applied
Delete of environment deletes whole stack
When using with RDS
- Good for Dev / Test as database coupled with environment
- For production, decouple and launch separately
  - Additional Security Group on ASG
  - Provided connection string configuration to application servers
- Can control what happens with Retention setting - e.g. Create snapshot to keep DB
Every deployment creates a version
- Will hit version limit
- Use Application Version Lifecycle policy to delete old versions
- Each application can have multiple versions
Periodic stuff use a cron.yaml file
EB CLI allows for monitoring and working with environment
Change environment to change runtime

Kinesis

Streaming data from 1000s of sources simultaneous in small sizes (Kbs)
Load, Analyze Streaming Data
3 Services
- Streams
  - Data stored in Shards
  - 24hr to 7 day retention
  - Number of Shards controls capacity of streams
    - Max of incoming write / 1000 and outgoing read / 2000
  - Consumers read and send on
  - Can only control order within a shard using sequenceNumberForOrdering parameter
- Firehose
  - No need to worry about shards or streams
  - Completely automated
  - Can use Lambda to analyse / transform
  - Sends data to S3
  - No automatic retention - either straight to Lambda, S3, Elasticsearch (+Splunk)
  - If to Redshift via S3
- Analytics
  - SQL queries as it exists in Firehose or Streams
  - Store in S3, Redshift, Elasticsearch

CodeStar - CI / CD

White Paper
Cloud9 - Code Editor
Code Commit (Git) ==> Code Build (Build System) ==> Tests ==> Code Deploy ==> Envronments
Code Pipeline links it all together
CodeStart is the overall service for SDLC in cloud

Code Commit

Git
HTTPS and SSH
- IAM based need to create Git credentials in IAM
Notifications to SNS or CloudWatch
Cross Account
- Create cross account IAM role
- Grant role access
- Provide ARN to users

Code Build

Fully managed code build service
e.g. CodeCommit => Docker Build ==> ECR host on ECS
- Code in Git in Code Commit
- ECS Cluster running Linux ASG in a VPC
- Docker repository in ECR
- Docker build and then push to ECR (docker build, docker tag, docker push)
- Create a Task definition in ECS for image
- Create a Service to launch the Task in Cluster (allows control of placement of instances as wel)
Controlled by buildspec.yml
- Defaults to spec file in source code
- Can be hard coded in console (useful if cant change source code)
- Can be passed in start-build command (buildspecOverride)
Environment variables (env) section (Key Value Pair)
- Constants or from Paramater Store
Phases (phases):
- Install, PreBuild, Build, PostBuild
- Sequence of shell commands
Specify output artifacts
CodeBuild runs on a docker image
- AWS provide some standard one (Ubuntu)
- Can be any image
Shortened logs in console, full logs in CloudWatch
To access VPC resources need to add VPC-specific configuration

Code Deploy

Deploy to EC2, on-premise of Lambda
Integrates with Jenkins, GitHub, CodePipeline
Config management tools Ansible, Puppet and Chef
Two deployments methods In-Place or Blue/Green
- In-Place (or Rolling update):
  - One/Half/All instance at a time is upgraded
  - Capacity loss, rollback involves redeploy
  - Take out of Load Balancer during upgrade
  - Not available for Lambda
- Blue-Green
  - New instances created and deployed
  - No capacity loss
  - New instances added to ELB and then ready
  - Much easier to switch back if problem (just ELB config)
Deployment Configuration set of rules as well as success/fail conditions
appspec.yml file defines the deployment actions
- Defines parameters for code deploy
- Same location as files to deploy
- EC2 (YAML):
  - version: for future use, currently 0.0
  - os: windows or linux
  - files to copy from source to destination
  - hooks
    - BeforeBlockTraffic, BlockTraffic, AfterBlockTraffic
    - ApplicationStop
    - DownloadBundle
    - BeforeInstall, Install and AfterInstall
    - ApplicationStart and ValidateService
    - BeforeAllowTraffic, AllowTraffic, AfterAllowTraffic Shell script to run (runas support and timeouts)
- Lambda (YAML or JSON):
  - version: for future use, currently 0.0
  - resources: name and properties of Lambda
  - hooks: as per EC2 - e.g. to valdiate deployment at stages BeforeAllowTraffic and AfterAllowTraffic
  - Can specify version of Lambda to be deployed
Code Deploy agent installed on EC2 or On-Premise machines
Service role in IAM for CodeDeploy controls permissions
- Download from AWS URL and run install
- Runs as service on AWS Linux
Example deploy a web app from S3 bucket to EC2
- aws deploy create-application --application-name mywebapp
- aws deploy push --application-name mywebapp --s3-location s3://bucket/stuff.zip --ignore-hidden-files
- Next:
  - Define deployment group which is set of target machines
  - Choose Deployment type
  - Choose ASG, EC2 by Tag, OnPremise
  - Point at Load Balancer for config
- Create the Deployment
  - Gives additional options about file overwrites
  - Can override default roll back configuration

Code Pipeline

CI/CD service
Orchestrate build, test and deployment on code changes
Links with Lambda, Elastic Beanstalk, Cloud Formation, ECS
Source code from CodeCommit, GitHub, or S3
- Trigger by CloudWatch alert
- S3 enable versioning and trigger on new version
Can integrate with Jenkins

Cloud Formation

Infrastructure as code - Manage, configure, provision
- Version control
- Consistency
- Time and effort
Defined in templates (YAML or JSON)
- Store in S3 (can be uploaded directly in console into S3 bucket)
- Sample templates from AWS
Free to use
Under management tools
Dependencies built in sequence
Managed as Stacks (build and delete)
Structure:
- AWSTemplateFormatVersion: "2010-09-09": specifies file format (only one supported)
- Description
- Metadata - custom fields
- Parameters - Input values, provided at stack launch by user
- Conditions - Custom expressions to allow template to make decisions
- Mappings - Define values in a dictionary key (e.g. RegionMap)
- Transform - Include snippets of code from other files in S3 (https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/CHAP_TemplateQuickRef.html), code re-use
- Resources - The AWS Resources you are deploying (required!)
- Outputs - Outputs from the scripts, can be passed downstream
Rollback on failure by default
- Can be disabled for debugging etc
- --disable-rollback from CLI/API
Nested Stacks
- Allow for reuse of template within a template
- Part of Resources section as a Stack type
- Must have TemplateURL can have Parameters
For EC2 can use cfn-init to install software on instances
StackSets extend stacks over multiple regions and account
ChangeSets used to change running resources

Serverless Application Model (SAM)

Cloudformation extension for serverless
Simplified syntax
Add a Transform: AWS::Serverless-2016-10-31 line to template after Version
- Tells AWS is SAM template
Place yaml file in same folder as Lambda code
Package whole lot including YAML to S3 bucket
SAM CLI
- sam package: Package all the local resources for a SAM to s3-bucket (applies transform)
- sam deploy: deploys the serverless app using CF

Monitoring

CloudWatch - monitor performance and logs
CloudTrail - monitors API calls to AWS (think audit)
AWS Config - records state of AWS environment and notifies of changes (think version control of environment)
XRay allow you to trace through execution path
- Needs IAM permission to write to XRay
- Docker image in ECS
- Use interceptors to catch all HTTP requests
- XRay Lambda Environment Variables
  - _X_AMZN_TRACE_ID - trace ID for XRay
  - AWS_XRAY_CONTEXT_MISSING - what happens if no trace ID (error log by default)
  - AWS_XRAY_DAEMON_ADDRESS - IP:PORT for XRay daemon
To monitor ELBs look at the access logs

Cloudwatch

Can aggregrate in CloudWatch using statistic sets
Can monitor Compute (ASG, ELBs, R53 Healthcheck), Storage and CDN, Databases and Analytics
Also monitors billing (alerts on threshold)
Gathers logs into log streams
EC2 Monitors by default (CPU, Network, Disk IOs, Status Check) every 5 mins (cost for every 1 minute)
- Cannot monitor by default RAM or Disk space - custom metric (minimum of every 1 minute)
Stored indefinitely by default (including terminated instances)
Can get data using GetMetricStatistics API
Alert on all metrics, trigger an action including a lambda
Can include outside resources of AWS - SSM agent and Cloudwatch agent
Default EC2 metrics are 5 min (basic monitoring), detailed 1 min, High resolution metrics allow for 10s or 30s
Alerts over multiple evaluation periods, control for period length, number of points
For logs from EC2 need to install CloudWatch agent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!