前注:
学习书籍<AWS Certified Solutions Architect Associate All-in-One Exam Guide (Exam SAA-C01)>时记录的笔记。
由于是全英文书籍,所以笔记记录大部分为英文。
Index
- Storage
-
- 1. Three major categories
- 2. Amazon S3 (Simple Storage Service)
-
- (1) Advantages
- (2) Usage of S3 in real life
- (3) Basic concepts
- (4) Data consistency model
- (5) Performance considerations
- (6) Encryption
- (7) Access Control
- (8) S3 Storage Class
- (9) Versioning of Objects in Amazon S3
- (10)Amazon S3 Lifecycle Management
- (11)Amazon S3 Cross-region Replication
- (12)Static Website Hosting in Amazon S3
- 3. Amazon Glacier
- 4. Amazon EBS (Elastic Block Store)
- 5. Amazon EFS (Elastic File System)
- 6. On-premise Storage Integration with AWS
Storage
1. Three major categories
File: Amazon EFS
Block: Amazon EBS, Amazon EC2 Instance Store
Object: Amazon S3, Amazon Glacier
2. Amazon S3 (Simple Storage Service)
(1) Advantages
Simple, Scalable, Durable, Secured, High performance, Available, Low cost, Easy to manage, Easy integration
Notes:
· Easy integration with third parties, provides REST APIs and SDKs
· Scale up easily though PB data, unlimited data storage
· 99.999999999, data store across multiple data centers and in multiple devices in a redundant manner, sustain concurrent data loss in two facilities
· Data upload -> automatically encrypted; Data transfer -> support SSL;
· Integrate with Amazon CloudFront, delivery web service with low latency
(2) Usage of S3 in real life
Backup
popular for storing backup files among enterprises, data is distributed in three copies for each file between multiple AZs within an AWS region.
Tape replacement
Static web site hosting
Application hosting
can use it for hosting mobile and Internet-based apps.
Disaster recovery
Content distribution
distribute content over internet (directly from S3 or via CloudFront). The content can be anything, such as files, media (photos, videos, …). It can also be a software delivery platform where customers can download software.
Data lake
popular in the world of big data as a big data store to keep all kinds of data. S3 is often used with EMR, Redshift, Redshift Spectrum, Athena, Glue, and QuickSight for running big data analytics.
(A data lake is a central place for storing massive amounts of data that can be processed, analyzed, and consumed by different business units in an organization.)
Private repository
like Git, Yum, or Maven.
(3) Basic concepts
Bucket: a container for storing objects in Amazon S3
Notes: Name must be unique (even across multiple regions)
eg.
object name: ringtone.mp3,
bucket name: newringtones,
file accessible URL: http://newringtones.s3.amazonaws.com/ringtone.mp3
Object: anything store in an S3 bucket is called an object
Notes:
· An object is uniquely identified within a bucket by a name or key and by a version ID.
· Every object in a bucket has only one key.
Region: the object stored in the region never leaves the region unless you explicitly transfer it to a different region.
APIs: allow developers to write applications on top of S3.
Notes:
· REST HTTPS > REST HTTP > SOAP HTTPS
· HTTP verbs correspond to S3 CRUD Operations
GET -> Read, PUT -> Create, DELETE -> Delete, POST -> Create
SDKs: In addition to APIs, lots of SDKs are available in various platforms including in browsers, on mobile devices (Android and iOS), and in multiple programming languages (Java, .NET, Node.js, PHP, Python, Ruby, Go, C++).
Amazon CLI (command-line interface): unified tool to manage all your AWS services.
Notes:
· CLI is often used by Amazon S3 in conjunction with REST APIs and SDKs.
· Through CLI can perform to all the S3 operations from a command line.
(4) Data consistency model
“write once, read many”
Amazon S3 Standard: data store in a minimum of three AZs
Amazon S3-One Zone Infrequent Access: data store in a single AZ
“read-after-write consistency”
Eventually consistent system: the data is automatically replicated and propagated across multiple systems and across multiple AZs within a region, so sometimes you will have a situation where you won’t be able to see the updates or changes instantly.
Update case: if you PUT to an existing key, a subsequent read might return the old data or the updated data, but it will never write corrupted or partial data.
No object locking: the request with the latest time stamp wins. (User can achieve object locking by building it into the application)
(5) Performance considerations
Partition: If the workload you are planning to run on an Amazon S3 bucket is going to exceed 100 PUT/LIST/DELETE requests per second or 300 GET requests per second.
S3 automatically partitions based on the key prefix (chapter2/image/… → ‘c’).
Change the chapter na