AWS:Introduction of Databases
What is Database?
Data access (reads and writes) is needed on a recruiting basis.
It allows multiple user access for reads and writes
It safeguards against unintentional mistakes, or unexpected power or hardware failure, and can recover the last known state.
A relation database is a data structure that allows you to link information from different tables, or different types of data buckets.
It normalizes data into the structures.
A schema is used to strictly define, tables, columns, indexes, and relations between tables
- Same items in tables are stored in the same table locations (rows/columns)
- Relational databases can save data in multiple-joined tables (rows and columns)
Information about an entity can be stored distributed over multiple “Related” tables.
Virtually all relational DBs use Structured Query Language (SQL)
Are best suited for OLTP (On Line Transaction Processing)
- OLTP faciliates and manages transaction oriented applications.
- An ATM machine transaction is an OLTP example.
Relational DBs are usually used in Enterprise applications/scenarios
- Exception is MYSQL which is used for web applications.
Inability to scale out (horizontally) to the needs of Web 2.0 & Big Data Applications.
Requires expensive hardware to scale up (vertically), since its performance is dependent on that
Requires more investment to span a distributed system.
Is a collection of data integrated from Multiple sources, which then undergoes complex long queries for analytical and managaent reporting.
Introduction to Non-Relational Databases
In the simplest form, non relational databases store data without a structured mechanisms to link data from different tables to one another.
Are high performance databases that are non-schema based unlike relational DBs.
Use non-structured or semi-structured data.
Storage and retrieval of data is modeled without/away from tabular relations as in SQL DBs
- Non-Relational or No SQL databases use a variety of data models, including document, graph, key-value and columnar.
They scale out (Horizontally) using distributed clusters to increase throughput without increasing latency.
- This meets today’s needs in social media, big data and IoT
Require commodity (low cost) hardware
Much faster performance compared to Relational DBs
Easier to develop
Scalable in performance with high availability and resilience
Automatically spread the data over multiple servers
Multiple related values/ entries can be stored into one DB entry unlike RDBMS.
Best suited for On Line Analytical Processing (OLAP)
Examples use-cases are:
Business reporting for sales
Business process management
OLAP tools enables the multi-dimensional analysis of data from multiple/many perspectives.
- Columnar databases
- Document databases
- Graph databases
- In-memory key value databases
- Columnar Databases:
Optimized for redaing and writing columns not rows
Reduce the amount of data to be loaded from the disk
Scaleout using distributed clusters (low cost hardware)
Column-oriented storage drasticallty reduces the I/O requirements to read/write data
- On AWS you have a choice of running your own NOSQL DB using EC2 and EBS , or take advantage of the fully managed AWS Columnar databases such as:
- Open Source databsses such as
_ Apache Cassandra
_ Apache HBase
NoSQL Databases-Document DBs
Store semi-structured data as documents, typically in JSON or XML format
- This makes loading objects with relevant data and properties
The schema for each NoSQL document can vary,
- More flexibility for developers, DB admin, IT Pros to organize and store data
- Reduces storage needs for optional values
Can scale “out” using distributed clusters of low-cost hardware to increase throughput without increasing latency.
On AWS you can run your own NoSQL document database or use the fully managed service from AWS
- DynamoDB is a fully managed document databse (it also supported key value)
It is fully managed document database (it is also supported key value)
It is extremely fast and delivers predictable performance with seamless scalability
Use for applications that need consistent, single digit millisecond latency at any scale.
A fully manged AWS NoSQL database and supports both document and key-value data models
Is a grate fit for
- Mobile Apps
- Web Apps
- Gaming Apps
- Ad-tech Apps
- Internet of things (IoT)
NoSL Databases- In-memory Key Value Store DBs
An in-memory key-va;ue store is NoSQL DB optimized for read-heavy application workloads.
- Examples are socail networking, gaming, media sharing and Q&A portals
It is also optimized for compute-intensive workloads
- They improve application performance by storing critical pieces of data in memory for low latency access.
_ Cached information may include:
- Results of I/O intensive database queries or
- Results of computationally intensive calculations
On AWS you can use your own or take advantage of fully managed AWS offerings
Amazon Elastic Cache as an In-Memory Key-vALUE sTORES
- Is a web service that makes it easy to seploy, operate, and scale an in-memory cache
- It improves the performance of web application by allowing for a faster retrieval of the information from managed, in-memory caches.
Amazon ElasticCache automatically detects and replaces failed nodes
- No need to worry about node failures or replacement
ElasticCache supports two open-source in-memory caching engines.
In next blog, I will discuss on AWS RDS.