Ten years ago, a database was needed to scale out for both reads and writes to meet the long-term needs of a growing business. The Amazon Dynamo database was developed and WERNER VOGELS, CTO of Amazon.com explains how its has grown from then.
It all started in 2004 when Amazon was running Oracle’s enterprise edition with clustering and replication. We had an advanced team of database administrators and access to top experts within Oracle. We were pushing the limits of what was a leading commercial database at the time and were unable to sustain the availability, scalability and performance needs that our growing Amazon business demanded.
Our straining database infrastructure on Oracle led us to evaluate if we could develop a purpose-built database that would support our business needs for the long term. We prioritized focusing on requirements that would support high-scale, mission-critical services like Amazon’s shopping cart, and questioned assumptions traditionally held by relational databases such as the requirement for strong consistency. Our goal was to build a database that would have the unbounded scalability, consistent performance and the high availability to support the needs of our rapidly growing business.
A deep dive on how we were using our existing databases revealed that they were frequently not used for their relational capabilities. About 70 percent of operations were of the key-value kind, where only a primary key was used and a single row would be returned. About 20 percent would return a set of rows, but still operate on only a single table.
With these requirements in mind, and a willingness to question the status quo, a small group of distributed systems experts came together and designed a horizontally scalable distributed database that would scale out for both reads and writes to meet the long-term needs of our business. This was the genesis of the Amazon Dynamo database.
The success of our early results with the Dynamo database encouraged us to write Amazon’s Dynamo whitepaper and share it at the 2007 ACM Symposium on Operating Systems Principles (SOSP conference), so that others in the industry could benefit. The Dynamo paper was well-received and served as a catalyst to create the category of distributed database technologies commonly known today as “NoSQL.”
Of course, no technology change happens in isolation, and at the same time NoSQL was evolving, so was cloud computing. As we began growing the AWS business, we realized that external customers might find our Dynamo database just as useful as we found it within Amazon.com. So, we set out to build a fully hosted AWS database service based upon the original Dynamo design.
The requirements for a fully hosted cloud database service needed to be at an even higher bar than what we had set for our Amazon internal system. The cloud-hosted version would need to be:
- Scalable – The service would need to support hundreds of thousands, or even millions of AWS customers, each supporting their own internet-scale applications.
- Secure – The service would have to store critical data for external AWS customers which would require an even higher bar for access control and security.
- Durable and Highly-Available – The service would have to be extremely resilient to failure so that all AWS customers could trust it for their mission-critical workloads as well.
- Performant – The service would need to be able to maintain consistent performance in the face of diverse customer workloads.
- Manageable – The service would need to be easy to manage and operate. This was perhaps the most important requirement if we wanted a broad set of users to adopt the service.
With these goals in mind, In January, 2012 we launched Amazon DynamoDB, our cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance and manageability needed to run mission-critical workloads.
Today, DynamoDB powers the next wave of high-performance, internet-scale applications that would overburden traditional relational databases. Many of the world’s largest internet-scale businesses such as Lyft, Tinder and Redfin as well as enterprises such as Comcast, Under Armour, BMW, Nordstrom and Toyota depend on DynamoDB’s scale and performance to support their mission-critical workloads.
DynamoDB is used by Lyft to store GPS locations for all their rides, Tinder to store millions of user profiles and make billions of matches, Redfin to scale to millions of users and manage data for hundreds of millions of properties, Comcast to power their XFINITY X1 video service running on more than 20 million devices, BMW to run its car-as-a-sensor service that can scale up and down by two orders of magnitude within 24 hours, Nordstrom for their recommendations engine reducing processing time from 20 minutes to a few seconds, Under Armour to support its connected fitness community of 200 million users, Toyota Racing to make real time decisions on pit-stops, tire changes, and race strategy, and another 100,000+ AWS customers for a wide variety of high-scale, high-performance use cases.
With all the real-world customer use, DynamoDB has proven itself on those original design dimensions:
- Scalable – DynamoDB supports customers with single tables that serve millions of requests per second, store hundreds of terabytes, or contain over 1 trillion items of data. In support of Amazon Prime Day 2017, the biggest day in Amazon retail history, DynamoDB served over 12.9 million requests per second. DynamoDB operates in all AWS regions (16 geographic regions now with announced plans for six more Regions in Bahrain, China, France, Hong Kong, Sweden), so you can have a scalable database in the geographic region you need.
- Secure – DynamoDB provides fine-grained access control at the table, item, and attribute level, integrated with AWS Identity and Access Management. VPC Endpoints give you the ability to control whether network traffic between your application and DynamoDB traverses the public Internet or stays within your virtual private cloud. Integration with AWS CloudWatch, AWS CloudTrail, and AWS Config enables support for monitoring, audit, and configuration management. SOC, PCI, ISO, FedRAMP, HIPAA BAA, and DoD Impact Level 4 certifications allows customers to meet a wide range of compliance standards.
- Durable and Highly-Available – DynamoDB maintains data durability and 99.99 percent availability in the event of a server, a rack of servers, or an Availability Zone failure. DynamoDB automatically re-distributes your data to healthy servers to ensure there are always multiple replicas of your data without you needing to intervene.
- Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases. In addition, DynamoDB Accelerator (DAX) a fully managed, highly available, in-memory cache further speeds up DynamoDB response times from milliseconds to microseconds and can continue to do so at millions of requests per second.
- Manageable – DynamoDB eliminates the need for manual capacity planning, provisioning, monitoring of servers, software upgrades, applying security patches, scaling infrastructure, monitoring, performance tuning, replication across distributed datacenters for high availability, and replication across new nodes for data durability. All of this is done for you automatically and with zero downtime so that you can focus on your customers, your applications, and your business.
- Adaptive Capacity –DynamoDB intelligently adapts to your table’s unique storage needs, by scaling your table storage up by horizontally partitioning them across many servers, or down with Time To Live (TTL) that deletes items that you marked to expire. DynamoDB provides Auto Scaling, which automatically adapts your table throughput up or down in response to actual traffic to your tables and indexes. Auto Scaling is on by default for all new tables and indexes.
Ten years ago, we never would have imagined the lasting impact our efforts on Dynamo would have. What started out as an exercise in solving our own needs in a customer obsessed way, turned into a catalyst for a broader industry movement towards non-relational databases, and ultimately, an enabler for a new class of internet-scale applications.
As we say at AWS, It is still Day One for DynamoDB. We believe we are in the midst of a transformative period for databases, and the adoption of purpose-built databases like DynamoDB is only getting started. We expect that the next ten years will see even more innovation in databases than the last ten. I know the team is working on some exciting new things for DynamoDB – I can’t wait to share them with you over the upcoming months.
How to rob a bank in the 21st century
In the early 1980s, South Africans were gripped by tales of the most infamous bank robbery gangs the country had ever known: The Stander Gang. The gang would boldly walk into banks, brandishing weapons, demand cash and simply disappear. These days, a criminal doesn’t even have to be in the same country as the bank he or she intends to rob. Cyber criminals are quite capable of emptying bank accounts without even stepping out of their own homes.
As we become more and more aware of cybersecurity and the breaches that can occur, we’ve become more vigilant. Criminals, however, are still going to follow the money and even though security may be beefed up in many organisations, hackers are going to go for the weakest links. This makes it quintessential for consumers and enterprises to stay one step ahead of the game.
“Not only do these cyber bank criminals get away with the cash, they also end up damaging an organisation’s reputation and the integrity of its infrastructure,” says Indi Siriniwasa, Vice President of Trend Micro, Sub-Saharan Africa. “And sometimes, these breaches mean they get away with more than just cash – they can make off with data and personal information as well.”
Because the cyber criminals operate outside bricks and mortar, going for the cash register or robbing the customers is not where their misdeeds end. Bank employees – from the tellers to the CEO – are all fair game.
But how do they do it? Taking money out of an account is not the only way to steal money. Cyber criminals can zero in on the bank’s infrastructure, or hack into payment systems and even payment documents. Part of a successful operation for them may also include hacking into telecommunications to gain access to one-time pins or mobile networks.
“It’s not just about hacking,” says Siriniwasa.. “It’s also about the hackers trying to get an ‘inside man’ in the bank who could help them or even using a person’s personal details to get a new SIM so that they can have access to OTPs. Of course, they also use the tried and tested method of phishing which continues to be exceptionally effective – despite the education in the market to thwart it.”
The amounts of malware and available attacks to gain access to bank funds is strikingly vast and varies from using web injection script, social engineering and even targeting internal networks as well as points of sale systems. If there is an internet connection and a system you can be assured that there is a cybercriminal trying to crack it. The impact on the bank itself is also massive, with reputations left in tatters and customers moving their business elsewhere.
“We see that cyber criminals use multi-faceted attacks,” says Siriniwasa. “This means that we need to come at security from multiple angles as well. Every single layer of an organisation’s online perimeter need to be secured. Threat isolation is exceptionally important and having security with intrusion protection is vital. Again, vigilance on the part of staff and customers also goes a long way to preventing attacks. These criminals might not carry guns like Andre Stander and his gang, but they are just as dangerous – in fact – probably more so.”
Beaten by big data? AI is the answer
by ZAKES SOCIKWA, cloud big data and analytics lead at Oracle
In 2019, it’sestimated we’ll generate more data than we did in the previous 5,000 years. Data is fast becoming the most valuable asset of any modern organisation, and while most have access to their internal data, they continue to experience challenges in deriving maximum value through being able to effectively monetise the information that they hold.
The foundation of any analytics or Business Intelligence (BI) reporting capability is an efficient data collection system that ensures events/transactions are properly recorded, captured, processed and stored. Some of this information on its own might not provide any valuable insights, but if it is analysed together with other sources might yield interesting patterns.
Big data opens up possibilities of enhancing internal sources with unstructured data and information from Internet of Things (IoT) devices. Furthermore, as we move to a digital age, more businesses are implementing customer experience solutions and there is a growing need for them to improve their service and personalise customer engagements.
The digital behaviour of customers, such as social media postings and the networks or platforms they engage with, further provides valuable information for data collection. Information gathering methods are being expanded to accommodate all types and formats of data, including images, videos, and more.
In the past, BI and Data Mining were left to highly technical and analytical individuals, but the introduction of data visualisation tools is democratising the analytics world. However, business users and report consumers often do not have a clear understanding of what they need or what is possible.
AI now embedded into day to day applications
To this end, artificial intelligence (AI) is finishing what business intelligence started. By gathering, contextualising, understanding, and acting on huge quantities of data, AI has given rise to a new breed of applications – one that’s continuously improving and adapting to the conditions around it. The more data that is available for the analysis, the better is the quality of the outcomes or predictions.
In addition, AI changes the productivity equation for many jobs by automating activities and adapting current jobs to solve more complex and time-consuming problems, from recruiters being able to source better candidates faster to financial analysts eliminating manual error-prone reporting.
This type of automation will not replace all jobs but will invent new ones. This enables businesses to reduce the time to complete tasks and the costs of maintenance, and will lead to the creation of higher-value jobs and new engagement models. Oracle predicts that by 2025, the productivity gains delivered by AI, emerging technologies, and augmented experiences could double compared to today’s operations.
According to the IDC, worldwide revenues for big data and business analytics (BDA) solutions was expected to total $166 billion in 2018, and forecast to reach $260 billion in 2022, with a compound annual growth rate of 11.9% over the 2017-2022 forecast period. It adds that two of the fastest growing BDA technology categories will be Cognitive/AI Software Platforms (36.5% CAGR) and Non-relational Analytic Data Stores (30.3% CAGR)¹.
Informed decisions, now and in the future
As new layers of technology are introduced and more complex data sources are added to the ecosystem, the need for a tightly integrated technology stack becomes a challenge. It is advisable to choose your technology components very carefully and always have the end state in mind.
More development on emerging technologies such as blockchain, AI, IoT, virtual reality and others will probably be available on cloud first before coming on premise. For those organisations that are adopting public cloud, there are opportunities to consume the benefits of public cloud and drive down costs of doing business.
While the introduction of public cloud is posing a challenge on data sovereignty and other regulations, technology providers such as Oracle have developed a ‘Cloud at Customer’ model that provides the full benefits of public cloud – but located on premise, within an organisation’s own data centre.
The best organisations will innovate and optimise faster than the rest. Best decisions must be made around choice of technology, business processes, integration and architectures that are fit for business. In the information marketplace, speed and informed decision making will be key differentiators amongst competitors.
¹ IDC Press Release, Revenues for Big Data and Business Analytics Solutions Forecast to Reach $260 Billion in 2022, Led by the Banking and Manufacturing Industries, According to IDC, 15 August 2018