Ten years ago, a database was needed to scale out for both reads and writes to meet the long-term needs of a growing business. The Amazon Dynamo database was developed and WERNER VOGELS, CTO of Amazon.com explains how its has grown from then.
It all started in 2004 when Amazon was running Oracle’s enterprise edition with clustering and replication. We had an advanced team of database administrators and access to top experts within Oracle. We were pushing the limits of what was a leading commercial database at the time and were unable to sustain the availability, scalability and performance needs that our growing Amazon business demanded.
Our straining database infrastructure on Oracle led us to evaluate if we could develop a purpose-built database that would support our business needs for the long term. We prioritized focusing on requirements that would support high-scale, mission-critical services like Amazon’s shopping cart, and questioned assumptions traditionally held by relational databases such as the requirement for strong consistency. Our goal was to build a database that would have the unbounded scalability, consistent performance and the high availability to support the needs of our rapidly growing business.
A deep dive on how we were using our existing databases revealed that they were frequently not used for their relational capabilities. About 70 percent of operations were of the key-value kind, where only a primary key was used and a single row would be returned. About 20 percent would return a set of rows, but still operate on only a single table.
With these requirements in mind, and a willingness to question the status quo, a small group of distributed systems experts came together and designed a horizontally scalable distributed database that would scale out for both reads and writes to meet the long-term needs of our business. This was the genesis of the Amazon Dynamo database.
The success of our early results with the Dynamo database encouraged us to write Amazon’s Dynamo whitepaper and share it at the 2007 ACM Symposium on Operating Systems Principles (SOSP conference), so that others in the industry could benefit. The Dynamo paper was well-received and served as a catalyst to create the category of distributed database technologies commonly known today as “NoSQL.”
Of course, no technology change happens in isolation, and at the same time NoSQL was evolving, so was cloud computing. As we began growing the AWS business, we realized that external customers might find our Dynamo database just as useful as we found it within Amazon.com. So, we set out to build a fully hosted AWS database service based upon the original Dynamo design.
The requirements for a fully hosted cloud database service needed to be at an even higher bar than what we had set for our Amazon internal system. The cloud-hosted version would need to be:
- Scalable – The service would need to support hundreds of thousands, or even millions of AWS customers, each supporting their own internet-scale applications.
- Secure – The service would have to store critical data for external AWS customers which would require an even higher bar for access control and security.
- Durable and Highly-Available – The service would have to be extremely resilient to failure so that all AWS customers could trust it for their mission-critical workloads as well.
- Performant – The service would need to be able to maintain consistent performance in the face of diverse customer workloads.
- Manageable – The service would need to be easy to manage and operate. This was perhaps the most important requirement if we wanted a broad set of users to adopt the service.
With these goals in mind, In January, 2012 we launched Amazon DynamoDB, our cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance and manageability needed to run mission-critical workloads.
Today, DynamoDB powers the next wave of high-performance, internet-scale applications that would overburden traditional relational databases. Many of the world’s largest internet-scale businesses such as Lyft, Tinder and Redfin as well as enterprises such as Comcast, Under Armour, BMW, Nordstrom and Toyota depend on DynamoDB’s scale and performance to support their mission-critical workloads.
DynamoDB is used by Lyft to store GPS locations for all their rides, Tinder to store millions of user profiles and make billions of matches, Redfin to scale to millions of users and manage data for hundreds of millions of properties, Comcast to power their XFINITY X1 video service running on more than 20 million devices, BMW to run its car-as-a-sensor service that can scale up and down by two orders of magnitude within 24 hours, Nordstrom for their recommendations engine reducing processing time from 20 minutes to a few seconds, Under Armour to support its connected fitness community of 200 million users, Toyota Racing to make real time decisions on pit-stops, tire changes, and race strategy, and another 100,000+ AWS customers for a wide variety of high-scale, high-performance use cases.
With all the real-world customer use, DynamoDB has proven itself on those original design dimensions:
- Scalable – DynamoDB supports customers with single tables that serve millions of requests per second, store hundreds of terabytes, or contain over 1 trillion items of data. In support of Amazon Prime Day 2017, the biggest day in Amazon retail history, DynamoDB served over 12.9 million requests per second. DynamoDB operates in all AWS regions (16 geographic regions now with announced plans for six more Regions in Bahrain, China, France, Hong Kong, Sweden), so you can have a scalable database in the geographic region you need.
- Secure – DynamoDB provides fine-grained access control at the table, item, and attribute level, integrated with AWS Identity and Access Management. VPC Endpoints give you the ability to control whether network traffic between your application and DynamoDB traverses the public Internet or stays within your virtual private cloud. Integration with AWS CloudWatch, AWS CloudTrail, and AWS Config enables support for monitoring, audit, and configuration management. SOC, PCI, ISO, FedRAMP, HIPAA BAA, and DoD Impact Level 4 certifications allows customers to meet a wide range of compliance standards.
- Durable and Highly-Available – DynamoDB maintains data durability and 99.99 percent availability in the event of a server, a rack of servers, or an Availability Zone failure. DynamoDB automatically re-distributes your data to healthy servers to ensure there are always multiple replicas of your data without you needing to intervene.
- Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases. In addition, DynamoDB Accelerator (DAX) a fully managed, highly available, in-memory cache further speeds up DynamoDB response times from milliseconds to microseconds and can continue to do so at millions of requests per second.
- Manageable – DynamoDB eliminates the need for manual capacity planning, provisioning, monitoring of servers, software upgrades, applying security patches, scaling infrastructure, monitoring, performance tuning, replication across distributed datacenters for high availability, and replication across new nodes for data durability. All of this is done for you automatically and with zero downtime so that you can focus on your customers, your applications, and your business.
- Adaptive Capacity –DynamoDB intelligently adapts to your table’s unique storage needs, by scaling your table storage up by horizontally partitioning them across many servers, or down with Time To Live (TTL) that deletes items that you marked to expire. DynamoDB provides Auto Scaling, which automatically adapts your table throughput up or down in response to actual traffic to your tables and indexes. Auto Scaling is on by default for all new tables and indexes.
Ten years ago, we never would have imagined the lasting impact our efforts on Dynamo would have. What started out as an exercise in solving our own needs in a customer obsessed way, turned into a catalyst for a broader industry movement towards non-relational databases, and ultimately, an enabler for a new class of internet-scale applications.
As we say at AWS, It is still Day One for DynamoDB. We believe we are in the midst of a transformative period for databases, and the adoption of purpose-built databases like DynamoDB is only getting started. We expect that the next ten years will see even more innovation in databases than the last ten. I know the team is working on some exciting new things for DynamoDB – I can’t wait to share them with you over the upcoming months.
VoD cuts the cord in SA
Some 20% of South Africans who sign up for a subscription video on demand (SVOD) service such as Netflix or Showmax do so with the intention of cancelling their pay television subscription.
That’s according to GfK’s international ViewScape survey*, which this year covers Africa (South Africa, Kenya and Nigeria) for the first time.
The study—which surveyed 1,250 people representative of urban South African adults with Internet access—shows that 90% of the country’s online adults today use at least one online video service and that just over half are paying to view digital online content. The average user spends around 7 hours and two minutes a day consuming video content, with broadcast television accounting for just 42% of the time South Africans spend in front of a screen.
Consumers in South Africa spend nearly as much of their daily viewing time – 39% of the total – watching free digital video sources such as YouTube and Facebook as they do on linear television. People aged 18 to 24 years spend more than eight hours a day watching video content as they tend to spend more time with free digital video than people above their age.
Says Benjamin Ballensiefen, managing director for Sub Sahara Africa at GfK: “The media industry is experiencing a revolution as digital platforms transform viewers’ video consumption behaviour. The GfK ViewScape study is one of the first to not only examine broadcast television consumption in Kenya, Nigeria and South Africa, but also to quantify how linear and online forms of content distribution fit together in the dynamic world of video consumption.”
The study finds that just over a third of South African adults are using streaming video on demand (SVOD) services, with only 16% of SVOD users subscribing to multiple services. Around 23% use per-pay-view platforms such as DSTV Box Office, while about 10% download pirated content from the Internet. Around 82% still sometimes watch content on disc-based media.
“Linear and non-linear television both play significant roles in South Africa’s video landscape, though disruption from digital players poses a growing threat to the incumbents,” says Molemo Moahloli, general manager for media research & regional business development at GfK Sub Sahara Africa. “Among most demographics, usage of paid online content is incremental to consumption of linear television, but there are signs that younger consumers are beginning to substitute SVOD for pay-television subscriptions.”
New data rules raise business trust challenges
When the General Data Protection Regulation comes into effect on May 25th, financial services firms will face a new potential threat to their on-going challenges with building strong customer relationships, writes DARREL ORSMOND, Financial Services Industry Head at SAP Africa.
The regulation – dubbed GDPR for short – is aimed at giving European citizens control back over their personal data. Any firm that creates, stores, manages or transfers personal information of an EU citizen can be held liable under the new regulation. Non-compliance is not an option: the fines are steep, with a maximum penalty of €20-million – or nearly R300-million – for transgressors.
GDPR marks a step toward improved individual rights over large corporates and states that prevents the latter from using and abusing personal information at their discretion. Considering the prevailing trust deficit – one global EY survey found that 60% of global consumers worry about hacking of bank accounts or bank cards, and 58% worry about the amount of personal and private data organisations have about them – the new regulation comes at an opportune time. But it is almost certain to cause disruption to normal business practices when implemented, and therein lies both a threat and an opportunity.
The fundamentals of trust
GDPR is set to tamper with two fundamental factors that can have a detrimental effect on the implicit trust between financial services providers and their customers: firstly, customers will suddenly be challenged to validate that what they thought companies were already doing – storing and managing their personal data in a manner that is respectful of their privacy – is actually happening. Secondly, the outbreak of stories relating to companies mistreating customer data or exposing customers due to security breaches will increase the chances that customers now seek tangible reassurance from their providers that their data is stored correctly.
The recent news of Facebook’s indiscriminate sharing of 50 million of its members’ personal data to an outside firm has not only led to public outcry but could cost the company $2-trillion in fines should the Federal Trade Commission choose to pursue the matter to its fullest extent. The matter of trust also extends beyond personal data: in EY’s 2016 Global Consumer Banking Survey, less than a third of respondents had complete trust that their banks were being transparent about fees and charges.
This is forcing companies to reconsider their role in building and maintaining trust with its customers. In any customer relationship, much is done based on implicit trust. A personal banking customer will enjoy a measure of familiarity that often provides them with some latitude – for example when applying for access to a new service or an overdraft facility – that can save them a lot of time and energy. Under GDPR and South Africa’s POPI act, this process is drastically complicated: banks may now be obliged to obtain permission to share customer data between different business units (for example because they are part of different legal entities and have not expressly received permission). A customer may now allow banks to use their personal data in risk scoring models, but prevent them from determining whether they qualify for private banking services.
What used to happen naturally within standard banking processes may be suddenly constrained by regulation, directly affecting the bank’s relationship with its customers, as well as its ability to upsell to existing customers.
The risk of compliance
Are we moving to an overly bureaucratic world where even the simplest action is subject to a string of onerous processes? Compliance officers are already embedded within every function in a typical financial services institution, as well as at management level. Often the reporting of risk processes sits outside formal line functions and end up going straight to the board. This can have a stifling effect on innovation, with potentially negative consequences for customer service.
A typical banking environment is already creaking under the weight of close to 100 acts, which makes it difficult to take the calculated risks needed to develop and launch innovative new banking products. Entire new industries could now emerge, focusing purely on the matter of compliance and associated litigation. GDPR already requires the services of Data Protection Officers, but the growing complexity of regulatory compliance could add a swathe of new job functions and disciplines. None of this points to the type of innovation that the modern titans of business are renowned for.
A three-step plan of action
So how must banks and other financial services firms respond? I would argue there are three main elements to successfully navigating the immediate impact of the new regulations:
Firstly, ensuring that the technologies you use to secure, manage and store personal data is sufficiently robust. Modern financial services providers have a wealth of customer data at their disposal, including unstructured data from non-traditional sources such as social media. The tools they use to process and safeguard this data needs to be able to withstand the threats posed by potential data breaches and malicious attacks.
Secondly, rethinking the core organisational processes governing their interactions with customers. This includes the internal measures for setting terms and conditions, how customers are informed of their intention to use their data, and how risk is assessed. A customer applying for medical insurance will disclose deeply personal information about themselves to the insurance provider: it is imperative the insurer provides reassurance that the customer’s data will be treated respectfully and with discretion and with their express permission.
Thirdly, financial services firms need to define a core set of principles for how they treat customers and what constitutes fair treatment. This should be an extension of a broader organisational focus on treating customers fairly, and can go some way to repairing the trust deficit between the financial services industry and the customers they serve.